CalculusWithJuliaNotes.jl/quarto/308797b5/search.json
2022-08-11 13:00:43 -04:00

2739 lines
2.1 MiB
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[
{
"objectID": "index.html",
"href": "index.html",
"title": "Calculus with Julia",
"section": "",
"text": "Calculus with Julia\n\n\n\nThis is a set of notes for learning calculus using the Julia language. Julia is an open-source programming language with an easy to learn syntax that is well suited for this task.\nRead “Getting started with Julia” to learn how to install and customize Julia for following along with these notes. Read “Julia interfaces to review different ways to interact with a Julia installation.\nSince the mid 90s there has been a push to teach calculus using many different points of view. The Harvard style rule of four says that as much as possible the conversation should include a graphical, numerical, algebraic, and verbal component. These notes use the programming language Julia to illustrate the graphical, numerical, and, at times, the algebraic aspects of calculus.\nThere are many examples of integrating a computer algebra system (such as Mathematica, Maple, or Sage) into the calculus conversation. Computer algebra systems can be magical. The popular WolframAlpha website calls the full power of Mathematica while allowing an informal syntax that is flexible enough to be used as a backend for Apples Siri feature. (“Siri what is the graph of x squared minus 4?”) For learning purposes, computer algebra systems model very well the algebraic/symbolic treatment of the material while providing means to illustrate the numeric aspects. Theses notes are a bit different in that Julia is primarily used for the numeric style of computing and the algebraic/symbolic treatment is added on. Doing the symbolic treatment by hand can be very beneficial while learning, and computer algebra systems make those exercises seem kind of redundant, as the finished product can be produced much easier.\nOur real goal is to get at the concepts using technology as much as possible without getting bogged down in the mechanics of the computer language. We feel Julia has a very natural syntax that makes the initial start up not so much more difficult than using a calculator, but with a language that has a tremendous upside. The notes restrict themselves to a reduced set of computational concepts. This set is sufficient for working many of the problems in calculus, but do not cover thoroughly many aspects of programming. (Those who are interested can go off on their own and Julia provides a rich opportunity to do so.) Within this restricted set, are operators that make many of the computations of calculus reduce to a function call of the form action(function, arguments...). With a small collection of actions that can be composed, many of the problems associated with introductory calculus can be attacked.\nThese notes are presented in pages covering a fairly focused concept, in a spirit similar to a section of a book. Just like a book, there are try-it-yourself questions at the end of each page. All have a limited number of self-graded answers. These notes borrow ideas from many sources, for example Strang (n.d.), Knill (n.d.), Schey (1997), Hass, Heil, and Weir (2018), Rogawski, Adams, and Franzosa (2019), several Wikipedia pages, and other sources..\nThese notes are accompanied by a Julia package CalculusWithJulia that provides some simple functions to streamline some common tasks and loads some useful packages that will be used repeatedly.\nThese notes are presented as a Quarto book. To learn more about Quarto books visit https://quarto.org/docs/books.\nThese notes may also be compiled into Pluto notebooks. As such, to accommodate Plutos design of only one global variable definition being allowed per notebook, there is frequent use of Unicode symbols for variable names.\nTo contribute say by suggesting addition topics, correcting a mistake, or fixing a typo click the “Edit this page” link.\n\nCalculus with Julia version {{< meta version }}, produced on {{< meta\n\n\n\n\nHass, Joel R., Christopher E. Heil, and Maurice D. Weir. 2018. Thomas Calculus. Pearson.\n\n\nKnill, Oliver. n.d. “Some Teaching Notes.” https://people.math.harvard.edu/~knill/teach/index.html.\n\n\nRogawski, Jon, Colin Adams, and Robert Franzosa. 2019. Calculus. Macmillan.\n\n\nSchey, H. M. 1997. Div, Grad, Curl, and All That. W.W. Norton.\n\n\nStrang, Gilbert. n.d. “MS Windows NT Kernel Description.” https://ocw.mit.edu/courses/res-18-001-calculus-online-textbook-spring-2005/."
},
{
"objectID": "precalc/calculator.html",
"href": "precalc/calculator.html",
"title": "1  From calculator to computer",
"section": "",
"text": "Let us consider a basic calculator with buttons to add, subtract, multiply, divide, and take square roots. Using such a simple thing is certainly familiar for any reader of these notes. Indeed, a familiarity with a graphing calculator is expected. Julia makes these familiar tasks just as easy, offering numerous conveniences along the way. In this section we describe how.\nThe following image is the calculator that Google presents upon searching for “calculator.”\nThis calculator should have a familiar appearance with a keypad of numbers, a set of buttons for arithmetic operations, a set of buttons for some common mathematical functions, a degree/radian switch, and buttons for interacting with the calculator: Ans, AC (also CE), and =.\nThe goal here is to see the counterparts within Julia to these features.\nFor an illustration of a really basic calculator, have some fun watching this video:"
},
{
"objectID": "precalc/calculator.html#operations",
"href": "precalc/calculator.html#operations",
"title": "1  From calculator to computer",
"section": "1.1 Operations",
"text": "1.1 Operations\nPerforming a simple computation on the calculator typically involves hitting buttons in a sequence, such as “1”, “+”, “2”, “=” to compute 3 from adding 1 + 2. In Julia, the process is not so different. Instead of pressing buttons, the various values are typed in. So, we would have:\n\n1 + 2\n\n3\n\n\nSending an expression to Julias interpreter - the equivalent of pressing the “=” key on a calculator - is done at the command line by pressing the Enter or Return key, and in Pluto, also using the “play” icon, or the keyboard shortcut Shift-Enter. If the current expression is complete, then Julia evaluates it and shows any output. If the expression is not complete, Julias response depends on how it is being called. Within Pluto, a message about “premature end of input” is given. If the expression raises an error, this will be noted.\nThe basic arithmetic operations on a calculator are “+”, “-”, “×”, “÷”, and “\\(xʸ\\)”. These have parallels in Julia through the binary operators: +, -, *, /, and ^:\n\n1 + 2, 2 - 3, 3 * 4, 4 / 5, 5 ^ 6\n\n(3, -1, 12, 0.8, 15625)\n\n\nOn some calculators, there is a distinction between minus signs - the binary minus sign and the unary minus sign to create values such as \\(-1\\).\nIn Julia, the same symbol, “-”, is used for each:\n\n-1 - 2\n\n-3\n\n\nAn expression like \\(6 - -3\\), subtracting minus three from six, must be handled with some care. With the Google calculator, the expression must be entered with accompanying parentheses: \\(6 -(-3)\\). In Julia, parentheses may be used, but are not needed. However, if omitted, a space is required between the two minus signs:\n\n6 - -3\n\n9\n\n\n(If no space is included, the value “--” is parsed like a different, undefined, operation.)\n\n\nWarningJulia only uses one symbol for minus, but web pages may not! Copying and pasting an expression with a minus sign can lead to hard to understand errors such as: invalid character \"\". There are several Unicode symbols that look similar to the ASCII minus sign, but are different. These notes use a different character for the minus sign for the typeset math (e.g., \\(1 - \\pi\\)) than for the code within cells (e.g. 1 - 2). Thus, copying and pasting the typeset math may not work as expected.\n\n\n\n\n\n1.1.1 Examples\n\nExample\nFor everyday temperatures, the conversion from Celsius to Fahrenheit (\\(9/5 C + 32\\)) is well approximated by simply doubling and adding \\(30\\). Compare these values for an average room temperature, \\(C=20\\), and for a relatively chilly day, \\(C=5\\):\nFor \\(C=20\\):\n\n9 / 5 * 20 + 32\n\n68.0\n\n\nThe easy to compute approximate value is:\n\n2 * 20 + 30\n\n70\n\n\nThe difference is:\n\n(9/5*20 + 32) - (2 * 20 + 30)\n\n-2.0\n\n\nFor \\(C=5\\), we have the actual value of:\n\n9 / 5 * 5 + 32\n\n41.0\n\n\nand the easy to compute value is simply \\(40 = 10 + 30\\). The difference is\n\n(9 / 5 * 5 + 32) - 40\n\n1.0\n\n\n\n\nExample\nAdd the numbers \\(1 + 2 + 3 + 4 + 5\\).\n\n1 + 2 + 3 + 4 + 5\n\n15\n\n\n\n\nExample\nHow small is \\(1/2/3/4/5/6\\)? It is about \\(14/10,000\\), as this will show:\n\n1/2/3/4/5/6\n\n0.001388888888888889\n\n\n\n\nExample\nWhich is bigger \\(4^3\\) or \\(3^4\\)? We can check by computing their difference:\n\n4^3 - 3^4\n\n-17\n\n\nSo \\(3^4\\) is bigger.\n\n\nExample\nA right triangle has sides \\(a=11\\) and \\(b=12\\). Find the length of the hypotenuse squared. As \\(c^2 = a^2 + b^2\\) we have:\n\n11^2 + 12^2\n\n265"
},
{
"objectID": "precalc/calculator.html#order-of-operations",
"href": "precalc/calculator.html#order-of-operations",
"title": "1  From calculator to computer",
"section": "1.2 Order of operations",
"text": "1.2 Order of operations\nThe calculator must use some rules to define how it will evaluate its instructions when two or more operations are involved. We know mathematically, that when \\(1 + 2 \\cdot 3\\) is to be evaluated the multiplication is done first then the addition.\nWith the Google Calculator, typing 1 + 2 x 3 = will give the value \\(7\\), but if we evaluate the + sign first, via 1 + 2 = x 3 = the answer will be 9, as that will force the addition of 1+2 before multiplying. The more traditional way of performing that calculation is to use parentheses to force an evaluation. That is, (1 + 2) * 3 = will produce 9 (though one must type it in, and not use a mouse to enter). Except for the most primitive of calculators, there are dedicated buttons for parentheses to group expressions.\nIn Julia, the entire expression is typed in before being evaluated, so the usual conventions of mathematics related to the order of operations may be used. These are colloquially summarized by the acronym PEMDAS.\n\nPEMDAS. This acronym stands for Parentheses, Exponents, Multiplication, Division, Addition, Subtraction. The order indicates which operation has higher precedence, or should happen first. This isnt exactly the case, as “M” and “D” have the same precedence, as do “A” and “S”. In the case of two operations with equal precedence, associativity is used to decide which to do. For the operations +, -, *, / the associativity is left to right, as in the left one is done first, then the right. However, ^ has right associativity, so 4^3^2 is 4^(3^2) and not (4^3)^2. (Be warned that some calculators - and spread sheets, such as Excel - will treat this expression with left associativity.)\n\nWith rules of precedence, an expression like the following has a clear interpretation to Julia without the need for parentheses:\n\n1 + 2 - 3 * 4 / 5 ^ 6\n\n2.999232\n\n\nWorking through PEMDAS we see that ^ is first, then * and then / (this due to associativity and * being the leftmost expression of the two) and finally + and then -, again by associativity rules. So we should have the same value with:\n\n(1 + 2) - ((3 * 4) / (5 ^ 6))\n\n2.999232\n\n\nIf different parentheses are used, the answer will likely be different. For example, the following forces the operations to be -, then *, then +. The result of that is then divided by 5^6:\n\n(1 + ((2 - 3) * 4)) / (5 ^ 6)\n\n-0.000192\n\n\n\n1.2.1 Examples\n\nExample\nThe percentage error in \\(x\\) if \\(y\\) is the correct value is \\((x-y)/y \\cdot 100\\). Compute this if \\(x=100\\) and \\(y=98.6\\).\n\n(100 - 98.6) / 98.6 * 100\n\n1.4198782961460505\n\n\n\n\nExample\nThe marginal cost of producing one unit can be computed by finding the cost for \\(n+1\\) units and subtracting the cost for \\(n\\) units. If the cost of \\(n\\) units is \\(n^2 + 10\\), find the marginal cost when \\(n=100\\).\n\n(101^2 + 10) - (100^2 + 10)\n\n201\n\n\n\n\nExample\nThe average cost per unit is the total cost divided by the number of units. Again, if the cost of \\(n\\) units is \\(n^2 + 10\\), find the average cost for \\(n=100\\) units.\n\n(100^2 + 10) / 100\n\n100.1\n\n\n\n\nExample\nThe slope of the line through two points is \\(m=(y_1 - y_0) / (x_1 - x_0)\\). For the two points \\((1,2)\\) and \\((3,4)\\) find the slope of the line through them.\n\n(4 - 2) / (3 - 1)\n\n1.0\n\n\n\n\n\n1.2.2 Two ways to write division - and they are not the same\nThe expression \\(a + b / c + d\\) is equivalent to \\(a + (b/c) + d\\) due to the order of operations. It will generally have a different answer than \\((a + b) / (c + d)\\).\nHow would the following be expressed, were it written inline:\n\\[\n\\frac{1 + 2}{3 + 4}?\n\\]\nIt would have to be computed through \\((1 + 2) / (3 + 4)\\). This is because unlike /, the implied order of operation in the mathematical notation with the horizontal division symbol (the vinicula) is to compute the top and the bottom and then divide. That is, the vinicula is a grouping notation like parentheses, only implicitly so. Thus the above expression really represents the more verbose:\n\\[\n\\frac{(1 + 2)}{(3 + 4)}.\n\\]\nWhich lends itself readily to the translation:\n\n(1 + 2) / (3 + 4)\n\n0.42857142857142855\n\n\nTo emphasize, this is not the same as the value without the parentheses:\n\n1 + 2 / 3 + 4\n\n5.666666666666666\n\n\n\n\n\n\n\n\nWarning\n\n\n\nThe viniculum also indicates grouping when used with the square root (the top bar), and complex conjugation. That usage is often clear enough, but the usage of the viniculum in division often leads to confusion. The example above is one where the parentheses are often, erroneously, omitted. However, more confusion can arise when there is more than one vinicula. An expression such as \\(a/b/c\\) written inline has no confusion, it is: \\((a/b) / c\\) as left association is used; but when written with a pair of vinicula there is often the typographical convention of a slightly longer vinicula to indicate which is to be considered first. In the absence of that, then top to bottom association is often implied.\n\n\n\n\n1.2.3 Infix, postfix, and prefix notation\nThe factorial button on the Google Button creates an expression like 14! that is then evaluated. The operator, !, appears after the value (14) that it is applied to. This is called postfix notation. When a unary minus sign is used, as in -14, the minus sign occurs before the value it operates on. This uses prefix notation. These concepts can be extended to binary operations, where a third possibility is provided: infix notation, where the operator is between the two values. The infix notation is common for our familiar mathematical operations. We write 14 + 2 and not + 14 2 or 14 2 +. (Though if we had an old reverse-Polish notation calculator, we would enter 14 2 +!) In Julia, there are several infix operators, such as +, -, … and others that we may be unfamiliar with. These mirror the familiar notation from most math texts.\n\n\n\n\n\n\nNote\n\n\n\nIn Julia many infix operations can be done using a prefix manner. For example 14 + 2 can also be evaluated by +(14,2). There are very few postfix operations, though in these notes we will overload one, the ' operation, to indicate a derivative."
},
{
"objectID": "precalc/calculator.html#constants",
"href": "precalc/calculator.html#constants",
"title": "1  From calculator to computer",
"section": "1.3 Constants",
"text": "1.3 Constants\nThe Google calculator has two built in constants, e and π. Julia provides these as well, though not quite as easily. First, π is just pi:\n\npi\n\nπ = 3.1415926535897...\n\n\nWhereas, e is is not simply the character e, but rather a Unicode character typed in as \\euler[tab].\n\n\n\n = 2.7182818284590...\n\n\n\n\n\n\n\n\nNote\n\n\n\nHowever, when the accompanying package, CalculusWithJulia, is loaded, the character e will refer to a floating point approximation to the Euler constant .\n\n\nIn the sequel, we will just use e for this constant (though more commonly the exp function), with the reminder that base Julia alone does not reserve this symbol.\nMathematically these are irrational values with decimal expansions that do not repeat. Julia represents these values internally with additional accuracy beyond that which is displayed. Math constants can be used as though they were numbers, such is done with this expression:\n\n^(1/(2*pi))\n\n1.17251960642002\n\n\n\n\n\n\n\n\nWarning\n\n\n\nIn most cases. There are occasional (basically rare) spots where using pi by itself causes an eror where 1*pi will not. The reason is 1*pi will create a floating point value from the irrational object, pi.\n\n\n\n1.3.1 Numeric literals\nFor some special cases, Julia implements multiplication without a multiplication symbol. This is when the value on the left is a number, as in 2pi, which has an equivalent value to 2*pi. However the two are not equivalent, in that multiplication with numeric literals does not have the same precedence as regular multiplication - it is higher. This has practical importance when used in division or powers. For instance, these two are not the same:\n\n1/2pi, 1/2*pi\n\n(0.15915494309189535, 1.5707963267948966)\n\n\nWhy? Because the first 2pi is performed before division, as multiplication with numeric literals has higher precedence than regular multiplication, which is at the same level as division.\nTo confuse things even more, consider\n\n2pi^2pi\n\n2658.978166443007\n\n\nIs this the same as 2 * (pi^2) * pi or (2pi)^(2pi)?. The former would be the case is powers had higher precedence than literal multiplication, the latter would be the case were it the reverse. In fact, the correct answer is 2 * (pi^(2*pi)):\n\n2pi^2pi, 2 * (pi/2) * pi, (2pi)^(2pi), 2 * (pi^(2pi))\n\n(2658.978166443007, 9.869604401089358, 103540.92043427199, 2658.978166443007)\n\n\nThis follows usual mathematical convention, but is a source of potential confusion. It can be best to be explicit about multiplication, save for the simplest of cases."
},
{
"objectID": "precalc/calculator.html#functions",
"href": "precalc/calculator.html#functions",
"title": "1  From calculator to computer",
"section": "1.4 Functions",
"text": "1.4 Functions\nOn the Google calculator, the square root button has a single purpose: for the current value find a square root if possible, and if not signal an error (such as what happens if the value is negative). For more general powers, the \\(x^y\\) key can be used.\nIn Julia, functions are used to perform the actions that a specialized button may do on the calculator. Julia provides many standard mathematical functions - more than there could be buttons on a calculator - and allows the user to easily define their own functions. For example, Julia provides the same set of functions as on Googles calculator, though with different names. For logarithms, \\(\\ln\\) becomes log and \\(\\log\\) is log10 (computer programs almost exclusively reserve log for the natural log); for factorials, \\(x!\\), there is factorial; for powers \\(\\sqrt{}\\) becomes sqrt, \\(EXP\\) becomes exp, and \\(x^y\\) is computed with the infix operator ^. For the trigonometric functions, the basic names are similar: sin, cos, tan. These expect radians. For angles in degrees, the convenience functions sind, cosd, and tand are provided. On the calculator, inverse functions like \\(\\sin^{-1}(x)\\) are done by combining \\(Inv\\) with \\(\\sin\\). With Julia, the function name is asin, an abbreviation for “arcsine.” (Which is a good thing, as the notation using a power of \\(-1\\) is often a source of confusion and is not supported by Julia without work.) Similarly, there are asind, acos, acosd, atan, and atand functions available to the Julia user.\nThe following table summarizes the above:\n\n\n\n\n\n\nCalculatorJulia\n\n\\(+\\), \\(-\\), \\(\\times\\), \\(\\div\\)\n+, -, *, /\n\n\\(x^y\\)\n^\n\n\\(\\sqrt{}, \\sqrt[3]{}\\)\nsqrt, cbrt\n\n\\(e^x\\)\nexp\n\n\\(\\ln\\), \\(\\log\\)\nlog, log10\n\n\\(\\sin, \\cos, \\tan, \\sec, \\csc, \\cot\\)\nsin, cos, tan, sec, csc, cot\n\nIn degrees, not radians\nsind, cosd, tand, secd, cscd, cotd\n\n\\(\\sin^{-1}, \\cos^{-1}, \\tan^{-1}\\)\nasin, acos, atan\n\n\\(n!\\)\nfactorial\n\n\n\n\n\n\n\nUsing a function is very straightforward. A function is called using parentheses, in a manner visually similar to how a function is called mathematically. So if we consider the sqrt function, we have:\n\nsqrt(4), sqrt(5)\n\n(2.0, 2.23606797749979)\n\n\nThe function is referred to by name (sqrt) and called with parentheses. Any arguments are passed into the function using commas to separate values, should there be more than one. When there are numerous values for a function, the arguments may need to be given in a specific order or may possibly be specified with keywords. (A semicolon can be used instead of a comma to separate keyword arguments.)\nSome more examples:\n\nexp(2), log(10), sqrt(100), 10^(1/2)\n\n(7.38905609893065, 2.302585092994046, 10.0, 3.1622776601683795)\n\n\n\n\n\n\n\n\nNote\n\n\n\nParentheses have many roles. Weve just seen that parentheses may be used for grouping, and now we see they are used to indicate a function is being called. These are familiar from their parallel usage in traditional math notation. In Julia, a third usage is common, the making of a “tuple,” or a container of different objects, for example (1, sqrt(2), pi). In these notes, the output of multiple commands separated by commas is a printed tuple.\n\n\n\n1.4.1 Multiple arguments\nFor the logarithm, we mentioned that log is the natural log and log10 implements the logarithm base 10. As well there is log2. However, in general there is no logb for any base b. Instead, the basic log function can take two arguments. When it does, the first is the base, and the second the value to take the logarithm of. This avoids forcing the user to remember that \\(\\log_b(x) = \\log(x)/\\log(b)\\).\nSo we have all these different, but related, uses to find logarithms:\n\nlog(e), log(2, e), log(10, e), log(e, 2)\n\n(1.0, 1.4426950408889634, 0.43429448190325176, 0.6931471805599453)\n\n\nIn Julia, the “generic” function log not only has different implementations for different types of arguments (real or complex), but also has a different implementation depending on the number of arguments.\n\n\n1.4.2 Examples\n\nExample\nA right triangle has sides \\(a=11\\) and \\(b=12\\). Find the length of the hypotenuse. As \\(c^2 = a^2 + b^2\\) we have:\n\nsqrt(11^2 + 12^2)\n\n16.278820596099706\n\n\n\n\nExample\nA formula from statistics to compute the variance of a binomial random variable for parameters \\(p\\) and \\(n\\) is \\(\\sqrt{n p (1-p)}\\). Compute this value for \\(p=1/4\\) and \\(n=10\\).\n\nsqrt(10 * 1/4 * (1 - 1/4))\n\n1.3693063937629153\n\n\n\n\nExample\nFind the distance between the points \\((-3, -4)\\) and \\((5,6)\\). Using the distance formula \\(\\sqrt{(x_1-x_0)^2+(y_1-y_0)^2}\\), we have:\n\nsqrt((5 - -3)^2 + (6 - -4)^2)\n\n12.806248474865697\n\n\n\n\nExample\nThe formula to compute the resistance of two resistors in parallel is given by: \\(1/(1/r_1 + 1/r_2)\\). Suppose the resistance is \\(10\\) in one resistor and \\(20\\) in the other. What is the resistance in parallel?\n\n1 / (1/10 + 1/20)\n\n6.666666666666666"
},
{
"objectID": "precalc/calculator.html#errors",
"href": "precalc/calculator.html#errors",
"title": "1  From calculator to computer",
"section": "1.5 Errors",
"text": "1.5 Errors\nNot all computations on a calculator are valid. For example, the Google calculator will display Error as the output of \\(0/0\\) or \\(\\sqrt{-1}\\). These are also errors mathematically, though the second is not if the complex numbers are considered.\nIn Julia, there is a richer set of error types. The value 0/0 will in fact not be an error, but rather a value NaN. This is a special floating point value indicating “not a number” and is the result for various operations. The output of \\(\\sqrt{-1}\\) (computed via sqrt(-1)) will indicate a domain error:\n\nsqrt(-1)\n\nLoadError: DomainError with -1.0:\nsqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).\n\n\nFor integer or real-valued inputs, the sqrt function expects non-negative values, so that the output will always be a real number.\nThere are other types of errors. Overflow is a common one on most calculators. The value of \\(1000!\\) is actually very large (over 2500 digits large). On the Google calculator it returns Infinity, a slight stretch. For factorial(1000) Julia returns an OverflowError. This means that the answer is too large to be represented as a regular integer.\n\nfactorial(1000)\n\nLoadError: OverflowError: 1000 is too large to look up in the table; consider using `factorial(big(1000))` instead\n\n\nHow Julia handles overflow is a study in tradeoffs. For integer operations that demand high performance, Julia does not check for overflow. So, for example, if we are not careful strange answers can be had. Consider the difference here between powers of 2:\n\n2^62, 2^63\n\n(4611686018427387904, -9223372036854775808)\n\n\nOn a machine with \\(64\\)-bit integers, the first of these two values is correct, the second, clearly wrong, as the answer given is negative. This is due to overflow. The cost of checking is considered too high, so no error is thrown. The user is expected to have a sense that they need to be careful when their values are quite large. (Or the user can use floating point numbers, which though not always exact, can represent much bigger values and are exact for a reasonably wide range of integer values.)\n\n\n\n\n\n\nWarning\n\n\n\nIn a turnaround from a classic blues song, we can think of Julia as built for speed, not for comfort. All of these errors above could be worked around so that the end user doesnt see them. However, this would require slowing things down, either through checking of operations or allowing different types of outputs for similar type of inputs. These are tradeoffs that are not made for performance reasons. For the most part, the tradeoffs dont get in the way, but learning where to be careful takes some time. Error messages often suggest a proper alternative.\n\n\n\nExample\nDid Homer Simpson disprove Fermats Theorem?\nFermats theorem states there are no solutions over the integers to \\(a^n + b^n = c^n\\) when \\(n > 2\\). In the photo accompanying the linked article, we see:\n\\[\n3987^{12} + 4365^{12} - 4472^{12}.\n\\]\nIf you were to do this on most calculators, the answer would be \\(0\\). Were this true, it would show that there is at least one solution to \\(a^{12} + b^{12} = c^{12}\\) over the integers - hence Fermat would be wrong. So is it \\(0\\)?\nWell, lets try something with Julia to see. Being clever, we check if \\((3987^{12} + 4365^{12})^{1/12} = 4472\\):\n\n(3987^12 + 4365^12)^(1/12)\n\n28.663217591132355\n\n\nNot even close. Case closed. But wait? This number to be found must be at least as big as \\(3987\\) and we got \\(28\\). Doh! Something cant be right. Well, maybe integer powers are being an issue. (The largest \\(64\\)-bit integer is less than \\(10^{19}\\) and we can see that \\((4\\cdot 10^3)^{12}\\) is bigger than \\(10^{36})\\). Trying again using floating point values for the base, we see:\n\n(3987.0^12 + 4365.0^12)^(1/12)\n\n4472.000000007058\n\n\nAhh, we see something really close to \\(4472\\), but not exactly. Why do most calculators get this last part wrong? It isnt that they dont use floating point, but rather the difference between the two numbers:\n\n(3987.0^12 + 4365.0^12)^(1/12) - 4472\n\n7.057678885757923e-9\n\n\nis less than \\(10^{-8}\\) so on a display with \\(8\\) digits may be rounded to \\(0\\).\nMoral: with Julia and with calculators, we still have to be mindful not to blindly accept an answer."
},
{
"objectID": "precalc/calculator.html#questions",
"href": "precalc/calculator.html#questions",
"title": "1  From calculator to computer",
"section": "1.6 Questions",
"text": "1.6 Questions\n\nQuestion\nCompute \\(22/7\\) with Julia.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute \\(\\sqrt{220}\\) with Julia.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute \\(2^8\\) with Julia.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the value of\n\\[\n\\frac{9 - 5 \\cdot (3-4)}{6 - 2}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the following using Julia:\n\\[\n\\frac{(.25 - .2)^2}{(1/4)^2 + (1/3)^2}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the decimal representation of the following using Julia:\n\\[\n1 + \\frac{1}{2} + \\frac{1}{2^2} + \\frac{1}{2^3} + \\frac{1}{2^4}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the following using Julia:\n\\[\n\\frac{3 - 2^2}{4 - 2\\cdot3}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the following using Julia:\n\\[\n(1/2) \\cdot 32 \\cdot 3^2 + 100 \\cdot 3 - 20\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWich of the following is a valid Julia expression for\n\\[\n\\frac{3 - 2}{4 - 1}\n\\]\nthat uses the least number of parentheses?\n\n\n\n \n \n \n \n \n \n \n \n \n 3 - 2 / (4 - 1)\n \n \n\n\n \n \n \n \n (3 - 2) / (4 - 1)\n \n \n\n\n \n \n \n \n (3 - 2)/ 4 - 1\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWich of the following is a valid Julia expression for\n\\[\n\\frac{3\\cdot2}{4}\n\\]\nthat uses the least number of parentheses?\n\n\n\n \n \n \n \n \n \n \n \n \n (3 * 2) / 4\n \n \n\n\n \n \n \n \n 3 * 2 / 4\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of the following is a valid Julia expression for\n\\[\n2^{4 - 2}\n\\]\nthat uses the least number of parentheses?\n\n\n\n \n \n \n \n \n \n \n \n \n (2 ^ 4) - 2\n \n \n\n\n \n \n \n \n 2 ^ 4 - 2\n \n \n\n\n \n \n \n \n 2 ^ (4 - 2)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the U.S. version of the Office, the opening credits include a calculator calculation. The key sequence shown is 9653 + which produces 11532. What value was added to?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWe saw that 1 / 2 / 3 / 4 / 5 / 6 is about \\(14\\) divided by \\(10,000\\). But what would be a more familiar expression representing it:\n\n\n\n \n \n \n \n \n \n \n \n \n 1 / (2 / 3 / 4 / 5 / 6)\n \n \n\n\n \n \n \n \n 1 /(2 * 3 * 4 * 5 * 6)\n \n \n\n\n \n \n \n \n 1 / 2 * 3 / 4 * 5 / 6\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nOne of these three expressions will produce a different answer, select that one:\n\n\n\n \n \n \n \n \n \n \n \n \n 2 - (3 - 4)\n \n \n\n\n \n \n \n \n 2 - 3 - 4\n \n \n\n\n \n \n \n \n (2 - 3) - 4\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nOne of these three expressions will produce a different answer, select that one:\n\n\n\n \n \n \n \n \n \n \n \n \n 2 - 3 * 4\n \n \n\n\n \n \n \n \n (2 - 3) * 4\n \n \n\n\n \n \n \n \n 2 - (3 * 4)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nOne of these three expressions will produce a different answer, select that one:\n\n\n\n \n \n \n \n \n \n \n \n \n -(1^2)\n \n \n\n\n \n \n \n \n (-1)^2\n \n \n\n\n \n \n \n \n -1^2\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of \\(\\sin(\\pi/10)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of \\(\\sin(52^\\circ)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of\n\\[\n\\frac{\\sin(\\pi/3) - 1/2}{\\pi/3 - \\pi/6}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(\\sin^{-1}(\\sin(3\\pi/2))\\) equal to \\(3\\pi/2\\)? (The “arc” functions do no use power notation, but instead a prefix of a.)\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of round(3.5000)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of sqrt(32 - 12)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich is greater \\(e^\\pi\\) or \\(\\pi^e\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^{\\pi}\\)\n \n \n\n\n \n \n \n \n \\(\\pi^{e}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the value of \\(\\pi - (x - \\sin(x)/\\cos(x))\\) when \\(x=3\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFactorials in Julia are computed with the function factorial, not the postfix operator !, as with math notation. What is \\(10!\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill -2^2 produce 4 (which is a unary - evaluated before ^) or -4 (which is a unary - evaluated after ^)?\n\n\n\n \n \n \n \n \n \n \n \n \n -4\n \n \n\n\n \n \n \n \n 4\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA twitter post from popular mechanics generated some attention.\n\nWhat is the answer?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes this expression return the correct answer using proper order of operations?\n\n8÷2(2+2)\n\n1\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy or why not:\n\n\n\n \n \n \n \n \n \n \n \n \n Of course it is correct.\n \n \n\n\n \n \n \n \n The precedence of numeric literal coefficients used for implicit multiplication is higher than other binary operators such as multiplication (*), and division (/, \\, and //)"
},
{
"objectID": "precalc/variables.html",
"href": "precalc/variables.html",
"title": "2  Variables",
"section": "",
"text": "Screenshot of a calculator provided by the Google search engine.\n\n\nThe Google calculator has a button Ans to refer to the answer to the previous evaluation. This is a form of memory. The last answer is stored in a specific place in memory for retrieval when Ans is used. In some calculators, more advanced memory features are possible. For some, it is possible to push values onto a stack of values for them to be referred to at a later time. This proves useful for complicated expressions, say, as the expression can be broken into smaller intermediate steps to be computed. These values can then be appropriately combined. This strategy is a good one, though the memory buttons can make its implementation a bit cumbersome.\nWith Julia, as with other programming languages, it is very easy to refer to past evaluations. This is done by assignment whereby a computed value stored in memory is associated with a name. The name can be used to look up the value later. Assignment does not change the value of the object being assigned, it only introduces a reference to it.\nAssignment in Julia is handled by the equals sign and takes the general form variable_name = value. For example, here we assign values to the variables x and y\n\nx = sqrt(2)\ny = 42\n\n42\n\n\nIn an assignment, the right hand side is always returned, so it appears nothing has happened. However, the values are there, as can be checked by typing their name\n\nx\n\n1.4142135623730951\n\n\nJust typing a variable name (without a trailing semicolon) causes the assigned value to be displayed.\nVariable names can be reused, as here, where we redefine x:\n\nx = 2\n\n2\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe Pluto interface for Julia is idiosyncratic, as variables are reactive. This interface allows changes to a variable x to propogate to all other cells referring to x. Consequently, the variable name can only be assigned once per notebook unless the name is in some other namespace, which can be arranged by including the assignment inside a function or a let block.\n\n\nJulia is referred to as a “dynamic language” which means (in most cases) that a variable can be reassigned with a value of a different type, as we did with x where first it was assigned to a floating point value then to an integer value. (Though we meet some cases - generic functions - where Julia balks at reassigning a variable if the type if different.)\nMore importantly than displaying a value, is the use of variables to build up more complicated expressions. For example, to compute\n\\[\n\\frac{1 + 2 \\cdot 3^4}{5 - 6/7}\n\\]\nwe might break it into the grouped pieces implied by the mathematical notation:\n\ntop = 1 + 2*3^4\nbottom = 5 - 6/7\ntop/bottom\n\n39.34482758620689\n\n\n\n\n\n\nImagine we have the following complicated expression related to the trajectory of a projectile with wind resistance:\n\\[\n \\left(\\frac{g}{k v_0\\cos(\\theta)} + \\tan(\\theta) \\right) t + \\frac{g}{k^2}\\ln\\left(1 - \\frac{k}{v_0\\cos(\\theta)} t \\right)\n\\]\nHere \\(g\\) is the gravitational constant \\(9.8\\) and \\(v_0\\), \\(\\theta\\) and \\(k\\) parameters, which we take to be \\(200\\), \\(45\\) degrees, and \\(1/2\\) respectively. With these values, the above expression can be computed when \\(s=100\\):\n\ng = 9.8\nv0 = 200\ntheta = 45\nk = 1/2\nt = 100\na = v0 * cosd(theta)\n(g/(k*a) + tand(theta))* t + (g/k^2) * log(1 - (k/a)*t)\n\n96.75771791632161\n\n\nBy defining a new variable a to represent a value that is repeated a few times in the expression, the last command is greatly simplified. Doing so makes it much easier to check for accuracy against the expression to compute.\n\n\n\nA common expression in mathematics is a polynomial expression, for example \\(-16s^2 + 32s - 12\\). Translating this to Julia at \\(s =3\\) we might have:\n\ns = 3\n-16*s^2 + 32*s - 12\n\n-60\n\n\nThis looks nearly identical to the mathematical expression, but we inserted * to indicate multiplication between the constant and the variable. In fact, this step is not needed as Julia allows numeric literals to have an implied multiplication:\n\n-16s^2 + 32s - 12\n\n-60"
},
{
"objectID": "precalc/variables.html#where-math-and-computer-notations-diverge",
"href": "precalc/variables.html#where-math-and-computer-notations-diverge",
"title": "2  Variables",
"section": "2.2 Where math and computer notations diverge",
"text": "2.2 Where math and computer notations diverge\nIt is important to recognize that = to Julia is not in analogy to how \\(=\\) is used in mathematical notation. The following Julia code is not an equation:\n\nx = 3\nx = x^2\n\n9\n\n\nWhat happens instead? The right hand side is evaluated (x is squared), the result is stored and bound to the variable x (so that x will end up pointing to the new value, 9, and not the original one, 3); finally the value computed on the right-hand side is returned and in this case displayed, as there is no trailing semicolon to suppress the output.\nThis is completely unlike the mathematical equation \\(x = x^2\\) which is typically solved for values of \\(x\\) that satisfy the equation (\\(0\\) and \\(1\\)).\n\nExample\nHaving = as assignment is usefully exploited when modeling sequences. For example, an application of Newtons method might end up with this expression:\n\\[\nx_{i+1} = x_i - \\frac{x_i^2 - 2}{2x_i}\n\\]\nAs a mathematical expression, for each \\(i\\) this defines a new value for \\(x_{i+1}\\) in terms of a known value \\(x_i\\). This can be used to recursively generate a sequence, provided some starting point is known, such as \\(x_0 = 2\\).\nThe above might be written instead with:\n\nx = 2\nx = x - (x^2 - 2) / (2x)\nx = x - (x^2 - 2) / (2x)\n\n1.4166666666666667\n\n\nRepeating this last line will generate new values of x based on the previous one - no need for subscripts. This is exactly what the mathematical notation indicates is to be done."
},
{
"objectID": "precalc/variables.html#context",
"href": "precalc/variables.html#context",
"title": "2  Variables",
"section": "2.3 Context",
"text": "2.3 Context\nThe binding of a value to a variable name happens within some context. For our simple illustrations, we are assigning values, as though they were typed at the command line. This stores the binding in the Main module. Julia looks for variables in this module when it encounters an expression and the value is substituted. Other uses, such as when variables are defined within a function, involve different contexts which may not be visible within the Main module.\n\n\n\n\n\n\nNote\n\n\n\nThe varinfo function will list the variables currently defined in the main workspace. There is no mechanism to delete a single variable.\n\n\n\n\n\n\n\n\nWarning\n\n\n\nShooting oneselves in the foot. Julia allows us to locally redefine variables that are built in, such as the value for pi or the function object assigned to sin. For example, this is a perfectly valid command sin=3. However, it will overwrite the typical value of sin so that sin(3) will be an error. At the terminal, the binding to sin occurs in the Main module. This shadows that value of sin bound in the Base module. Even if redefined in Main, the value in base can be used by fully qualifying the name, as in Base.sin(pi). This uses the notation module_name.variable_name to look up a binding in a module."
},
{
"objectID": "precalc/variables.html#variable-names",
"href": "precalc/variables.html#variable-names",
"title": "2  Variables",
"section": "2.4 Variable names",
"text": "2.4 Variable names\nJulia has a very wide set of possible names for variables. Variables are case sensitive and their names can include many Unicode characters. Names must begin with a letter or an appropriate Unicode value (but not a number). There are some reserved words, such as try or else which can not be assigned to. However, many built-in names can be locally overwritten. Conventionally, variable names are lower case. For compound names, it is not unusual to see them squished together, joined with underscores, or written in camelCase.\n\nvalue_1 = 1\na_long_winded_variable_name = 2\nsinOfX = sind(45)\n__private = 2 # a convention\n\n2\n\n\n\n2.4.1 Unicode names\nJulia allows variable names to use Unicode identifiers. Such names allow julia notation to mirror that of many mathematical texts. For example, in calculus the variable \\(\\epsilon\\) is often used to represent some small number. We can assign to a symbol that looks like \\(\\epsilon\\) using Julias LaTeX input mode. Typing \\epsilon[tab] will replace the text with the symbol within IJulia or the command line.\n\nϵ = 1e-10\n\n1.0e-10\n\n\nEntering Unicode names follows the pattern of “slash” + LaTeX name + [tab] key. Some other ones that are useful are \\delta[tab], \\alpha[tab], and \\beta[tab], though there are hundreds of other values defined.\nFor example, we could have defined theta (\\theta[tab]) and v0 (v\\_0[tab]) using Unicode to make them match more closely the typeset math:\n\nθ = 45; v₀ = 200\n\n200\n\n\n\n\n\n\n\n\nUnicode\n\n\n\nThese notes can be presented as HTML files or as Pluto notebooks. They often use Unicode alternatives to avoid the Pluto requirement of a single use of assigning to a variable name in a notebook without placing the assignment in a let block or a function body.\n\n\n\n\n\n\n\n\nEmojis\n\n\n\nThere is even support for tab-completion of emojis such as \\\\:snowman:[tab] or \\\\:koala:[tab]\n\n\n\nExample\nAs mentioned the value of \\(e\\) is bound to the Unicode value \\euler[tab] and not the letter e, so Unicode entry is required to access this constant This isnt quite true. The MathConstants module defines e, as well as a few other values accessed via Unicode. When the CalculusWithJulia package is loaded, as will often be done in these notes, a value of exp(1) is assigned to e."
},
{
"objectID": "precalc/variables.html#tuple-assignment",
"href": "precalc/variables.html#tuple-assignment",
"title": "2  Variables",
"section": "2.5 Tuple assignment",
"text": "2.5 Tuple assignment\nIt is a common task to define more than one variable. Multiple definitions can be done in one line, using semicolons to break up the commands, as with:\n\na = 1; b = 2; c=3\n\n3\n\n\nFor convenience, Julia allows an alternate means to define more than one variable at a time. The syntax is similar:\n\na, b, c = 1, 2, 3\nb\n\n2\n\n\nThis sets a=1, b=2, and c=3, as suggested. This construct relies on tuple destructuring. The expression on the right hand side forms a tuple of values. A tuple is a container for different types of values, and in this case the tuple has 3 values. When the same number of variables match on the left-hand side as those in the container on the right, the names are assigned one by one.\nThe value on the right hand side is evaluated, then the assignment occurs. The following exploits this to swap the values assigned to a and b:\n\na, b = 1, 2\na, b = b, a\n\n(2, 1)\n\n\n\nExample, finding the slope\nFind the slope of the line connecting the points \\((1,2)\\) and \\((4,6)\\). We begin by defining the values and then applying the slope formula:\n\nx0, y0 = 1, 2\nx1, y1 = 4, 6\nm = (y1 - y0) / (x1 - x0)\n\n1.3333333333333333\n\n\nOf course, this could be computed directly with (6-2) / (4-1), but by using familiar names for the values we can be certain we apply the formula properly."
},
{
"objectID": "precalc/variables.html#questions",
"href": "precalc/variables.html#questions",
"title": "2  Variables",
"section": "2.6 Questions",
"text": "2.6 Questions\n\nQuestion\nLet \\(a=10\\), \\(b=2.3\\), and \\(c=8\\). Find the value of \\((a-b)/(a-c)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet x = 4. Compute \\(y=100 - 2x - x^2\\). What is the value:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the answer to this computation?\na = 3.2; b=2.3\na^b - b^a\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor longer computations, it can be convenient to do them in parts, as this makes it easier to check for mistakes.\nFor example, to compute\n\\[\n\\frac{p - q}{\\sqrt{p(1-p)}}\n\\]\nfor \\(p=0.25\\) and \\(q=0.2\\) we might do:\np, q = 0.25, 0.2\ntop = p - q\nbottom = sqrt(p*(1-p))\nans = top/bottom\nWhat is the result of the above?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing variables to record the top and the bottom of the expression, compute the following for \\(x=3\\):\n\\[\ny = \\frac{x^2 - 2x - 8}{x^2 - 9x - 20}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich if these is not a valid variable name (identifier) in Julia:\n\n\n\n \n \n \n \n \n \n \n \n \n some_really_long_name_that_is_no_fun_to_type\n \n \n\n\n \n \n \n \n 5degreesbelowzero\n \n \n\n\n \n \n \n \n aMiXeDcAsEnAmE\n \n \n\n\n \n \n \n \n fahrenheit451\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these symbols is one of Julias built-in math constants?\n\n\n\n \n \n \n \n \n \n \n \n \n pi\n \n \n\n\n \n \n \n \n oo\n \n \n\n\n \n \n \n \n E\n \n \n\n\n \n \n \n \n I\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat key sequence will produce this assignment\nδ = 1/10\n\n\n\n \n \n \n \n \n \n \n \n \n \\delta[tab] = 1/10\n \n \n\n\n \n \n \n \n $\\\\delta$ = 1/10\n \n \n\n\n \n \n \n \n delta[tab] = 1/10\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these three statements will not be a valid way to assign three variables at once:\n\n\n\n \n \n \n \n \n \n \n \n \n a=1, b=2, c=3\n \n \n\n\n \n \n \n \n a,b,c = 1,2,3\n \n \n\n\n \n \n \n \n a=1; b=2; c=3\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe fact that assignment always returns the value of the right hand side and the fact that the = sign associates from right to left means that the following idiom:\nx = y = z = 3\nWill always:\n\n\n\n \n \n \n \n \n \n \n \n \n Create \\(3\\) linked values that will stay synced when any value changes\n \n \n\n\n \n \n \n \n Throw an error\n \n \n\n\n \n \n \n \n Assign all three variables at once to a value of 3"
},
{
"objectID": "precalc/numbers_types.html",
"href": "precalc/numbers_types.html",
"title": "3  Number systems",
"section": "",
"text": "In mathematics, there are many different number systems in common use. For example by the end of pre-calculus, all of the following have been introduced:\nOn top of these, we have special subsets, such as the natural numbers \\(\\{1, 2, \\dots\\}\\) (sometimes including \\(0\\)), the even numbers, the odd numbers, the positive numbers, the non-negative numbers, etc.\nMathematically, these number systems are naturally nested within each other as integers are rational numbers which are real numbers, which can be viewed as part of the complex numbers.\nCalculators typically have just one type of number - floating point values. These model the real numbers. Julia, on other other hand, has a rich type system, and within that has many different number types. There are types that model each of the four main systems above, and within each type, specializations for how these values are stored.\nMost of the details will not be of interest to all, and will be described later.\nFor now, lets consider the number \\(1\\). It can be viewed as either an integer, rational, real, or complex number. To construct “\\(1\\)” in each type within Julia we have these different styles:\nThe basic number types in Julia are Int, Float64, Rational and Complex, though in fact there are many more, and the last two arent even concrete types. This distinction is important, as the type of number dictates how it will be stored and how precisely the stored value can be expected to be to the mathematical value it models.\nThough there are explicit constructors for these types, these notes avoid them unless necessary, as Julias parser can distinguish these types through an easy to understand syntax:\nSimilarly, each type is printed slightly differently.\nThe key distinction is between integers and floating points. While floating point values include integers, and so can be used exclusively on the calculator, the difference is that an integer is guaranteed to be an exact value, whereas a floating point value, while often an exact representation of a number is also often just an approximate value. This can be an advantage floating point values can model a much wider range of numbers.\nNow in nearly all cases the differences are not noticable. Take for instance this simple calculation involving mixed types.\nThe sum of an integer, a floating point number and rational number returns a floating point number without a complaint.\nThis is because behind the scenes, Julia will often “promote” a type to match, so for example to compute 1 + 1.25 the integer 1 will be promoted to a floating point value and the two values are then added. Similarly, with 2.25 + 3//2, where the fraction is promoted to the floating point value 1.5 and addition is carried out.\nAs floating point numbers may be approximations, some values are not quite what they would be mathematically:\nThese values are very small numbers, but not exactly \\(0\\), as they are mathematically.\nThe only common issue is with powers. Julia tries to keep a predictable output from the input types (not their values). Here are the two main cases that arise where this can cause unexpected results:\nRather than give an error though, Julia gives seemingly arbitrary answers, as can be seen in this example on a \\(64\\)-bit machine:\n(They arent arbitrary, rather integer arithmetic is implemented as modular arithmetic.)\nThis could be worked around, as it is with some programming languages, but it isnt, as it would slow down this basic computation. So, it is up to the user to be aware of cases where their integer values can grow to big. The suggestion is to use floating point numbers in this domain, as they have more room, at the cost of sometimes being approximate values.\nThis is because for real-valued inputs Julia expects to return a real-valued output. Of course, this is true in mathematics until the complex numbers are introduced. Similarly in Julia - to take square roots of negative numbers, start with complex numbers:\ninteger bases and negative integer exponents. For example 2^(-1). This is now special cased, though only for numeric literals. If z=-1, 2^z will throw a DomainError. Historically, the desire to keep a predictable type for the output (integer) led to defining this case as a domain error, but its usefulness led to special casing."
},
{
"objectID": "precalc/numbers_types.html#additional-details.",
"href": "precalc/numbers_types.html#additional-details.",
"title": "3  Number systems",
"section": "3.1 Additional details.",
"text": "3.1 Additional details.\nWhat follows is only needed for those seeking more background.\nJulia has abstract number types Integer, Real, and Number. All four types described above are of type Number, but Complex is not of type Real.\nHowever, a specific value is an instance of a concrete type. A concrete type will also include information about how the value is stored. For example, the integer 1 could be stored using \\(64\\) bits as a signed integers, or, should storage be a concern, as an \\(8\\) bits signed or even unsigned integer, etc.. If storage isnt an issue, but exactness at all scales is, then it can be stored in a manner that allows for the storage to grow using “big” numbers.\nThese distinctions can be seen in how Julia parses these three values:\n\n1234567890 will be a \\(64\\)-bit integer (on newer machines), Int64\n12345678901234567890 will be a \\(128\\) bit integer, Int128\n1234567890123456789012345678901234567890 will be a big integer, BigInt\n\nHaving abstract types allows programmers to write functions that will work over a wide range of input values that are similar, but have different implementation details.\n\n3.1.1 Integers\nIntegers are often used casually, as they come about from parsing. As with a calculator, floating point numbers could be used for integers, but in Julia - and other languages - it proves useful to have numbers known to have exact values. In Julia there are built-in number types for integers stored in \\(8\\), \\(16\\), \\(32\\), \\(64\\), and \\(128\\) bits and BigInts if the previous arent large enough. (\\(8\\) bits can hold \\(8\\) binary values representing \\(1\\) of \\(256=2^8\\) possibilities, whereas the larger \\(128\\) bit can hold one of \\(2^{128}\\) possibilities.) Smaller values can be more efficiently used, and this is leveraged at the system level, but not a necessary distinction with calculus where the default size along with an occasional usage of BigInt suffice.\n\n\n3.1.2 Floating point numbers\nFloating point numbers are a computational model for the real numbers. For floating point numbers, \\(64\\) bits are used by default for both \\(32\\)- and \\(64\\)-bit systems, though other storage sizes can be requested. This gives a large ranging - but still finite - set of real numbers that can be represented. However, there are infinitely many real numbers just between \\(0\\) and \\(1\\), so there is no chance that all can be represented exactly on the computer with a floating point value. Floating point then is necessarily an approximation for all but a subset of the real numbers. Floating point values can be viewed in normalized scientific notation as \\(a\\cdot 2^b\\) where \\(a\\) is the significand and \\(b\\) is the exponent. Save for special values, the significand \\(a\\) is normalized to satisfy \\(1 \\leq \\lvert a\\rvert < 2\\), the exponent can be taken to be an integer, possibly negative.\nAs per IEEE Standard 754, the Float64 type gives 52 bits to the precision (with an additional implied one), 11 bits to the exponent and the other bit is used to represent the sign. Positive, finite, floating point numbers have a range approximately between \\(10^{-308}\\) and \\(10^{308}\\), as 308 is about \\(\\log_{10}\\cdot 2^{1023}\\). The numbers are not evenly spread out over this range, but, rather, are much more concentrated closer to \\(0\\).\n\n\n\n\n\n\nMore on floating point numbers\n\n\n\nYou can discover more about the range of floating point values provided by calling a few different functions.\n\ntypemax(0.0) gives the largest value for the type (Inf in this case).\nprevfloat(Inf) gives the largest finite one, in general prevfloat is the next smallest floating point value.\n\n\n\n\nnextfloat(-Inf), similarly, gives the smallest finite floating point value, and in general returns the next largest floating point value.\nnextfloat(0.0) gives the closest positive value to 0.\neps() gives the distance to the next floating point number bigger than 1.0. This is sometimes referred to as machine precision.\n\n\nScientific notation\nFloating point numbers may print in a familiar manner:\n\nx = 1.23\n\n1.23\n\n\nor may be represented in scientific notation:\n\n6.23 * 10.0^23\n\n6.23e23\n\n\nThe special coding aeb (or if the exponent is negative ae-b) is used to represent the number \\(a \\cdot 10^b\\) (\\(1 \\leq a < 10\\)). This notation can be used directly to specify a floating point value:\n\navagadro = 6.23e23\n\n6.23e23\n\n\nHere e is decidedly not the Euler number, rather syntax to separate the exponent from the mantissa.\nThe first way of representing this number required using 10.0 and not 10 as the integer power will return an integer and even for 64-bit systems is only valid up to 10^18. Using scientific notation avoids having to concentrate on such limitations.\n\nExample\nFloating point values in scientific notation will always be normalized. This is easy for the computer to do, but tedious to do by hand. Here we see:\n\n4e30 * 3e40\n\n1.2000000000000001e71\n\n\n\n3e40 / 4e30\n\n7.5e9\n\n\nThe power in the first is \\(71\\), not \\(70 = 30+40\\), as the product of \\(3\\) and \\(4\\) as \\(12\\) or 1.2e^1. (We also see the artifact of 1.2 not being exactly representable in floating point.)\n\n\nExample: 32-bit floating point\nIn some uses, such as using a GPU, \\(32\\)-bit floating point (single precision) is also common. These values may be specified with an f in place of the e in scientific notation:\n\n1.23f0\n\n1.23f0\n\n\nAs with the use of e, some exponent is needed after the f, even if it is 0.\n\n\n\nSpecial values: Inf, -Inf, NaN\nThe coding of floating point numbers also allows for the special values of Inf, -Inf to represent positive and negative infinity. As well, a special value NaN (“not a number”) is used to represent a value that arises when an operation is not closed (e.g., \\(0.0/0.0\\) yields NaN). (Technically NaN has several possible “values,” a point ignored here.) Except for negative bases, the floating point numbers with the addition of Inf and NaN are closed under the operations +, -, *, /, and ^. Here are some computations that produce NaN:\n\n0/0, Inf/Inf, Inf - Inf, 0 * Inf\n\n(NaN, NaN, NaN, NaN)\n\n\nWhereas, these produce an infinity\n\n1/0, Inf + Inf, 1 * Inf\n\n(Inf, Inf, Inf)\n\n\nFinally, these are mathematically undefined, but still yield a finite value with Julia:\n\n0^0, Inf^0\n\n(1, 1.0)\n\n\n\n\nFloating point numbers and real numbers\nFloating point numbers are an abstraction for the real numbers. For the most part this abstraction works in the background, though there are cases where one needs to have it in mind. Here are a few:\n\nFor real and rational numbers, between any two numbers \\(a < b\\), there is another real number in between. This is not so for floating point numbers which have a finite precision. (Julia has some functions for working with this distinction.)\nFloating point numbers are approximations for most values, even simple rational ones like \\(1/3\\). This leads to oddities such as this value not being \\(0\\):\n\n\nsqrt(2)*sqrt(2) - 2\n\n4.440892098500626e-16\n\n\nIt is no surprise that an irrational number, like \\(\\sqrt{2}\\), cant be represented exactly within floating point, but it is perhaps surprising that simple numbers can not be, so \\(1/3\\), \\(1/5\\), \\(\\dots\\) are approximated. Here is a surprising-at-first consequence:\n\n1/10 + 2/10 == 3/10\n\nfalse\n\n\nThat is adding 1/10 and 2/10 is not exactly 3/10, as expected mathematically. Such differences are usually very small and are generally attributed to rounding error. The user needs to be mindful when testing for equality, as is done above with the == operator.\n\nFloating point addition is not necessarily associative, that is the property \\(a + (b+c) = (a+b) + c\\) may not hold exactly. For example:\n\n\n1/10 + (2/10 + 3/10) == (1/10 + 2/10) + 3/10\n\nfalse\n\n\n\nFor real numbers subtraction of similar-sized numbers is not exceptional, for example \\(1 - \\cos(x)\\) is positive if \\(0 < x < \\pi/2\\), say. This will not be the case for floating point values. If \\(x\\) is close enough to \\(0\\), then \\(\\cos(x)\\) and \\(1\\) will be so close, that they will be represented by the same floating point value, 1.0, so the difference will be zero:\n\n\n1.0 - cos(1e-8)\n\n0.0\n\n\n\n\n\n3.1.3 Rational numbers\nRational numbers can be used when the exactness of the number is more important than the speed or wider range of values offered by floating point numbers. In Julia a rational number is comprised of a numerator and a denominator, each an integer of the same type, and reduced to lowest terms. The operations of addition, subtraction, multiplication, and division will keep their answers as rational numbers. As well, raising a rational number to a positive, integer value will produce a rational number.\nAs mentioned, these are constructed using double slashes:\n\n1//2, 2//1, 6//4\n\n(1//2, 2//1, 3//2)\n\n\nRational numbers are exact, so the following are identical to their mathematical counterparts:\n\n1//10 + 2//10 == 3//10\n\ntrue\n\n\nand associativity:\n\n(1//10 + 2//10) + 3//10 == 1//10 + (2//10 + 3//10)\n\ntrue\n\n\nHere we see that the type is preserved under the basic operations:\n\n(1//2 + 1//3 * 1//4 / 1//5) ^ 6\n\n1771561//2985984\n\n\nFor powers, a non-integer exponent is converted to floating point, so this operation is defined, though will always return a floating point value:\n\n(1//2)^(1//2) # the first parentheses are necessary as `^` will be evaluated before `//`.\n\n0.7071067811865476\n\n\n\nExample: different types of real numbers\nThis table shows what attributes are implemented for the different types.\n\n\n\n\n\n\nAttributesIntegerRationalFloatingPoint\n\nconstruction\n1\n1//1\n1.0\n\nexact\ntrue\ntrue\nnot usually\n\nwide range\nfalse\nfalse\ntrue\n\nhas infinity\nfalse\nfalse\ntrue\n\nhas -0\nfalse\nfalse\ntrue\n\nfast\ntrue\nfalse\ntrue\n\nclosed under\n+, -, *, ^ (non-negative exponent)\n+, -, *, / (non zero denominator),^ (integer power)\n+, -, *, / (possibly NaN, Inf),^ (non-negative base)\n\n\n\n\n\n\n\n\n\n\n3.1.4 Complex numbers\nComplex numbers in Julia are stored as two numbers, a real and imaginary part, each some type of Real number. The special constant im is used to represent \\(i=\\sqrt{-1}\\). This makes the construction of complex numbers fairly standard:\n\n1 + 2im, 3 + 4.0im\n\n(1 + 2im, 3.0 + 4.0im)\n\n\n(These two arent exactly the same, the 3 is promoted from an integer to a float to match the 4.0. Each of the components must be of the same type of number.)\nMathematically, complex numbers are needed so that certain equations can be satisfied. For example \\(x^2 = -2\\) has solutions \\(-\\sqrt{2}i\\) and \\(\\sqrt{2}i\\) over the complex numbers. Finding this in Julia requires some attention, as we have both sqrt(-2) and sqrt(-2.0) throwing a DomainError, as the sqrt function expects non-negative real arguments. However first creating a complex number does work:\n\nsqrt(-2 + 0im)\n\n0.0 + 1.4142135623730951im\n\n\nFor complex arguments, the sqrt function will return complex values (even if the answer is a real number).\nThis means, if you wanted to perform the quadratic equation for any real inputs, your computations might involve something like the following:\n\na,b,c = 1,2,3 ## x^2 + 2x + 3\ndiscr = b^2 - 4a*c\n(-b + sqrt(discr + 0im))/(2a), (-b - sqrt(discr + 0im))/(2a)\n\n(-1.0 + 1.4142135623730951im, -1.0 - 1.4142135623730951im)\n\n\nWhen learning calculus, the only common usage of complex numbers arises when solving polynomial equations for roots, or zeros, though they are very important for subsequent work using the concepts of calculus.\n\n\n\n\n\n\nNote\n\n\n\nThough complex numbers are stored as pairs of numbers, the imaginary unit, im, is of type Complex{Bool}, a type that can be promoted to more specific types when im is used with different number types."
},
{
"objectID": "precalc/numbers_types.html#type-stability",
"href": "precalc/numbers_types.html#type-stability",
"title": "3  Number systems",
"section": "3.2 Type stability",
"text": "3.2 Type stability\nOne design priority of Julia is that it should be fast. How can Julia do this? In a simple model, Julia is an interface between the user and the computers processor(s). Processors consume a set of instructions, the user issues a set of commands. Julia is in charge of the translation between the two. Ultimately Julia calls a compiler to create the instructions. A basic premise is the shorter the instructions, the faster they are to process. Shorter instructions can come about by being more explicit about what types of values the instructions concern. Explicitness means, there is no need to reason about what a value can be. When Julia can reason about the type of value involved without having to reason about the values themselves, it can work with the compiler to produce shorter lists of instructions.\nSo knowing the type of the output of a function based only on the type of the inputs can be a big advantage. In Julia this is known as type stability. In the standard Julia library, this is a primary design consideration.\n\nExample: closure\nTo motivate this a bit, we discuss how mathematics can be shaped by a desire to stick to simple ideas. A desirable algebraic property of a set of numbers and an operation is closure. That is, if one takes an operation like + and then uses it to add two numbers in a set, will that result also be in the set? If this is so for any pair of numbers, then the set is closed with respect to the operation addition.\nLets suppose we start with the natural numbers: \\(1,2, \\dots\\). Natural, in that we can easily represent small values in terms of fingers. This set is closed under addition - as a child learns when counting using their fingers. However, if we started with the odd natural numbers, this set would not be closed under addition - \\(3+3=6\\).\nThe natural numbers are not all the numbers we need, as once a desire for subtraction is included, we find the set isnt closed. There isnt a \\(0\\), needed as \\(n-n=0\\) and there arent negative numbers. The set of integers are needed for closure under addition and subtraction.\nThe integers are also closed under multiplication, which for integer values can be seen as just regrouping into longer additions.\nHowever, the integers are not closed under division - even if you put aside the pesky issue of dividing by \\(0\\). For that, the rational numbers must be introduced. So aside from division by \\(0\\), the rationals are closed under addition, subtraction, multiplication, and division. There is one more fundamental operation though, powers.\nPowers are defined for positive integers in a simple enough manner\n\\[\na^n=a \\cdot a \\cdot a \\cdots a \\text{ (n times); } a, n \\text{ are integers } n \\text{ is positive}.\n\\]\nWe can define \\(a^0\\) to be \\(1\\), except for the special case of \\(0^0\\), which is left undefined mathematically (though it is also defined as 1 within Julia). We can extend the above to include negative values of \\(a\\), but what about negative values of \\(n\\)? We cant say the integers are closed under powers, as the definition consistent with the rules that \\(a^{(-n)} = 1/a^n\\) requires rational numbers to be defined.\nWell, in the above a could be a rational number, is a^n closed for rational numbers? No again. Though it is fine for \\(n\\) as an integer (save the odd case of \\(0\\), simple definitions like \\(2^{1/2}\\) are not answered within the rationals. For this, we need to introduce the real numbers. It is mentioned that Aristotle hinted at the irrationality of the square root of \\(2\\). To define terms like \\(a^{1/n}\\) for integer values \\(a,n > 0\\) a reference to a solution to an equation \\(x^n-a\\) is used. Such solutions require the irrational numbers to have solutions in general. Hence the need for the real numbers (well, algebraic numbers at least, though once the exponent is no longer a rational number, the full set of real numbers are needed.)\nSo, save the pesky cases, the real numbers will be closed under addition, subtraction, multiplication, division, and powers - provided the base is non-negative.\nFinally for that last case, the complex numbers are introduced to give an answer to \\(\\sqrt{-1}\\).\n\nHow does this apply with Julia?\nThe point is, if we restrict our set of inputs, we can get more precise values for the output of basic operations, but to get more general inputs we need to have bigger output sets.\nA similar thing happens in Julia. For addition say, the addition of two integers of the same type will be an integer of that type. This speed consideration is not solely for type stability, but also to avoid checking for overflow.\nAnother example, the division of two integers will always be a number of the same type - floating point, as that is the only type that ensures the answer will always fit within. (The explicit use of rationals notwithstanding.) So even if two integers are the input and their answer could be an integer, in Julia it will be a floating point number, (cf. 2/1).\nHopefully this helps explain the subtle issues around powers: in Julia an integer raised to an integer should be an integer, for speed, though certain cases are special cased, like 2^(-1). However since a real number raised to a real number makes sense always when the base is non-negative, as long as real numbers are used as outputs, the expressions 2.0^(-1) and 2^(-1.0) are computed and real numbers (floating points) are returned. For type stability, even though \\(2.0^1\\) could be an integer, a floating point answer is returned.\nAs for negative bases, Julia could always return complex numbers, but in addition to this being slower, it would be irksome to users. So users must opt in. Hence sqrt(-1.0) will be an error, but the more explicit - but mathematically equivalent - sqrt(-1.0 + 0im) will not be a domain error, but rather a complex value will be returned."
},
{
"objectID": "precalc/numbers_types.html#questions",
"href": "precalc/numbers_types.html#questions",
"title": "3  Number systems",
"section": "3.3 Questions",
"text": "3.3 Questions\n\nQuestion\nThe number created by pi/2 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by 2/2 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by 2//2 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by 1 + 1//2 + 1/3 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by 2^3 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by sqrt(im) is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe number created by 2^(-1) is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe “number” created by 1/0 is?\n\n\n\n \n \n \n \n \n \n \n \n \n Integer\n \n \n\n\n \n \n \n \n Rational\n \n \n\n\n \n \n \n \n Floating point\n \n \n\n\n \n \n \n \n Complex\n \n \n\n\n \n \n \n \n None, an error occurs\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs (2 + 6) + 7 equal to 2 + (6 + 7)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs (2/10 + 6/10) + 7/10 equal to 2/10 + (6/10 + 7/10)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following should compute 2^(-1), which if entered directly will return 0.5. Does it?\na, b = 2, -1\na^b\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(This shows the special casing that is done when powers use literal numbers.)"
},
{
"objectID": "precalc/logical_expressions.html",
"href": "precalc/logical_expressions.html",
"title": "4  Inequalities, Logical expressions",
"section": "",
"text": "In this section we use the following package:"
},
{
"objectID": "precalc/logical_expressions.html#boolean-values",
"href": "precalc/logical_expressions.html#boolean-values",
"title": "4  Inequalities, Logical expressions",
"section": "4.1 Boolean values",
"text": "4.1 Boolean values\nIn mathematics it is common to test if an expression is true or false. For example, is the point \\((1,2)\\) inside the disc \\(x^2 + y^2 \\leq 1\\)? We would check this by substituting \\(1\\) for \\(x\\) and \\(2\\) for \\(y\\), evaluating both sides of the inequality and then assessing if the relationship is true or false. In this case, we end up with a comparison of \\(5 \\leq 1\\), which we of course know is false.\nJulia provides numeric comparisons that allow this notation to be exactly mirrored:\n\nx, y = 1, 2\nx^2 + y^2 <= 1\n\nfalse\n\n\nThe response is false, as expected. Julia provides Boolean values true and false for such questions. The same process is followed as was described mathematically.\nThe set of numeric comparisons is nearly the same as the mathematical counterparts: <, <=, ==, >=, >. The syntax for less than or equal can also be represented with the Unicode ≤ (generated by \\le[tab]). Similarly, for greater than or equal, there is \\ge[tab].\n\n\n\n\n\n\nWarning\n\n\n\nThe use of == is necessary, as = is used for assignment and mutation.”)\n\n\nThe ! operator takes a boolean value and negates it. It uses prefix notation:\n\n!true\n\nfalse\n\n\nFor convenience, a != b can be used in place of !(a == b)."
},
{
"objectID": "precalc/logical_expressions.html#algebra-of-inequalities",
"href": "precalc/logical_expressions.html#algebra-of-inequalities",
"title": "4  Inequalities, Logical expressions",
"section": "4.2 Algebra of inequalities",
"text": "4.2 Algebra of inequalities\nTo illustrate, lets see that the algebra of expressions works as expected.\nFor example, if \\(a < b\\) then for any \\(c\\) it is also true that \\(a + c < b + c\\).\nWe cant “prove” this through examples, but we can investigate it by the choice of various values of \\(a\\), \\(b\\), and \\(c\\). For example:\n\na,b,c = 1,2,3\na < b, a + c < b + c\n\n(true, true)\n\n\nOr in reverse:\n\na,b,c = 3,2,1\na < b, a + c < b + c\n\n(false, false)\n\n\nTrying other choices will show that the two answers are either both false or both true.\n\n\n\n\n\n\nWarning\n\n\n\nWell, almost… When Inf or NaN are involved, this may not hold, for example 1 + Inf < 2 + Inf is actually false. As would be 1 + (typemax(1)-1) < 2 + (typemax(1)-1).\n\n\nSo adding or subtracting most any finite value from an inequality will preserve the inequality, just as it does for equations.\nWhat about addition and multiplication?\nConsider the case \\(a < b\\) and \\(c > 0\\). Then \\(ca < cb\\). Here we investigate using \\(3\\) random values (which will be positive):\n\na,b,c = rand(3) # 3 random numbers in [0,1)\na < b, c*a < c*b\n\n(true, true)\n\n\nWhenever these two commands are run, the two logical values should be identical, even though the specific values of a, b, and c will vary.\nThe restriction that \\(c > 0\\) is needed. For example, if \\(c = -1\\), then we have \\(a < b\\) if and only if \\(-a > -b\\). That is the inequality is “flipped.”\n\na,b = rand(2)\na < b, -a > -b\n\n(false, false)\n\n\nAgain, whenever this is run, the two logical values should be the same. The values \\(a\\) and \\(-a\\) are the same distance from \\(0\\), but on opposite sides. Hence if \\(0 < a < b\\), then \\(b\\) is farther from \\(0\\) than \\(a\\), so \\(-b\\) will be farther from \\(0\\) than \\(-a\\), which in this case says \\(-b < -a\\), as expected.\nFinally, we have the case of division. The relation of \\(x\\) and \\(1/x\\) (for \\(x > 0\\)) is that the farther \\(x\\) is from \\(0\\), the closer \\(1/x\\) is to \\(0\\). So large values of \\(x\\) make small values of \\(1/x\\). This leads to this fact for \\(a,b > 0\\): \\(a < b\\) if and only if \\(1/a > 1/b\\).\nWe can check with random values again:\n\na,b = rand(2)\na < b, 1/a > 1/b\n\n(true, true)\n\n\nIn summary we investigated numerically that the following hold:\n\na < b if and only if a + c < b + c for all finite a, b, and c.\na < b if and only if c*a < c*b for all finite a and b, and finite, positive c.\na < b if and only if -a > -b for all finite a and b.\na < b if and only if 1/a > 1/b for all finite, positive a and b.\n\n\n4.2.1 Examples\nWe now show some inequalities highlighted on this Wikipedia page.\nNumerically investigate the fact \\(e^x \\geq 1 + x\\) by showing it is true for three different values of \\(x\\). We pick \\(x=-1\\), \\(0\\), and \\(1\\):\n\nx = -1; exp(x) >= 1 + x\nx = 0; exp(x) >= 1 + x\nx = 1; exp(x) >= 1 + x\n\ntrue\n\n\nNow, lets investigate that for any distinct real numbers, \\(a\\) and \\(b\\), that\n\\[\n\\frac{e^b - e^a}{b - a} > e^{(a+b)/2}\n\\]\nFor this, we use rand(2) to generate two random numbers in \\([0,1)\\):\n\na, b = rand(2)\n(exp(b) - exp(a)) / (b-a) > exp((a+b)/2)\n\ntrue\n\n\nThis should evaluate to true for any random choice of a and b returned by rand(2).\nFinally, lets investigate the fact that the harmonic mean, \\(2/(1/a + 1/b)\\) is less than or equal to the geometric mean, \\(\\sqrt{ab}\\), which is less than or equal to the quadratic mean, \\(\\sqrt{a^2 + b^2}/\\sqrt{2}\\), using two randomly chosen values:\n\na, b = rand(2)\nh = 2 / (1/a + 1/b)\ng = (a * b) ^ (1 / 2)\nq = sqrt((a^2 + b^2) / 2)\nh <= g, g <= q\n\n(true, true)"
},
{
"objectID": "precalc/logical_expressions.html#chaining-combining-expressions-absolute-values",
"href": "precalc/logical_expressions.html#chaining-combining-expressions-absolute-values",
"title": "4  Inequalities, Logical expressions",
"section": "4.3 Chaining, combining expressions: absolute values",
"text": "4.3 Chaining, combining expressions: absolute values\nThe absolute value notation can be defined through cases:\n\\[\n\\lvert x\\rvert = \\begin{cases}\nx & x \\geq 0\\\\\n-x & \\text{otherwise}.\n\\end{cases}\n\\]\nThe interpretation of \\(\\lvert x\\rvert\\), as the distance on the number line of \\(x\\) from \\(0\\), means that many relationships are naturally expressed in terms of absolute values. For example, a simple shift: \\(\\lvert x -c\\rvert\\) is related to the distance \\(x\\) is from the number \\(c\\). As common as they are, the concept can still be confusing when inequalities are involved.\nFor example, the expression \\(\\lvert x - 5\\rvert < 7\\) has solutions which are all values of \\(x\\) within \\(7\\) units of \\(5\\). This would be the values \\(-2< x < 12\\). If this isnt immediately intuited, then formally \\(\\lvert x - 5\\rvert <7\\) is a compact representation of a chain of inequalities: \\(-7 < x-5 < 7\\). (Which is really two combined inequalities: \\(-7 < x-5\\) and \\(x-5 < 7\\).) We can “add” \\(5\\) to each side to get \\(-2 < x < 12\\), using the fact that adding by a finite number does not change the inequality sign.\nJulias precedence for logical expressions, allows such statements to mirror the mathematical notation:\n\nx = 18\nabs(x - 5) < 7\n\nfalse\n\n\nThis is to be expected, but we could also have written:\n\n-7 < x - 5 < 7\n\nfalse\n\n\nRead aloud this would be “minus \\(7\\) is less than \\(x\\) minus \\(5\\) and \\(x\\) minus \\(5\\) is less than \\(7\\)”.\nThe “and” equations can be combined as above with a natural notation. However, an equation like \\(\\lvert x - 5\\rvert > 7\\) would emphasize an or and be “\\(x\\) minus \\(5\\) less than minus \\(7\\) or \\(x\\) minus \\(5\\) greater than \\(7\\)”. Expressing this requires some new notation.\nThe boolean shortcut operators && and || implement “and” and “or.” (There are also bitwise boolean operators & and |, but we only describe the former.)\nThus we could write \\(-7 < x-5 < 7\\) as\n\n(-7 < x - 5) && (x - 5 < 7)\n\nfalse\n\n\nand could write \\(\\lvert x-5\\rvert > 7\\) as\n\n(x - 5 < -7) || (x - 5 > 7)\n\ntrue\n\n\n(The first expression is false for \\(x=18\\) and the second expression true, so the “or”ed result is true and the “and” result if false.)\n\nExample\nOne of DeMorgans Laws states that “not (A and B)” is the same as “(not A) or (not B)”. This is a kind of distributive law for “not”, but note how the “and” changes to “or”. We can verify this law systematically. For example, the following shows it true for \\(1\\) of the \\(4\\) possible cases for the pair A, B to take:\n\nA,B = true, false ## also true, true; false, true; and false, false\n!(A && B) == !A || !B\n\ntrue"
},
{
"objectID": "precalc/logical_expressions.html#precedence",
"href": "precalc/logical_expressions.html#precedence",
"title": "4  Inequalities, Logical expressions",
"section": "4.4 Precedence",
"text": "4.4 Precedence\nThe question of when parentheses are needed and when they are not is answered by the precedence rules implemented. Earlier, we wrote\n\n(x - 5 < -7) || (x - 5 > 7)\n\ntrue\n\n\nTo represent \\(\\lvert x-5\\rvert > 7\\). Were the parentheses necessary? Lets just check.\n\nx - 5 < -7 || x - 5 > 7\n\ntrue\n\n\nSo no, they were not in this case.\nAn operator (such as <, >, || above) has an associated associativity and precedence. The associativity is whether an expression like a - b - c is (a-b) - c or a - (b-c). The former being left associative, the latter right. Of issue here is precedence, as in with two or more different operations, which happens first, second, \\(\\dots\\).\nThe table in the manual on operator precedence and associativity shows that for these operations “control flow” (the && above) is lower than “comparisons” (the <, >), which are lower than “Addition” (the - above). So the expression without parentheses would be equivalent to:\n\n((x-5) < -7) && ((x-5) > 7)\n\nfalse\n\n\n(This is different than the precedence of the bitwise boolean operators, which have & with “Multiplication” and | with “Addition”, so x-5 < 7 | x - 5 > 7 would need parentheses.)\nA thorough understanding of the precedence rules can help eliminate unnecessary parentheses, but in most cases it is easier just to put them in."
},
{
"objectID": "precalc/logical_expressions.html#arithmetic-with",
"href": "precalc/logical_expressions.html#arithmetic-with",
"title": "4  Inequalities, Logical expressions",
"section": "4.5 Arithmetic with",
"text": "4.5 Arithmetic with\nFor convenience, basic arithmetic can be performed with Boolean values, false becomes \\(0\\) and true \\(1\\). For example, both these expressions make sense:\n\ntrue + true + false, false * 1000\n\n(2, 0)\n\n\nThe first example shows a common means used to count the number of true values in a collection of Boolean values - just add them.\nThis can be cleverly exploited. For example, the following expression returns x when it is positive and \\(0\\) otherwise:\n\n(x > 0) * x\n\n18\n\n\nThere is a built in function, max that can be used for this: max(0, x).\nThis expression returns x if it is between \\(-10\\) and \\(10\\) and otherwise \\(-10\\) or \\(10\\) depending on whether \\(x\\) is negative or positive.\n\n(x < -10)*(-10) + (x >= -10)*(x < 10) * x + (x>=10)*10\n\n10\n\n\nThe clamp(x, a, b) performs this task more generally, and is used as in clamp(x, -10, 10)."
},
{
"objectID": "precalc/logical_expressions.html#questions",
"href": "precalc/logical_expressions.html#questions",
"title": "4  Inequalities, Logical expressions",
"section": "4.6 Questions",
"text": "4.6 Questions\n\nQuestion\nIs e^pi or pi^e greater?\n\n\n\n \n \n \n \n \n \n \n \n \n e^pi is less than pi^e\n \n \n\n\n \n \n \n \n e^pi is greater than pi^e\n \n \n\n\n \n \n \n \n e^pi is equal to pi^e\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(\\sin(1000)\\) positive?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose you know \\(0 < a < b\\). What can you say about the relationship between \\(-1/a\\) and \\(-1/b\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-1/a > -1/b\\)\n \n \n\n\n \n \n \n \n \\(-1/a \\geq -1/b\\)\n \n \n\n\n \n \n \n \n \\(-1/a < -1/b\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose you know \\(a < 0 < b\\), is it true that \\(1/a > 1/b\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, it is always true.\n \n \n\n\n \n \n \n \n It is never true, as \\(1/a\\) is negative and \\(1/b\\) is positive\n \n \n\n\n \n \n \n \n It can sometimes be true, though not always.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe airyai function is a special function named after a British Astronomer who realized the functions value in his studies of the rainbow. The SpecialFunctions package must be loaded to include this function, which is done with the accompanying package CalculusWithJulia.\n\nairyai(0)\n\n0.3550280538878172\n\n\nIt is known that this function is always positive for \\(x > 0\\), though not so for negative values of \\(x\\). Which of these indicates the first negative value : airyai(-1) <0, airyai(-2) < 0, …, or airyai(-5) < 0?\n\n\n\n \n \n \n \n \n \n \n \n \n airyai(-1) < 0\n \n \n\n\n \n \n \n \n airyai(-2) < 0\n \n \n\n\n \n \n \n \n airyai(-3) < 0\n \n \n\n\n \n \n \n \n airyai(-4) < 0\n \n \n\n\n \n \n \n \n airyai(-5) < 0\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBy trying three different values of \\(x > 0\\) which of these could possibly be always true:\n\n\n\n \n \n \n \n \n \n \n \n \n x^x <= (1/e)^(1/e)\n \n \n\n\n \n \n \n \n x^x >= (1/e)^(1/e)\n \n \n\n\n \n \n \n \n x^x == (1/e)^(1/e)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nStudent logic says \\((x+y)^p = x^p + y^p\\). Of course, this isnt correct for all \\(p\\) and \\(x\\). By trying a few points, which is true when \\(x,y > 0\\) and \\(0 < p < 1\\):\n\n\n\n \n \n \n \n \n \n \n \n \n (x+y)^p > x^p + y^p\n \n \n\n\n \n \n \n \n (x+y)^p < x^p + y^p\n \n \n\n\n \n \n \n \n (x+y)^p == x^p + y^p\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAccording to Wikipedia, one of the following inequalities is always true for \\(a, b > 0\\) (as proved by I. Ilani in JSTOR, AMM, Vol.97, No.1, 1990). Which one?\n\n\n\n \n \n \n \n \n \n \n \n \n a^b + b^a <= 1\n \n \n\n\n \n \n \n \n a^a + b^b <= a^b + b^a\n \n \n\n\n \n \n \n \n a^a + b^b >= a^b + b^a\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(3\\) in the set \\(\\lvert x - 2\\rvert < 1/2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of the following is equivalent to \\(\\lvert x - a\\rvert > b\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-b < x - a < b\\)\n \n \n\n\n \n \n \n \n \\(-b < x-a \\text{ and } x - a < b\\)\n \n \n\n\n \n \n \n \n \\(x - a < -b \\text{ or } x - a > b\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(\\lvert x - \\pi\\rvert < 1/10\\) is \\(\\lvert \\sin(x) - \\sin(\\pi)\\rvert < 1/10\\)?\nGuess an answer based on a few runs of\nx = pi + 1/10 * (2rand()-1)\nabs(x - pi) < 1/10, abs(sin(x) - sin(pi)) < 1/10\n\n\n\n \n \n \n \n \n \n \n \n \n true\n \n \n\n\n \n \n \n \n false\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes 12 satisfy \\(\\lvert x - 3\\rvert + \\lvert x-9\\rvert > 12\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these will show DeMorgans law holds when both values are false:\n\n\n\n \n \n \n \n \n \n \n \n \n !(false && false) == (false || false)\n \n \n\n\n \n \n \n \n !(false && false) == (!false || !false)\n \n \n\n\n \n \n \n \n !(false && false) == (!false && !false)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor floating point numbers there are two special values Inf and NaN. For which of these is the answer always false:\n\n\n\n \n \n \n \n \n \n \n \n \n NaN < 3.0 and 3.0 <= NaN\n \n \n\n\n \n \n \n \n Inf < 3.0 and 3.0 <= Inf\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe IEEE 754 standard is about floating point numbers, for which there are the special values Inf, -Inf, NaN, and, surprisingly, -0.0 (as a floating point number and not -0, an integer). Here are 4 facts that seem reasonable:\n\nPositive zero is equal but not greater than negative zero.\nInf is equal to itself and greater than everything else except NaN.\n-Inf is equal to itself and less then everything else except NaN.\nNaN is not equal to, not less than, and not greater than anything, including itself.\n\nDo all four seem to be the case within Julia? Find your answer by trial and error.\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe NaN value is meant to signal an error in computation. Julia has value to indicate some data is missing or unavailable. This is missing. For missing values we have these computations:\n\ntrue && missing, true || missing\n\n(missing, true)\n\n\nWe see the value of true || missing is true. Why?\n\n\n\n \n \n \n \n \n \n \n \n \n Since the second value is \"missing\", only the first is used. So false || missing would also be false\n \n \n\n\n \n \n \n \n In the manual we can read that \"In the expression a || b, the subexpression b is only evaluated if a evaluates to false.\" In this case a is true and so a is returned.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe value for true && missing is missing, not a boolean value. What happens?\n\n\n\n \n \n \n \n \n \n \n \n \n In the manual we can read that \"In the expression a && b, the subexpression b is only evaluated if a evaluates to true.\" In this case, a is false so b is evaluated and returned. As b is just missing that is the return value.\n \n \n\n\n \n \n \n \n Since the second value is \"missing\" all such answers would be missing."
},
{
"objectID": "precalc/vectors.html",
"href": "precalc/vectors.html",
"title": "5  Vectors",
"section": "",
"text": "One of the first models learned in physics are the equations governing the laws of motion with constant acceleration: \\(x(t) = x_0 + v_0 t + 1/2 \\cdot a t^2\\). This is a consequence of Newtons second law of motion applied to the constant acceleration case. A related formula for the velocity is \\(v(t) = v_0 + at\\). The following figure is produced using these formulas applied to both the vertical position and the horizontal position:\nFor the motion in the above figure, the objects \\(x\\) and \\(y\\) values change according to the same rule, but, as the acceleration is different in each direction, we get different formula, namely: \\(x(t) = x_0 + v_{0x} t\\) and \\(y(t) = y_0 + v_{0y}t - 1/2 \\cdot gt^2\\).\nIt is common to work with both formulas at once. Mathematically, when graphing, we naturally pair off two values using Cartesian coordinates (e.g., \\((x,y)\\)). Another means of combining related values is to use a vector. The notation for a vector varies, but to distinguish them from a point we will use \\(\\langle x,~ y\\rangle\\). With this notation, we can use it to represent the position, the velocity, and the acceleration at time \\(t\\) through:\n\\[\n\\begin{align}\n\\vec{x} &= \\langle x_0 + v_{0x}t,~ -(1/2) g t^2 + v_{0y}t + y_0 \\rangle,\\\\\n\\vec{v} &= \\langle v_{0x},~ -gt + v_{0y} \\rangle, \\text{ and }\\\\\n\\vec{a} &= \\langle 0,~ -g \\rangle.\n\\end{align}\n\\]\nDont spend time thinking about the formulas if they are unfamiliar. The point emphasized here is that we have used the notation \\(\\langle x,~ y \\rangle\\) to collect the two values into a single object, which we indicate through a label on the variable name. These are vectors, and we shall see they find use far beyond this application.\nInitially, our primary use of vectors will be as containers, but it is worthwhile to spend some time to discuss properties of vectors and their visualization.\nA line segment in the plane connects two points \\((x_0, y_0)\\) and \\((x_1, y_1)\\). The length of a line segment (its magnitude) is given by the distance formula \\(\\sqrt{(x_1 - x_0)^2 + (y_1 - y_0)^2}\\). A line segment can be given a direction by assigning an initial point and a terminal point. A directed line segment has both a direction and a magnitude. A vector is an abstraction where just these two properties \\(-\\) a direction and a magnitude \\(-\\) are intrinsic. While a directed line segment can be represented by a vector, a single vector describes all such line segments found by translation. That is, how the the vector is located when visualized is for convenience, it is not a characteristic of the vector. In the figure above, all vectors are drawn with their tails at the position of the projectile over time.\nWe can visualize a (two-dimensional) vector as an arrow in space. This arrow has two components. We represent a vector mathematically as \\(\\langle x,~ y \\rangle\\). For example, the vector connecting the point \\((x_0, y_0)\\) to \\((x_1, y_1)\\) is \\(\\langle x_1 - x_0,~ y_1 - y_0 \\rangle\\).\nThe magnitude of a vector comes from the distance formula applied to a line segment, and is \\(\\| \\vec{v} \\| = \\sqrt{x^2 + y^2}\\).\nWe call the values \\(x\\) and \\(y\\) of the vector \\(\\vec{v} = \\langle x,~ y \\rangle\\) the components of the \\(v\\).\nTwo operations on vectors are fundamental.\nThe concept of scalar multiplication and addition, allow the decomposition of vectors into standard vectors. The standard unit vectors in two dimensions are \\(e_x = \\langle 1,~ 0 \\rangle\\) and \\(e_y = \\langle 0,~ 1 \\rangle\\). Any two dimensional vector can be written uniquely as \\(a e_x + b e_y\\) for some pair of scalars \\(a\\) and \\(b\\) (or as, \\(\\langle a, b \\rangle\\)). This is true more generally where the two vectors are not the standard unit vectors - they can be any two non-parallel vectors.\nThe two operations of scalar multiplication and vector addition are defined in a component-by-component basis. We will see that there are many other circumstances where performing the same action on each component in a vector is desirable.\nWhen a vector is placed with its tail at the origin, it can be described in terms of the angle it makes with the \\(x\\) axis, \\(\\theta\\), and its length, \\(r\\). The following formulas apply:\n\\[\nr = \\sqrt{x^2 + y^2}, \\quad \\tan(\\theta) = y/x.\n\\]\nIf we are given \\(r\\) and \\(\\theta\\), then the vector is \\(v = \\langle r \\cdot \\cos(\\theta),~ r \\cdot \\sin(\\theta) \\rangle\\)."
},
{
"objectID": "precalc/vectors.html#vectors-in-julia",
"href": "precalc/vectors.html#vectors-in-julia",
"title": "5  Vectors",
"section": "5.1 Vectors in Julia",
"text": "5.1 Vectors in Julia\nA vector in Julia can be represented by its individual components, but it is more convenient to combine them into a collection using the [,] notation:\n\nx, y = 1, 2\nv = [x, y] # square brackets, not angles\n\n2-element Vector{Int64}:\n 1\n 2\n\n\nThe basic vector operations are implemented for vector objects. For example, the vector v has scalar multiplication defined for it:\n\n10 * v\n\n2-element Vector{Int64}:\n 10\n 20\n\n\nThe norm function returns the magnitude of the vector (by default):\nimport LinearAlgebra: norm\n\nnorm(v)\n\n2.23606797749979\n\n\nA unit vector is then found by scaling by the reciprocal of the magnitude:\n\nv / norm(v)\n\n2-element Vector{Float64}:\n 0.4472135954999579\n 0.8944271909999159\n\n\nIn addition, if w is another vector, we can add and subtract:\n\nw = [3, 2]\nv + w, v - 2w\n\n([4, 4], [-5, -2])\n\n\nWe see above that scalar multiplication, addition, and subtraction can be done without new notation. This is because the usual operators have methods defined for vectors.\nFinally, to find an angle \\(\\theta\\) from a vector \\(\\langle x,~ y\\rangle\\), we can employ the atan function using two arguments:\n\nnorm(v), atan(y, x) # v = [x, y]\n\n(2.23606797749979, 1.1071487177940904)"
},
{
"objectID": "precalc/vectors.html#higher-dimensional-vectors",
"href": "precalc/vectors.html#higher-dimensional-vectors",
"title": "5  Vectors",
"section": "5.2 Higher dimensional vectors",
"text": "5.2 Higher dimensional vectors\nMathematically, vectors can be generalized to more than \\(2\\) dimensions. For example, using \\(3\\)-dimensional vectors are common when modeling events happening in space, and \\(4\\)-dimensional vectors are common when modeling space and time.\nIn Julia there are many uses for vectors outside of physics applications. A vector in Julia is just a one-dimensional collection of similarly typed values and a special case of an array. Such objects find widespread usage. For example:\n\nIn plotting graphs with Julia, vectors are used to hold the \\(x\\) and \\(y\\) coordinates of a collection of points to plot and connect with straight lines. There can be hundreds of such points in a plot.\nVectors are a natural container to hold the roots of a polynomial or zeros of a function.\nVectors may be used to record the state of an iterative process.\nVectors are naturally used to represent a data set, such as arise when collecting survey data.\n\nCreating higher-dimensional vectors is similar to creating a two-dimensional vector, we just include more components:\n\nfibs = [1, 1, 2, 3, 5, 8, 13]\n\n7-element Vector{Int64}:\n 1\n 1\n 2\n 3\n 5\n 8\n 13\n\n\nLater we will discuss different ways to modify the values of a vector to create new ones, similar to how scalar multiplication does.\nAs mentioned, vectors in Julia are comprised of elements of a similar type, but the type is not limited to numeric values. For example, a vector of strings might be useful for text processing, a vector of Boolean values can naturally arise, some applications are even naturally represented in terms of vectors of vectors (such as happens when plotting a collection points). Look at the output of these two vectors:\n\n[\"one\", \"two\", \"three\"] # Array{T, 1} is shorthand for Vector{T}. Here T - the type - is String\n\n3-element Vector{String}:\n \"one\"\n \"two\"\n \"three\"\n\n\n\n[true, false, true] # vector of Bool values\n\n3-element Vector{Bool}:\n 1\n 0\n 1\n\n\nFinally, we mention that if Julia has values of different types it will promote them to a common type if possible. Here we combine three types of numbers, and see that each is promoted to Float64:\n\n[1, 2.0, 3//1]\n\n3-element Vector{Float64}:\n 1.0\n 2.0\n 3.0\n\n\nWhereas, in this example where there is no common type to promote the values to, a catch-all type of Any is used to hold the components.\n\n[\"one\", 2, 3.0, 4//1]\n\n4-element Vector{Any}:\n \"one\"\n 2\n 3.0\n 4//1"
},
{
"objectID": "precalc/vectors.html#indexing",
"href": "precalc/vectors.html#indexing",
"title": "5  Vectors",
"section": "5.3 Indexing",
"text": "5.3 Indexing\nGetting the components out of a vector can be done in a manner similar to multiple assignment:\n\nvs = [1, 2]\nv₁, v₂ = vs\n\n2-element Vector{Int64}:\n 1\n 2\n\n\nWhen the same number of variable names are on the left hand side of the assignment as in the container on the right, each is assigned in order.\nThough this is convenient for small vectors, it is far from being so if the vector has a large number of components. However, the vector is stored in order with a first, second, third, \\(\\dots\\) component. Julia allows these values to be referred to by index. This too uses the [] notation, though differently. Here is how we get the second component of vs:\n\nvs[2]\n\n2\n\n\nThe last value of a vector is usually denoted by \\(v_n\\). In Julia, the length function will return \\(n\\), the number of items in the container. So v[length(v)] will refer to the last component. However, the special keyword end will do so as well, when put into the context of indexing. So v[end] is more idiomatic. (Similarly, there is a begin keyword that is useful when the vector is not \\(1\\)-based, as is typical but not mandatory.)\n\n\n\n\n\n\nMore on indexing\n\n\n\nThere is much more to indexing than just indexing by a single integer value. For example, the following can be used for indexing:\n\na scalar integer (as seen)\n\n\n\n\na range\na vector of integers\na boolean vector\n\nSome add-on packages extend this further.\n\n5.3.1 Assignment and indexing\nIndexing notation can also be used with assignment, meaning it can appear on the left hand side of an equals sign. The following expression replaces the second component with a new value:\n\nvs[2] = 10\n\n10\n\n\nThe value of the right hand side is returned, not the value for vs. We can check that vs is then \\(\\langle 1,~ 10 \\rangle\\) by showing it:\n\nvs = [1,2]\nvs[2] = 10\nvs\n\n2-element Vector{Int64}:\n 1\n 10\n\n\nThe assignment vs[2] is different than the initial assignment vs=[1,2] in that, vs[2]=10 modifies the container that vs points to, whereas v=[1,2] replaces the binding for vs. The indexed assignment is then more memory efficient when vectors are large. This point is also of interest when passing vectors to functions, as a function may modify components of the vector passed to it, though cant replace the container itself."
},
{
"objectID": "precalc/vectors.html#some-useful-functions-for-working-with-vectors.",
"href": "precalc/vectors.html#some-useful-functions-for-working-with-vectors.",
"title": "5  Vectors",
"section": "5.4 Some useful functions for working with vectors.",
"text": "5.4 Some useful functions for working with vectors.\nAs mentioned, the length function returns the number of components in a vector. It is one of several useful functions for vectors.\nThe sum and prod function will add and multiply the elements in a vector:\n\nv1 = [1,1,2,3,5,8]\nsum(v1), prod(v1)\n\n(20, 240)\n\n\nThe unique function will throw out any duplicates:\n\nunique(v1) # drop a `1`\n\n5-element Vector{Int64}:\n 1\n 2\n 3\n 5\n 8\n\n\nThe functions maximum and minimum will return the largest and smallest values of an appropriate vector.\n\nmaximum(v1)\n\n8\n\n\n(These should not be confused with max and min which give the largest or smallest value over all their arguments.)\nThe extrema function returns both the smallest and largest value of a collection:\n\nextrema(v1)\n\n(1, 8)\n\n\nConsider now\n\n𝒗 = [1,4,2,3]\n\n4-element Vector{Int64}:\n 1\n 4\n 2\n 3\n\n\nThe sort function will rearrange the values in 𝒗:\n\nsort(𝒗)\n\n4-element Vector{Int64}:\n 1\n 2\n 3\n 4\n\n\nThe keyword argument, rev=false can be given to get values in decreasing order:\n\nsort(𝒗, rev=false)\n\n4-element Vector{Int64}:\n 1\n 2\n 3\n 4\n\n\nFor adding a new element to a vector the push! method can be used, as in\n\npush!(𝒗, 5)\n\n5-element Vector{Int64}:\n 1\n 4\n 2\n 3\n 5\n\n\nTo append more than one value, the append! function can be used:\n\nappend!(v1, [6,8,7])\n\n9-element Vector{Int64}:\n 1\n 1\n 2\n 3\n 5\n 8\n 6\n 8\n 7\n\n\nThese two functions modify or mutate the values stored within the vector 𝒗 that passed as an argument. In the push! example above, the value 5 is added to the vector of \\(4\\) elements. In Julia, a convention is to name mutating functions with a trailing exclamation mark. (Again, these do not mutate the binding of 𝒗 to the container, but do mutate the contents of the container.) There are functions with mutating and non-mutating definitions, an example is sort and sort!.\nIf only a mutating function is available, like push!, and this is not desired a copy of the vector can be made. It is not enough to copy by assignment, as with w = 𝒗. As both w and 𝒗 will be bound to the same memory location. Rather, you call copy to make a new container with copied contents, as in w = copy(𝒗).\nCreating new vectors of a given size is common for programming, though not much use will be made here. There are many different functions to do so: ones to make a vector of ones, zeros to make a vector of zeros, trues and falses to make Boolean vectors of a given size, and similar to make a similar-sized vector (with no particular values assigned)."
},
{
"objectID": "precalc/vectors.html#applying-functions-element-by-element-to-values-in-a-vector",
"href": "precalc/vectors.html#applying-functions-element-by-element-to-values-in-a-vector",
"title": "5  Vectors",
"section": "5.5 Applying functions element by element to values in a vector",
"text": "5.5 Applying functions element by element to values in a vector\nFunctions such as sum or length are known as reductions as they reduce the “dimensionality” of the data: a vector is in some sense \\(1\\)-dimensional, the sum or length are \\(0\\)-dimensional numbers. Applying a reduction is straightforward it is just a regular function call.\n\nv = [1, 2, 3, 4]\nsum(v), length(v)\n\n(10, 4)\n\n\nOther desired operations with vectors act differently. Rather than reduce a collection of values using some formula, the goal is to apply some formula to each of the values, returning a modified vector. A simple example might be to square each element, or subtract the average value from each element. An example comes from statistics. When computing a variance, we start with data \\(x_1, x_2, \\dots, x_n\\) and along the way form the values \\((x_1-\\bar{x})^2, (x_2-\\bar{x})^2, \\dots, (x_n-\\bar{x})^2\\).\nSuch things can be done in many differents ways. Here we describe two, but will primarily utilize the first.\n\n5.5.1 Broadcasting a function call\nIf we have a vector, xs, and a function, f, to apply to each value, there is a simple means to achieve this task. By adding a “dot” between the function name and the parenthesis that enclose the arguments, instructs Julia to “broadcast” the function call. The details allow for more flexibility, but, for this purpose, broadcasting will take each value in xs and apply f to it, returning a vector of the same size as xs. When more than one argument is involved, broadcasting will try to fill out different sized objects.\nFor example, the following will find, using sqrt, the square root of each value in a vector:\n\nxs = [1, 1, 3, 4, 7]\nsqrt.(xs)\n\n5-element Vector{Float64}:\n 1.0\n 1.0\n 1.7320508075688772\n 2.0\n 2.6457513110645907\n\n\nThis would find the sine of each number in xs:\n\nsin.(xs)\n\n5-element Vector{Float64}:\n 0.8414709848078965\n 0.8414709848078965\n 0.1411200080598672\n -0.7568024953079282\n 0.6569865987187891\n\n\nFor each function, the .( (and not () after the name is the surface syntax for broadcasting.\nThe ^ operator is an infix operator. Infix operators can be broadcast, as well, by using the form . prior to the operator, as in:\n\nxs .^ 2\n\n5-element Vector{Int64}:\n 1\n 1\n 9\n 16\n 49\n\n\nHere is an example involving the logarithm of a set of numbers. In astronomy, a logarithm with base \\(100^{1/5}\\) is used for star brightness. We can use broadcasting to find this value for several values at once through:\n\nys = [1/5000, 1/500, 1/50, 1/5, 5, 50]\nbase = (100)^(1/5)\nlog.(base, ys)\n\n6-element Vector{Float64}:\n -9.247425010840049\n -6.747425010840047\n -4.247425010840047\n -1.747425010840047\n 1.747425010840047\n 4.247425010840047\n\n\nBroadcasting with multiple arguments allows for mixing of vectors and scalar values, as above, making it convenient when parameters are used.\nAs a final example, the task from statistics of centering and then squaring can be done with broadcasting. We go a bit further, showing how to compute the sample variance of a data set. This has the formula\n\\[\n\\frac{1}{n-1}\\cdot ((x_1-\\bar{x})^2 + \\cdots + (x_n - \\bar{x})^2).\n\\]\nThis can be computed, with broadcasting, through:\n\nimport Statistics: mean\nxs = [1, 1, 2, 3, 5, 8, 13]\nn = length(xs)\n(1/(n-1)) * sum(abs2.(xs .- mean(xs)))\n\n19.57142857142857\n\n\nThis shows many of the manipulations that can be made with vectors. Rather than write .^2, we follow the definition of var and chose the possibly more performant abs2 function which, in general, efficiently finds \\(|x|^2\\) for various number types. The .- uses broadcasting to subtract a scalar (mean(xs)) from a vector (xs). Without the ., this would error.\n\n\n\n\n\n\nNote\n\n\n\nThe map function is very much related to broadcasting and similarly named functions are found in many different programming languages. (The “dot” broadcast is mostly limited to Julia and mirrors on a similar usage of a dot in MATLAB.) For those familiar with other programming languages, using map may seem more natural. Its syntax is map(f, xs).\n\n\n\n\n5.5.2 Comprehensions\nIn mathematics, set notation is often used to describe elements in a set.\nFor example, the first \\(5\\) cubed numbers can be described by:\n\\[\n\\{x^3: x \\text{ in } 1, 2,\\dots, 5\\}\n\\]\nComprehension notation is similar. The above could be created in Julia with:\n\n𝒙s = [1,2,3,4,5]\n[x^3 for x in 𝒙s]\n\n5-element Vector{Int64}:\n 1\n 8\n 27\n 64\n 125\n\n\nSomething similar can be done more succinctly:\n\n𝒙s .^ 3\n\n5-element Vector{Int64}:\n 1\n 8\n 27\n 64\n 125\n\n\nHowever, comprehensions have a value when more complicated expressions are desired as they work with an expression of 𝒙s, and not a pre-defined or user-defined function.\nAnother typical example of set notation might include a condition, such as, the numbers divisible by \\(7\\) between \\(1\\) and \\(100\\). Set notation might be:\n\\[\n\\{x: \\text{rem}(x, 7) = 0 \\text{ for } x \\text{ in } 1, 2, \\dots, 100\\}.\n\\]\nThis would be read: “the set of \\(x\\) such that the remainder on division by \\(7\\) is \\(0\\) for all x in \\(1, 2, \\dots, 100\\).”\nIn Julia, a comprehension can include an if clause to mirror, somewhat, the math notation. For example, the above would become (using 1:100 as a means to create the numbers \\(1,2,\\dots, 100\\), as will be described in an upcoming section):\n\n[x for x in 1:100 if rem(x,7) == 0]\n\n14-element Vector{Int64}:\n 7\n 14\n 21\n 28\n 35\n 42\n 49\n 56\n 63\n 70\n 77\n 84\n 91\n 98\n\n\nComprehensions can be a convenient means to describe a collection of numbers, especially when no function is defined, but the simplicity of the broadcast notation (just adding a judicious “.”) leads to its more common use in these notes.\n\nExample: creating a “T” table for creating a graph\nThe process of plotting a function is usually first taught by generating a “T” table: values of \\(x\\) and corresponding values of \\(y\\). These pairs are then plotted on a Cartesian grid and the points are connected with lines to form the graph. Generating a “T” table in Julia is easy: create the \\(x\\) values, then create the \\(y\\) values for each \\(x\\).\nTo be concrete, lets generate \\(7\\) points to plot \\(f(x) = x^2\\) over \\([-1,1]\\).\nThe first task is to create the data. We will soon see more convenient ways to generate patterned data, but for now, we do this by hand:\n\na,b, n = -1, 1, 7\nd = (b-a) // (n-1)\n𝐱s = [a, a+d, a+2d, a+3d, a+4d, a+5d, a+6d] # 7 points\n\n7-element Vector{Rational{Int64}}:\n -1//1\n -2//3\n -1//3\n 0//1\n 1//3\n 2//3\n 1//1\n\n\nTo get the corresponding \\(y\\) values, we can use a compression (or define a function and broadcast):\n\n𝐲s = [x^2 for x in 𝐱s]\n\n7-element Vector{Rational{Int64}}:\n 1//1\n 4//9\n 1//9\n 0//1\n 1//9\n 4//9\n 1//1\n\n\nVectors can be compared together by combining them into a separate container, as follows:\n\n[𝐱s 𝐲s]\n\n7×2 Matrix{Rational{Int64}}:\n -1//1 1//1\n -2//3 4//9\n -1//3 1//9\n 0//1 0//1\n 1//3 1//9\n 2//3 4//9\n 1//1 1//1\n\n\n(If there is a space between objects they are horizontally combined. In our construction of vectors using [] we used a comma for vertical combination. More generally we should use a ; for vertical concatenation.)\nIn the sequel, we will typically use broadcasting for this task using two steps: one to define a function the second to broadcast it.\n\n\n\n\n\n\nNote\n\n\n\nThe style generally employed here is to use plural variable names for a collection of values, such as the vector of \\(y\\) values and singular names when a single value is being referred to, leading to expressions like “x in xs”."
},
{
"objectID": "precalc/vectors.html#other-container-types",
"href": "precalc/vectors.html#other-container-types",
"title": "5  Vectors",
"section": "5.6 Other container types",
"text": "5.6 Other container types\nVectors in Julia are a container, one of many different types. Another useful type for programming purposes are tuples. If a vector is formed by placing comma-separated values within a [] pair (e.g., [1,2,3]), a tuple is formed by placing comma-separated values withing a () pair. A tuple of length \\(1\\) uses a convention of a trailing comma to distinguish it from a parethesized expression (e.g. (1,) is a tuple, (1) is just the value 1).\nTuples are used in programming, as they dont typically require allocated memory to be used so they can be faster. Internal usages are for function arguments and function return types. Unlike vectors, tuples can be heterogeneous collections. (When commas are used to combine more than one output into a cell, a tuple is being used.) (Also, a big technical distinction is that tuples are also different from vectors and other containers in that tuple types are covariant in their parameters, not invariant.)\nUnlike vectors, tuples can have names which can be used for referencing a value, similar to indexing but possibly more convenient. Named tuples are similar to dictionaries which are used to associate a key (like a name) with a value.\nFor example, here a named tuple is constructed, and then its elements referenced:\n\nnt = (one=1, two=\"two\", three=:three) # heterogeneous values (Int, String, Symbol)\nnt.one, nt[2], n[end] # named tuples have name or index access\n\n(1, \"two\", 7)"
},
{
"objectID": "precalc/vectors.html#questions",
"href": "precalc/vectors.html#questions",
"title": "5  Vectors",
"section": "5.7 Questions",
"text": "5.7 Questions\n\nQuestion\nWhich command will create the vector \\(\\vec{v} = \\langle 4,~ 3 \\rangle\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n v = [4,3]\n \n \n\n\n \n \n \n \n v = '4, 3'\n \n \n\n\n \n \n \n \n v = {4, 3}\n \n \n\n\n \n \n \n \n v = (4,3)\n \n \n\n\n \n \n \n \n v = <4,3>\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich command will create the vector with components “4,3,2,1”?\n\n\n\n \n \n \n \n \n \n \n \n \n v = (4,3,2,1)\n \n \n\n\n \n \n \n \n v = <4,3,2,1>\n \n \n\n\n \n \n \n \n v = '4, 3, 2, 1'\n \n \n\n\n \n \n \n \n v = [4,3,2,1]\n \n \n\n\n \n \n \n \n v = {4,3,2,1}\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the magnitude of the vector \\(\\vec{v} = \\langle 10,~ 15 \\rangle\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of the following is the unit vector in the direction of \\(\\vec{v} = \\langle 3,~ 4 \\rangle\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [0.6, 0.8]\n \n \n\n\n \n \n \n \n [3, 4]\n \n \n\n\n \n \n \n \n [1, 1]\n \n \n\n\n \n \n \n \n [1.0, 1.33333]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat vector is in the same direction as \\(\\vec{v} = \\langle 3,~ 4 \\rangle\\) but is 10 times as long?\n\n\n\n \n \n \n \n \n \n \n \n \n [3, 4]\n \n \n\n\n \n \n \n \n [9.48683, 12.6491 ]\n \n \n\n\n \n \n \n \n [30, 40]\n \n \n\n\n \n \n \n \n [10, 10]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(\\vec{v} = \\langle 3,~ 4 \\rangle\\) and \\(\\vec{w} = \\langle 1,~ 2 \\rangle\\) find \\(2\\vec{v} + 5 \\vec{w}\\).\n\n\n\n \n \n \n \n \n \n \n \n \n [11, 18]\n \n \n\n\n \n \n \n \n [6, 8]\n \n \n\n\n \n \n \n \n [5, 10]\n \n \n\n\n \n \n \n \n [4, 6]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet v be defined by:\nv = [1, 1, 2, 3, 5, 8, 13, 21]\nWhat is the length of v?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the sum of v?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the prod of v?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFrom transum.org.\n\n\n\n\n\nThe figure shows \\(5\\) vectors.\nExpress vector c in terms of a and b:\n\n\n\n \n \n \n \n \n \n \n \n \n 3b\n \n \n\n\n \n \n \n \n b-a\n \n \n\n\n \n \n \n \n a + b\n \n \n\n\n \n \n \n \n 3a\n \n \n\n\n \n \n \n \n a - b\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nExpress vector d in terms of a and b:\n\n\n\n \n \n \n \n \n \n \n \n \n 3a\n \n \n\n\n \n \n \n \n 3b\n \n \n\n\n \n \n \n \n b-a\n \n \n\n\n \n \n \n \n a - b\n \n \n\n\n \n \n \n \n a + b\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nExpress vector e in terms of a and b:\n\n\n\n \n \n \n \n \n \n \n \n \n 3b\n \n \n\n\n \n \n \n \n a + b\n \n \n\n\n \n \n \n \n a - b\n \n \n\n\n \n \n \n \n 3a\n \n \n\n\n \n \n \n \n b-a\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf xs=[1, 2, 3, 4] and f(x) = x^2 which of these will not produce the vector [1, 4, 9, 16]?\n\n\n\n \n \n \n \n \n \n \n \n \n f.(xs)\n \n \n\n\n \n \n \n \n map(f, xs)\n \n \n\n\n \n \n \n \n [f(x) for x in xs]\n \n \n\n\n \n \n \n \n All three of them work\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\sin(x)\\) and \\(g(x) = \\cos(x)\\). In the interval \\([0, 2\\pi]\\) the zeros of \\(g(x)\\) are given by\n\nzs = [pi/2, 3pi/2]\n\n2-element Vector{Float64}:\n 1.5707963267948966\n 4.71238898038469\n\n\nWhat construct will give the function values of \\(f\\) at the zeros of \\(g\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n sin(zs)\n \n \n\n\n \n \n \n \n sin.(zs)\n \n \n\n\n \n \n \n \n sin(.zs)\n \n \n\n\n \n \n \n \n .sin(zs)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf zs = [1,4,9,16] which of these commands will return [1.0, 2.0, 3.0, 4.0]?\n\n\n\n \n \n \n \n \n \n \n \n \n sqrt(zs)\n \n \n\n\n \n \n \n \n sqrt.(zs)\n \n \n\n\n \n \n \n \n zs^(1/2)\n \n \n\n\n \n \n \n \n zs^(1./2)"
},
{
"objectID": "precalc/ranges.html",
"href": "precalc/ranges.html",
"title": "6  Ranges and Sets",
"section": "",
"text": "Sequences of numbers are prevalent in math. A simple one is just counting by ones:\n\\[\n1, 2, 3, 4, 5, 6, 7, 8, 9, 10, \\dots\n\\]\nOr counting by sevens:\n\\[\n7, 14, 21, 28, 35, 42, 49, \\dots\n\\]\nMore challenging for humans is counting backwards by 7:\n\\[\n100, 93, 86, 79, \\dots\n\\]\nThese are examples of arithmetic sequences. The form of the first \\(n+1\\) terms in such a sequence is:\n\\[\na_0, a_0 + h, a_0 + 2h, a_0 + 3h, \\dots, a_0 + nh\n\\]\nThe formula for the \\(a_n\\)th term can be written in terms of \\(a_0\\), or any other \\(0 \\leq m \\leq n\\) with \\(a_n = a_m + (n-m)\\cdot h\\).\nA typical question might be: The first term of an arithmetic sequence is equal to \\(200\\) and the common difference is equal to \\(-10\\). Find the value of \\(a_{20}\\). We could find this using \\(a_n = a_0 + n\\cdot h\\):\n\na0, h, n = 200, -10, 20\na0 + n * h\n\n0\n\n\nMore complicated questions involve an unknown first value, as with: an arithmetic sequence has a common difference equal to \\(10\\) and its \\(6\\)th term is equal to \\(52\\). Find its \\(15\\)th term, \\(a_{15}\\). Here we have to answer: \\(a_0 + 15 \\cdot 10\\). Either we could find \\(a_0\\) (using \\(52 = a_0 + 6\\cdot(10)\\)) or use the above formula\n\na6, h, m, n = 52, 10, 6, 15\na15 = a6 + (n-m)*h\n\n142\n\n\n\n\nRather than express sequences by the \\(a_0\\), \\(h\\), and \\(n\\), Julia uses the starting point (a), the difference (h) and a suggested stopping value (b). That is, we need three values to specify these ranges of numbers: a start, a step, and an endof. Julia gives a convenient syntax for this: a:h:b. When the difference is just \\(1\\), all numbers between the start and end are specified by a:b, as in\n\n1:10\n\n1:10\n\n\nBut wait, nothing different printed? This is because 1:10 is efficiently stored. Basically, a recipe to generate the next number from the previous number is created and 1:10 just stores the start and end point and that recipe is used to generate the set of all values. To expand the values, you have to ask for them to be collected (though this typically isnt needed in practice):\n\ncollect(1:10)\n\n10-element Vector{Int64}:\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n 8\n 9\n 10\n\n\nWhen a non-default step size is needed, it goes in the middle, as in a:h:b. For example, counting by sevens from \\(1\\) to \\(50\\) is achieved by:\n\ncollect(1:7:50)\n\n8-element Vector{Int64}:\n 1\n 8\n 15\n 22\n 29\n 36\n 43\n 50\n\n\nOr counting down from 100:\n\ncollect(100:-7:1)\n\n15-element Vector{Int64}:\n 100\n 93\n 86\n 79\n 72\n 65\n 58\n 51\n 44\n 37\n 30\n 23\n 16\n 9\n 2\n\n\nIn this last example, we said end with \\(1\\), but it ended with \\(2\\). The ending value in the range is a suggestion to go up to, but not exceed. Negative values for h are used to make decreasing sequences.\n\n\n\nFor generating points to make graphs, a natural set of points to specify is \\(n\\) evenly spaced points between \\(a\\) and \\(b\\). We can mimic creating this set with the range operation by solving for the correct step size. We have \\(a_0=a\\) and \\(a_0 + (n-1) \\cdot h = b\\). (Why \\(n-1\\) and not \\(n\\)?) Solving yields \\(h = (b-a)/(n-1)\\). To be concrete we might ask for \\(9\\) points between \\(-1\\) and \\(1\\):\n\na, b, n = -1, 1, 9\nh = (b-a)/(n-1)\ncollect(a:h:b)\n\n9-element Vector{Float64}:\n -1.0\n -0.75\n -0.5\n -0.25\n 0.0\n 0.25\n 0.5\n 0.75\n 1.0\n\n\nPretty neat. If we were doing this many times - such as once per plot - wed want to encapsulate this into a function, for example:\n\nfunction evenly_spaced(a, b, n)\n h = (b-a)/(n-1)\n collect(a:h:b)\nend\n\nevenly_spaced (generic function with 1 method)\n\n\nGreat, lets try it out:\n\nevenly_spaced(0, 2pi, 5)\n\n5-element Vector{Float64}:\n 0.0\n 1.5707963267948966\n 3.141592653589793\n 4.71238898038469\n 6.283185307179586\n\n\nNow, our implementation was straightforward, but only because it avoids somethings. Look at something simple:\n\nevenly_spaced(1/5, 3/5, 3)\n\n3-element Vector{Float64}:\n 0.2\n 0.4\n 0.6\n\n\nIt seems to work as expected. But looking just at the algorithm it isnt quite so clear:\n\n1/5 + 2*1/5 # last value\n\n0.6000000000000001\n\n\nFloating point roundoff leads to the last value exceeding 0.6, so should it be included? Well, here it is pretty clear it should be, but better to have something programmed that hits both a and b and adjusts h accordingly.\nEnter the base function range which solves this seemingly simple - but not really - task. It can use a, b, and n. Like the range operation, this function returns a generator which can be collected to realize the values.\nThe number of points is specified with keyword arguments, as in:\n\nxs = range(-1, 1, length=9) # or simply range(-1, 1, 9) as of v\"1.7\"\n\n-1.0:0.25:1.0\n\n\nand\n\ncollect(xs)\n\n9-element Vector{Float64}:\n -1.0\n -0.75\n -0.5\n -0.25\n 0.0\n 0.25\n 0.5\n 0.75\n 1.0\n\n\n\n\n\n\n\n\nNote\n\n\n\nThere is also the LinRange(a, b, n) function which can be more performant than range, as it doesnt try to correct for floating point errors."
},
{
"objectID": "precalc/ranges.html#modifying-sequences",
"href": "precalc/ranges.html#modifying-sequences",
"title": "6  Ranges and Sets",
"section": "6.2 Modifying sequences",
"text": "6.2 Modifying sequences\nNow we concentrate on some more general styles to modify a sequence to produce a new sequence.\n\n6.2.1 Filtering\nFor example, another way to get the values between \\(0\\) and \\(100\\) that are multiples of \\(7\\) is to start with all \\(101\\) values and throw out those that dont match. To check if a number is divisible by \\(7\\), we could use the rem function. It gives the remainder upon division. Multiples of 7 match rem(m, 7) == 0. Checking for divisibility by seven is unusual enough there is nothing built in for that, but checking for division by \\(2\\) is common, and for that, there is a built-in function iseven.\nThe act of throwing out elements of a collection based on some condition is called filtering. The filter function does this in Julia; the basic syntax being filter(predicate_function, collection). The “predicate_function” is one that returns either true or false, such as iseven. The output of filter consists of the new collection of values - those where the predicate returns true.\nTo see it used, lets start with the numbers between 0 and 25 (inclusive) and filter out those that are even:\n\nfilter(iseven, 0:25)\n\n13-element Vector{Int64}:\n 0\n 2\n 4\n 6\n 8\n 10\n 12\n 14\n 16\n 18\n 20\n 22\n 24\n\n\nTo get the numbers between \\(1\\) and \\(100\\) that are divisible by \\(7\\) requires us to write a function akin to iseven, which isnt hard (e.g., is_seven(x) = x%7 == 0 or if being fancy Base.Fix2(iszero∘rem, 7)), but isnt something we continue with just yet.\nFor another example, here is an inefficient way to list the prime numbers between \\(100\\) and \\(200\\). This uses the isprime function from the Primes package\nusing Primes\n\nfilter(isprime, 100:200)\n\n21-element Vector{Int64}:\n 101\n 103\n 107\n 109\n 113\n 127\n 131\n 137\n 139\n 149\n 151\n 157\n 163\n 167\n 173\n 179\n 181\n 191\n 193\n 197\n 199\n\n\nIllustrating filter at this point is mainly a motivation to illustrate that we can start with a regular set of numbers and then modify or filter them. The function takes on more value once we discuss how to write predicate functions.\n\n\n6.2.2 Comprehensions\nLets return to the case of the set of even numbers between \\(0\\) and \\(100\\). We have many ways to describe this set:\n\nThe collection of numbers \\(0, 2, 4, 6 \\dots, 100\\), or the arithmetic sequence with step size \\(2\\), which is returned by 0:2:100.\nThe numbers between \\(0\\) and \\(100\\) that are even, that is filter(iseven, 0:100).\nThe set of numbers \\(\\{2k: k=0, \\dots, 50\\}\\).\n\nWhile Julia has a special type for dealing with sets, we will use a vector for such a set. (Unlike a set, vectors can have repeated values, but as vectors are more widely used, we demonstrate them.) Vectors are described more fully in a previous section, but as a reminder, vectors are constructed using square brackets: [] (a special syntax for concatenation). Square brackets are used in different contexts within Julia, in this case we use them to create a collection. If we separate single values in our collection by commas (or semicolons), we will create a vector:\n\nx = [0, 2, 4, 6, 8, 10]\n\n6-element Vector{Int64}:\n 0\n 2\n 4\n 6\n 8\n 10\n\n\nThat is of course only part of the set of even numbers we want. Creating more might be tedious were we to type them all out, as above. In such cases, it is best to generate the values.\nFor this simple case, a range can be used, but more generally a comprehension provides this ability using a construct that closely mirrors a set definition, such as \\(\\{2k: k=0, \\dots, 50\\}\\). The simplest use of a comprehension takes this form (as we described in the section on vectors):\n[expr for variable in collection]\nThe expression typically involves the variable specified after the keyword for. The collection can be a range, a vector, or many other items that are iterable. Here is how the mathematical set \\(\\{2k: k=0, \\dots, 50\\}\\) may be generated by a comprehension:\n\n[2k for k in 0:50]\n\n51-element Vector{Int64}:\n 0\n 2\n 4\n 6\n 8\n 10\n 12\n 14\n 16\n 18\n 20\n 22\n 24\n ⋮\n 78\n 80\n 82\n 84\n 86\n 88\n 90\n 92\n 94\n 96\n 98\n 100\n\n\nThe expression is 2k, the variable k, and the collection is the range of values, 0:50. The syntax is basically identical to how the math expression is typically read aloud.\nFor some other examples, here is how we can create the first \\(10\\) numbers divisible by \\(7\\):\n\n[7k for k in 1:10]\n\n10-element Vector{Int64}:\n 7\n 14\n 21\n 28\n 35\n 42\n 49\n 56\n 63\n 70\n\n\nHere is how we can square the numbers between \\(1\\) and \\(10\\):\n\n[x^2 for x in 1:10]\n\n10-element Vector{Int64}:\n 1\n 4\n 9\n 16\n 25\n 36\n 49\n 64\n 81\n 100\n\n\nTo generate other progressions, such as powers of \\(2\\), we could do:\n\n[2^i for i in 1:10]\n\n10-element Vector{Int64}:\n 2\n 4\n 8\n 16\n 32\n 64\n 128\n 256\n 512\n 1024\n\n\nHere are decreasing powers of \\(2\\):\n\n[1/2^i for i in 1:10]\n\n10-element Vector{Float64}:\n 0.5\n 0.25\n 0.125\n 0.0625\n 0.03125\n 0.015625\n 0.0078125\n 0.00390625\n 0.001953125\n 0.0009765625\n\n\nSometimes, the comprehension does not produce the type of output that may be expected. This is related to Julias more limited abilities to infer types at the command line. If the output type is important, the extra prefix of T[] can be used, where T is the desired type. We will see that this will be needed at times with symbolic math.\n\n\n6.2.3 Generators\nA typical pattern would be to generate a collection of numbers and then apply a function to them. For example, here is one way to sum the powers of \\(2\\):\n\nsum([2^i for i in 1:10])\n\n2046\n\n\nConceptually this is easy to understand, but computationally it is a bit inefficient. The generator syntax allows this type of task to be done more efficiently. To use this syntax, we just need to drop the []:\n\nsum(2^i for i in 1:10)\n\n2046\n\n\n(The difference being no intermediate object is created to store the collection of all values specified by the generator.)\n\n\n6.2.4 Filtering generated expressions\nBoth comprehensions and generators allow for filtering through the keyword if. The following shows one way to add the prime numbers in \\([1,100]\\):\n\nsum(p for p in 1:100 if isprime(p))\n\n1060\n\n\nThe value on the other side of if should be an expression that evaluates to either true or false for a given p (like a predicate function, but here specified as an expression). The value returned by isprime(p) is such.\nIn this example, we use the fact that rem(k, 7) returns the remainder found from dividing k by 7, and so is 0 when k is a multiple of 7:\n\nsum(k for k in 1:100 if rem(k,7) == 0) ## add multiples of 7\n\n735\n\n\nThe same if can be used in a comprehension. For example, this is an alternative to filter for identifying the numbers divisble by 7 in a range of numbers:\n\n[k for k in 1:100 if rem(k,7) == 0]\n\n14-element Vector{Int64}:\n 7\n 14\n 21\n 28\n 35\n 42\n 49\n 56\n 63\n 70\n 77\n 84\n 91\n 98\n\n\n\nExample: Making change\nThis example of Stefan Karpinski comes from a blog post highlighting changes to the Julia language with version v\"0.5.0\", which added features to comprehensions that made this example possible.\nFirst, a simple question: using pennies, nickels, dimes, and quarters how many different ways can we generate one dollar? Clearly \\(100\\) pennies, or \\(20\\) nickels, or \\(10\\) dimes, or \\(4\\) quarters will do this, so the answer is at least four, but how much more than four?\nWell, we can use a comprehension to enumerate the possibilities. This example illustrates how comprehensions and generators can involve one or more variable for the iteration.\nFirst, we either have \\(0,1,2,3\\), or \\(4\\) quarters, or \\(0\\), \\(25\\) cents, \\(50\\) cents, \\(75\\) cents, or a dollars worth. If we have, say, \\(1\\) quarter, then we need to make up \\(75\\) cents with the rest. If we had \\(3\\) dimes, then we need to make up \\(45\\) cents out of nickels and pennies, if we then had \\(6\\) nickels, we know we must need \\(15\\) pennies.\nThe following expression shows how counting this can be done through enumeration. Here q is the amount contributed by quarters, d the amount from dimes, n the amount from nickels, and p the amount from pennies. q ranges over \\(0, 25, 50, 75, 100\\) or 0:25:100, etc. If we know that the sum of quarters, dimes, nickels contributes a certain amount, then the number of pennies must round things up to \\(100\\).\n\nways = [(q, d, n, p) for q = 0:25:100 for d = 0:10:(100 - q) for n = 0:5:(100 - q - d) for p = (100 - q - d - n)]\nlength(ways)\n\n242\n\n\nWe see \\(242\\) cases, each distinct. The first \\(3\\) are:\n\nways[1:3]\n\n3-element Vector{NTuple{4, Int64}}:\n (0, 0, 0, 100)\n (0, 0, 5, 95)\n (0, 0, 10, 90)\n\n\nThe generating expression reads naturally. It introduces the use of multiple for statements, each subsequent one depending on the value of the previous (working left to right). Now suppose, we want to ensure that the amount in pennies is less than the amount in nickels, etc. We could use filter somehow to do this for our last answer, but using if allows for filtering while the events are generating. Here our condition is simply expressed: q > d > n > p:\n\n[(q, d, n, p) for q = 0:25:100\n for d = 0:10:(100 - q)\n for n = 0:5:(100 - q - d)\n for p = (100 - q - d - n)\n if q > d > n > p]\n\n4-element Vector{NTuple{4, Int64}}:\n (50, 30, 15, 5)\n (50, 30, 20, 0)\n (50, 40, 10, 0)\n (75, 20, 5, 0)"
},
{
"objectID": "precalc/ranges.html#random-numbers",
"href": "precalc/ranges.html#random-numbers",
"title": "6  Ranges and Sets",
"section": "6.3 Random numbers",
"text": "6.3 Random numbers\nWe have been discussing structured sets of numbers. On the opposite end of the spectrum are random numbers. Julia makes them easy to generate, especially random numbers chosen uniformly from \\([0,1)\\).\n\nThe rand() function returns a randomly chosen number in \\([0,1)\\).\nThe rand(n) function returns a vector of n randomly chosen numbers in \\([0,1)\\).\n\nTo illustrate, this will command return a single number\n\nrand()\n\n0.7091665095082278\n\n\nIf the command is run again, it is almost certain that a different value will be returned:\n\nrand()\n\n0.49047268527669474\n\n\nThis call will return a vector of \\(10\\) such random numbers:\n\nrand(10)\n\n10-element Vector{Float64}:\n 0.9297209023460982\n 0.4435368429176365\n 0.5398096862580283\n 0.007172948822441794\n 0.95740601571626\n 0.36712852348390046\n 0.8258955993805087\n 0.23389908444145213\n 0.6865547101208402\n 0.8362747495470747\n\n\nThe rand function is easy to use. The only common source of confusion is the subtle distinction between rand() and rand(1), as the latter is a vector of \\(1\\) random number and the former just \\(1\\) random number."
},
{
"objectID": "precalc/ranges.html#questions",
"href": "precalc/ranges.html#questions",
"title": "6  Ranges and Sets",
"section": "6.4 Questions",
"text": "6.4 Questions\n\nQuestion\nWhich of these will produce the odd numbers between \\(1\\) and \\(99\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n 1:3:99\n \n \n\n\n \n \n \n \n 1:2:99\n \n \n\n\n \n \n \n \n 1:99\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these will create the sequence \\(2, 9, 16, 23, \\dots, 72\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n 2:72\n \n \n\n\n \n \n \n \n 72:-7:2\n \n \n\n\n \n \n \n \n 2:9:72\n \n \n\n\n \n \n \n \n 2:7:72\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow many numbers are in the sequence produced by 0:19:1000?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe range operation (a:h:b) can also be used to countdown. Which of these will do so, counting down from 10 to 1? (You can call collect to visualize the generated numbers.)\n\n\n\n \n \n \n \n \n \n \n \n \n 1:-1:10\n \n \n\n\n \n \n \n \n 10:-1:1\n \n \n\n\n \n \n \n \n 10:1\n \n \n\n\n \n \n \n \n 1:10\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the last number generated by 1:4:7?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhile the range operation can generate vectors by collecting, do the objects themselves act like vectors?\nDoes scalar multiplication work as expected? In particular, is the result of 2*(1:5) basically the same as 2 * [1,2,3,4,5]?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes vector addition work? as expected? In particular, is the result of (1:4) + (2:5) basically the same as [1,2,3,4] + [2,3,4,5]?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat if parenthese are left off? Explain the output of 1:4 + 2:5?\n\n\n\n \n \n \n \n \n \n \n \n \n Addition happens prior to the use of : so this is like 1:(4+2):5\n \n \n\n\n \n \n \n \n It is just random\n \n \n\n\n \n \n \n \n It gives the correct answer, a generator for the vector [3,5,7,9]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow is a:b-1 interpreted:\n\n\n\n \n \n \n \n \n \n \n \n \n as a:(b-1)\n \n \n\n\n \n \n \n \n as (a:b) - 1, which is (a-1):(b-1)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCreate the sequence \\(10, 100, 1000, \\dots, 1,000,000\\) using a list comprehension. Which of these works?\n\n\n\n \n \n \n \n \n \n \n \n \n [10^i for i in [10, 100, 1000]]\n \n \n\n\n \n \n \n \n [i^10 for i in [1:6]]\n \n \n\n\n \n \n \n \n [10^i for i in 1:6]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCreate the sequence \\(0.1, 0.01, 0.001, \\dots, 0.0000001\\) using a list comprehension. Which of these will work:\n\n\n\n \n \n \n \n \n \n \n \n \n [(1/10)^i for i in 1:7]\n \n \n\n\n \n \n \n \n [10^-i for i in 1:7]\n \n \n\n\n \n \n \n \n [i^(1/10) for i in 1:7]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nEvaluate the expression \\(x^3 - 2x + 3\\) for each of the values \\(-5, -4, \\dots, 4, 5\\) using a comprehension. Which of these will work?\n\n\n\n \n \n \n \n \n \n \n \n \n [x^3 - 2x + 3 for x in -(5:5)]\n \n \n\n\n \n \n \n \n [x^3 - 2x + 3 for i in -5:5]\n \n \n\n\n \n \n \n \n [x^3 - 2x + 3 for x in -5:5]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow many prime numbers are there between \\(1100\\) and \\(1200\\)? (Use filter and isprime)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich has more prime numbers the range 1000:2000 or the range 11000:12000?\n\n\n\n \n \n \n \n \n \n \n \n \n 1000:2000\n \n \n\n\n \n \n \n \n 11000:12000\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWe can easily add an arithmetic progression with the sum function. For example, sum(1:100) will add the numbers \\(1, 2, ..., 100\\).\nWhat is the sum of the odd numbers between \\(0\\) and \\(100\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe sum of the arithmetic progression \\(a, a+h, \\dots, a+n\\cdot h\\) has a simple formula. Using a few cases, can you tell if this is the correct one:\n\\[\n(n+1)\\cdot a + h \\cdot n(n+1)/2\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, this is true\n \n \n\n\n \n \n \n \n No, this is false\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA geometric progression is of the form \\(a^0, a^1, a^2, \\dots, a^n\\). These are easily generated by comprehensions of the form [a^i for i in 0:n]. Find the sum of the geometric progression \\(1, 2^1, 2^2, \\dots, 2^{10}\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs your answer of the form \\((1 - a^{n+1}) / (1-a)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe product of the terms in an arithmetic progression has a known formula. The product can be found by an expression of the form prod(a:h:b). Find the product of the terms in the sequence \\(1,3,5,\\dots,19\\)."
},
{
"objectID": "precalc/functions.html",
"href": "precalc/functions.html",
"title": "7  Functions",
"section": "",
"text": "This section will use the following add-on packages:\nA mathematical function is defined abstractly by:\nThat is, a function gives a correspondence between values in its domain with values in its range.\nThis definition is abstract, as functions can be very general. With single-variable calculus, we generally specialize to real-valued functions of a single variable (univariate, scalar functions). These typically have the correspondence given by a rule, such as \\(f(x) = x^2\\) or \\(f(x) = \\sqrt{x}\\). The functions domain may be implicit (as in all \\(x\\) for which the rule is defined) or may be explicitly given as part of the rule. The functions range is then the image of its domain, or the set of all \\(f(x)\\) for each \\(x\\) in the domain (\\(\\{f(x): x \\in \\text{ domain}\\}\\)).\nSome examples of mathematical functions are:\n\\[\nf(x) = \\cos(x), \\quad g(x) = x^2 - x, \\quad h(x) = \\sqrt{x}, \\quad\ns(x) = \\begin{cases} -1 & x < 0\\\\1&x>0\\end{cases}.\n\\]\nFor these examples, the domain of both \\(f(x)\\) and \\(g(x)\\) is all real values of \\(x\\), where as for \\(h(x)\\) it is implicitly just the set of non-negative numbers, \\([0, \\infty)\\). Finally, for \\(s(x)\\), we can see that the domain is defined for every \\(x\\) but \\(0\\).\nIn general the range is harder to identify than the domain, and this is the case for these functions too. For \\(f(x)\\) we may know the \\(\\cos\\) function is trapped in \\([-1,1]\\) and it is intuitively clear than all values in that set are possible. The function \\(h(x)\\) would have range \\([0,\\infty)\\). The \\(s(x)\\) function is either \\(-1\\) or \\(1\\), so only has two possible values in its range. What about \\(g(x)\\)? It is a parabola that opens upward, so any \\(y\\) values below the \\(y\\) value of its vertex will not appear in the range. In this case, the symmetry indicates that the vertex will be at \\((1/2, -1/4)\\), so the range is \\([-1/4, \\infty)\\).\nEuler defined a function as an “analytic expression composed in any way whatsoever of the variable quantity and numbers or constant quantities.” He goes on to indicate that as Euler matured, so did his notion of function, ending up closer to the modern idea of a correspondence not necessarily tied to a particular formula or “analytic expression.” He finishes by saying: “It is fair to say that we now study functions in analysis because of him.”\nWe will see that defining functions within Julia can be as simple a concept as Euler started with, but that the more abstract concept has a great advantage that is exploited in the design of the language."
},
{
"objectID": "precalc/functions.html#defining-simple-mathematical-functions",
"href": "precalc/functions.html#defining-simple-mathematical-functions",
"title": "7  Functions",
"section": "7.1 Defining simple mathematical functions",
"text": "7.1 Defining simple mathematical functions\nThe notation Julia uses to define simple mathematical functions could not be more closely related to how they are written mathematically. For example, the functions \\(f(x)\\), \\(g(x)\\), and \\(h(x)\\) above may be defined by:\n\nf(x) = cos(x)\ng(x) = x^2 - x\nh(x) = sqrt(x)\n\nh (generic function with 1 method)\n\n\nThe left-hand sign of the equals sign is an assignment. In this use, a function with a given signature is defined and attached to a method table for the given function name. The right-hand side is simply Julia code to compute the rule corresponding to the function.\nCalling the function also follows standard math notation:\n\nf(pi), g(2), h(4)\n\n(-1.0, 2, 2.0)\n\n\nFor typical cases like the three above, there isnt really much new to learn.\n\n\n\n\n\n\nNote\n\n\n\nThe equals sign in Julia always indicates either an assignment or a mutation of the object on the left side. The definition of a function above is an assignment, in that a function is added (or modified) in a table holding the methods associated with the functions name.\nThe equals sign restricts the expressions available on the left-hand side to a) a variable name, for assignment; b) mutating an object at an index, as in xs[1]; c) mutating a property of a stuct; or d) a function assignment following this form function_name(args...).\nWhereas function definitions and usage in Julia mirrors standard math notation; equations in math are not so mirrored in Julia. In mathematical equations, the left-hand of an equation is typically a complicated algebraic expression. Not so with Julia, where the left hand side of the equals sign is prescribed and quite limited.\n\n\n\n7.1.1 The domain of a function\nFunctions in Julia have an implicit domain, just as they do mathematically. In the case of \\(f(x)\\) and \\(g(x)\\), the right-hand side is defined for all real values of \\(x\\), so the domain is all \\(x\\). For \\(h(x)\\) this isnt the case, of course. Trying to call \\(h(x)\\) when \\(x < 0\\) will give an error:\n\nh(-1)\n\nLoadError: DomainError with -1.0:\nsqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).\n\n\nThe DomainError is one of many different error types Julia has, in this case it is quite apt: the value \\(-1\\) is not in the domain of the function.\n\n\n7.1.2 Equations, functions, calling a function\nMathematically we tend to blur the distinction between the equation\n\\[\ny = 5/9 \\cdot (x - 32)\n\\]\nand the function\n\\[\nf(x) = 5/9 \\cdot (x - 32)\n\\]\nIn fact, the graph of a function \\(f(x)\\) is simply defined as the graph of the equation \\(y=f(x)\\). There is a distinction in Julia as a command such as\n\nx = -40\ny = 5/9 * (x - 32)\n\n-40.0\n\n\nwill evaluate the right-hand side with the value of x bound at the time of assignment to y, whereas assignment to a function\n\nf(x) = 5/9 * (x - 32)\nf(72) ## room temperature\n\n22.22222222222222\n\n\nwill create a function object with a value of x determined at a later time - the time the function is called. So the value of x defined when the function is created is not important here (as the value of x used by f is passed in as an argument).\nWithin Julia, we make note of the distinction between a function object versus a function call. In the definition f(x)=cos(x), the variable f refers to a function object, whereas the expression f(pi) is a function call. This mirrors the math notation where an \\(f\\) is used when properties of a function are being emphasized (such as \\(f \\circ g\\) for composition) and \\(f(x)\\) is used when the values related to the function are being emphasized (such as saying “the plot of the equation \\(y=f(x)\\)).\nDistinguishing these three related but different concepts (equations, function objects, and function calls) is important when modeling on the computer.\n\n\n7.1.3 Cases\nThe definition of \\(s(x)\\) above has two cases:\n\\[\ns(x) = \\begin{cases} -1 & s < 0\\\\ 1 & s > 0. \\end{cases}\n\\]\nWe learn to read this as: when \\(s\\) is less than \\(0\\), then the answer is \\(-1\\). If \\(s\\) is greater than \\(0\\) the answer is \\(1.\\) Often - but not in this example - there is an “otherwise” case to catch those values of \\(x\\) that are not explicitly mentioned. As there is no such “otherwise” case here, we can see that this function has no definition when \\(x=0\\). This function is often called the “sign” function and is also defined by \\(\\lvert x\\rvert/x\\). (Julias sign function actually defines sign(0) to be 0.)\nHow do we create conditional statements in Julia? Programming languages generally have “if-then-else” constructs to handle conditional evaluation. In Julia, the following code will handle the above condition:\nif x < 0\n -1\nelseif x > 0\n 1\nend\nThe “otherwise” case would be caught with an else addition. So, for example, this would implement Julias definition of sign (which also assigns \\(0\\) to \\(0\\)):\nif x < 0\n -1\nelseif x > 0\n 1\nelse\n 0\nend\nThe conditions for the if statements are expressions that evaluate to either true or false, such as generated by the Boolean operators <, <=, ==, !-, >=, and >.\nIf familiar with if conditions, they are natural to use. However, for simpler cases of “if-else” Julia provides the more convenient ternary operator: cond ? if_true : if_false. (The name comes from the fact that there are three arguments specified.) The ternary operator checks the condition and if true returns the first expression, whereas if the condition is false the second condition is returned. Both expressions are evaluated. (The short-circuit operators can be used to avoid both evaluations.)\nFor example, here is one way to define an absolute value function:\n\nabs_val(x) = x >= 0 ? x : -x\n\nabs_val (generic function with 1 method)\n\n\nThe condition is x >= 0 - or is x non-negative? If so, the value x is used, otherwise -x is used.\nHere is a means to implement a function which takes the larger of x or 10:\n\nbigger_10(x) = x > 10 ? x : 10.0\n\nbigger_10 (generic function with 1 method)\n\n\n(This could also utilize the max function: f(x) = max(x, 10.0).)\nOr similarly, a function to represent a cell phone plan where the first \\(500\\) minutes are \\(20\\) dollars and every additional minute is \\(5\\) cents:\n\ncellplan(x) = x < 500 ? 20.0 : 20.0 + 0.05 * (x-500)\n\ncellplan (generic function with 1 method)\n\n\n\n\n\n\n\n\nWarning\n\n\n\nType stability. These last two definitions used 10.0 and 20.0 instead of the integers 10 and 20 for the answer. Why the extra typing? When Julia can predict the type of the output from the type of inputs, it can be more efficient. So when possible, we help out and ensure the output is always the same type.\n\n\n\nExample\nThe ternary operator can be used to define an explicit domain. For example, a falling body might have height given by \\(h(t) = 10 - 16t^2\\). This model only applies for non-negative \\(t\\) and non-negative \\(h\\) values. So, in particular \\(0 \\leq t \\leq \\sqrt{10/16}\\). To implement this function we might have:\n\nhᵣ(t) = 0 <= t <= sqrt(10/16) ? 10.0 - 16t^2 : error(\"t is not in the domain\")\n\nhᵣ (generic function with 1 method)\n\n\n\n\nNesting ternary operators\nThe function s(x) isnt quite so easy to implement, as there isnt an “otherwise” case. We could use an if statement, but instead illustrate using a second, nested ternary operator:\n\ns(x) = x < 0 ? 1 :\n x > 0 ? 1 : error(\"0 is not in the domain\")\n\ns (generic function with 1 method)\n\n\nWith nested ternary operators, the advantage over the if condition is not always compelling, but for simple cases the ternary operator is quite useful."
},
{
"objectID": "precalc/functions.html#functions-defined-with-the-function-keyword",
"href": "precalc/functions.html#functions-defined-with-the-function-keyword",
"title": "7  Functions",
"section": "7.2 Functions defined with the “function” keyword",
"text": "7.2 Functions defined with the “function” keyword\nFor more complicated functions, say one with a few steps to compute, an alternate form for defining a function can be used:\nfunction function_name(function_arguments)\n ...function_body...\nend\nThe last value computed is returned unless the function_body contains an explicit return statement.\nFor example, the following is a more verbose way to define \\(sq(x) = x^2\\):\n\nfunction sq(x)\n return x^2\nend\n\nsq (generic function with 1 method)\n\n\nThe line return x^2, could have just been x^2 as it is the last (and) only line evaluated.\n\n\n\n\n\n\nNote\n\n\n\nThe return keyword is not a function, so is not called with parentheses. An emtpy return statement will return a value of nothing.\n\n\n\nExample\nImagine we have the following complicated function related to the trajectory of a projectile with wind resistance:\n\\[\n f(x) = \\left(\\frac{g}{k v_0\\cos(\\theta)} + \\tan(\\theta) \\right) x + \\frac{g}{k^2}\\ln\\left(1 - \\frac{k}{v_0\\cos(\\theta)} x \\right)\n\\]\nHere \\(g\\) is the gravitational constant \\(9.8\\) and \\(v_0\\), \\(\\theta\\) and \\(k\\) parameters, which we take to be \\(200\\), \\(45\\) degrees and \\(1/2\\) respectively. With these values, the above function can be computed when \\(x=100\\) with:\n\nfunction trajectory(x)\n g, v0, theta, k = 9.8, 200, 45*pi/180, 1/2\n a = v0 * cos(theta)\n\n (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x)\nend\n\ntrajectory (generic function with 1 method)\n\n\n\ntrajectory(100)\n\n96.7577179163216\n\n\nBy using a multi-line function our work is much easier to look over for errors.\n\n\nExample: the secant method for finding a solution to \\(f(x) = 0\\).\nThis next example, shows how using functions to collect a set of computations for simpler reuse can be very helpful.\nAn old method for finding a zero of an equation is the secant method. We illustrate the method with the function \\(f(x) = x^2 - 2\\). In an upcoming example we saw how to create a function to evaluate the secant line between \\((a,f(a))\\) and \\((b, f(b))\\) at any point. In this example, we define a function to compute the \\(x\\) coordinate of where the secant line crosses the \\(x\\) axis. This can be defined as follows:\n\nfunction secant_intersection(f, a, b)\n # solve 0 = f(b) + m * (x-b) where m is the slope of the secant line\n # x = b - f(b) / m\n m = (f(b) - f(a)) / (b - a)\n b - f(b) / m\nend\n\nsecant_intersection (generic function with 1 method)\n\n\nWe utilize this as follows. Suppose we wish to solve \\(f(x) = 0\\) and we have two “rough” guesses for the answer. In our example, we wish to solve \\(q(x) = x^2 - 2\\) and our “rough” guesses are \\(1\\) and \\(2\\). Call these values \\(a\\) and \\(b\\). We improve our rough guesses by finding a value \\(c\\) which is the intersection point of the secant line.\n\nq(x) = x^2 - 2\n𝒂, 𝒃 = 1, 2\n𝒄 = secant_intersection(q, 𝒂, 𝒃)\n\n1.3333333333333335\n\n\nIn our example, we see that in trying to find an answer to \\(f(x) = 0\\) ( \\(\\sqrt{2}\\approx 1.414\\dots\\)) our value found from the intersection point is a better guess than either \\(a=1\\) or \\(b=2\\):\n\n\n\n\n\nStill, q(𝒄) is not really close to \\(0\\):\n\nq(𝒄)\n\n-0.22222222222222188\n\n\nBut it is much closer than either \\(q(a)\\) or \\(q(b)\\), so it is an improvement. This suggests renaming \\(a\\) and \\(b\\) with the old \\(b\\) and \\(c\\) values and trying again we might do better still:\n\n𝒂, 𝒃 = 𝒃, 𝒄\n𝒄 = secant_intersection(q, 𝒂, 𝒃)\nq(𝒄)\n\n-0.03999999999999959\n\n\nYes, now the function value at this new \\(c\\) is even closer to \\(0\\). Trying a few more times we see we just get closer and closer. He we start again to see the progress\n\n𝒂,𝒃 = 1, 2\nfor step in 1:6\n 𝒂, 𝒃 = 𝒃, secant_intersection(q, 𝒂, 𝒃)\n current = (c=𝒃, qc=q(𝒃))\n @show current\nend\n\ncurrent = (c = 1.3333333333333335, qc = -0.22222222222222188)\ncurrent = (c = 1.4000000000000001, qc = -0.03999999999999959)\ncurrent = (c = 1.4146341463414633, qc = 0.0011897679952408424)\ncurrent = (c = 1.41421143847487, qc = -6.007286838860537e-6)\ncurrent = (c = 1.4142135620573204, qc = -8.931455575122982e-10)\ncurrent = (c = 1.4142135623730954, qc = 8.881784197001252e-16)\n\n\nNow our guess \\(c\\) is basically the same as sqrt(2). Repeating the above leads to only a slight improvement in the guess, as we are about as close as floating point values will allow.\nHere we see a visualization with all these points. As can be seen, it quickly converges at the scale of the visualization, as we cant see much closer than 1e-2.\n\n\n\n\n\nIn most cases, this method can fairly quickly find a zero provided two good starting points are used."
},
{
"objectID": "precalc/functions.html#parameters-function-context-scope-keyword-arguments",
"href": "precalc/functions.html#parameters-function-context-scope-keyword-arguments",
"title": "7  Functions",
"section": "7.3 Parameters, function context (scope), keyword arguments",
"text": "7.3 Parameters, function context (scope), keyword arguments\nConsider two functions implementing the slope-intercept form and point-slope form of a line:\n\\[\nf(x) = m \\cdot x + b, \\quad g(x) = y_0 + m \\cdot (x - x_0).\n\\]\nBoth functions use the variable \\(x\\), but there is no confusion, as we learn that this is just a dummy variable to be substituted for and so could have any name. Both also share a variable \\(m\\) for a slope. Where does that value come from? In practice, there is a context that gives an answer. Despite the same name, there is no expectation that the slope will be the same for each function if the context is different. So when parameters are involved, a function involves a rule and a context to give specific values to the parameters. Euler had said initially that functions composed of “the variable quantity and numbers or constant quantities.” The term “variable,” we still use, but instead of “constant quantities,” we use the name “parameters.”\nSomething similar is also true with Julia. Consider the example of writing a function to model a linear equation with slope \\(m=2\\) and \\(y\\)-intercept \\(3\\). A typical means to do this would be to define constants, and then use the familiar formula:\n\nm, b = 2, 3\nmxb(x) = m*x + b\n\nmxb (generic function with 1 method)\n\n\nThis will work as expected. For example, \\(f(0)\\) will be \\(b\\) and \\(f(2)\\) will be \\(7\\):\n\nmxb(0), mxb(2)\n\n(3, 7)\n\n\nAll fine, but what if somewhere later the values for \\(m\\) and \\(b\\) were redefined, say with \\(m,b = 3,2\\)?\nNow what happens with \\(f(0)\\)? When \\(f\\) was defined b was \\(3\\), but now if we were to call f, b is \\(2\\). Which value will we get? More generally, when f is being evaluated in what context does Julia look up the bindings for the variables it encounters? It could be that the values are assigned when the function is defined, or it could be that the values for the parameters are resolved when the function is called. If the latter, what context will be used?\nBefore discussing this, lets just see in this case:\n\nm, b = 3, 2\nmxb(0)\n\n2\n\n\nSo the b is found from the currently stored value. This fact can be exploited. we can write template-like functions, such as f(x)=m*x+b and reuse them just by updating the parameters separately.\nHow Julia resolves what a variable refers to is described in detail in the manual page Scope of Variables. In this case, the function definition finds variables in the context of where the function was defined, the main workspace. As seen, this context can be modified after the function definition and prior to the function call. It is only when b is needed, that the context is consulted, so the most recent binding is retrieved. Contexts (more formally known as environments) allow the user to repurpose variable names without there being name collision. For example, we typically use x as a function argument, and different contexts allow this x to refer to different values.\nMostly this works as expected, but at times it can be complicated to reason about. In our example, definitions of the parameters can be forgotten, or the same variable name may have been used for some other purpose. The potential issue is with the parameters, the value for x is straightforward, as it is passed into the function. However, we can also pass the parameters, such as \\(m\\) and \\(b\\), as arguments. For parameters, we suggest using keyword arguments. These allow the specification of parameters, but also give a default value. This can make usage explicit, yet still convenient. For example, here is an alternate way of defining a line with parameters m and b:\n\nmxplusb(x; m=1, b=0) = m*x + b\n\nmxplusb (generic function with 1 method)\n\n\nThe right-hand side is identical to before, but the left hand side is different. Arguments defined after a semicolon are keyword arguments. They are specified as var=value (or var::Type=value to restrict the type) where the value is used as the default, should a value not be specified when the function is called.\nCalling a function with keyword arguments can be identical to before:\n\nmxplusb(0)\n\n0\n\n\nDuring this call, values for m and b are found from how the function is called, not the main workspace. In this case, nothing is specified so the defaults of \\(m=1\\) and \\(b=0\\) are used. Whereas, this call will use the user-specified values for m and b:\n\nmxplusb(0; m=3, b=2)\n\n2\n\n\nKeywords are used to mark the parameters whose values are to be changed from the default. Though one can use positional arguments for parameters - and there are good reasons to do so - using keyword arguments is a good practice if performance isnt paramount, as their usage is more explicit yet the defaults mean that a minimum amount of typing needs to be done.\n\nExample\nIn the example for multi-line functions we hard coded many variables inside the body of the function. In practice it can be better to pass these in as parameters along the lines of:\n\nfunction trajectory(x; g = 9.8, v0 = 200, theta = 45*pi/180, k = 1/2)\n a = v0 * cos(theta)\n (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x)\nend\ntrajectory(100)\n\n96.7577179163216\n\n\n\n\n7.3.1 The f(x,p) style for parameterization\nAn alternative to keyword arguments is to bundle the parameters into a container and pass them as a single argument to the function. The idiom in Julia is to use the second argument for parameters, or f(x, p) for the function argument specifications. This style is used in the very popular SciML suite of packages.\nFor example, here we use a named tuple to pass parameters to f:\n\nfunction trajectory(x ,p)\n g,v0, theta, k = p.g, p.v0, p.theta, p.k # unpack parameters\n\n a = v0 * cos(theta)\n (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x)\nend\n\np = (g=9.8, v0=200, theta = 45*pi/180, k=1/2)\ntrajectory(100, p)\n\n96.7577179163216\n\n\nThe style isnt so different from using keyword arguments, save the extra step of unpacking the parameters. The big advantage is consistency the function is always called in an identical manner regardless of the number of parameters (or variables)."
},
{
"objectID": "precalc/functions.html#multiple-dispatch",
"href": "precalc/functions.html#multiple-dispatch",
"title": "7  Functions",
"section": "7.4 Multiple dispatch",
"text": "7.4 Multiple dispatch\nThe concept of a function is of much more general use than its restriction to mathematical functions of single real variable. A natural application comes from describing basic properties of geometric objects. The following function definitions likely will cause no great concern when skimmed over:\n\nArea(w, h) = w * h # of a rectangle\nVolume(r, h) = pi * r^2 * h # of a cylinder\nSurfaceArea(r, h) = pi * r * (r + sqrt(h^2 + r^2)) # of a right circular cone, including the base\n\nSurfaceArea (generic function with 1 method)\n\n\nThe right-hand sides may or may not be familiar, but it should be reasonable to believe that if push came to shove, the formulas could be looked up. However, the left-hand sides are subtly different - they have two arguments, not one. In Julia it is trivial to define functions with multiple arguments - we just did.\nEarlier we saw the log function can use a second argument to express the base. This function is basically defined by log(b,x)=log(x)/log(b). The log(x) value is the natural log, and this definition just uses the change-of-base formula for logarithms.\nBut not so fast, on the left side is a function with two arguments and on the right side the functions have one argument - yet they share the same name. How does Julia know which to use? Julia uses the number, order, and type of the positional arguments passed to a function to determine which function definition to use. This is technically known as multiple dispatch or polymorphism. As a feature of the language, it can be used to greatly simplify the number of functions the user must learn. The basic idea is that many functions are “generic” in that they have methods which will work differently in different scenarios.\n\n\n\n\n\n\nWarning\n\n\n\nMultiple dispatch is very common in mathematics. For example, we learn different ways to add: integers (fingers, carrying), real numbers (align the decimal points), rational numbers (common denominators), complex numbers (add components), vectors (add components), polynomials (combine like monomials), … yet we just use the same + notation for each operation. The concepts are related, the details different.\n\n\nJulia is similarly structured. Julia terminology would be to call the operation “+” a generic function and the different implementations methods of “+”. This allows the user to just need to know a smaller collection of generic concepts yet still have the power of detail-specific implementations. To see how many different methods are defined in the base Julia language for the + operator, we can use the command methods(+). As there are so many (\\(\\approx 200\\)) and that number is growing, we illustrate how many different logarithm methods are implemented for “numbers:”\n\nmethods(log, (Number,))\n\n# 11 methods for generic function log: log(::Static.StaticFloat64{M}) where M in Static at /Users/verzani/.julia/packages/Static/sVI3g/src/Static.jl:460 log(d::ForwardDiff.Dual{T}) where T in ForwardDiff at /Users/verzani/.julia/packages/ForwardDiff/Z1voq/src/dual.jl:238 log(x::Float32) in Base.Math at special/log.jl:266 log(::Irrational{:}) in Base.MathConstants at mathconstants.jl:123 log(x::SymPy.Sym) in SymPy at /Users/verzani/julia/SymPy/src/mathfuns.jl:43 log(x::BigFloat) in Base.MPFR at mpfr.jl:678 log(x::Float64) in Base.Math at special/log.jl:269 log(a::ComplexF16) in Base.Math at math.jl:1202 log(z::Complex) in Base at complex.jl:608 log(a::Float16) in Base.Math at math.jl:1201 log(x::Real) in Base.Math at math.jl:1218 \n\n\n(The arguments have type annotations such as x::Float64 or x::BigFloat. Julia uses these to help resolve which method should be called for a given set of arguments. This allows for different operations depending on the variable type. For example, in this case, the log function for Float64 values uses a fast algorithm, whereas for BigFloat values an algorithm that can handle multiple precision is used.)\n\nExample: An application of composition and multiple dispatch\nAs mentioned Julias multiple dispatch allows multiple functions with the same name. The function that gets selected depends not just on the type of the arguments, but also on the number of arguments given to the function. We can exploit this to simplify our tasks. For example, consider this optimization problem:\n\nFor all rectangles of perimeter \\(20\\), what is the one with largest area?\n\nThe start of this problem is to represent the area in terms of one variable. We see next that composition can simplify this task, which when done by hand requires a certain amount of algebra.\nRepresenting the area of a rectangle in terms of two variables is easy, as the familiar formula of width times height applies:\n\nArea(w, h) = w * h\n\nArea (generic function with 1 method)\n\n\nBut the other fact about this problem - that the perimeter is \\(20\\) - means that height depends on width. For this question, we can see that \\(P=2w + 2h\\) so that - as a function - height depends on w as follows:\n\nheight(w) = (20 - 2*w)/2\n\nheight (generic function with 1 method)\n\n\nBy hand we would substitute this last expression into that for the area and simplify (to get \\(A=w\\cdot (20-2 \\cdot w)/2 = -w^2 + 10\\)). However, within Julia we can let composition do the substitution and leave the algebraic simplification for Julia to do:\n\nArea(w) = Area(w, height(w))\n\nArea (generic function with 2 methods)\n\n\nThis might seem odd, just like with log, we now have two different but related functions named Area. Julia will decide which to use based on the number of arguments when the function is called. This setup allows both to be used on the same line, as above. This usage style is not so common with many computer languages, but is a feature of Julia which is built around the concept of generic functions with multiple dispatch rules to decide which rule to call.\nFor example, jumping ahead a bit, the plot function of Plots expects functions of a single numeric variable. Behind the scenes, then the function A(w) will be used in this graph:\n\nplot(Area, 0, 10)\n\n\n\n\nFrom the graph, we can see that that width for maximum area is \\(w=5\\) and so \\(h=5\\) as well."
},
{
"objectID": "precalc/functions.html#function-application",
"href": "precalc/functions.html#function-application",
"title": "7  Functions",
"section": "7.5 Function application",
"text": "7.5 Function application\nThe typical calling pattern for a function simply follows mathematical notation, that is f(x) calls the function f with the argument x. There are times especially with function composition that an alternative piping syntax is desirable. Julia provides the infix operation |> for piping, defining it by |>(x, f) = f(x). This allows composition to work left to right, instead of right to left. For example, these two calls produce the same answer:\n\nexp(sin(log(3))), 3 |> log |> sin |> exp\n\n(2.436535228064216, 2.436535228064216)"
},
{
"objectID": "precalc/functions.html#other-types-of-functions",
"href": "precalc/functions.html#other-types-of-functions",
"title": "7  Functions",
"section": "7.6 Other types of functions",
"text": "7.6 Other types of functions\nJulia has both generic functions and anonymous functions. Generic functions participate in multiple dispatch, a central feature of Julia. Anonymous functions are very useful with higher-order programming (passing functions as arguments). These notes occasionally take advantage of anonymous functions for convenience.\n\n7.6.1 Anonymous functions\nSimple mathematical functions have a domain and range which are a subset of the real numbers, and generally have a concrete mathematical rule. However, the definition of a function is much more abstract. Weve seen that functions for computer languages can be more complicated too, with, for example, the possibility of multiple input values. Things can get more abstract still.\nTake for example, the idea of the shift of a function. The following mathematical definition of a new function \\(g\\) related to a function \\(f\\):\n\\[\ng(x) = f(x-c)\n\\]\nhas an interpretation - the graph of \\(g\\) will be the same as the graph of \\(f\\) shifted to the right by \\(c\\) units. That is \\(g\\) is a transformation of \\(f\\). From one perspective, the act of replacing \\(x\\) with \\(x-c\\) transforms a function into a new function. Mathematically, when we focus on transforming functions, the word operator is sometimes used. This concept of transforming a function can be viewed as a certain type of function, in an abstract enough way. The relation would be to just pair off the functions \\((f,g)\\) where \\(g(x) = f(x-c)\\).\nWith Julia we can represent such operations. The simplest thing would be to do something like:\n\nf(x) = x^2 - 2x\ng(x) = f(x -3)\n\ng (generic function with 1 method)\n\n\nThen \\(g\\) has the graph of \\(f\\) shifted by 3 units to the right. Now f above refers to something in the main workspace, in this example a specific function. Better would be to allow f to be an argument of a function, like this:\n\nfunction shift_right(f; c=0)\n function(x)\n f(x - c)\n end\nend\n\nshift_right (generic function with 1 method)\n\n\nThat takes some parsing. In the body of the shift_right is the definition of a function. But this function has no name it is anonymous. But what it does should be clear - it subtracts \\(c\\) from \\(x\\) and evaluates \\(f\\) at this new value. Since the last expression creates a function, this function is returned by shift_right.\nSo we could have done something more complicated like:\n\nf(x) = x^2 - 2x\nl = shift_right(f, c=3)\n\n#15 (generic function with 1 method)\n\n\nThen l is a function that is derived from f.\n\n\n\n\n\n\nNote\n\n\n\nThe value of c used when l is called is the one passed to shift_right. Functions like l that are returned by other functions also are called closures, as the context they are evaluated within includes the context of the function that constructs them.\n\n\nAnonymous functions can be created with the function keyword, but we will use the “arrow” notation, arg->body to create them, The above, could have been defined as:\n\nshift_right_alt(f; c=0) = x -> f(x-c)\n\nshift_right_alt (generic function with 1 method)\n\n\nWhen the -> is seen a function is being created.\n\n\n\n\n\n\nWarning\n\n\n\nGeneric versus anonymous functions. Julia has two types of functions, generic ones, as defined by f(x)=x^2 and anonymous ones, as defined by x -> x^2. One gotcha is that Julia does not like to use the same variable name for the two types. In general, Julia is a dynamic language, meaning variable names can be reused with different types of variables. But generic functions take more care, as when a new method is defined it gets added to a method table. So repurposing the name of a generic function for something else is not allowed. Similarly, repurposing an already defined variable name for a generic function is not allowed. This comes up when we use functions that return functions as we have different styles that can be used: When we defined l = shift_right(f, c=3) the value of l is assigned an anonymous function. This binding can be reused to define other variables. However, we could have defined the function l through l(x) = shift_right(f, c=3)(x), being explicit about what happens to the variable x. This would add a method to the generic function l. Meaning, we get an error if we tried to assign a variable to l, such as an expression like l=3. We generally employ the latter style, even though it involves a bit more typing, as we tend to stick to methods of generic functions for consistency.\n\n\n\nExample: the secant line\nA secant line is a line through two points on the graph of a function. If we have a function \\(f(x)\\), and two \\(x\\)-values \\(x=a\\) and \\(x=b\\), then we can find the slope between the points \\((a,f(a))\\) and \\((b, f(b))\\) with:\n\\[\nm = \\frac{f(b) - f(a)}{b - a}.\n\\]\nThe point-slope form of a line then gives the equation of the tangent line as \\(y = f(a) + m \\cdot (x - a)\\).\nTo model this in Julia, we would want to turn the inputs f,a, b into a function that implements the secant line (functions are much easier to work with than equations). Here is how we can do it:\n\nfunction secant(f, a, b)\n m = (f(b) - f(a)) / (b-a)\n x -> f(a) + m * (x - a)\nend\n\nsecant (generic function with 1 method)\n\n\nThe body of the function nearly mirrors the mathematical treatment. The main difference is in place of \\(y = \\dots\\) we have a x -> ... to create an anonymous function.\nTo illustrate the use, suppose \\(f(x) = x^2 - 2\\) and we have the secant line between \\(a=1\\) and \\(b=2\\). The value at \\(x=3/2\\) is given by:\n\nf(x) = x^2 - 2\na,b = 1, 2\nsecant(f,a,b)(3/2)\n\n0.5\n\n\nThe last line employs double parentheses. The first pair, secant(f,a,b), returns a function and the second pair, (3/2), are used to call the returned function.\n\n\nClosures\nOne main use of anonymous functions is to make closures. Weve touched on two concepts: functions with parameters and functions as arguments to other functions. The creation of a function for a given set of parameters may be needed. Anonymous functions are used to create closures which capture the values of the parameters. For a simple example, mxplusb parameterizes any line, but to use a function to represent a specific line, a new function can be created:\n\nmxplusb(x; m=0, b=0) = m*x + b\nspecific_line(m,b) = x -> mxplusb(x; m=m, b=b)\n\nspecific_line (generic function with 1 method)\n\n\nThe returned object will have its parameters (m and b) fixed when used.\nIn Julia, the functions Base.Fix1 and Base.Fix2 are provided to take functions of two variables and create callable objects of just one variable, with the other argument fixed. This partial function application is provided by a some of the logical comparison operators. which can be useful with filtering, say.\nFor example, <(2) is a funny looking way of expressing the function x -> x < 2. (Think of x < y as <(x,y) and then “fix” the value of y to be 2.) This is useful with filtering by a predicate function, for example:\n\nfilter(<(2), 0:4)\n\n2-element Vector{Int64}:\n 0\n 1\n\n\nwhich picks off the values of 0 and 1 in a somewhat obscure way but less verbose than filter(x -> x < 2, 0:4).\nThe Fix2 function is also helpful when using the f(x, p) form for passing parameters to a function. The result of Base.Fix2(f, p) is a function with its parameters fixed that can be passed along for plotting or other uses.\n\n\n\n7.6.2 The do notation\nMany functions in Julia accept a function as the first argument. A common pattern for calling some function is action(f, args...) where action is the function that will act on another function f using the value(s) in args.... There do notation is syntactical sugar for creating an anonymous function which is useful when more complicated function bodies are needed.\nHere is an artificial example to illustrate of a task we wont have cause to use in these notes, but is an important skill in some contexts. The do notation can be confusing to read, as it moves the function definition to the end and not the beginning, but is convenient to write and is used very often with the task of this example.\nTo save some text to a file requires a few steps: opening the file; writing to the file; closing the file. The open function does the first. One method has this signature open(f::Function, args...; kwargs....) and is documented to “Apply the function f to the result of open(args...; kwargs...) and close the resulting file descriptor upon completion.” Which is great, the open and close stages are handled by Julia and only the writing is up to the user.\nThe writing is done in the function of a body, so the do notation allows the creation of the function to be handled anonymously. In this context, the argument to this function will be an IO handle, which is typically called io.\nSo the pattern would be\nopen(\"somefile.txt\", \"w\") do io\n write(io, \"Four score and seven\")\n write(io, \"years ago...\")\nend\nThe name of the file to open appears, how the file is to be opened (w means write, r would mean read), and then a function with argument io which writes two lines to io."
},
{
"objectID": "precalc/functions.html#questions",
"href": "precalc/functions.html#questions",
"title": "7  Functions",
"section": "7.7 Questions",
"text": "7.7 Questions\n\nQuestion\nState the domain and range of \\(f(x) = |x + 2|\\).\n\n\n\n \n \n \n \n \n \n \n \n \n Domain is all real numbers, range is all real numbers\n \n \n\n\n \n \n \n \n Domain is all real numbers, range is all non-negative numbers\n \n \n\n\n \n \n \n \n Domain is all non-negative numbers, range is all non-negative numbers\n \n \n\n\n \n \n \n \n Domain is all non-negative numbers, range is all real numbers\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nState the domain and range of \\(f(x) = 1/(x-2)\\).\n\n\n\n \n \n \n \n \n \n \n \n \n Domain is all real numbers, range is all real numbers\n \n \n\n\n \n \n \n \n Domain is all real numbers except \\(2\\), range is all real numbers except \\(0\\)\n \n \n\n\n \n \n \n \n Domain is all non-negative numbers except \\(-2\\), range is all non-negative numbers except \\(0\\)\n \n \n\n\n \n \n \n \n Domain is all non-negative numbers except \\(0\\), range is all real numbers except \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these functions has a domain of all real \\(x\\), but a range of \\(x > 0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = \\sqrt{x}\\)\n \n \n\n\n \n \n \n \n \\(f(x) = 1/x^2\\)\n \n \n\n\n \n \n \n \n \\(f(x) = 2^x\\)\n \n \n\n\n \n \n \n \n \\(f(x) = |x|\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nWhich of these commands will make a function for \\(f(x) = \\sin(x + \\pi/3)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n f: x -> sin(x + pi/3)\n \n \n\n\n \n \n \n \n f x = sin(x + pi/3)\n \n \n\n\n \n \n \n \n f = sin(x + pi/3)\n \n \n\n\n \n \n \n \n f(x) = sin(x + pi/3)\n \n \n\n\n \n \n \n \n function f(x) = sin(x + pi/3)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich of these commands will create a function for \\(f(x) = (1 + x^2)^{-1}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n f[x] = (1 + x^2)^(-1)\n \n \n\n\n \n \n \n \n f(x) = (1 + x^2)^(-1)\n \n \n\n\n \n \n \n \n function f(x) = (1 + x^2)^(-1)\n \n \n\n\n \n \n \n \n def f(x): (1 + x^2)^(-1)\n \n \n\n\n \n \n \n \n f(x) := (1 + x^2)^(-1)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill the following Julia commands create a function for\n\\[\nf(x) = \\begin{cases}\n30 & x < 500\\\\\n30 + 0.10 \\cdot (x-500) & \\text{otherwise.}\n\\end{cases}\n\\]\nphone_plan(x) = x < 500 ? 30.0 : 30 + 0.10 * (x-500);\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe expression max(0, x) will be 0 if x is negative, but otherwise will take the value of x. Is this the same?\na_max(x) = x < 0 ? x : 0.0;\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn statistics, the normal distribution has two parameters \\(\\mu\\) and \\(\\sigma\\) appearing as:\n\\[\nf(x; \\mu, \\sigma) = \\frac{1}{\\sqrt{2\\pi\\sigma}} e^{-\\frac{1}{2}\\frac{(x-\\mu)^2}{\\sigma}}.\n\\]\nDoes this function implement this with the default values of \\(\\mu=0\\) and \\(\\sigma=1\\)?\n\na_normal(x; mu=0, sigma=1) = 1/sqrt(2pi*sigma) * exp(-(1/2)*(x-mu)^2/sigma)\n\na_normal (generic function with 1 method)\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat value of \\(\\mu\\) is used if the function is called as f(x, sigma=2.7)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat value of \\(\\mu\\) is used if the function is called as f(x, mu=70)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat value of \\(\\mu\\) is used if the function is called as f(x, mu=70, sigma=2.7)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nJulia has keyword arguments (as just illustrated) but also positional arguments. These are matched by how the function is called. For example,\n\nA(w, h) = w * h\n\nA (generic function with 1 method)\n\n\nwhen called as A(10, 5) will use 10 for w and 5 for h, as the order of w and h matches that of 10 and 5 in the call.\nThis is clear enough, but in fact positional arguments can have default values (then called optional) arguments). For example,\n\nB(w, h=5) = w * h\n\nB (generic function with 2 methods)\n\n\nActually creates two functions: B(w,h) for when the call is, say, B(10,5) and B(w) when the call is B(10).\nSuppose a function C is defined by\n\nC(x, mu=0, sigma=1) = 1/sqrt(2pi*sigma) * exp(-(1/2)*(x-mu)^2/sigma)\n\nC (generic function with 3 methods)\n\n\nThis is nearly identical to the last question, save for a comma instead of a semicolon after the x.\nWhat value of mu is used by the call C(1, 70, 2.7)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat value of mu is used by the call C(1, 70)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat value of mu is used by the call C(1)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWill the call C(1, mu=70) use a value of 70 for mu?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, this will work just as it does for keyword arguments\n \n \n\n\n \n \n \n \n No, there will be an error that the function does not accept keyword arguments\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis function mirrors that of the built-in clamp function:\n\nklamp(x, a, b) = x < a ? a : (x > b ? b : x)\n\nklamp (generic function with 1 method)\n\n\nCan you tell what it does?\n\n\n\n \n \n \n \n \n \n \n \n \n x is the larger of the minimum of x and a and the value of b, aka max(min(x,a),b)\n \n \n\n\n \n \n \n \n If x is in [a,b] it returns x, otherwise it returns a when x is less than a and b when x is greater than b.\n \n \n\n\n \n \n \n \n If x is in [a,b] it returns x, otherwise it returns NaN\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nJulia has syntax for the composition of functions \\(f\\) and \\(g\\) using the Unicode operator ∘ entered as \\circ[tab].\nThe notation to call a composition follows the math notation, where parentheses are necessary to separate the act of composition from the act of calling the function:\n\\[\n(f \\circ g)(x)\n\\]\nFor example\n\n(sin ∘ cos)(pi/4)\n\n0.6496369390800625\n\n\nWhat happens if you forget the extra parentheses and were to call sin ∘ cos(pi/4)?\n\n\n\n \n \n \n \n \n \n \n \n \n You still get \\(0.649...\\)\n \n \n\n\n \n \n \n \n You get a MethodError, as cos(pi/4) is evaluated as a number and ∘ is not defined for functions and numbers\n \n \n\n\n \n \n \n \n You get a generic function, but this won't be callable. If tried, it will give an method error.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe pipe notation ex |> f takes the output of ex and uses it as the input to the function f. That is composition. What is the value of this expression 1 |> sin |> cos?\n\n\n\n \n \n \n \n \n \n \n \n \n It is 0.6663667453928805, the same as cos(sin(1))\n \n \n\n\n \n \n \n \n It is 0.5143952585235492, the same as sin(cos(1))\n \n \n\n\n \n \n \n \n It gives an error\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nJulia has implemented this limited set of algebraic operations on functions: ∘ for composition and ! for negation. (Read ! as “not.”) The latter is useful for “predicate” functions (ones that return either true or false. What is output by this command?\nfn = !iseven\nfn(3)\n\n\n\n \n \n \n \n \n \n \n \n \n false\n \n \n\n\n \n \n \n \n true\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGeneric functions in Julia allow many algorithms to work without change for different number types. For example, 3000 years ago, floating point numbers wouldnt have been used to carry out the secant method computations, rather rational numbers would have been. We can see the results of using rational numbers with no change to our key function, just by starting with rational numbers for a and b:\n\nsecant_intersection(f, a, b) = b - f(b) * (b - a) / (f(b) - f(a)) # rewritten\nf(x) = x^2 - 2\na, b = 1//1, 2//1\nc = secant_intersection(f, a, b)\n\n4//3\n\n\nNow c is 4//3 and not 1.333.... This works as the key operations used: division, squaring, subtraction all have different implementations for rational numbers that preserve this type.\nRepeat the secant method two more times to find a better approximation for \\(\\sqrt{2}\\). What is the value of c found?\n\n\n\n \n \n \n \n \n \n \n \n \n 4//3\n \n \n\n\n \n \n \n \n 7//5\n \n \n\n\n \n \n \n \n 58//41\n \n \n\n\n \n \n \n \n 816//577\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHow small is the value of \\(f(c)\\) for this value?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHow close is this answer to the true value of \\(\\sqrt{2}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n about \\(8\\) parts in \\(100\\)\n \n \n\n\n \n \n \n \n about \\(1\\) parts in \\(100\\)\n \n \n\n\n \n \n \n \n about \\(4\\) parts in \\(10,000\\)\n \n \n\n\n \n \n \n \n about \\(2\\) parts in \\(1,000,000\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Finding a good approximation to \\(\\sqrt{2}\\) would be helpful to builders, for example, as it could be used to verify the trueness of a square room, say.)\n\n\nQuestion\nJulia does not have surface syntax for the difference of functions. This is a common thing to want when solving equations. The tools available solve \\(f(x)=0\\), but problems may present as solving for \\(h(x) = g(x)\\) or even \\(h(x) = c\\), for some constant. Which of these solutions is not helpful if \\(h\\) and \\(g\\) are already defined?\n\n\n\n \n \n \n \n \n \n \n \n \n Just use f = h - g\n \n \n\n\n \n \n \n \n Define f(x) = h(x) - g(x)\n \n \n\n\n \n \n \n \n Use x -> h(x) - g(x) when the difference is needed\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIdentifying the range of a function can be a difficult task. We see in this question that in some cases, a package can be of assistance.\nA mathematical interval is a set of values of the form\n\nan open interval: \\(a < x < b\\), or \\((a,b)\\);\na closed interval: \\(a \\leq x \\leq b\\), or \\([a,b]\\);\nor a half-open interval: \\(a < x \\leq b\\) or \\(a \\leq x < b\\), repectively \\((a,b]\\) or \\([a,b)\\).\n\nThey all contain all real numbers between the endpoints, the distinction is whether the endpoints are included or not.\nA domain is some set, but typically that set is an interval such as all real numbers (\\((-\\infty,\\infty)\\)), all non-negative numbers (\\([0,\\infty)\\)), or, say, all positive numbers (\\((0,\\infty)\\)).\nThe IntervalArithmetic package provides an easy means to define closed intervals using the symbol .., but this is also used by the already loaded CalculusWithJulia package in different manner, so we use the fully qualified named constructor in the following to construct intervals:\nimport IntervalArithmetic\n\nI1 = IntervalArithmetic.Interval(-Inf, Inf)\n\n[-∞, ∞]\n\n\n\nI2 = IntervalArithmetic.Interval(0, Inf)\n\n[0, ∞]\n\n\nThe main feature of the package is not to construct intervals, but rather to rigorously bound with an interval the output of the image of a closed interval under a function. That is, for a function \\(f\\) and closed interval \\([a,b]\\), a bound for the set \\(\\{f(x) \\text{ for } x \\text{ in } [a,b]\\}\\). When [a,b] is the domain of \\(f\\), then this is a bound for the range of \\(f\\).\nFor example the function \\(f(x) = x^2 + 2\\) had a domain of all real \\(x\\), the range can be found with:\n\nab = IntervalArithmetic.Interval(-Inf, Inf)\nu(x) = x^2 + 2\nu(ab)\n\n[2, ∞]\n\n\nFor this problem, the actual range can easily be identified. Does the bound computed match exactly?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes sin(0..pi) exactly match the interval of \\([-1,1]\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nGuess why or why not?\n\n\n\n \n \n \n \n \n \n \n \n \n Well it does, because \\([-1,1]\\) is the range\n \n \n\n\n \n \n \n \n It does not. The bound found is a provably known bound. The small deviation is due to the possible errors in evalution of the sin function near the floating point approximation of pi,\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNow consider the evaluation\n\nf(x) = x^x\nI = IntervalArithmetic.Interval(0, Inf)\nf(I)\n\n[0, ∞]\n\n\nMake a graph of f. Does the interval found above provide a nearly exact estimate of the true range (as the previous two questions have)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAny thoughts on why?\n\n\n\n \n \n \n \n \n \n \n \n \n The interval is a nearly exact estimate, as guaranteed by IntervalArithmetic.\n \n \n\n\n \n \n \n \n The guarantee of IntervalArithmetic is a bound on the interval, not the exact interval. In the case where the variable x appears more than once, it is treated formulaically as an independent quantity (meaning it has it full set of values considered in each instance) which is not the actual case mathematically. This is the \"dependence problem\" in interval arithmetic."
},
{
"objectID": "precalc/plotting.html",
"href": "precalc/plotting.html",
"title": "8  The Graph of a Function",
"section": "",
"text": "This section will use the following packages:\nA scalar, univariate function, such as \\(f(x) = 1 - x^2/2\\), can be thought of in many different ways. For example:\nThe graph of a univariate function is just a set of points in the Cartesian plane. These points come from the relation \\((x,f(x))\\) that defines the function. Operationally, a sketch of the graph will consider a handful of such pairs and then the rest of the points will be imputed.\nFor example, a typical approach to plot \\(f(x) = 1 - x^2/2\\) would be to choose some values for \\(x\\) and find the corresponding values of \\(y\\). This might be organized in a “T”-table:\nThese pairs would be plotted in a Cartesian plane and then connected with curved lines. A good sketch is aided by knowing ahead of time that this function describes a parabola which is curving downwards.\nWe note that this sketch would not include all the pairs \\((x,f(x))\\), as their extent is infinite, rather a well chosen collection of points over some finite domain."
},
{
"objectID": "precalc/plotting.html#graphing-a-function-with-julia",
"href": "precalc/plotting.html#graphing-a-function-with-julia",
"title": "8  The Graph of a Function",
"section": "8.1 Graphing a function with Julia",
"text": "8.1 Graphing a function with Julia\nJulia has several different options for rendering graphs, all in external packages. We will focus in these notes on the Plots package, which provides a common interface to several different plotting backends. (Click through for instructions for plotting with the Makie package or the PlotlyLight package.) At the top of this section the accompanying CalculusWithJulia package and the Plots package were loaded with the using command, like this:\nusing CalculusWithJulia\nusing Plots\n\n\n\n\n\n\nNote\n\n\n\nPlots is a frontend for one of several backends. Plots comes with a backend for web-based graphics (call plotly() to specify that); a backend for static graphs (call gr() for that). If the PyPlot package is installed, calling pyplot() will set that as a backend. For terminal usage, if the UnicodePlots package is installed, calling unicodeplots() will enable that usage. There are still other backends.\n\n\nThe plotly backend is part of the Plots package, as is gr. Other backends require installation, such as PyPlot and PlotlyJS. We use gr in these notes, for the most part. (The plotly backend is also quite nice for interactive usage, but doesnt work as well with the static HTML pages.)\nWith Plots loaded, it is straightforward to graph a function.\nFor example, to graph \\(f(x) = 1 - x^2/2\\) over the interval \\([-3,3]\\) we have:\n\nf(x) = 1 - x^2/2\nplot(f, -3, 3)\n\n\n\n\nThe plot command does the hard work behind the scenes. It needs \\(2\\) pieces of information declared:\n\nWhat to plot. With this invocation, this detail is expressed by passing a function object to plot\nWhere to plot; the xmin and xmax values. As with a sketch, it is impossible in this case to render a graph with all possible \\(x\\) values in the domain of \\(f\\), so we need to pick some viewing window. In the example this is \\([-3,3]\\) which is expressed by passing the two endpoints as the second and third arguments.\n\nPlotting a function is then this simple: plot(f, xmin, xmax).\n\nA basic template: Many operations we meet will take the form action(function, args...), as the call to plot does. The template shifts the focus to the action to be performed. This is a declarative style, where the details to execute the action are only exposed as needed.\n\n\n\n\n\n\n\nNote\n\n\n\nThe time to first plot can feel sluggish, but subsequent plots will be speedy. See the technical note at the end of this section for an explanation.\n\n\nLets see some other graphs.\nThe sin function over one period is plotted through:\n\nplot(sin, 0, 2pi)\n\n\n\n\nWe can make a graph of \\(f(x) = (1+x^2)^{-1}\\) over \\([-3,3]\\) with\n\nf(x) = 1 / (1 + x^2)\nplot(f, -3, 3)\n\n\n\n\nA graph of \\(f(x) = e^{-x^2/2}\\) over \\([-2,2]\\) is produced with:\n\nf(x) = exp(-x^2/2)\nplot(f, -2, 2)\n\n\n\n\nWe could skip the first step of defining a function by using an anonymous function. For example, to plot \\(f(x) = \\cos(x) - x\\) over \\([0, \\pi/2]\\) we could do:\n\nplot(x -> cos(x) - x, 0, pi/2)\n\n\n\n\nAnonymous functions are especially helpful when parameterized functions are involved:\n\nmxplusb(x; m=1, b=0) = m*x + b\nplot(x -> mxplusb(x; m=-1, b=1), -1, 2)\n\n\n\n\nHad we parameterized using the f(x,p) style, the result would be similar:\n\nfunction mxplusb(x, p)\n m, b = p.m, p.b\n m * x + b\nend\nplot(x -> mxplusb(x, (m=-1, b=1)), -1, 2)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe function object in the general pattern action(function, args...) is commonly specified in one of three ways: by a name, as with f; as an anonymous function; or as the return value of some other action through composition.\n\n\nAnonymous functions are also created by Julia's do notation, which is useful when the first argument to function (like plot) accepts a function:\n\nplot(0, pi/2) do x\n cos(x) - x\nend\n\n\n\n\nThe do notation can be a bit confusing to read when unfamiliar, though its convenience makes it appealing.\n\n\n\n\n\n\nNote\n\n\n\nSome types we will encounter, such as the one for symbolic values or the special polynomial one, have their own plot recipes that allow them to be plotted similarly as above, even though they are not functions.\n\n\n\nMaking a graph with Plots is easy, but producing a graph that is informative can be a challenge, as the choice of a viewing window can make a big difference in what is seen. For example, trying to make a graph of \\(f(x) = \\tan(x)\\), as below, will result in a bit of a mess - the chosen viewing window crosses several places where the function blows up:\n\nf(x) = tan(x)\nplot(f, -10, 10)\n\n\n\n\nThough this graph shows the asymptote structure and periodicity, it doesnt give much insight into each period or even into the fact that the function is periodic."
},
{
"objectID": "precalc/plotting.html#the-details-of-graph-making",
"href": "precalc/plotting.html#the-details-of-graph-making",
"title": "8  The Graph of a Function",
"section": "8.2 The details of graph making",
"text": "8.2 The details of graph making\nThe actual details of making a graph of \\(f\\) over \\([a,b]\\) are pretty simple and follow the steps in making a “T”-table:\n\nA set of \\(x\\) values are created between \\(a\\) and \\(b\\).\nA corresponding set of \\(y\\) values are created.\nThe pairs \\((x,y)\\) are plotted as points and connected with straight lines.\n\nThe only real difference is that when drawing by hand, we might know to curve the lines connecting points based on an analysis of the function. As Julia doesnt consider this, the points are connected with straight lines like a dot-to-dot puzzle.\nIn general, the x values are often generated by range or the colon operator and the y values produced by mapping or broadcasting a function over the generated x values.\nHowever, the plotting directive plot(f, xmin, xmax) calls an adaptive algorithm to use more points where needed, as judged by PlotUtils.adapted_grid(f, (xmin, xmax)). It computes both the x and y values. This algorithm is wrapped up into the unzip(f, xmin, xmax) function from CalculusWithJulia. The algorithm adds more points where the function is more “curvy” and uses fewer points where it is “straighter.” Here we see the linear function is identified as needing far fewer points than the oscillating function when plotted over the same range:\n\npts_needed(f, xmin, xmax) = length(unzip(f, xmin, xmax)[1])\npts_needed(x -> 10x, 0, 10), pts_needed(x -> sin(10x), 0, 10)\n\n(31, 1605)\n\n\n(In fact, the 21 is the minimum number of points used for any function; a linear function only needs two.)\n\nFor instances where a specific set of \\(x\\) values is desired to be used, the range function or colon operator can be used to create the \\(x\\) values and broadcasting used to create the \\(y\\) values. For example, if we were to plot \\(f(x) = \\sin(x)\\) over \\([0,2\\pi]\\) using \\(10\\) points, we might do:\n\n𝒙s = range(0, 2pi, length=10)\n𝒚s = sin.(𝒙s)\n\n10-element Vector{Float64}:\n 0.0\n 0.6427876096865393\n 0.984807753012208\n 0.8660254037844387\n 0.3420201433256689\n -0.34202014332566866\n -0.8660254037844385\n -0.9848077530122081\n -0.6427876096865396\n -2.4492935982947064e-16\n\n\nFinally, to plot the set of points and connect with lines, the \\(x\\) and \\(y\\) values are passed along as vectors:\n\nplot(𝒙s, 𝒚s)\n\n\n\n\nThis plots the points as pairs and then connects them in order using straight lines. Basically, it creates a dot-to-dot graph. The above graph looks primitive, as it doesnt utilize enough points.\n\nExample: Reflections\nThe graph of a function may be reflected through a line, as those seen with a mirror. For example, a reflection through the \\(y\\) axis takes a point \\((x,y)\\) to the point \\((-x, y)\\). We can easily see this graphically, when we have sets of \\(x\\) and \\(y\\) values through a judiciously placed minus sign.\nFor example, to plot \\(\\sin(x)\\) over \\((-\\pi,\\pi)\\) we might do:\n\nxs = range(-pi, pi, length=100)\nys = sin.(xs)\nplot(xs, ys)\n\n\n\n\nTo reflect this graph through the \\(y\\) axis, we only need to plot -xs and not xs:\n\nplot(-xs, ys)\n\n\n\n\nLooking carefully we see there is a difference. (How?)\nThere are four very common reflections:\n\nreflection through the \\(y\\)-axis takes \\((x,y)\\) to \\((-x, y)\\).\nreflection through the \\(x\\)-axis takes \\((x,y)\\) to \\((x, -y)\\).\nreflection through the origin takes \\((x,y)\\) to \\((-x, -y)\\).\nreflection through the line \\(y=x\\) takes \\((x,y)\\) to \\((y,x)\\).\n\nFor the \\(\\sin(x)\\) graph, we see that reflecting through the \\(x\\) axis produces the same graph as reflecting through the \\(y\\) axis:\n\nplot(xs, -ys)\n\n\n\n\nHowever, reflecting through the origin leaves this graph unchanged:\n\nplot(-xs, -ys)\n\n\n\n\n\nAn even function is one where reflection through the \\(y\\) axis leaves the graph unchanged. That is, \\(f(-x) = f(x)\\). An odd function is one where a reflection through the origin leaves the graph unchanged, or symbolically \\(f(-x) = -f(x)\\).\n\nIf we try reflecting the graph of \\(\\sin(x)\\) through the line \\(y=x\\), we have:\n\nplot(ys, xs)\n\n\n\n\nThis is the graph of the equation \\(x = \\sin(y)\\), but is not the graph of a function as the same \\(x\\) can map to more than one \\(y\\) value. (The new graph does not pass the “vertical line” test.)\nHowever, for the sine function we can get a function from this reflection if we choose a narrower viewing window:\n\nxs = range(-pi/2, pi/2, length=100)\nys = sin.(xs)\nplot(ys, xs)\n\n\n\n\nThe graph is that of the “inverse function” for \\(\\sin(x), x \\text{ in } [-\\pi/2, \\pi/2]\\).\n\n\nThe plot(xs, f) syntax\nWhen plotting a univariate function there are three basic patterns that can be employed. We have examples above of:\n\nplot(f, xmin, xmax) uses an adaptive algorithm to identify values for \\(x\\) in the interval [xmin, xmas],\nplot(xs, f.(xs)) to manually choose the values of \\(x\\) to plot points for, and\n\nFinally there is a merging of these following either of these patterns:\n\nplot(f, xs) or plot(xs, f)\n\nBoth require a manual choice of the values of the \\(x\\)-values to plot, but the broadcasting is carried out in the plot command. This style is convenient, for example, to down sample the \\(x\\) range to see the plotting mechanics, such as:\n\nplot(0:pi/4:2pi, sin)\n\n\n\n\n\n\nNaN values\nAt times it is not desirable to draw lines between each succesive point. For example, if there is a discontinuity in the function or if there were a vertical asymptote, such as what happens at \\(0\\) with \\(f(x) = 1/x\\).\nThe most straightforward plot is dominated by the vertical asymptote at \\(x=0\\):\n\nq(x) = 1/x\nplot(q, -1, 1)\n\n\n\n\nWe can attempt to improve this graph by adjusting the viewport. The viewport of a graph is the \\(x\\)-\\(y\\) range of the viewing window. By default, the \\(y\\)-part of the viewport is determined by the range of the function over the specified interval, \\([a,b]\\). As just seen, this approach can produce poor graphs. The ylims=(ymin, ymax) argument can modify what part of the \\(y\\) axis is shown. (Similarly xlims=(xmin, xmax) will modify the viewport in the \\(x\\) direction.)\nAs we see, even with this adjustment, the spurious line connecting the points with \\(x\\) values closest to \\(0\\) is still drawn:\n\nplot(q, -1, 1, ylims=(-10,10))\n\n\n\n\nThe dot-to-dot algorithm, at some level, assumes the underlying function is continuous; here \\(q(x)=1/x\\) is not.\nThere is a convention for most plotting programs that if the \\(y\\) value for a point is NaN that no lines will connect to that point, (x,NaN). NaN conveniently appears in many cases where a plot may have an issue, though not with \\(1/x\\) as 1/0 is Inf and not NaN. (Unlike, say, 0/0 which is NaN.)\nHere is one way to plot \\(q(x) = 1/x\\) over \\([-1,1]\\) taking advantage of this convention:\n\nxs = range(-1, 1, length=251)\nys = q.(xs)\nys[xs .== 0.0] .= NaN\nplot(xs, ys)\n\n\n\n\nBy using an odd number of points, we should have that \\(0.0\\) is amongst the xs. The next to last line replaces the \\(y\\) value that would be infinite with NaN.\nAs a recommended alternative, we might modify the function so that if it is too large, the values are replaced by NaN. Here is one such function consuming a function and returning a modified function put to use to make this graph:\n\nrangeclamp(f, hi=20, lo=-hi; replacement=NaN) = x -> lo < f(x) < hi ? f(x) : replacement\nplot(rangeclamp(x -> 1/x), -1, 1)\n\n\n\n\n(The clamp function is a base Julia function which clamps a number between lo and hi, returning lo or hi if x is outside that range.)"
},
{
"objectID": "precalc/plotting.html#layers",
"href": "precalc/plotting.html#layers",
"title": "8  The Graph of a Function",
"section": "8.3 Layers",
"text": "8.3 Layers\nGraphing more than one function over the same viewing window is often desirable. Though this is easily done in Plots by specifying a vector of functions as the first argument to plot instead of a single function object, we instead focus on building the graph layer by layer.\nFor example, to see that a polynomial and the cosine function are “close” near \\(0\\), we can plot both \\(\\cos(x)\\) and the function \\(f(x) = 1 - x^2/2\\) over \\([-\\pi/2,\\pi/2]\\):\n\nf(x) = 1 - x^2/2\nplot(cos, -pi/2, pi/2, label=\"cos\")\nplot!(f, -pi/2, pi/2, label=\"f\")\n\n\n\n\nAnother useful function to add to a plot is one to highlight the \\(x\\) axis. This makes identifying zeros of the function easier. The anonymous function x -> 0 will do this. But, perhaps less cryptically, so will the base function zero. For example\n\nf(x) = x^5 - x + 1\nplot(f, -1.5, 1.4, label=\"f\")\nplot!(zero, label=\"zero\")\n\n\n\n\n(The job of zero is to return “\\(0\\)” in the appropriate type. There is also a similar one function in base Julia.)\nThe plot! call adds a layer. We could still specify the limits for the plot, though as this can be computed from the figure, to plot zero we let Plots do it.\nFor another example, suppose we wish to plot the function \\(f(x)=x\\cdot(x-1)\\) over the interval \\([-1,2]\\) and emphasize with points the fact that \\(0\\) and \\(1\\) are zeros. We can do this with three layers: the first to graph the function, the second to emphasize the \\(x\\) axis, the third to graph the points.\n\nf(x) = x*(x-1)\nplot(f, -1, 2, legend=false) # turn off legend\nplot!(zero)\nscatter!([0,1], [0,0])\n\n\n\n\nThe \\(3\\) main functions used in these notes for adding layers are:\n\nplot!(f, a, b) to add the graph of the function f; also plot!(xs, ys)\nscatter!(xs, ys) to add points \\((x_1, y_1), (x_2, y_2), \\dots\\).\nannotate!((x,y, label)) to add a label at \\((x,y)\\)\n\n\n\n\n\n\n\nWarning\n\n\n\nJulia has a convention to use functions named with a ! suffix to indicate that they mutate some object. In this case, the object is the current graph, though it is implicit. Both plot!, scatter!, and annotate! (others too) do this by adding a layer."
},
{
"objectID": "precalc/plotting.html#additional-arguments",
"href": "precalc/plotting.html#additional-arguments",
"title": "8  The Graph of a Function",
"section": "8.4 Additional arguments",
"text": "8.4 Additional arguments\nThe Plots package provides many arguments for adjusting a graphic, here we mention just a few of the attributes:\n\nplot(..., title=\"main title\", xlab=\"x axis label\", ylab=\"y axis label\"): add title and label information to a graphic\nplot(..., color=\"green\"): this argument can be used to adjust the color of the drawn figure (color can be a string,\"green\", or a symbol, :green, among other specifications)\nplot(..., linewidth=5): this argument can be used to adjust the width of drawn lines\nplot(..., xlims=(a,b), ylims=(c,d): either or both xlims and ylims can be used to control the viewing window\nplot(..., linestyle=:dash): will change the line style of the plotted lines to dashed lines. Also :dot, …\nplot(..., aspect_ratio=:equal): will keep \\(x\\) and \\(y\\) axis on same scale so that squares look square.\nplot(..., legend=false): by default, different layers will be indicated with a legend, this will turn off this feature\nplot(..., label=\"a label\") the label attribute will show up when a legend is present. Using an empty string, \"\", will suppress add the layer to the legend.\n\nFor plotting points with scatter, or scatter! the markers can be adjusted via\n\nscatter(..., markersize=5): increase marker size\nscatter(..., marker=:square): change the marker (uses a symbol, not a string to specify)\n\nOf course, zero, one, or more of these can be used on any given call to plot, plot!, scatter or scatter!."
},
{
"objectID": "precalc/plotting.html#graphs-of-parametric-equations",
"href": "precalc/plotting.html#graphs-of-parametric-equations",
"title": "8  The Graph of a Function",
"section": "8.5 Graphs of parametric equations",
"text": "8.5 Graphs of parametric equations\nIf we have two functions \\(f(x)\\) and \\(g(x)\\) there are a few ways to investigate their joint behavior. As just mentioned, we can graph both \\(f\\) and \\(g\\) over the same interval using layers. Such a graph allows an easy comparison of the shape of the two functions and can be useful in solving \\(f(x) = g(x)\\). For the latter, the graph of \\(h(x) = f(x) - g(x)\\) is also of value: solutions to \\(f(x)=g(x)\\) appear as crossing points on the graphs of f and g, whereas they appear as zeros (crossings of the \\(x\\)-axis) when h is plotted.\nA different graph can be made to compare the two functions side-by-side. This is a parametric plot. Rather than plotting points \\((x,f(x))\\) and \\((x,g(x))\\) with two separate graphs, the graph consists of points \\((f(x), g(x))\\). We illustrate with some examples below:\n\nExample\nThe most “famous” parametric graph is one that is likely already familiar, as it follows the parametrization of points on the unit circle by the angle made between the \\(x\\) axis and the ray from the origin through the point. (If not familiar, this will soon be discussed in these notes.)\n\n𝒇(x) = cos(x); 𝒈(x) = sin(x)\n𝒕s = range(0, 2pi, length=100)\nplot(𝒇.(𝒕s), 𝒈.(𝒕s), aspect_ratio=:equal) # make equal axes\n\n\n\n\nAny point \\((a,b)\\) on this graph is represented by \\((\\cos(t), \\sin(t))\\) for some value of \\(t\\), and in fact multiple values of \\(t\\), since \\(t + 2k\\pi\\) will produce the same \\((a,b)\\) value as \\(t\\) will.\nMaking the parametric plot is similar to creating a plot using lower level commands. There a sequence of values is generated to approximate the \\(x\\) values in the graph (xs), a set of commands to create the corresponding function values (e.g., f.(xs)), and some instruction on how to represent the values, in this case with lines connecting the points (the default for plot for two sets of numbers).\nIn this next plot, the angle values are chosen to be the familiar ones, so the mechanics of the graph can be emphasized. Only the upper half is plotted:\n\n\n\n9 rows × 3 columnsθxySymSymSym10102pi/6sqrt(3)/21/23pi/4sqrt(2)/2sqrt(2)/24pi/31/2sqrt(3)/25pi/20162*pi/3-1/2sqrt(3)/273*pi/4-sqrt(2)/2sqrt(2)/285*pi/6-sqrt(3)/21/29pi-10\n\n\n\nθs =[0, pi/6, pi/4, pi/3, pi/2, 2pi/3, 3pi/4, 5pi/6, pi]\nplot(𝒇.(θs), 𝒈.(θs), legend=false, aspect_ratio=:equal)\nscatter!(𝒇.(θs), 𝒈.(θs))\n\n\n\n\n\nAs with the plot of a univariate function, there is a convenience interface for these plots - just pass the two functions in:\n\nplot(𝒇, 𝒈, 0, 2pi, aspect_ratio=:equal)\n\n\n\n\n\n\nExample\nLooking at growth. Comparing \\(x^2\\) with \\(x^3\\) can run into issues, as the scale gets big:\n\nx²(x) = x^2\nx³(x) = x^3\nplot(x², 0, 25)\nplot!(x³, 0, 25)\n\n\n\n\nIn the above, x³ is already \\(25\\) times larger on the scale of \\([0,25]\\) and this only gets worse if the viewing window were to get larger. However, the parametric graph is quite different:\n\nplot(x², x³, 0, 25)\n\n\n\n\nIn this graph, as \\(x^3/x^2 = x\\), as \\(x\\) gets large, the ratio stays reasonable.\n\n\nExample\nParametric plots are useful to compare the ratio of values near a point. In the above example, we see how this is helpful for large x. This example shows it is convenient for a fixed x, in this case x=0.\nPlot \\(f(x) = x^3\\) and \\(g(x) = x - \\sin(x)\\) around \\(x=0\\):\n\nf(x) = x^3\ng(x) = x - sin(x)\nplot(f, g, -pi/2, pi/2)\n\n\n\n\nThis graph is nearly a straight line. At the point \\((0,0)=(g(0), g(0))\\), we see that both functions are behaving in a similar manner, though the slope is not \\(1\\), so they do not increase at exactly the same rate.\n\n\nExample: Etch A Sketch\nEtch A sketch is a drawing toy where two knobs control the motion of a pointer, one knob controlling the \\(x\\) motion, the other the \\(y\\) motion. The trace of the movement of the pointer is recorded until the display is cleared by shaking. Shake to clear is now a motion incorporated by some smart-phone apps.\nPlaying with the toy makes a few things become clear:\n\nTwisting just the left knob (the horizontal or \\(x\\) motion) will move the pointer left or right, leaving a horizontal line. Parametrically, this would follow the equations \\(f(t) = \\xi(t)\\) for some \\(\\xi\\) and \\(g(t) = c\\).\nTwisting just the right knob (the vertical or \\(y\\) motion) will move the pointer up or down, leaving a vertical line. Parametrically, this would follow the equations \\(f(t) = c\\) and \\(g(t) = \\psi(t)\\) for some \\(\\psi\\).\nDrawing a line with a slope different from \\(0\\) or \\(\\infty\\) requires moving both knobs at the same time. A \\(45\\)\\(^\\circ\\) line with slope \\(m=1\\) can be made by twisting both at the same rate, say through \\(f(t) = ct\\), \\(g(t)=ct\\). It doesnt matter how big \\(c\\) is, just that it is the same for both \\(f\\) and \\(g\\). Creating a different slope is done by twisting at different rates, say \\(f(t)=ct\\) and \\(g(t)=dt\\). The slope of the resulting line will be \\(d/c\\).\nDrawing a curve is done by twisting the two knobs with varying rates.\n\nThese all apply to parametric plots, as the Etch A Sketch trace is no more than a plot of \\((f(t), g(t))\\) over some range of values for \\(t\\), where \\(f\\) describes the movement in time of the left knob and \\(g\\) the movement in time of the right.\nNow, we revist the last problem in the context of this. We saw in the last problem that the parametric graph was nearly a line - so close the eye cant really tell otherwise. That means that the growth in both \\(f(t) = t^3\\) and \\(g(t)=t - \\sin(t)\\) for \\(t\\) around \\(0\\) are in a nearly fixed ratio, as otherwise the graph would have more curve in it.\n\n\nExample: Spirograph\nParametric plots can describe a richer set of curves than can plots of functions. Plots of functions must pass the “vertical-line test”, as there can be at most one \\(y\\) value for a given \\(x\\) value. This is not so for parametric plots, as the circle example above shows. Plotting sines and cosines this way is the basis for the once popular Spirograph toy. The curves drawn there are parametric plots where the functions come from rolling a smaller disc either around the outside or inside of a larger disc.\nHere is an example using a parameterization provided on the Wikipedia page where \\(R\\) is the radius of the larger disc, \\(r\\) the radius of the smaller disc and \\(\\rho < r\\) indicating the position of the pencil within the smaller disc.\n\nR, r, rho = 1, 1/4, 1/4\nf(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t)\ng(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t)\n\nplot(f, g, 0, max((R-r)/r, r/(R-r))*2pi)\n\n\n\n\nIn the above, one can fix \\(R=1\\). Then different values for r and rho will produce different graphs. These graphs will be periodic if \\((R-r)/r\\) is a rational. (Nothing about these equations requires \\(\\rho < r\\).)"
},
{
"objectID": "precalc/plotting.html#questions",
"href": "precalc/plotting.html#questions",
"title": "8  The Graph of a Function",
"section": "8.6 Questions",
"text": "8.6 Questions\n\nQuestion\nPlot the function \\(f(x) = x^3 - x\\). When is the function positive?\n\n\n\n \n \n \n \n \n \n \n \n \n (-1, 0) and (1, Inf)\n \n \n\n\n \n \n \n \n (-Inf, -0.577) and (0.577, Inf)\n \n \n\n\n \n \n \n \n (-Inf, -1) and (0,1)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nPlot the function \\(f(x) = 3x^4 + 8x^3 - 18x^2\\). Where (what \\(x\\) value) is the smallest value? (That is, for which input \\(x\\) is the output \\(f(x)\\) as small as possible.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nPlot the function \\(f(x) = 3x^4 + 8x^3 - 18x^2\\). When is the function increasing?\n\n\n\n \n \n \n \n \n \n \n \n \n (-3, 0) and (1, Inf)\n \n \n\n\n \n \n \n \n (-Inf, -3) and (0, 1)\n \n \n\n\n \n \n \n \n (-Inf, -4.1) and (1.455, Inf)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGraphing both f and the line \\(y=0\\) helps focus on the zeros of f. When f(x)=log(x)-2, plot f and the line \\(y=0\\). Identify the lone zero.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nPlot the function \\(f(x) = x^3 - x\\) over \\([-2,2]\\). How many zeros are there?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe function \\(f(x) = (x^3 - 2x) / (2x^2 -10)\\) is a rational function with issues when \\(2x^2 = 10\\), or \\(x = -\\sqrt{5}\\) or \\(\\sqrt{5}\\).\nPlot this function from \\(-5\\) to \\(5\\). How many times does it cross the \\(x\\) axis?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA trash collection plan charges a flat rate of 35 dollars a month for the first 10 bags of trash and is 4 dollars a bag thereafter. Which function will model this:\n\n\n\n \n \n \n \n \n \n \n \n \n f(x) = x <= 4 ? 35.0 : 35.0 + 10.0 * (x-4)\n \n \n\n\n \n \n \n \n f(x) = x <= 10 ? 35.0 : 35.0 + 4.0 * (x-10)\n \n \n\n\n \n \n \n \n f(x) = x <= 35.0 ? 10.0 : 10.0 + 35.0 * (x-4)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nMake a plot of the model. Graphically estimate how many bags of trash will cost 55 dollars.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nPlot the functions \\(f(x) = \\cos(x)\\) and \\(g(x) = x\\). Estimate the \\(x\\) value of where the two graphs intersect.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe fact that only a finite number of points are used in a graph can introduce artifacts. An example can appear when plotting sinusoidal functions. An example is the graph of f(x) = sin(500*pi*x) over [0,1].\nMake its graph using 250 evenly spaced points, as follows:\nxs = range(0, 1, length=250)\nf(x) = sin(500*pi*x)\nplot(xs, f.(xs))\nWhat is seen?\n\n\n\n \n \n \n \n \n \n \n \n \n Oddly, it looks exactly like the graph of \\(f(x) = \\sin(2\\pi x)\\).\n \n \n\n\n \n \n \n \n It oscillates wildly, as the period is \\(T=2\\pi/(500 \\pi)\\) so there are 250 oscillations.\n \n \n\n\n \n \n \n \n It should oscillate evenly, but instead doesn't oscillate very much near 0 and 1\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe algorithm to plot a function works to avoid aliasing issues. Does the graph generated by plot(f, 0, 1) look the same, as the one above?\n\n\n\n \n \n \n \n \n \n \n \n \n No, but is still looks pretty bad, as fitting 250 periods into a too small number of pixels is a problem.\n \n \n\n\n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No, the graph shows clearly all 250 periods.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nMake this parametric plot for the specific values of the parameters k and l. What shape best describes it?\nR, r, rho = 1, 3/4, 1/4\nf(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t)\ng(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t)\n\nplot(f, g, 0, max((R-r)/r, r/(R-r))*2pi, aspect_ratio=:equal)\n\n\n\n \n \n \n \n \n \n \n \n \n Four sharp points, like a star\n \n \n\n\n \n \n \n \n Four petals, like a flower\n \n \n\n\n \n \n \n \n An ellipse\n \n \n\n\n \n \n \n \n A straight line\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor these next questions, we use this function:\n\nfunction spirograph(R, r, rho)\n f(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t)\n g(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t)\n\n plot(f, g, 0, max((R-r)/r, r/(R-r))*2pi, aspect_ratio=:equal)\nend\n\nspirograph (generic function with 1 method)\n\n\nMake this plot for the following specific values of the parameters R, r, and rho. What shape best describes it?\nR, r, rho = 1, 3/4, 1/4\n\n\n\n \n \n \n \n \n \n \n \n \n Four sharp points, like a star\n \n \n\n\n \n \n \n \n Four petals, like a flower\n \n \n\n\n \n \n \n \n An ellipse\n \n \n\n\n \n \n \n \n A straight line\n \n \n\n\n \n \n \n \n None of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nMake this plot for the following specific values of the parameters R, r, and rho. What shape best describes it?\nR, r, rho = 1, 1/2, 1/4\n\n\n\n \n \n \n \n \n \n \n \n \n Four sharp points, like a star\n \n \n\n\n \n \n \n \n Four petals, like a flower\n \n \n\n\n \n \n \n \n An ellipse\n \n \n\n\n \n \n \n \n A straight line\n \n \n\n\n \n \n \n \n None of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nMake this plot for the specific values of the parameters R, r, and rho. What shape best describes it?\nR, r, rho = 1, 1/4, 1\n\n\n\n \n \n \n \n \n \n \n \n \n Four sharp points, like a star\n \n \n\n\n \n \n \n \n Four petals, like a flower\n \n \n\n\n \n \n \n \n A circle\n \n \n\n\n \n \n \n \n A straight line\n \n \n\n\n \n \n \n \n None of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nMake this plot for the specific values of the parameters R, r, and rho. What shape best describes it?\nR, r, rho = 1, 1/8, 1/4\n\n\n\n \n \n \n \n \n \n \n \n \n Four sharp points, like a star\n \n \n\n\n \n \n \n \n Four petals, like a flower\n \n \n\n\n \n \n \n \n A circle\n \n \n\n\n \n \n \n \n A straight line\n \n \n\n\n \n \n \n \n None of the above"
},
{
"objectID": "precalc/plotting.html#technical-note",
"href": "precalc/plotting.html#technical-note",
"title": "8  The Graph of a Function",
"section": "8.7 Technical note",
"text": "8.7 Technical note\nThe slow “time to first plot” in Julia is a well-known hiccup that is related to how Julia can be so fast. Loading Plots and the making the first plot are both somewhat time consuming, though the second and subsequent plots are speedy. Why?\nJulia is an interactive language that attains its speed by compiling functions on the fly using the llvm compiler. When Julia encounters a new combination of a function method and argument types it will compile and cache a function for subsequent speedy execution. The first plot is slow, as there are many internal functions that get compiled. This has sped up of late, as excessive recompilations have been trimmed down, but still has a way to go. This is different from “precompilation” which also helps trim down time for initial executions. There are also some more technically challenging means to create Julia images for faster start up that can be pursued if needed."
},
{
"objectID": "precalc/transformations.html",
"href": "precalc/transformations.html",
"title": "9  Function manipulations",
"section": "",
"text": "In this section we will use these add-on packages:\nThinking of functions as objects themselves that can be manipulated - rather than just blackboxes for evaluation - is a major abstraction of calculus. The main operations to come: the limit of a function, the derivative of a function, and the integral of a function all operate on functions. Hence the idea of an operator. Here we discuss manipulations of functions from pre-calculus that have proven to be useful abstractions."
},
{
"objectID": "precalc/transformations.html#the-algebra-of-functions",
"href": "precalc/transformations.html#the-algebra-of-functions",
"title": "9  Function manipulations",
"section": "9.1 The algebra of functions",
"text": "9.1 The algebra of functions\nWe can talk about the algebra of functions. For example, the sum of functions \\(f\\) and \\(g\\) would be a function whose value at \\(x\\) was just \\(f(x) + g(x)\\). More formally, we would have:\n\\[\n(f + g)(x) = f(x) + g(x),\n\\]\nWe have given meaning to a new function \\(f+g\\) by defining what is does to \\(x\\) with the rule on the right hand side. Similarly, we can define operations for subtraction, multiplication, addition, and powers.\nThese mathematical concepts arent defined for functions in base Julia, though they could be if desired, by a commands such as:\n\nimport Base: +\nf::Function + g::Function = x -> f(x) + g(x)\n\n+ (generic function with 314 methods)\n\n\nThis adds a method to the generic + function for functions. The type annotations ::Function ensure this applies only to functions. To see that it would work, we could do odd-looking things like:\n\nss = sin + sqrt\nss(4)\n\n1.2431975046920718\n\n\nDoing this works, as Julia treats functions as first class objects, lending itself to higher order programming. However, this definition in general is kind of limiting, as functions in mathematics and Julia can be much more varied than just the univariate functions we have defined addition for. We wont pursue this further.\n\n9.1.1 Composition of functions\nAs seen, just like with numbers, it can make sense mathematically to define addition, subtraction, multiplication and division of functions. Unlike numbers though, we can also define a new operation on functions called composition that involves chaining the output of one function to the input of another. Composition is a common practice in life, where the result of some act is fed into another process. For example, making a pie from scratch involves first making a crust, then composing this with a filling. A better abstraction might be how we “surf” the web. The output of one search leads us to another search whose output then is a composition.\nMathematically, a composition of univariate functions \\(f\\) and \\(g\\) is written \\(f \\circ g\\) and defined by what it does to a value in the domain of \\(g\\) by:\n\\[\n(f \\circ g)(x) = f(g(x)).\n\\]\nThe output of \\(g\\) becomes the input of \\(f\\).\nComposition depends on the order of things. There is no guarantee that \\(f \\circ g\\) should be the same as \\(g \\circ f\\). (Putting on socks then shoes is quite different from putting on shoes then socks.) Mathematically, we can see this quite clearly with the functions \\(f(x) = x^2\\) and \\(g(x) = \\sin(x)\\). Algebraically we have:\n\\[\n(f \\circ g)(x) = \\sin(x)^2, \\quad (g \\circ f)(x) = \\sin(x^2).\n\\]\nThough they may be typographically similar dont be fooled, the following graph shows that the two functions arent even close except for \\(x\\) near \\(0\\) (for example, one composition is always non-negative, whereas the other is not):\n\nf(x) = x^2\ng(x) = sin(x)\nfg = f ∘ g # typed as f \\circ[tab] g\ngf = g ∘ f # typed as g \\circ[tab] f\nplot(fg, -2, 2, label=\"f∘g\")\nplot!(gf, label=\"g∘f\")\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nUnlike how the basic arithmetic operations are treated, Julia defines the infix Unicode operator \\\\circ[tab] to represent composition of functions, mirroring mathematical notation. This infix operations takes in two functions and returns an anonymous function. It can be useful and will mirror standard mathematical usage up to issues with precedence rules.\n\n\nStarting with two functions and composing them requires nothing more than a solid grasp of knowing the rules of function evaluation. If \\(f(x)\\) is defined by some rule involving \\(x\\), then \\(f(g(x))\\) just replaces each \\(x\\) in the rule with a \\(g(x)\\).\nSo if \\(f(x) = x^2 + 2x - 1\\) and \\(g(x) = e^x - x\\) then \\(f \\circ g\\) would be (before any simplification)\n\\[\n(f \\circ g)(x) = (e^x - x)^2 + 2(e^x - x) - 1.\n\\]\nIf can be helpful to think of the argument to \\(f\\) as a “box” that gets filled in by \\(g\\):\n\\[\n\\begin{align*}\ng(x) &=e^x - x\\\\\nf(\\square) &= (\\square)^2 + 2(\\square) - 1\\\\\nf(g(x)) &= (g(x))^2 + 2(g(x)) - 1 = (e^x - x)^2 + 2(e^x - x) - 1.\n\\end{align*}\n\\]\nHere we look at a few compositions:\n\nThe function \\(h(x) = \\sqrt{1 - x^2}\\) can be seen as \\(f\\circ g\\) with \\(f(x) = \\sqrt{x}\\) and \\(g(x) = 1-x^2\\).\nThe function \\(h(x) = \\sin(x/3 + x^2)\\) can be viewed as \\(f\\circ g\\) with \\(f(x) = \\sin(x)\\) and \\(g(x) = x/3 + x^2\\).\nThe function \\(h(x) = e^{-1/2 \\cdot x^2}\\) can be viewed as \\(f\\circ g\\) with \\(f(x) = e^{-x}\\) and \\(g(x) = (1/2) \\cdot x^2\\).\n\nDecomposing a function into a composition of functions is not unique, other compositions could have been given above. For example, the last function is also \\(f(x) = e^{-x/2}\\) composed with \\(g(x) = x^2\\).\n\n\n\n\n\n\nNote\n\n\n\nThe real value of composition is to break down more complicated things into a sequence of easier steps. This is good mathematics, but also good practice more generally. For example, when we approach a problem with the computer, we generally use a smallish set of functions and piece them together (that is, compose them) to find a solution.\n\n\n\n\n9.1.2 Shifting and scaling graphs\nIt is very useful to mentally categorize functions within families. The difference between \\(f(x) = \\cos(x)\\) and \\(g(x) = 12\\cos(2(x - \\pi/4))\\) is not that much - both are cosine functions, one is just a simple enough transformation of the other. As such, we expect bounded, oscillatory behaviour with the details of how large and how fast the oscillations are to depend on the specifics of the function. Similarly, both these functions \\(f(x) = 2^x\\) and \\(g(x)=e^x\\) behave like exponential growth, the difference being only in the rate of growth. There are families of functions that are qualitatively similar, but quantitatively different, linked together by a few basic transformations.\nThere is a set of operations of functions, which does not really change the type of function. Rather, it basically moves and stretches how the functions are graphed. We discuss these four main transformations of \\(f\\):\n\n\n\n\n\n\nTransformationDescription\n\nvertical shifts\nThe function \\(h(x) = k + f(x)\\) will have the same graph as \\(f\\) shifted up by \\(k\\) units.\n\nhorizontal shifts\nThe function \\(h(x) = f(x - k)\\) will have the same graph as \\(f\\) shifted right by \\(k\\) units.\n\nstretching\nThe function \\(h(x) = kf(x)\\) will have the same graph as \\(f\\) stretched by a factor of \\(k\\) in the \\(y\\) direction.\n\nscaling\nThe function \\(h(x) = f(kx)\\) will have the same graph as \\(f\\) compressed horizontally by a factor of \\(1\\) over \\(k\\).\n\n\n\n\n\n\n\nThe functions \\(h\\) are derived from \\(f\\) in a predictable way. To implement these transformations within Julia, we define operators (functions which transform one function into another). As these return functions, the function bodies are anonymous functions. The basic definitions are similar, save for the x -> ... part that signals the creation of an anonymous function to return:\n\nup(f, k) = x -> f(x) + k\nover(f, k) = x -> f(x - k)\nstretch(f, k) = x -> k * f(x)\nscale(f, k) = x -> f(k * x)\n\nscale (generic function with 1 method)\n\n\nTo illustrate, lets define a hat-shaped function as follows:\n\n𝒇(x) = max(0, 1 - abs(x))\n\n𝒇 (generic function with 1 method)\n\n\nA plot over the interval \\([-2,2]\\) is shown here:\n\nplot(𝒇, -2,2)\n\n\n\n\nThe same graph of \\(f\\) and its image shifted up by \\(2\\) units would be given by:\n\nplot(𝒇, -2, 2, label=\"f\")\nplot!(up(𝒇, 2), label=\"up\")\n\n\n\n\nA graph of \\(f\\) and its shift over by \\(2\\) units would be given by:\n\nplot(𝒇, -2, 4, label=\"f\")\nplot!(over(𝒇, 2), label=\"over\")\n\n\n\n\nA graph of \\(f\\) and it being stretched by \\(2\\) units would be given by:\n\nplot(𝒇, -2, 2, label=\"f\")\nplot!(stretch(𝒇, 2), label=\"stretch\")\n\n\n\n\nFinally, a graph of \\(f\\) and it being scaled by \\(2\\) would be given by:\n\nplot(𝒇, -2, 2, label=\"f\")\nplot!(scale(𝒇, 2), label=\"scale\")\n\n\n\n\nScaling by \\(2\\) shrinks the non-zero domain, scaling by \\(1/2\\) would stretch it. If this is not intuitive, the defintion x-> f(x/c) could have been used, which would have opposite behaviour for scaling.\n\nMore exciting is what happens if we combine these operations.\nA shift right by \\(2\\) and up by \\(1\\) is achieved through\n\nplot(𝒇, -2, 4, label=\"f\")\nplot!(up(over(𝒇,2), 1), label=\"over and up\")\n\n\n\n\nShifting and scaling can be confusing. Here we graph scale(over(𝒇,2),1/3):\n\nplot(𝒇, -1,9, label=\"f\")\nplot!(scale(over(𝒇,2), 1/3), label=\"over and scale\")\n\n\n\n\nThis graph is over by \\(6\\) with a width of \\(3\\) on each side of the center. Mathematically, we have \\(h(x) = f((1/3)\\cdot x - 2)\\)\nCompare this to the same operations in opposite order:\n\nplot(𝒇, -1, 5, label=\"f\")\nplot!(over(scale(𝒇, 1/3), 2), label=\"scale and over\")\n\n\n\n\nThis graph first scales the symmetric graph, stretching from \\(-3\\) to \\(3\\), then shifts over right by \\(2\\). The resulting function is \\(f((1/3)\\cdot (x-2))\\).\nAs a last example, following up on the last example, a common transformation mathematically is\n\\[\nh(x) = \\frac{1}{a}f(\\frac{x - b}{a}).\n\\]\nWe can view this as a composition of “scale” by \\(1/a\\), then “over” by \\(b\\), and finally “stretch” by \\(1/a\\):\n\na = 2; b = 5\n𝒉(x) = stretch(over(scale(𝒇, 1/a), b), 1/a)(x)\nplot(𝒇, -1, 8, label=\"f\")\nplot!(𝒉, label=\"h\")\n\n\n\n\n(This transformation keeps the same amount of area in the triangles, can you tell from the graph?)\n\nExample\nA model for the length of a day in New York City must take into account periodic seasonal effects. A simple model might be a sine curve. However, there would need to be many modifications: Obvious ones would be that the period would need to be about \\(365\\) days, the oscillation around \\(12\\) and the amplitude of the oscillations no more than \\(12\\).\nWe can be more precise. According to dateandtime.info in \\(2015\\) the longest day will be June \\(21\\)st when there will be \\(15\\)h \\(5\\)m \\(46\\)s of sunlight, the shortest day will be December \\(21\\)st when there will be \\(9\\)h \\(15\\)m \\(19\\)s of sunlight. On January \\(1\\), there will be \\(9\\)h \\(19\\)m \\(44\\)s of sunlight.\nA model for a transformed sine curve is\n\\[\na + b\\sin(d(x - c))\n\\]\nWhere \\(b\\) is related to the amplitude, \\(c\\) the shift and the period is \\(T=2\\pi/d\\). We can find some of these easily from the above:\n\na = 12\nb = ((15 + 5/60 + 46/60/60) - (9 + 19/60 + 44/60/60)) / 2\nd = 2pi/365\n\n0.01721420632103996\n\n\nIf we let January \\(1\\) be \\(x=0\\) then the first day of spring, March \\(21\\), is day \\(80\\) (Date(2017, 3, 21) - Date(2017, 1, 1) + 1). This day aligns with the shift of the sine curve. This shift is \\(80\\):\n\nc = 80\n\n80\n\n\nPutting this together, we have our graph is “scaled” by \\(d\\), “over” by \\(c\\), “stretched” by \\(b\\) and “up” by \\(a\\). Here we plot it over slightly more than one year so that we can see that the shortest day of light is in late December (\\(x \\approx -10\\) or \\(x \\approx 355\\)).\n\nnewyork(t) = up(stretch(over(scale(sin, d), c), b), a)(t)\nplot(newyork, -20, 385)\n\n\n\n\nTo test, if we match up with the model powering dateandtime.info we note that it predicts “\\(15\\)h \\(0\\)m \\(4\\)s” on July \\(4\\), \\(2015\\). This is day \\(185\\) (Date(2015, 7, 4) - Date(2015, 1, 1) + 1). Our model prediction has a difference of\n\ndatetime = 15 + 0/60 + 4/60/60\ndelta = (newyork(185) - datetime) * 60\n\n-11.874016679895263\n\n\nThis is off by a fair amount - almost \\(12\\) minutes. Clearly a trigonometric model, based on the assumption of circular motion of the earth around the sun, is not accurate enough for precise work, but it does help one understand how summer days are longer than winter days and how the length of a day changes fastest at the spring and fall equinoxes.\n\n\nExample: a growth model in fisheries\nThe von Bertanlaffy growth equation is \\(L(t) =L_\\infty \\cdot (1 - e^{k\\cdot(t-t_0)})\\). This family of functions can be viewed as a transformation of the exponential function \\(f(t)=e^t\\). Part is a scaling and shifting (the \\(e^{k \\cdot (t - t_0)}\\)) along with some shifting and stretching. The various parameters have physical importance which can be measured: \\(L_\\infty\\) is a carrying capacity for the species or organism, and \\(k\\) is a rate of growth. These parameters may be estimated from data by finding the “closest” curve to a given data set.\n\n\nExample: the pipeline operator\nIn the last example, we described our sequence as scale, over, stretch, and up, but code this in reverse order, as the composition \\(f \\circ g\\) is done from right to left. A more convenient notation would be to have syntax that allows the composition of \\(g\\) then \\(f\\) to be written \\(x \\rightarrow g \\rightarrow f\\). Julia provides the pipeline operator for chaining function calls together.\nFor example, if \\(g(x) = \\sqrt{x}\\) and \\(f(x) =\\sin(x)\\) we could call \\(f(g(x))\\) through:\n\ng(x) = sqrt(x)\nf(x) = sin(x)\npi/2 |> g |> f\n\n0.9500244274657834\n\n\nThe output of the preceding expression is passed as the input to the next. This notation is especially convenient when the enclosing function is not the main focus. (Some programming languages have more developed fluent interfaces for chaining function calls. Julia has more powerful chaining macros provided in packages, such as DataPipes.jl or Chain.jl.)\n\n\n\n9.1.3 Operators\nThe functions up, over, etc. are operators that take a function as an argument and return a function. The use of operators fits in with the template action(f, args...). The action is what we are doing, such as plot, over, and others to come. The function f here is just an object that we are performing the action on. For example, a plot takes a function and renders a graph using the additional arguments to select the domain to view, etc.\nCreating operators that return functions involves the use of anonymous functions, using these operators is relatively straightforward. Two basic patterns are\n\nStoring the returned function, then calling it:\n\nl(x) = action1(f, args...)(x)\nl(10)\n\nComposing two operators:\n\naction2( action1(f, args..), other_args...)\nComposition like the above is convenient, but can get confusing if more than one composition is involved.\n\nExample: two operators\n(See Krill for background on this example.) Consider two operations on functions. The first takes the difference between adjacent points. We call this D:\n\nD(f::Function) = k -> f(k) - f(k-1)\n\nD (generic function with 1 method)\n\n\nTo see that it works, we take a typical function\n\n𝐟(k) = 1 + k^2\n\n𝐟 (generic function with 1 method)\n\n\nand check:\n\nD(𝐟)(3), 𝐟(3) - 𝐟(3-1)\n\n(5, 5)\n\n\nThat the two are the same value is no coincidence. (Again, pause for a second to make sure you understand why D(f)(3) makes sense. If this is unclear, you could name the function D(f) and then call this with a value of 3.)\nNow we want a function to cumulatively sum the values \\(S(f)(k) = f(1) + f(2) + \\cdots + f(k-1) + f(k)\\), as a function of \\(k\\). Adding up \\(k\\) terms is easy to do with a generator and the function sum:\n\nS(f) = k -> sum(f(i) for i in 1:k)\n\nS (generic function with 1 method)\n\n\nTo check if this works as expected, compare these two values:\n\nS(𝐟)(4), 𝐟(1) + 𝐟(2) + 𝐟(3) + 𝐟(4)\n\n(34, 34)\n\n\nSo one function adds, the other subtracts. Addition and subtraction are somehow inverse to each other so should “cancel” out. This holds for these two operations as well, in the following sense: subtracting after adding leaves the function alone:\n\nk = 10 # some arbitrary value k >= 1\nD(S(𝐟))(k), 𝐟(k)\n\n(101, 101)\n\n\nAny positive integer value of k will give the same answer (up to overflow). This says the difference of the accumulation process is just the last value to accumulate.\nAdding after subtracting also leaves the function alone, save for a vestige of \\(f(0)\\). For example, k=15:\n\nS(D(𝐟))(15), 𝐟(15) - 𝐟(0)\n\n(225, 225)\n\n\nThat is the accumulation of differences is just the difference of the end values.\nThese two operations are discrete versions of the two main operations of calculus - the derivative and the integral. This relationship will be known as the “fundamental theorem of calculus.”"
},
{
"objectID": "precalc/transformations.html#questions",
"href": "precalc/transformations.html#questions",
"title": "9  Function manipulations",
"section": "9.2 Questions",
"text": "9.2 Questions\n\nQuestion\nIf \\(f(x) = 1/x\\) and \\(g(x) = x-2\\), what is \\(g(f(x))\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1/(x-2)\\)\n \n \n\n\n \n \n \n \n \\(1/x - 2\\)\n \n \n\n\n \n \n \n \n \\(-2\\)\n \n \n\n\n \n \n \n \n \\(x - 2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(f(x) = e^{-x}\\) and \\(g(x) = x^2\\) and \\(h(x) = x-3\\), what is \\(f \\circ g \\circ h\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((e^x -3)^2\\)\n \n \n\n\n \n \n \n \n \\(e^{-(x-3)^2}\\)\n \n \n\n\n \n \n \n \n \\(e^x+x^2+x-3\\)\n \n \n\n\n \n \n \n \n \\(e^{-x^2 - 3}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(h(x) = (f \\circ g)(x) = \\sin^2(x)\\) which is a possibility for \\(f\\) and \\(g\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x)=\\sin(x); \\quad g(x) = x^2\\)\n \n \n\n\n \n \n \n \n `\\(f(x)=x^2; \\quad g(x) = \\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(f(x)=x^2; \\quad g(x) = \\sin^2(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich function would have the same graph as the sine curve shifted over by 4 and up by 6?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(h(x) = 4 + \\sin(6x)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = 6 + \\sin(x + 4)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = 6 + \\sin(x-4)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = 6\\sin(x-4)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(h(x) = 4x^2\\) and \\(f(x) = x^2\\). Which is not true:\n\n\n\n \n \n \n \n \n \n \n \n \n The graph of \\(h(x)\\) is the graph of f(x) shifted up by \\(4\\) units\n \n \n\n\n \n \n \n \n The graph of \\(h(x)\\) is the graph of \\(f(x)\\) stretched by a factor of \\(4\\)\n \n \n\n\n \n \n \n \n The graph of \\(h(x)\\) is the graph of \\(f(x)\\) scaled by a factor of \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe transformation \\(h(x) = (1/a) \\cdot f((x-b)/a)\\) can be viewed in one sequence:\n\n\n\n \n \n \n \n \n \n \n \n \n shifting by \\(a\\), then scaling by \\(b\\), and then scaling by \\(1/a\\)\n \n \n\n\n \n \n \n \n scaling by \\(1/a\\), then shifting by \\(b\\), then stretching by \\(1/a\\)\n \n \n\n\n \n \n \n \n shifting by \\(a\\), then scaling by \\(a\\), and then scaling by \\(b\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis is the graph of a transformed sine curve.\n\n\n\n\n\nWhat is the period of the graph?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the amplitude of the graph?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the form of the function graphed?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sin(2x)\\)\n \n \n\n\n \n \n \n \n \\(\\sin(\\pi x)\\)\n \n \n\n\n \n \n \n \n \\(2 \\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(2 \\sin(\\pi x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this expression\n\\[\n\\left(f(1) - f(0)\\right) + \\left(f(2) - f(1)\\right) + \\cdots + \\left(f(n) - f(n-1)\\right) =\n-f(0) + f(1) - f(1) + f(2) - f(2) + \\cdots + f(n-1) - f(n-1) + f(n) =\nf(n) - f(0).\n\\]\nReferring to the definitions of D and S in the example on operators, which relationship does this support:\n\n\n\n \n \n \n \n \n \n \n \n \n D(S(f))(n) = f(n)\n \n \n\n\n \n \n \n \n S(D(f))(n) = f(n) - f(0)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this expression:\n\\[\n\\left(f(1) + f(2) + \\cdots + f(n-1) + f(n)\\right) - \\left(f(1) + f(2) + \\cdots + f(n-1)\\right) = f(n).\n\\]\nReferring to the definitions of D and S in the example on operators, which relationship does this support:\n\n\n\n \n \n \n \n \n \n \n \n \n D(S(f))(n) = f(n)\n \n \n\n\n \n \n \n \n S(D(f))(n) = f(n) - f(0)"
},
{
"objectID": "precalc/inversefunctions.html",
"href": "precalc/inversefunctions.html",
"title": "10  The Inverse of a Function",
"section": "",
"text": "In this section we will use these add-on packages:\nA (univariate) mathematical function relates or associates values of \\(x\\) to values \\(y\\) using the notation \\(y=f(x)\\). A key point is a given \\(x\\) is associated with just one \\(y\\) value, though a given \\(y\\) value may be associated with several different \\(x\\) values. (Graphically, this is the vertical line test.)\nWe may conceptualize such a relation in many ways: through an algebraic rule; through the graph of \\(f;\\) through a description of what \\(f\\) does; or through a table of paired values, say. For the moment, lets consider a function as rule that takes in a value of \\(x\\) and outputs a value \\(y\\). If a rule is given defining the function, the computation of \\(y\\) is straightforward. A different question is not so easy: for a given value \\(y\\) what value - or values - of \\(x\\) (if any) produce an output of \\(y\\)? That is, what \\(x\\) value(s) satisfy \\(f(x)=y\\)?\nIf for each \\(y\\) in some set of values there is just one \\(x\\) value, then this operation associates to each value \\(y\\) a single value \\(x\\), so it too is a function. When that is the case we call this an inverse function.\nWhy is this useful? When available, it can help us solve equations. If we can write our equation as \\(f(x) = y\\), then we can “solve” for \\(x\\) through \\(x = g(y)\\), where \\(g\\) is this inverse function.\nLets explore when we can “solve” for an inverse function.\nConsider the graph of the function \\(f(x) = 2^x\\):\nThe graph of a function is a representation of points \\((x,f(x))\\), so to find \\(f(c)\\) from the graph, we begin on the \\(x\\) axis at \\(c\\), move vertically to the graph (the point \\((c, f(c))\\)), and then move horizontally to the \\(y\\) axis, intersecting it at \\(f(c)\\). The figure shows this for \\(c=2\\), from which we can read that \\(f(c)\\) is about \\(4\\). This is how an \\(x\\) is associated to a single \\(y\\).\nIf we were to reverse the direction, starting at \\(f(c)\\) on the \\(y\\) axis and then moving horizontally to the graph, and then vertically to the \\(x\\)-axis we end up at a value \\(c\\) with the correct \\(f(c)\\). This operation will form a function if the initial movement horizontally is guaranteed to find no more than one value on the graph. That is, to have an inverse function, there can not be two \\(x\\) values corresponding to a given \\(y\\) value. This observation is often visualized through the “horizontal line test” - the graph of a function with an inverse function can only intersect a horizontal line at most in one place.\nMore formally, a function is called one-to-one if for any two \\(a \\neq b\\), it must be that \\(f(a) \\neq f(b)\\). Many functions are one-to-one, many are not. Familiar one-to-one functions are linear functions (\\(f(x)=a \\cdot x + b\\) with \\(a\\neq 0\\)), odd powers of \\(x\\) (\\(f(x)=x^{2k+1}\\)), and functions of the form \\(f(x)=x^{1/n}\\) for \\(x \\geq 0\\). In contrast, all even functions are not one-to-one, as \\(f(x) = f(-x)\\) for any nonzero \\(x\\) in the domain of \\(f\\).\nA class of functions that are guaranteed to be one-to-one are the strictly increasing functions (which satisfy \\(a < b\\) implies \\(f(a) < f(b)\\)). Similarly, strictly decreasing functions are one-to-one. The term strictly monotonic is used to describe either strictly increasing or strictly decreasing. By the above observations, strictly monotonic function will have inverse functions.\nThe function \\(2^x\\), graphed above, is strictly increasing, so it will have an inverse function. That is we can solve for \\(x\\) in an equation like \\(2^x = 9\\) using the inverse function of \\(f(x) = 2^x\\), provided we can identify the inverse function."
},
{
"objectID": "precalc/inversefunctions.html#how-to-solve-for-an-inverse-function",
"href": "precalc/inversefunctions.html#how-to-solve-for-an-inverse-function",
"title": "10  The Inverse of a Function",
"section": "10.1 How to solve for an inverse function?",
"text": "10.1 How to solve for an inverse function?\nIf we know an inverse function exists, how can we find it?\nIf our function is given by a graph, the process above describes how to find the inverse function.\nHowever, typically we have a rule describing our function. What is the process then? A simple example helps illustrate. The linear function \\(f(x) = 9/5\\cdot x + 32\\) is strictly increasing, hence has an inverse function. What should it be? Lets describe the action of \\(f\\): it multiplies \\(x\\) by \\(9/5\\) and then adds \\(32\\). To “invert” this we first invert the adding of \\(32\\) by subtracting \\(32\\), then we would “invert” multiplying by \\(9/5\\) by dividing by \\(9/5\\). Hence \\(g(x)=(x-32)/(9/5)\\). We would generally simplify this, but lets not for now. If we view a function as a composition of many actions, then we find the inverse by composing the inverse of these actions in reverse order. The reverse order might seem confusing, but this is how we get dressed and undressed: to dress we put on socks and then shoes. To undress we take off the shoes and then take off the socks.\nWhen we solve algebraically for \\(x\\) in \\(y=9/5 \\cdot x + 32\\) we do the same thing as we do verbally: we subtract \\(32\\) from each side, and then divide by \\(9/5\\) to isolate \\(x\\):\n\\[\n\\begin{align}\ny &= 9/5 \\cdot x + 32\\\\\ny - 32 &= 9/5 \\cdot x\\\\\n(y-32) / (9/5) &= x.\n\\end{align}\n\\]\nFrom this, we have the function \\(g(y) = (y-32) / (9/5)\\) is the inverse function of \\(f(x) = 9/5\\cdot x + 32\\).\nUsually univariate functions are written with \\(x\\) as the dummy variable, so it is typical to write \\(g(x) = (x-32) / (9/5)\\) as the inverse function.\nUsually we use the name \\(f^{-1}\\) for the inverse function of \\(f\\), so this would be most often seen as \\(f^{-1}(x) = (x-32)/(9/5)\\) or after simplification \\(f^{-1}(x) = (5/9) \\cdot (x-32)\\).\n\n\n\n\n\n\nNote\n\n\n\nThe use of a negative exponent on the function name is easily confused for the notation for a reciprocal when it is used on a mathematical expression. An example might be the notation \\((1/x)^{-1}\\). As this is an expression this would simplify to \\(x\\) and not the inverse of the function \\(f(x)=1/x\\) (which is \\(f^{-1}(x) = 1/x\\)).\n\n\n\nExample\nSuppose a transformation of \\(x\\) is given by \\(y = f(x) = (ax + b)/(cx+d)\\). This function is invertible for most choices of the parameters. Find the inverse and describe its domain.\nFrom the expression \\(y=f(x)\\) we algebraically solve for \\(x\\):\n\\[\n\\begin{align*}\ny &= \\frac{ax +b}{cx+d}\\\\\ny \\cdot (cx + d) &= ax + b\\\\\nycx - ax &= b - yd\\\\\n(cy-a) \\cdot x &= b - dy\\\\\nx &= -\\frac{dy - b}{cy-a}.\n\\end{align*}\n\\]\nWe see that to solve for \\(x\\) we need to divide by \\(cy-a\\), so this expression can not be zero. So, using \\(x\\) as the dummy variable, we have\n\\[\nf^{-1}(x) = -\\frac{dx - b}{cx-a},\\quad cx-a \\neq 0.\n\\]\n\n\nExample\nThe function \\(f(x) = (x-1)^5 + 2\\) is strictly increasing and so will have an inverse function. Find it.\nAgain, we solve algebraically starting with \\(y=(x-1)^5 + 2\\) and solving for \\(x\\):\n\\[\n\\begin{align*}\ny &= (x-1)^5 + 2\\\\\ny - 2 &= (x-1)^5\\\\\n(y-2)^{1/5} &= x - 1\\\\\n(y-2)^{1/5} + 1 &= x.\n\\end{align*}\n\\]\nWe see that \\(f^{-1}(x) = 1 + (x - 2)^{1/5}\\). The fact that the power \\(5\\) is an odd power is important, as this ensures a unique (real) solution to the fifth root of a value, in the above \\(y-2\\).\n\n\nExample\nThe function \\(f(x) = x^x, x \\geq 1/e\\) is strictly increasing. However, trying to algebraically solve for an inverse function will quickly run into problems (without using specially defined functions). The existence of an inverse does not imply there will always be luck in trying to find a mathematical rule defining the inverse."
},
{
"objectID": "precalc/inversefunctions.html#functions-which-are-not-always-invertible",
"href": "precalc/inversefunctions.html#functions-which-are-not-always-invertible",
"title": "10  The Inverse of a Function",
"section": "10.2 Functions which are not always invertible",
"text": "10.2 Functions which are not always invertible\nConsider the function \\(f(x) = x^2\\). The graph - a parabola - is clearly not monotonic. Hence no inverse function exists. Yet, we can solve equations \\(y=x^2\\) quite easily: \\(y=\\sqrt{x}\\) or \\(y=-\\sqrt{x}\\). We know the square root undoes the squaring, but we need to be a little more careful to say the square root is the inverse of the squaring function.\nThe issue is there are generally two possible answers. To avoid this, we might choose to only take the non-negative answer. To make this all work as above, we restrict the domain of \\(f(x)\\) and now consider the related function \\(f(x)=x^2, x \\geq 0\\). This is now a monotonic function, so will have an inverse function. This is clearly \\(f^{-1}(x) = \\sqrt{x}\\). (The \\(\\sqrt{x}\\) being defined as the principle square root or the unique non-negative answer to \\(u^2-x=0\\).)\nThe inverse function theorem basically says that if \\(f\\) is locally monotonic, then an inverse function will exist locally. By “local” we mean in a neighborhood of \\(c\\).\n\nExample\nConsider the function \\(f(x) = (1+x^2)^{-1}\\). This bell-shaped function is even (symmetric about \\(0\\)), so can not possibly be one-to-one. However, if the domain is restricted to \\([0,\\infty)\\) it is. The restricted function is strictly decreasing and its inverse is found, as follows:\n\\[\n\\begin{align*}\ny &= \\frac{1}{1 + x^2}\\\\\n1+x^2 &= \\frac{1}{y}\\\\\nx^2 &= \\frac{1}{y} - 1\\\\\nx &= \\sqrt{(1-y)/y}, \\quad 0 \\leq y \\leq 1.\n\\end{align*}\n\\]\nThen \\(f^{-1}(x) = \\sqrt{(1-x)/x}\\) where \\(0 < x \\leq 1\\). The somewhat complicated restriction for the the domain coincides with the range of \\(f(x)\\). We shall see next that this is no coincidence."
},
{
"objectID": "precalc/inversefunctions.html#formal-properties-of-the-inverse-function",
"href": "precalc/inversefunctions.html#formal-properties-of-the-inverse-function",
"title": "10  The Inverse of a Function",
"section": "10.3 Formal properties of the inverse function",
"text": "10.3 Formal properties of the inverse function\nConsider again the graph of a monotonic function, in this case \\(f(x) = x^2 + 2, x \\geq 0\\):\n\nf(x) = x^2 + 2\nplot(f, 0, 4, legend=false)\nplot!([2,2,0], [0,f(2),f(2)])\n\n\n\n\nThe graph is shown over the interval \\((0,4)\\), but the domain of \\(f(x)\\) is all \\(x \\geq 0\\). The range of \\(f(x)\\) is clearly \\(2 \\leq x \\leq \\infty\\).\nThe lines layered on the plot show how to associate an \\(x\\) value to a \\(y\\) value or vice versa (as \\(f(x)\\) is one-to-one). The domain then of the inverse function is all the \\(y\\) values for which a corresponding \\(x\\) value exists: this is clearly all values bigger or equal to \\(2\\). The range of the inverse function can be seen to be all the images for the values of \\(y\\), which would be all \\(x \\geq 0\\). This gives the relationship:\n\nthe range of \\(f(x)\\) is the domain of \\(f^{-1}(x)\\); furthermore the domain of \\(f(x)\\) is the range for \\(f^{-1}(x)\\);\n\nFrom this we can see if we start at \\(x\\), apply \\(f\\) we get \\(y\\), if we then apply \\(f^{-1}\\) we will get back to \\(x\\) so we have:\n\nFor all \\(x\\) in the domain of \\(f\\): \\(f^{-1}(f(x)) = x\\).\n\nSimilarly, were we to start on the \\(y\\) axis, we would see:\n\nFor all \\(x\\) in the domain of \\(f^{-1}\\): \\(f(f^{-1}(x)) = x\\).\n\nIn short \\(f^{-1} \\circ f\\) and \\(f \\circ f^{-1}\\) are both identity functions, though on possibly different domains."
},
{
"objectID": "precalc/inversefunctions.html#the-graph-of-the-inverse-function",
"href": "precalc/inversefunctions.html#the-graph-of-the-inverse-function",
"title": "10  The Inverse of a Function",
"section": "10.4 The graph of the inverse function",
"text": "10.4 The graph of the inverse function\nThe graph of \\(f(x)\\) is a representation of all values \\((x,y)\\) where \\(y=f(x)\\). As the inverse flips around the role of \\(x\\) and \\(y\\) we have:\n\nIf \\((x,y)\\) is a point on the graph of \\(f(x)\\), then \\((y,x)\\) will be a point on the graph of \\(f^{-1}(x)\\).\n\nLets see this in action. Take the function \\(2^x\\). We can plot it by generating points to plot as follows:\n\nf(x) = 2^x\nxs = range(0, 2, length=50)\nys = f.(xs)\nplot(xs, ys, color=:blue, label=\"f\")\nplot!(ys, xs, color=:red, label=\"f⁻¹\") # the inverse\n\n\n\n\nBy flipping around the \\(x\\) and \\(y\\) values in the plot! command, we produce the graph of the inverse function - when viewed as a function of \\(x\\). We can see that the domain of the inverse function (in red) is clearly different from that of the function (in blue).\nThe inverse function graph can be viewed as a symmetry of the graph of the function. Flipping the graph for \\(f(x)\\) around the line \\(y=x\\) will produce the graph of the inverse function: Here we see for the graph of \\(f(x) = x^{1/3}\\) and its inverse function:\n\nf(x) = cbrt(x)\nxs = range(-2, 2, length=150)\nys = f.(xs)\nplot(xs, ys, color=:blue, aspect_ratio=:equal, legend=false)\nplot!(ys, xs, color=:red)\nplot!(identity, color=:green, linestyle=:dash)\nx, y = 1/2, f(1/2)\nplot!([x,y], [y,x], color=:green, linestyle=:dot)\n\n\n\n\nWe drew a line connecting \\((1/2, f(1/2))\\) to \\((f(1/2),1/2)\\). We can see that it crosses the line \\(y=x\\) perpendicularly, indicating that points are symmetric about this line. (The plotting argument aspect_ratio=:equal ensures that the \\(x\\) and \\(y\\) axes are on the same scale, so that this type of line will look perpendicular.)\nOne consequence of this symmetry, is that if \\(f\\) is strictly increasing, then so is its inverse.\n!!!note In the above we used cbrt(x) and not x^(1/3). The latter usage assumes that \\(x \\geq 0\\) as it isnt guaranteed that for all real exponents the answer will be a real number. The cbrt function knows there will always be a real answer and provides it.\n\n10.4.1 Lines\nThe slope of \\(f(x) = 9/5 \\cdot x + 32\\) is clearly \\(9/5\\) and the slope of the inverse function \\(f^{-1}(x) = 5/9 \\cdot (x-32)\\) is clearly \\(5/9\\) - or the reciprocal. This makes sense, as the slope is the rise over the run, and by flipping the \\(x\\) and \\(y\\) values we merely flip over the rise and the run.\nNow consider the graph of the tangent line to a function. This concept will be better defined later, for now, it is a line “tangent” to the graph of \\(f(x)\\) at a point \\(x=c\\).\nFor concreteness, we consider \\(f(x) = \\sqrt{x}\\) at \\(c=2\\). The tangent line will have slope \\(1/(2\\sqrt{2})\\) and will go through the point \\((2, f(2)\\). We graph the function, its tangent line, and their inverses:\n\nf(x) = sqrt(x)\nc = 2\ntl(x) = f(c) + 1/(2 * sqrt(2)) * (x - c)\nxs = range(0, 3, length=150)\nys = f.(xs)\nzs = tl.(xs)\nplot(xs, ys, color=:blue, legend=false)\nplot!(xs, zs, color=:blue) # the tangent line\nplot!(ys, xs, color=:red) # the inverse function\nplot!(zs, xs, color=:red) # inverse of tangent line\n\n\n\n\nWhat do we see? In blue, we can see the familiar square root graph along with a “tangent” line through the point \\((2, f(2))\\). The red graph of \\(f^{-1}(x) = x^2, x \\geq 0\\) is seen and, perhaps surprisingly, a tangent line. This is at the point \\((f(2), 2)\\). We know the slope of this tangent line is the reciprocal of the slope of the red tangent line. This gives this informal observation:\n\nIf the graph of \\(f(x)\\) has a tangent line at \\((c, f(c))\\) with slope \\(m\\), then the graph of \\(f^{-1}(x)\\) will have a tangent line at \\((f(c), c)\\) with slope \\(1/m\\).\n\nThis is reminiscent of the formula for the slope of a perpendicular line, \\(-1/m\\), but quite different, as this formula implies the two lines have either both positive slopes or both negative slopes, unlike the relationship in slopes between a line and a perpendicular line.\nThe key here is that the shape of \\(f(x)\\) near \\(x=c\\) is somewhat related to the shape of \\(f^{-1}(x)\\) at \\(f(c)\\). In this case, if we use the tangent line as a fill in for how steep a function is, we see from the relationship that if \\(f(x)\\) is “steep” at \\(x=c\\), then \\(f^{-1}(x)\\) will be “shallow” at \\(x=f(c)\\)."
},
{
"objectID": "precalc/inversefunctions.html#questions",
"href": "precalc/inversefunctions.html#questions",
"title": "10  The Inverse of a Function",
"section": "10.5 Questions",
"text": "10.5 Questions\n\nQuestion\nIs it possible that a function have two different inverses?\n\n\n\n \n \n \n \n \n \n \n \n \n No, for all \\(x\\) in the domain an an inverse, the value of any inverse will be the same, hence all inverse functions would be identical.\n \n \n\n\n \n \n \n \n Yes, the function \\(f(x) = x^2, x \\geq 0\\) will have a different inverse than the same function \\(f(x) = x^2, x \\leq 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function takes a value \\(x\\) adds \\(1\\), divides by \\(2\\), and then subtracts \\(1\\). Is the function “one-to-one”?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, the function is the linear function \\(f(x)=(x+1)/2 + 1\\) and so is monotonic.\n \n \n\n\n \n \n \n \n No, the function is \\(1\\) then \\(2\\) then \\(1\\), but not \"one-to-one\"\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs the function \\(f(x) = x^5 - x - 1\\) one-to-one?\n\n\n\n \n \n \n \n \n \n \n \n \n No, a graph over \\((-2,2)\\) will show this.\n \n \n\n\n \n \n \n \n Yes, a graph over \\((-100, 100)\\) will show this.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function is given by the table\nx | y\n--------\n1 | 3\n2 | 4\n3 | 5\n4 | 3\n5 | 4\n6 | 5\nIs the function one-to-one?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function is defined by its graph.\n\n\n\n\n\nOver the domain shown, is the function one-to-one?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x) = x^{-1}\\).\nWhat is \\(g(x) = (f(x))^{-1}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(g(x) = x\\)\n \n \n\n\n \n \n \n \n \\(g(x) = x^{-1}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(g(x) = f^{-1}(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(g(x) = x\\)\n \n \n\n\n \n \n \n \n \\(g(x) = x^{-1}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function, \\(f\\), is given by its graph:\n\n\n\n\n\nWhat is the value of \\(f(1)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\(f^{-1}(1)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\((f(1))^{-1}\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\(f^{-1}(1/2)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function is described as follows: for \\(x > 0\\) it takes the square root, adds \\(1\\) and divides by \\(2\\).\nWhat is the inverse of this function?\n\n\n\n \n \n \n \n \n \n \n \n \n The function that takes square of the value, then subtracts \\(1\\), and finally multiplies by \\(2\\).\n \n \n\n\n \n \n \n \n The function that divides by \\(2\\), adds \\(1\\), and then takes the square root of the value.\n \n \n\n\n \n \n \n \n The function that multiplies by \\(2\\), subtracts \\(1\\) and then squares the value.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function, \\(f\\), is specified by a table:\nx | y\n-------\n1 | 2\n2 | 3\n3 | 5\n4 | 8\n5 | 13\nWhat is \\(f(3)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(f^{-1}(3)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(f(5)^{-1}\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(f^{-1}(5)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the inverse function of \\(f(x) = (x^3 + 4)/5\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f^{-1}(x) = (5y-4)^{1/3}\\)\n \n \n\n\n \n \n \n \n \\(f^{-1}(x) = 5/(x^3 + 4)\\)\n \n \n\n\n \n \n \n \n \\(f^{-1}(x) = (5y-4)^3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the inverse function of \\(f(x) = x^\\pi + e, x \\geq 0\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f^{-1}(x) = (x-\\pi)^{e}\\)\n \n \n\n\n \n \n \n \n \\(f^{-1}(x) = (x-e)^{\\pi}\\)\n \n \n\n\n \n \n \n \n \\(f^{-1}(x) = (x-e)^{1/\\pi}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the domain of the inverse function for \\(f(x) = x^2 + 7, x \\geq 0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\([0, \\infty)\\)\n \n \n\n\n \n \n \n \n \\((-\\infty, \\infty)\\)\n \n \n\n\n \n \n \n \n \\([7, \\infty)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the range of the inverse function for \\(f(x) = x^2 + 7, x \\geq 0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\([7, \\infty)\\)\n \n \n\n\n \n \n \n \n \\((-\\infty, \\infty)\\)\n \n \n\n\n \n \n \n \n \\([0, \\infty)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFrom the plot, are blue and red inverse functions?\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFrom the plot, are blue and red inverse functions?\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe function \\(f(x) = (ax + b)/(cx + d)\\) is known as a Mobius transformation and can be expressed as a composition of \\(4\\) functions, \\(f_4 \\circ f_3 \\circ f_2 \\circ f_1\\):\n\nwhere \\(f_1(x) = x + d/c\\) is a translation,\nwhere \\(f_2(x) = x^{-1}\\) is inversion and reflection,\nwhere \\(f_3(x) = ((bc-ad)/c^2) \\cdot x\\) is scaling,\nand \\(f_4(x) = x + a/c\\) is a translation.\n\nFor \\(x=10\\), what is \\(f(10)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFor \\(x=10\\), what is \\(f_4(f_3(f_2(f_1(10))))\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe last two answers should be the same, why?\n\n\n\n \n \n \n \n \n \n \n \n \n As the latter is more complicated than the former.\n \n \n\n\n \n \n \n \n As \\(f_4(f_3(f_2(f_1(x))))=(f_1 \\circ f_2 \\circ f_3 \\circ f_4)(x)\\)\n \n \n\n\n \n \n \n \n As \\(f_4(f_3(f_2(f)_1(x))))=(f_4 \\circ f_3 \\circ f_2 \\circ f_1)(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nLet \\(g_1\\), \\(g_2\\), \\(g_3\\), and \\(g_4\\) denote the inverse functions. Clearly, \\(g_1(x) = x- d/c\\) and \\(g+4(x) = x - a/c\\), as the inverse of adding a constant is subtracting the constant.\nWhat is \\(g_2(x)=f_2^{-1}(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(g_2(x) = x\\)\n \n \n\n\n \n \n \n \n \\(g_2(x) = x^{-1}\\)\n \n \n\n\n \n \n \n \n \\(g_2(x) = x -1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(g_3(x)=f_3^{-1}(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(c^2/(b\\cdot c - a\\cdot d) \\cdot x\\)\n \n \n\n\n \n \n \n \n \\((b\\cdot c-a\\cdot d)/c^2 \\cdot x\\)\n \n \n\n\n \n \n \n \n \\(c^2 x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nGiven these, what is the value of \\(g_4(g_3(g_2(g_1(f_4(f_3(f_2(f_1(10))))))))\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat about the value of \\(g_1(g_2(g_3(g_4(f_4(f_3(f_2(f_1(10))))))))\\)?"
},
{
"objectID": "precalc/polynomial.html",
"href": "precalc/polynomial.html",
"title": "11  Polynomials",
"section": "",
"text": "In this section we use the following add-on packages:\nPolynomials are a particular class of expressions that are simple enough to have many properties that can be analyzed. In particular, the key concepts of calculus: limits, continuity, derivatives, and integrals are all relatively trivial for polynomial functions. However, polynomials are flexible enough that they can be used to approximate a wide variety of functions. Indeed, though we dont pursue this, we mention that Julias ApproxFun package exploits this to great advantage.\nHere we discuss some vocabulary and basic facts related to polynomials and show how the add-on SymPy package can be used to model polynomial expressions within SymPy. SymPy provides a Computer Algebra System (CAS) for Julia. In this case, by leveraging a mature Python package SymPy. Later we will discuss the Polynomials package for polynomials.\nFor our purposes, a monomial is simply a non-negative integer power of \\(x\\) (or some other indeterminate symbol) possibly multiplied by a scalar constant. For example, \\(5x^4\\) is a monomial, as are constants, such as \\(-2=-2x^0\\) and the symbol itself, as \\(x = x^1\\). In general, one may consider restrictions on where the constants can come from, and consider more than one symbol, but we wont pursue this here, restricting ourselves to the case of a single variable and real coefficients.\nA polynomial is a sum of monomials. After combining terms with same powers, a non-zero polynomial may be written uniquely as:\n\\[\na_n x^n + a_{n-1}x^{n-1} + \\cdots a_1 x + a_0, \\quad a_n \\neq 0\n\\]\nThe numbers \\(a_0, a_1, \\dots, a_n\\) are the coefficients of the polynomial in the standard basis. With the identifications that \\(x=x^1\\) and \\(1 = x^0\\), the monomials above have their power match their coefficients index, e.g., \\(a_ix^i\\). Outside of the coefficient \\(a_n\\), the other coefficients may be negative, positive, or \\(0\\). Except for the zero polynomial, the largest power \\(n\\) is called the degree. The degree of the zero polynomial is typically not defined or defined to be \\(-1\\), so as to make certain statements easier to express. The term \\(a_n\\) is called the leading coefficient. When the leading coefficient is \\(1\\), the polynomial is called a monic polynomial. The monomial \\(a_n x^n\\) is the leading term.\nFor example, the polynomial \\(-16x^2 - 32x + 100\\) has degree \\(2\\), leading coefficient \\(-16\\) and leading term \\(-16x^2\\). It is not monic, as the leading coefficient is not \\(1\\).\nLower degree polynomials have special names: a degree \\(0\\) polynomial (\\(a_0\\)) is a non-zero constant, a degree \\(1\\) polynomial (\\(a_0+a_1x\\)) is called linear, a degree \\(2\\) polynomial is quadratic, and a degree \\(3\\) polynomial is called cubic."
},
{
"objectID": "precalc/polynomial.html#linear-polynomials",
"href": "precalc/polynomial.html#linear-polynomials",
"title": "11  Polynomials",
"section": "11.1 Linear polynomials",
"text": "11.1 Linear polynomials\nA special place is reserved for polynomials with degree \\(1\\). These are linear, as their graphs are straight lines. The general form,\n\\[\na_1 x + a_0, \\quad a_1 \\neq 0,\n\\]\nis often written as \\(mx + b\\), which is the slope-intercept form. The slope of a line determines how steeply it rises. The value of \\(m\\) can be found from two points through the well-known formula:\n\\[\nm = \\frac{y_1 - y_0}{x_1 - x_0} = \\frac{\\text{rise}}{\\text{run}}\n\\]\n\n\n \n Graphs of y = mx for different values of m\n \n \n\n\n\nThe intercept, \\(b\\), comes from the fact that when \\(x=0\\) the expression is \\(b\\). That is the graph of the function \\(f(x) = mx + b\\) will have \\((0,b)\\) as a point on it.\nMore generally, we have the point-slope form of a line, written as a polynomial through\n\\[\ny_0 + m \\cdot (x - x_0).\n\\]\nThe slope is \\(m\\) and the point \\((x_0, y_0)\\). Again, the line graphing this as a function of \\(x\\) would have the point \\((x_0,y_0)\\) on it and have slope \\(m\\). This form is more useful in calculus, as the information we have convenient is more likely to be related to a specific value of \\(x\\), not the special value \\(x=0\\).\nThinking in terms of transformations, this looks like the function \\(f(x) = x\\) (whose graph is a line with slope \\(1\\)) stretched in the \\(y\\) direction by a factor of \\(m\\) then shifted right by \\(x_0\\) units, and then shifted up by \\(y_0\\) units. When \\(m>1\\), this means the line grows faster. When \\(m< 0\\), the line \\(f(x)=x\\) is flipped through the \\(x\\)-axis so would head downwards, not upwards like \\(f(x) = x\\)."
},
{
"objectID": "precalc/polynomial.html#symbolic-math-in-julia",
"href": "precalc/polynomial.html#symbolic-math-in-julia",
"title": "11  Polynomials",
"section": "11.2 Symbolic math in Julia",
"text": "11.2 Symbolic math in Julia\nThe indeterminate value x (or some other symbol) in a polynomial, is like a variable in a function and unlike a variable in Julia. Variables in Julia are identifiers, just a means to look up a specific, already determined, value. Rather, the symbol x is not yet determined, it is essentially a place holder for a future value. Although we have seen that Julia makes it very easy to work with mathematical functions, it is not the case that base Julia makes working with expressions of algebraic symbols easy. This makes sense, Julia is primarily designed for technical computing, where numeric approaches rule the day. However, symbolic math can be used from within Julia through add-on packages.\nSymbolic math programs include well-known ones like the commercial programs Mathematica and Maple. Mathematica powers the popular WolframAlpha website, which turns “natural” language into the specifics of a programming language. The open-source Sage project is an alternative to these two commercial giants. It includes a wide-range of open-source math projects available within its umbrella framework. (Julia can even be run from within the free service cloud.sagemath.com.) A more focused project for symbolic math, is the SymPy Python library. SymPy is also used within Sage. However, SymPy provides a self-contained library that can be used standalone within a Python session. That is great for Julia users, as the PyCall and PythonCall packages glue Julia to Python in a seamless manner. This allows the Julia package SymPy to provide functionality from SymPy within Julia.\n\n\n\n\n\n\nNote\n\n\n\nWhen SymPy is installed through the package manger, the underlying Python libraries will also be installed.\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe Symbolics package is a rapidly developing Julia-only packge that provides symbolic math options.\n\n\n\nTo use SymPy, we create symbolic objects to be our indeterminate symbols. The symbols function does this. However, we will use the more convenient @syms macro front end for symbols.\n\n@syms a, b, c, x::real, zs[1:10]\n\n(a, b, c, x, Sym[zs₁, zs₂, zs₃, zs₄, zs₅, zs₆, zs₇, zs₈, zs₉, zs₁₀])\n\n\nThe above shows that multiple symbols can be defined at once. The annotation x::real instructs SymPy to assume the x is real, as otherwise it assumes it is possibly complex. There are many other assumptions that can be made. The @syms macro documentation lists them. The zs[1:10] tensor notation creates a container with \\(10\\) different symbols. The macro @syms does not need assignment, as the variable(s) are created behind the scenes by the macro.\n\n\n\n\n\n\nNote\n\n\n\nMacros in Julia are just transformations of the syntax into other syntax. The @ indicates they behave differently than regular function calls.\n\n\nThe SymPy package does three basic things:\n\nIt imports some of the functionality provided by SymPy, including the ability to create symbolic variables.\nIt overloads many Julia functions to work seamlessly with symbolic expressions. This makes working with polynomials quite natural.\nIt gives access to a wide range of SymPys functionality through the sympy object.\n\nTo illustrate, using the just defined x, here is how we can create the polynomial \\(-16x^2 + 100\\):\n\n𝒑 = -16x^2 + 100\n\n \n\\[\n100 - 16 x^{2}\n\\]\n\n\n\nThat is, the expression is created just as you would create it within a function body. But here the result is still a symbolic object. We have assigned this expression to a variable p, and have not defined it as a function p(x). Mentally keeping the distinction between symbolic expressions and functions is very important.\nThe typeof function shows that 𝒑 is of a symbolic type (Sym):\n\ntypeof(𝒑)\n\nSym\n\n\nWe can mix and match symbolic objects. This command creates an arbitrary quadratic polynomial:\n\nquad = a*x^2 + b*x + c\n\n \n\\[\na x^{2} + b x + c\n\\]\n\n\n\nAgain, this is entered in a manner nearly identical to how we see such expressions typeset (\\(ax^2 + bx+c\\)), though we must remember to explicitly place the multiplication operator, as the symbols are not numeric literals.\nWe can apply many of Julias mathematical functions and the result will still be symbolic:\n\nsin(a*(x - b*pi) + c)\n\n \n\\[\n\\sin{\\left(a \\left(- \\pi b + x\\right) + c \\right)}\n\\]\n\n\n\nAnother example, might be the following combination:\n\nquad + quad^2 - quad^3\n\n \n\\[\na x^{2} + b x + c - \\left(a x^{2} + b x + c\\right)^{3} + \\left(a x^{2} + b x + c\\right)^{2}\n\\]\n\n\n\nOne way to create symbolic expressions is simply to call a Julia function with symbolic arguments. The first line in the next example defines a function, the second evaluates it at the symbols x, a, and b resulting in a symbolic expression ex:\n\nf(x, m, b) = m*x + b\nex = f(x, a, b)\n\n \n\\[\na x + b\n\\]"
},
{
"objectID": "precalc/polynomial.html#substitution-subs-replace",
"href": "precalc/polynomial.html#substitution-subs-replace",
"title": "11  Polynomials",
"section": "11.3 Substitution: subs, replace",
"text": "11.3 Substitution: subs, replace\nAlgebraically working with symbolic expressions is straightforward. A different symbolic task is substitution. For example, replacing each instance of x in a polynomial, with, say, (x-1)^2. Substitution requires three things to be specified: an expression to work on, a variable to substitute, and a value to substitute in.\nSymPy provides its subs function for this. This function is available in Julia, but it is easier to use notation reminiscent of function evaluation.\nTo illustrate, to do the task above for the polynomial \\(-16x^2 + 100\\) we could have:\n\n𝒑(x => (x-1)^2)\n\n \n\\[\n100 - 16 \\left(x - 1\\right)^{4}\n\\]\n\n\n\nThis “call” notation takes pairs (designated by a=>b) where the left-hand side is the variable to substitute for, and the right-hand side the new value. The value to substitute can depend on the variable, as illustrated; be a different variable; or be a numeric value, such as \\(2\\):\n\n𝒚 = 𝒑(x=>2)\n\n \n\\[\n36\n\\]\n\n\n\nThe result will always be of a symbolic type, even if the answer is just a number:\n\ntypeof(𝒚)\n\nSym\n\n\nIf there is just one free variable in an expression, the pair notation can be dropped:\n\n𝒑(4) # substitutes x=>4\n\n \n\\[\n-156\n\\]\n\n\n\n\nExample\nSuppose we have the polynomial \\(p = ax^2 + bx +c\\). What would it look like if we shifted right by \\(E\\) units and up by \\(F\\) units?\n\n@syms E F\np₂ = a*x^2 + b*x + c\np₂(x => x-E) + F\n\n \n\\[\nF + a \\left(- E + x\\right)^{2} + b \\left(- E + x\\right) + c\n\\]\n\n\n\nAnd expanded this becomes:\n\nexpand(p₂(x => x-E) + F)\n\n \n\\[\nE^{2} a - 2 E a x - E b + F + a x^{2} + b x + c\n\\]\n\n\n\n\n\n11.3.1 Conversion of symbolic numbers to Julia numbers\nIn the above, we substituted 2 in for x to get y:\n\np = -16x^2 + 100\ny = p(2)\n\n \n\\[\n36\n\\]\n\n\n\nThe value, \\(36\\) is still symbolic, but clearly an integer. If we are just looking at the output, we can easily translate from the symbolic value to an integer, as they print similarly. However the conversion to an integer, or another type of number, does not happen automatically. If a number is needed to pass along to another Julia function, it may need to be converted. In general, conversions between different types are handled through various methods of convert. However, with SymPy, the N function will attempt to do the conversion for you:\n\np = -16x^2 + 100\nN(p(2))\n\n36\n\n\nWhere convert(T,x) requires a specification of the type to convert x to, N attempts to match the data type used by SymPy to store the number. As such, the output type of N may vary (rational, a BigFloat, a float, etc.) For getting more digits of accuracy, a precision can be passed to N. The following command will take the symbolic value for \\(\\pi\\), PI, and produce about \\(60\\) digits worth as a BigFloat value:\n\nN(PI, 60)\n\n3.141592653589793238462643383279502884197169399375105820974939\n\n\nConversion by N will fail if the value to be converted contains free symbols, as would be expected.\n\n\n11.3.2 Converting symbolic expressions into Julia functions\nEvaluating a symbolic expression and returning a numeric value can be done by composing the two just discussed concepts. For example:\n\n𝐩 = 200 - 16x^2\nN(𝐩(2))\n\n136\n\n\nThis approach is direct, but can be slow if many such evaluations were needed (such as with a plot). An alternative is to turn the symbolic expression into a Julia function and then evaluate that as usual.\nThe lambdify function turns a symbolic expression into a Julia function\n\npp = lambdify(𝐩)\npp(2)\n\n136\n\n\nThe lambdify function uses the name of the similar SymPy function which is named after Pythons convention of calling anoynmous function “lambdas.” The use above is straightforward. Only slightly more complicated is the use when there are multiple symbolic values. For example:\n\np = a*x^2 + b\npp = lambdify(p)\npp(1,2,3)\n\n11\n\n\nThis evaluation matches a with 1, b with2, and x with 3 as that is the order returned by the function call free_symbols(p). To adjust that, a second vars argument can be given:\n\npp = lambdify(p, (x,a,b))\npp(1,2,3) # computes 2*1^2 + 3\n\n5"
},
{
"objectID": "precalc/polynomial.html#graphical-properties-of-polynomials",
"href": "precalc/polynomial.html#graphical-properties-of-polynomials",
"title": "11  Polynomials",
"section": "11.4 Graphical properties of polynomials",
"text": "11.4 Graphical properties of polynomials\nConsider the graph of the polynomial x^5 - x + 1:\n\nplot(x^5 - x + 1, -3/2, 3/2)\n\n\n\n\n(Plotting symbolic expressions is similar to plotting a function, in that the expression is passed in as the first argument. The expression must have only one free variable, as above, or an error will occur.)\nThis graph illustrates the key features of polynomial graphs:\n\nthere may be values for x where the graph crosses the \\(x\\) axis (real roots of the polynomial);\nthere may be peaks and valleys (local maxima and local minima)\nexcept for constant polynomials, the ultimate behaviour for large values of \\(\\lvert x\\rvert\\) is either both sides of the graph going to positive infinity, or negative infinity, or as in this graph one to the positive infinity and one to negative infinity. In particular, there is no horizontal asymptote.\n\nTo investigate this last point, lets consider the case of the monomial \\(x^n\\). When \\(n\\) is even, the following animation shows that larger values of \\(n\\) have greater growth once outside of \\([-1,1]\\):\n\n\n \n Demonstration that \\(x^{10}\\) grows faster than \\(x^8\\), ... and \\(x^2\\) grows faster than \\(x^0\\) (which is constant).\n \n \n\n\n\nOf course, this is expected, as, for example, \\(2^2 < 2^4 < 2^6 < \\cdots\\). The general shape of these terms is similar - \\(U\\) shaped, and larger powers dominate the smaller powers as \\(\\lvert x\\rvert\\) gets big.\nFor odd powers of \\(n\\), the graph of the monomial \\(x^n\\) is no longer \\(U\\) shaped, but rather constantly increasing. This graph of \\(x^5\\) is typical:\n\nplot(x^5, -2, 2)\n\n\n\n\nAgain, for larger powers the shape is similar, but the growth is faster.\n\n11.4.1 Leading term dominates\nTo see the roots and/or the peaks and valleys of a polynomial requires a judicious choice of viewing window, as ultimately the leading term will dominate the graph. The following animation of the graph of \\((x-5)(x-3)(x-2)(x-1)\\) illustrates. Subsequent images show a widening of the plot window until the graph appears U-shaped.\n\n\n \n The previous graph is highlighted in red. Ultimately the leading term (\\(x^4\\) here) dominates the graph.\n \n \n\n\n\nThe leading term in the animation is \\(x^4\\), of even degree, so the graphic is U-shaped, were the leading term of odd degree the left and right sides would each head off to different signs of infinity.\nTo illustrate analytically why the leading term dominates, consider the polynomial \\(2x^5 - x + 1\\) and then factor out the largest power, \\(x^5\\), leaving a product:\n\\[\nx^5 \\cdot (2 - \\frac{1}{x^4} + \\frac{1}{x^5}).\n\\]\nFor large \\(\\lvert x\\rvert\\), the last two terms in the product on the right get close to \\(0\\), so this expression is basically just \\(2x^5\\) - the leading term.\n\nThe following graphic illustrates the \\(4\\) basic overall shapes that can result when plotting a polynomials as \\(x\\) grows without bound:\n\n\n\n\n\n\nExample\nSuppose \\(p = a_n x^n + \\cdots + a_1 x + a_0\\) with \\(a_n > 0\\). Then by the above, eventually for large \\(x > 0\\) we have \\(p > 0\\), as that is the behaviour of \\(a_n x^n\\). Were \\(a_n < 0\\), then eventually for large \\(x>0\\), \\(p < 0\\).\nNow consider the related polynomial, \\(q\\), where we multiply \\(p\\) by \\(x^n\\) and substitute in \\(1/x\\) for \\(x\\). This is the “reversed” polynomial, as we see in this illustration for \\(n=2\\):\n\np = a*x^2 + b*x + c\nn = 2 # the degree of p\nq = expand(x^n * p(x => 1/x))\n\n \n\\[\na + b x + c x^{2}\n\\]\n\n\n\nIn particular, from the reversal, the behavior of \\(q\\) for large \\(x\\) depends on the sign of \\(a_0\\). As well, due to the \\(1/x\\), the behaviour of \\(q\\) for large \\(x>0\\) is the same as the behaviour of \\(p\\) for small positive \\(x\\). In particular if \\(a_n > 0\\) but \\(a_0 < 0\\), then p is eventually positive and q is eventually negative.\nThat is, if \\(p\\) has \\(a_n > 0\\) but \\(a_0 < 0\\) then the graph of \\(p\\) must cross the \\(x\\) axis.\nThis observation is the start of Descartes rule of signs, which counts the change of signs of the coefficients in p to say something about how many possible crossings there are of the \\(x\\) axis by the graph of the polynomial \\(p\\)."
},
{
"objectID": "precalc/polynomial.html#factoring-polynomials",
"href": "precalc/polynomial.html#factoring-polynomials",
"title": "11  Polynomials",
"section": "11.5 Factoring polynomials",
"text": "11.5 Factoring polynomials\nAmong numerous others, there are two common ways of representing a non-zero polynomial:\n\nexpanded form, as in \\(a_n x^n + a_{n-1}x^{n-1} + \\cdots a_1 x + a_0, a_n \\neq 0\\); or\nfactored form, as in \\(a\\cdot(x-r_1)\\cdot(x-r_2)\\cdots(x-r_n), a \\neq 0\\).\n\nThe latter writes \\(p\\) as a product of linear factors, though this is only possible in general if we consider complex roots. With real roots only, then the factors are either linear or quadratic, as will be discussed later.\nThere are values to each representation. One value of the expanded form is that polynomial addition and scalar multiplication is much easier than in factored form. For example, adding polynomials just requires matching up the monomials of similar powers. For the factored form, polynomial multiplication is much easier than expanded form. For the factored form it is easy to read off roots of the polynomial (values of \\(x\\) where \\(p\\) is \\(0\\)), as a product is \\(0\\) only if a term is \\(0\\), so any zero must be a zero of a factor. Factored form has other technical advantages. For example, the polynomial \\((x-1)^{1000}\\) can be compactly represented using the factored form, but would require \\(1001\\) coefficients to store in expanded form. (As well, due to floating point differences, the two would evaluate quite differently as one would require over a \\(1000\\) operations to compute, the other just two.)\nTranslating from factored form to expanded form can be done by carefully following the distributive law of multiplication. For example, with some care it can be shown that:\n\\[\n(x-1) \\cdot (x-2) \\cdot (x-3) = x^3 - 6x^2 +11x - 6.\n\\]\nThe SymPy function expand will perform these algebraic manipulations without fuss:\n\nexpand((x-1)*(x-2)*(x-3))\n\n \n\\[\nx^{3} - 6 x^{2} + 11 x - 6\n\\]\n\n\n\nFactoring a polynomial is several weeks worth of lessons, as there is no one-size-fits-all algorithm to follow. There are some tricks that are taught: for example factoring differences of perfect squares, completing the square, the rational root theorem, \\(\\dots\\). But in general the solution is not automated. The SymPy function factor will find all rational factors (terms like \\((qx-p)\\)), but will leave terms that do not have rational factors alone. For example:\n\nfactor(x^3 - 6x^2 + 11x -6)\n\n \n\\[\n\\left(x - 3\\right) \\left(x - 2\\right) \\left(x - 1\\right)\n\\]\n\n\n\nOr\n\nfactor(x^5 - 5x^4 + 8x^3 - 8x^2 + 7x - 3)\n\n \n\\[\n\\left(x - 3\\right) \\left(x - 1\\right)^{2} \\left(x^{2} + 1\\right)\n\\]\n\n\n\nBut will not factor things that are not hard to see:\n\nx^2 - 2\n\n \n\\[\nx^{2} - 2\n\\]\n\n\n\nThe factoring \\((x-\\sqrt{2})\\cdot(x + \\sqrt{2})\\) is not found, as \\(\\sqrt{2}\\) is not rational.\n(For those, it may be possible to solve to get the roots, which can then be used to produce the factored form.)\n\n11.5.1 Polynomial functions and polynomials.\nOur definition of a polynomial is in terms of algebraic expressions which are easily represented by SymPy objects, but not objects from base Julia. (Later we discuss the Polynomials package for representing polynomials. There is also the AbstractAlbegra package for a more algebraic treatment of polynomials.)\nHowever, polynomial functions are easily represented by Julia, for example,\n\nf(x) = -16x^2 + 100\n\nf (generic function with 2 methods)\n\n\nThe distinction is subtle, the expression is turned into a function just by adding the f(x) = preface. But to Julia there is a big distinction. The function form never does any computation until after a value of \\(x\\) is passed to it. Whereas symbolic expressions can be manipulated quite freely before any numeric values are specified.\nIt is easy to create a symbolic expression from a function - just evaluate the function on a symbolic value:\n\nf(x)\n\n \n\\[\n100 - 16 x^{2}\n\\]\n\n\n\nThis is easy - but can also be confusing. The function object is f, the expression is f(x) - the function evaluated on a symbolic object. Moreover, as seen, the symbolic expression can be evaluated using the same syntax as a function call:\n\np = f(x)\np(2)\n\n \n\\[\n36\n\\]\n\n\n\nFor many uses, the distinction is unnecessary to make, as the many functions will work with any callable expression. One such is plot either plot(f, a, b) or plot(f(x),a, b) will produce the same plot using the Plots package."
},
{
"objectID": "precalc/polynomial.html#questions",
"href": "precalc/polynomial.html#questions",
"title": "11  Polynomials",
"section": "11.6 Questions",
"text": "11.6 Questions\n\nQuestion\nLet \\(p\\) be the polynomial \\(3x^2 - 2x + 5\\).\nWhat is the degree of \\(p\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the leading coefficient of \\(p\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe graph of \\(p\\) would have what \\(y\\)-intercept?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs \\(p\\) a monic polynomial?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs \\(p\\) a quadratic polynomial?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe graph of \\(p\\) would be \\(U\\)-shaped?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the leading term of \\(p\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(3\\)\n \n \n\n\n \n \n \n \n \\(3x^2\\)\n \n \n\n\n \n \n \n \n \\(-2x\\)\n \n \n\n\n \n \n \n \n \\(5\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(p = x^3 - 2x^2 +3x - 4\\).\nWhat is \\(a_2\\), using the standard numbering of coefficient?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(a_n\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(a_0\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe linear polynomial \\(p = 2x + 3\\) is written in which form:\n\n\n\n \n \n \n \n \n \n \n \n \n general form\n \n \n\n\n \n \n \n \n point-slope form\n \n \n\n\n \n \n \n \n slope-intercept form\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe polynomial p is defined in Julia as follows:\n@syms x\np = -16x^2 + 64\nWhat command will return the value of the polynomial when \\(x=2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n p_2\n \n \n\n\n \n \n \n \n p[2]\n \n \n\n\n \n \n \n \n p*2\n \n \n\n\n \n \n \n \n p(x=>2)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the large, the graph of \\(p=x^{101} - x + 1\\) will\n\n\n\n \n \n \n \n \n \n \n \n \n Be \\(U\\)-shaped, opening upward\n \n \n\n\n \n \n \n \n Be \\(U\\)-shaped, opening downward\n \n \n\n\n \n \n \n \n Overall, go upwards from \\(-\\infty\\) to \\(+\\infty\\)\n \n \n\n\n \n \n \n \n Overall, go downwards from \\(+\\infty\\) to \\(-\\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the large, the graph of \\(p=x^{102} - x^{101} + x + 1\\) will\n\n\n\n \n \n \n \n \n \n \n \n \n Be \\(U\\)-shaped, opening upward\n \n \n\n\n \n \n \n \n Be \\(U\\)-shaped, opening downward\n \n \n\n\n \n \n \n \n Overall, go upwards from \\(-\\infty\\) to \\(+\\infty\\)\n \n \n\n\n \n \n \n \n Overall, go downwards from \\(+\\infty\\) to \\(-\\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the large, the graph of \\(p=-x^{10} + x^9 + x^8 + x^7 + x^6\\) will\n\n\n\n \n \n \n \n \n \n \n \n \n Be \\(U\\)-shaped, opening upward\n \n \n\n\n \n \n \n \n Be \\(U\\)-shaped, opening downward\n \n \n\n\n \n \n \n \n Overall, go upwards from \\(-\\infty\\) to \\(+\\infty\\)\n \n \n\n\n \n \n \n \n Overall, go downwards from \\(+\\infty\\) to \\(-\\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse SymPy to factor the polynomial \\(x^{11} - x\\). How many factors are found?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse SymPy to factor the polynomial \\(x^{12} - 1\\). How many factors are found?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the monic polynomial with roots \\(x=-1\\), \\(x=0\\), and \\(x=2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n x^3 + x^2 - 2x\n \n \n\n\n \n \n \n \n x^3 - x^2 - 2x\n \n \n\n\n \n \n \n \n x^3 - 3x^2 + 2x\n \n \n\n\n \n \n \n \n x^3 + x^2 + 2x\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse expand to expand the expression ((x-h)^3 - x^3) / h where x and h are symbolic constants. What is the value:\n\n\n\n \n \n \n \n \n \n \n \n \n 0\n \n \n\n\n \n \n \n \n -h^2 + 3hx - 3x^2\n \n \n\n\n \n \n \n \n h^3 + 3h^2x + 3hx^2 + x^3 -x^3/h\n \n \n\n\n \n \n \n \n x^3 - x^3/h"
},
{
"objectID": "precalc/polynomial_roots.html",
"href": "precalc/polynomial_roots.html",
"title": "12  Roots of a polynomial",
"section": "",
"text": "In this section we use the following add on packages:\nThe roots of a polynomial are the values of \\(x\\) that when substituted into the expression yield \\(0\\). For example, the polynomial \\(x^2 - x\\) has two roots, \\(0\\) and \\(1\\). A simple graph verifies this:\nThe graph crosses the \\(x\\)-axis at both \\(0\\) and \\(1\\).\nWhat is known about polynomial roots? Some simple questions might be:\nWe look at such questions here."
},
{
"objectID": "precalc/polynomial_roots.html#finding-roots-of-a-polynomial",
"href": "precalc/polynomial_roots.html#finding-roots-of-a-polynomial",
"title": "12  Roots of a polynomial",
"section": "12.1 Finding roots of a polynomial",
"text": "12.1 Finding roots of a polynomial\nKnowing that a certain number of roots exist and actually finding those roots are different matters. For the simplest cases (the linear case) with \\(a_0 + a_1x\\), we know by solving algebraically that the root is \\(-a_0/a_1\\). (We assume \\(a_1\\neq 0\\).) Of course, when \\(a_1 \\neq 0\\), the graph of the polynomial will be a line with some non-zero slope, so will cross the \\(x\\)-axis as the line and this axis are not parallel.\nFor the quadratic case, there is the famous quadratic formula (known since \\(2000\\) BC) to find the two roots guaranteed by the formula:\n\\[\n\\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}.\n\\]\nThe discriminant is defined as \\(b^2 - 4ac\\). When this is negative, the square root requires the concept of complex numbers to be defined, and the formula shows the two complex roots are conjugates. When the discriminant is \\(0\\), then the root has multiplicity two, e.g., the polynomial will factor as \\(a_2(x-r)^2\\). Finally, when the discriminant is positive, there will be two distinct, real roots. This figure shows the \\(3\\) cases, that are illustrated by \\(x^2 -1\\), \\(x^2\\) and \\(x^2 + 1\\):\n\nplot(x^2 - 1, -2, 2, legend=false) # two roots\nplot!(x^2, -2, 2) # one (double) root\nplot!(x^2 + 1, -2, 2) # no real root\nplot!(zero, -2, 2)\n\n\n\n\nThere are similar formulas for the cubic and quartic cases. (The cubic formula was known to Cardano in \\(1545\\), though through Tartagli, and the quartic was solved by Ferrari, Cardanos roommate.)\nIn general, there is no such formula using radicals for \\(5\\)th degree polynomials or higher, a proof first given by Ruffini in \\(1803\\) with improvement by Abel in \\(1824\\). Even though the fundamental theorem shows that any polynomial can be factored into linear and quadratic terms, there is no general method as to how. (It is the case that some such polynomials may be solvable by radicals, just not all of them.)\nThe factor function of SymPy only finds factors of polynomials with integer or rational coefficients corresponding to rational roots. There are alternatives.\nFinding roots with SymPy can also be done through its solve function, a function which also has a more general usage, as it can solve simple expressions or more than one expression. Here we illustrate that solve can easily handle quadratic expressions:\n\nsolve(x^2 + 2x - 3)\n\n2-element Vector{Sym}:\n -3\n 1\n\n\nThe answer is a vector of values that when substituted in for the free variable x produce \\(0.\\) The call to solve does not have an equals sign. To solve a more complicated expression of the type \\(f(x) = g(x),\\) one can solve \\(f(x) - g(x) = 0,\\) use the Eq function, or use f ~ g.\nWhen the expression to solve has more than one free variable, the variable to solve for should be explicitly stated with a second argument. For example, here we show that solve is aware of the quadratic formula:\n\n@syms a b::real c::positive\nsolve(a*x^2 + b*x + c, x)\n\n2-element Vector{Sym}:\n (-b - sqrt(-4*a*c + b^2))/(2*a)\n (-b + sqrt(-4*a*c + b^2))/(2*a)\n\n\nThe solve function will respect assumptions made when a variable is defined through symbols or @syms:\n\nsolve(a^2 + 1) # works, as a can be complex\n\n2-element Vector{Sym}:\n -\n \n\n\n\nsolve(b^2 + 1) # fails, as b is assumed real\n\nAny[]\n\n\n\nsolve(c + 1) # fails, as c is assumed positive\n\nAny[]\n\n\nPreviously, it was mentioned that factor only factors polynomials with integer coefficients over rational roots. However, solve can be used to factor. Here is an example:\n\nfactor(x^2 - 2)\n\n \n\\[\nx^{2} - 2\n\\]\n\n\n\nNothing is found, as the roots are \\(\\pm \\sqrt{2}\\), irrational numbers.\n\nrts = solve(x^2 - 2)\nprod(x-r for r in rts)\n\n \n\\[\n\\left(x - \\sqrt{2}\\right) \\left(x + \\sqrt{2}\\right)\n\\]\n\n\n\nSolving cubics and quartics can be done exactly using radicals. For example, here we see the solutions to a quartic equation can be quite involved, yet still explicit. (We use y so that complex-valued solutions, if any, will be found.)\n\n@syms y # possibly complex\nsolve(y^4 - 2y - 1)\n\n4-element Vector{Sym}:\n -sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3))/2 - sqrt(-4/sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3)) - 2*(1/4 + sqrt(129)/36)^(1/3) + 2/(3*(1/4 + sqrt(129)/36)^(1/3)))/2\n -sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3))/2 + sqrt(-4/sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3)) - 2*(1/4 + sqrt(129)/36)^(1/3) + 2/(3*(1/4 + sqrt(129)/36)^(1/3)))/2\n sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3))/2 + sqrt(-2*(1/4 + sqrt(129)/36)^(1/3) + 2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 4/sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3)))/2\n -sqrt(-2*(1/4 + sqrt(129)/36)^(1/3) + 2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 4/sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3)))/2 + sqrt(-2/(3*(1/4 + sqrt(129)/36)^(1/3)) + 2*(1/4 + sqrt(129)/36)^(1/3))/2\n\n\nThird- and fourth-degree polynomials can be solved in general, with increasingly more complicated answers. The following finds one of the answers for a general third-degre polynomial:\n\n@syms a[0:3]\np = sum(a*x^(i-1) for (i,a) in enumerate(a))\nrts = solve(p, x)\nrts[1] # there are three roots\n\n \n\\[\n- \\frac{a₂}{3 a₃} - \\frac{- \\frac{3 a₁}{a₃} + \\frac{a₂^{2}}{a₃^{2}}}{3 \\sqrt[3]{\\frac{27 a₀}{2 a₃} - \\frac{9 a₁ a₂}{2 a₃^{2}} + \\frac{a₂^{3}}{a₃^{3}} + \\frac{\\sqrt{- 4 \\left(- \\frac{3 a₁}{a₃} + \\frac{a₂^{2}}{a₃^{2}}\\right)^{3} + \\left(\\frac{27 a₀}{a₃} - \\frac{9 a₁ a₂}{a₃^{2}} + \\frac{2 a₂^{3}}{a₃^{3}}\\right)^{2}}}{2}}} - \\frac{\\sqrt[3]{\\frac{27 a₀}{2 a₃} - \\frac{9 a₁ a₂}{2 a₃^{2}} + \\frac{a₂^{3}}{a₃^{3}} + \\frac{\\sqrt{- 4 \\left(- \\frac{3 a₁}{a₃} + \\frac{a₂^{2}}{a₃^{2}}\\right)^{3} + \\left(\\frac{27 a₀}{a₃} - \\frac{9 a₁ a₂}{a₃^{2}} + \\frac{2 a₂^{3}}{a₃^{3}}\\right)^{2}}}{2}}}{3}\n\\]\n\n\n\nSome fifth degree polynomials are solvable in terms of radicals, however, solve will not seem to have luck with this particular fifth degree polynomial:\n\nsolve(x^5 - x + 1)\n\n1-element Vector{Sym}:\n CRootOf(x^5 - x + 1, 0)\n\n\n(Though there is no formula involving only radicals like the quadratic equation, there is a formula for the roots in terms of a function called the Bring radical.)\n\n12.1.1 The roots function\nRelated to solve is the specialized roots function for identifying roots, Unlike solve, it will identify multiplicities.\nFor a polynomial with only one indeterminate the usage is straight foward:\n\nroots((x-1)^2 * (x-2)^2) # solve doesn't identify multiplicities\n\nDict{Any, Any} with 2 entries:\n 1 => 2\n 2 => 2\n\n\nFor a polynomial with symbolic coefficients, the difference between the symbol and the coefficients must be identified. SymPy has a Poly type to do so. The following call illustrates:\n\n@syms a b c\np = a*x^2 + b*x + c\nq = sympy.Poly(p, x) # identify `x` as indeterminate; alternatively p.as_poly(x)\nroots(q)\n\nDict{Any, Any} with 2 entries:\n -b/(2*a) - sqrt(-4*a*c + b^2)/(2*a) => 1\n -b/(2*a) + sqrt(-4*a*c + b^2)/(2*a) => 1\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe sympy Poly function must be found within the underlying sympy module, a Python object, hence is qualified as sympy.Poly. This is common when using SymPy, as only a small handful of the many functions available are turned into Julia functions, the rest are used as would be done in Python. (This is similar, but different than qualifying by a Julia module when there are two conflicting names. An example will be the use of the name roots in both SymPy and Polynomials to refer to a function that finds the roots of a polynomial. If both functions were loaded, then the last line in the above example would need to be SymPy.roots(q) (note the capitalization.)\n\n\n\n\n12.1.2 Numerically finding roots\nThe solve function can be used to get numeric approximations to the roots. It is as easy as calling N on the solutions:\n\nrts = solve(x^5 - x + 1 ~ 0)\nN.(rts) # note the `.(` to broadcast over all values in rts\n\n1-element Vector{BigFloat}:\n -1.167303978261418684256045899854842180720560371525489039140082449275651903429536\n\n\nThis polynomial has \\(1\\) real root found by solve, as x is assumed to be real.\nHere we see another example:\n\nex = x^7 -3x^6 + 2x^5 -1x^3 + 2x^2 + 1x^1 - 2\nsolve(ex)\n\n3-element Vector{Sym}:\n 1\n 2\n CRootOf(x^5 - x - 1, 0)\n\n\nThis finds two of the seven possible roots, the remainder of the real roots can be found numerically:\n\nN.(solve(ex))\n\n3-element Vector{Real}:\n 1\n 2\n 1.167303978261418684256045899854842180720560371525489039140082449275651903429536\n\n\n\n\n12.1.3 The solveset function\nSymPy is phasing in the solveset function to replace solve. The main reason being that solve has too many different output types (a vector, a dictionary, …). The output of solveset is always a set. For tasks like this, which return a finite set, we use the elements function to access the individual answers. To illustrate:\n\n𝒑 = 8x^4 - 8x^2 + 1\n𝒑_rts = solveset(𝒑)\n\n \n\\[\n\\left\\{- \\sqrt{\\frac{1}{2} - \\frac{\\sqrt{2}}{4}}, \\sqrt{\\frac{1}{2} - \\frac{\\sqrt{2}}{4}}, - \\sqrt{\\frac{\\sqrt{2}}{4} + \\frac{1}{2}}, \\sqrt{\\frac{\\sqrt{2}}{4} + \\frac{1}{2}}\\right\\}\n\\]\n\n\n\nThe 𝒑_rts object, a FiniteSet, does not allow immediate access to its elements. For that elements will work to return a vector:\n\nelements(𝒑_rts)\n\n4-element Vector{Sym}:\n -sqrt(1/2 - sqrt(2)/4)\n sqrt(sqrt(2)/4 + 1/2)\n sqrt(1/2 - sqrt(2)/4)\n -sqrt(sqrt(2)/4 + 1/2)\n\n\nTo get the numeric approximation, we compose these function calls:\n\nN.(elements(solveset(𝒑)))\n\n4-element Vector{BigFloat}:\n -0.3826834323650897717284599840303988667613445624856270414338006356275460339600903\n 0.9238795325112867561281831893967882868224166258636424861150977312805350075011054\n 0.3826834323650897717284599840303988667613445624856270414338006356275460339600903\n -0.9238795325112867561281831893967882868224166258636424861150977312805350075011054"
},
{
"objectID": "precalc/polynomial_roots.html#do-numeric-methods-matter-when-you-can-just-graph",
"href": "precalc/polynomial_roots.html#do-numeric-methods-matter-when-you-can-just-graph",
"title": "12  Roots of a polynomial",
"section": "12.2 Do numeric methods matter when you can just graph?",
"text": "12.2 Do numeric methods matter when you can just graph?\nIt may seem that certain practices related to roots of polynomials are unnecessary as we could just graph the equation and look for the roots. This feeling is perhaps motivated by the examples given in textbooks to be worked by hand, which necessarily focus on smallish solutions. But, in general, without some sense of where the roots are, an informative graph itself can be hard to produce. That is, technology doesnt displace thinking - it only supplements it.\nFor another example, consider the polynomial \\((x-20)^5 - (x-20) + 1\\). In this form we might think the roots are near \\(20\\). However, were we presented with this polynomial in expanded form: \\(x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979\\), we might be tempted to just graph it to find roots. A naive graph might be to plot over \\([-10, 10]\\):\n\n𝐩 = x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979\nplot(𝐩, -10, 10)\n\n\n\n\nThis seems to indicate a root near \\(10\\). But look at the scale of the \\(y\\) axis. The value at \\(-10\\) is around \\(-25,000,000\\) so it is really hard to tell if \\(f\\) is near \\(0\\) when \\(x=10\\), as the range is too large.\nA graph over \\([10,20]\\) is still unclear:\n\nplot(𝐩, 10,20)\n\n\n\n\nWe see that what looked like a zero near \\(10\\), was actually a number around \\(-100,000\\).\nContinuing, a plot over \\([15, 20]\\) still isnt that useful. It isnt until we get close to \\(18\\) that the large values of the polynomial allow a clear vision of the values near \\(0\\). That being said, plotting anything bigger than \\(22\\) quickly makes the large values hide those near \\(0\\), and might make us think where the function dips back down there is a second or third zero, when only \\(1\\) is the case. (We know that, as this is the same \\(x^5 - x + 1\\) shifted to the right by \\(20\\) units.)\n\nplot(𝐩, 18, 22)\n\n\n\n\nNot that it cant be done, but graphically solving for a root here can require some judicious choice of viewing window. Even worse is the case where something might graphically look like a root, but in fact not be a root. Something like \\((x-100)^2 + 0.1\\) will demonstrate.\nFor another example, the following polynomial when plotted over \\([-5,7]\\) appears to have two real roots:\n\nh = x^7 - 16129x^2 + 254x - 1\nplot(h, -5, 7)\n\n\n\n\nin fact there are three, two are very close together:\n\nN.(solve(h))\n\n3-element Vector{BigFloat}:\n 0.007874015406930341157555003028161633376551552518768059431667490175426147404348286\n 0.007874016089132754403608727898779727134193464194254785228308443233139693780655877\n 6.939437409621392124436713492447610272200680501712185816507667632045076114762801\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe difference of the two roots is around 1e-10. For the graph over the interval of \\([-5,7]\\) there are about \\(800\\) “pixels” used, so each pixel represents a size of about 1.5e-2. So the cluster of roots would safely be hidden under a single “pixel.”\n\n\nThe point of this is to say, that it is useful to know where to look for roots, even if graphing calculators or graphing programs make drawing graphs relatively painless. A better way in this case would be to find the real roots first, and then incorporate that information into the choice of plot window."
},
{
"objectID": "precalc/polynomial_roots.html#some-facts-about-the-real-roots-of-a-polynomial",
"href": "precalc/polynomial_roots.html#some-facts-about-the-real-roots-of-a-polynomial",
"title": "12  Roots of a polynomial",
"section": "12.3 Some facts about the real roots of a polynomial",
"text": "12.3 Some facts about the real roots of a polynomial\nA polynomial with real coefficients may or may not have real roots. The following discusses some simple checks on the number of real roots and bounds on how big they can be. This can be roughly used to narrow viewing windows when graphing polynomials.\n\n12.3.1 Descartes rule of signs\nThe study of polynomial roots is an old one. In \\(1637\\) Descartes published a simple method to determine an upper bound on the number of positive real roots of a polynomial.\n\nDescartes rule of signs: if \\(p=a_n x^n + a_{n-1}x^{n-1} + \\cdots a_1x + a_0\\) then the number of positive real roots is either equal to the number of sign differences between consecutive nonzero coefficients, or is less than it by an even number. Repeated roots are counted separately.\n\nOne method of proof (sketched at the end of this section) first shows that in synthetic division by \\((x-c)\\) with \\(c > 0\\), we must have that any sign change in \\(q\\) is related to a sign change in \\(p\\) and there must be at least one more in \\(p\\). This is then used to show that there can be only as many positive roots as sign changes. That the difference comes in pairs is related to complex roots of real polynomials always coming in pairs.\nAn immediate consequence, is that a polynomial whose coefficients are all non-negative will have no positive real roots.\nApplying this to the polynomial \\(x^5 -x + 1\\) we get That the coefficients have signs: + 0 0 0 - + which collapses to the sign pattern +, -, +. This pattern has two changes of sign. The number of positive real roots is either \\(2\\) or \\(0\\). In fact there are \\(0\\) for this case.\nWhat about negative roots? Cleary, any negative root of \\(p\\) is a positive root of \\(q(x) = p(-x)\\), as the graph of \\(q\\) is just that of \\(p\\) flipped through the \\(y\\) axis. But the coefficients of \\(q\\) are the same as \\(p\\), except for the odd-indexed coefficients (\\(a_1, a_3, \\dots\\)) have a changed sign. Continuing with our example, for \\(q(x) = -x^5 + x + 1\\) we get the new sign pattern -, +, + which yields one sign change. That is, there must be a negative real root, and indeed there is, \\(x \\approx -1.1673\\).\nWith this knowledge, we could have known that in an earlier example the graph of p = x^7 - 16129x^2 + 254x - 1 which indicated two positive real roots was misleading, as there must be \\(1\\) or \\(3\\) by a count of the sign changes.\nFor another example, if we looked at \\(f(x) = x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979\\) again, we see that there could be \\(1\\), \\(3\\), or \\(5\\) positive roots. However, changing the signs of the odd powers leaves all “-” signs, so there are \\(0\\) negative roots. From the graph, we saw just \\(1\\) real root, not \\(3\\) or \\(5\\). We can verify numerically with:\n\nj = x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979\nN.(solve(j))\n\n1-element Vector{BigFloat}:\n 18.83269602173858131574395410014515781927943962847451096085991755072434809657041\n\n\n\n\n12.3.2 Cauchys bound on the magnitude of the real roots.\nDescartes rule gives a bound on how many real roots there may be. Cauchy provided a bound on how large they can be. Assume our polynomial is monic (if not, divide by \\(a_n\\) to make it so, as this wont effect the roots). Then any real root is no larger in absolute value than \\(|a_0| + |a_1| + |a_2| + \\cdots + |a_n|\\), (this is expressed in different ways.)\nTo see precisely why this bound works, suppose \\(x\\) is a root with \\(|x| > 1\\) and let \\(h\\) be the bound. Then since \\(x\\) is a root, we can solve \\(a_0 + a_1x + \\cdots + 1 \\cdot x^n = 0\\) for \\(x^n\\) as:\n\\[\nx^n = -(a_0 + a_1 x + \\cdots a_{n-1}x^{n-1})\n\\]\nWhich after taking absolute values of both sides, yields:\n\\[\n|x^n| \\leq |a_0| + |a_1||x| + |a_2||x^2| + \\cdots |a_{n-1}| |x^{n-1}| \\leq (h-1) (1 + |x| + |x^2| + \\cdots |x^{n-1}|).\n\\]\nThe last sum can be computed using a formula for geometric sums, \\((|x^n| - 1)/(|x|-1)\\). Rearranging, gives the inequality:\n\\[\n|x| - 1 \\leq (h-1) \\cdot (1 - \\frac{1}{|x^n|} ) \\leq (h-1)\n\\]\nfrom which it follows that \\(|x| \\leq h\\), as desired.\nFor our polynomial \\(x^5 -x + 1\\) we have the sum above is \\(3\\). The lone real root is approximately \\(-1.1673\\) which satisfies \\(|-1.1673| \\leq 3\\)."
},
{
"objectID": "precalc/polynomial_roots.html#questions",
"href": "precalc/polynomial_roots.html#questions",
"title": "12  Roots of a polynomial",
"section": "12.4 Questions",
"text": "12.4 Questions\n\nQuestion\nWhat is the remainder of dividing \\(x^4 - x^3 - x^2 + 2\\) by \\(x-2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^3 + x^2 + x + 2\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(x-2\\)\n \n \n\n\n \n \n \n \n \\(6\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the remainder of dividing \\(x^4 - x^3 - x^2 + 2\\) by \\(x^3 - 2x\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^2 - 2x + 2\\)\n \n \n\n\n \n \n \n \n \\(x - 1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWe have that \\(x^5 - x + 1 = (x^3 + x^2 - 1) \\cdot (x^2 - x + 1) + (-2x + 2)\\).\nWhat is the remainder of dividing \\(x^5 - x + 1\\) by \\(x^2 - x + 1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^3 + x^2 - 1\\)\n \n \n\n\n \n \n \n \n \\(-2x + 2\\)\n \n \n\n\n \n \n \n \n \\(x^2 - x + 1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this output from synthetic division\n2 | 1 0 0 0 -1 1\n | 2 4 8 16 30\n ---------------\n 1 2 4 8 15 31\nrepresenting \\(p(x) = q(x)\\cdot(x-c) + r\\).\nWhat is \\(p(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^4 +2x^3 + 4x^2 + 8x + 15\\)\n \n \n\n\n \n \n \n \n \\(x^5 - x + 1\\)\n \n \n\n\n \n \n \n \n \\(31\\)\n \n \n\n\n \n \n \n \n \\(2x^4 + 4x^3 + 8x^2 + 16x + 30\\)\n \n \n\n\n \n \n \n \n \\(x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(q(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2x^4 + 4x^3 + 8x^2 + 16x + 30\\)\n \n \n\n\n \n \n \n \n \\(x^5 - x + 1\\)\n \n \n\n\n \n \n \n \n \\(31\\)\n \n \n\n\n \n \n \n \n \\(x^4 +2x^3 + 4x^2 + 8x + 15\\)\n \n \n\n\n \n \n \n \n \\(x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(r\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31\\)\n \n \n\n\n \n \n \n \n \\(2x^4 + 4x^3 + 8x^2 + 16x + 30\\)\n \n \n\n\n \n \n \n \n \\(31\\)\n \n \n\n\n \n \n \n \n \\(x^4 +2x^3 + 4x^2 + 8x + 15\\)\n \n \n\n\n \n \n \n \n \\(x^5 - x + 1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(p=x^4 -9x^3 +30x^2 -44x + 24\\)\nFactor \\(p\\). What are the factors?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((x-2)\\) and \\((x-3)\\)\n \n \n\n\n \n \n \n \n \\((x+2)\\) and \\((x+3)\\)\n \n \n\n\n \n \n \n \n \\(2\\) and \\(3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes the expression \\(x^4 - 5\\) factor over the rational numbers?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nUsing solve, how many real roots does \\(x^4 - 5\\) have:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Soviet historian I. Y. Depman claimed that in \\(1486\\), Spanish mathematician Valmes was burned at the stake for claiming to have solved the quartic equation. Here we dont face such consequences.\nFind the largest real root of \\(x^4 - 10x^3 + 32x^2 - 38x + 15\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat are the numeric values of the real roots of \\(f(x) = x^6 - 5x^5 + x^4 - 3x^3 + x^2 - x + 1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [0.578696, 4.91368]\n \n \n\n\n \n \n \n \n [-0.434235+0.613836im, -0.434235-0.613836im]\n \n \n\n\n \n \n \n \n [-0.434235, -0.434235, 0.188049, 0.188049]\n \n \n\n\n \n \n \n \n [-0.434235, -0.434235, 0.188049, 0.188049, 0.578696, 4.91368]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nOdd polynomials must have at least one real root.\nConsider the polynomial \\(x^5 - 3x + 1\\). Does it have more than one real root?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the polynomial \\(x^5 - 1.5x + 1\\). Does it have more than one real root?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the maximum number of positive, real roots that Descartes bound says \\(p=x^5 + x^4 - x^3 + x^2 + x + 1\\) can have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHow many positive, real roots does it actually have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the maximum number of negative, real roots that Descartes bound says \\(p=x^5 + x^4 - x^3 + x^2 + x + 1\\) can have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHow many negative, real roots does it actually have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = x^5 - 4x^4 + x^3 - 2x^2 + x\\). What does Cauchys bound say is the largest possible magnitude of a root?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the largest magnitude of a real root?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAs \\(1 + 2 + 3 + 4\\) is \\(10\\), Cauchys bound says that the magnitude of the largest real root of \\(x^3 - ax^2 + bx - c\\) is \\(10\\) where \\(a,b,c\\) is one of \\(2,3,4\\). By considering all 6 such possible polynomials (such as \\(x^3 - 3x^2 + 2x - 4\\)) what is the largest magnitude or a root?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe roots of the Chebyshev polynomials are helpful for some numeric algorithms. These are a family of polynomials related by \\(T_{n+1}(x) = 2xT_n(x) - T_{n-1}(x)\\) (a recurrence relation in the manner of the Fibonacci sequence). The first two are \\(T_0(x) = 1\\) and \\(T_1(x) =x\\).\n\nBased on the relation, figure out \\(T_2(x)\\). It is\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(4x^2 - 1\\)\n \n \n\n\n \n \n \n \n \\(2x^2\\)\n \n \n\n\n \n \n \n \n \\(x\\)\n \n \n\n\n \n \n \n \n \\(2x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nTrue or false, the \\(degree\\) of \\(T_n(x)\\) is \\(n\\): (Look at the defining relation and reason this out).\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe fifth one is \\(T_5(x) = 32x^5 - 32x^3 + 6x\\). Cauchys bound says that the largest root has absolute value\n\n\n1 + 1 + 6/32\n\n2.1875\n\n\nThe Chebyshev polynomials have the property that in fact all \\(n\\) roots are real, distinct, and in \\([-1, 1]\\). Using SymPy, find the magnitude of the largest root:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nPlotting p over the interval \\([-2,2]\\) does not help graphically identify the roots:\n\n\nplot(16x^5 - 20x^3 + 5x, -2, 2)\n\n\n\n\nDoes graphing over \\([-1,1]\\) show clearly the \\(5\\) roots?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No"
},
{
"objectID": "precalc/polynomial_roots.html#appendix-proof-of-descartes-rule-of-signs",
"href": "precalc/polynomial_roots.html#appendix-proof-of-descartes-rule-of-signs",
"title": "12  Roots of a polynomial",
"section": "12.5 Appendix: Proof of Descartes rule of signs",
"text": "12.5 Appendix: Proof of Descartes rule of signs\nProof modified from this post.\nFirst, we can assume \\(p\\) is monic (\\(p_n=1\\) and positive), and \\(p_0\\) is non zero. The latter, as we can easily deflate the polynomial by dividing by \\(x\\) if \\(p_0\\) is zero.\nLet var(p) be the number of sign changes and pos(p) the number of positive real roots of p.\nFirst: For a monic \\(p\\) if \\(p_0 < 0\\) then var(p) is odd and if \\(p_0 > 0\\) then var(p) is even.\nThis is true for degree \\(n=1\\) the two sign patterns under the assumption are +- (\\(p_0 < 0\\)) or ++ (\\(p_0 > 0\\)). If it is true for degree \\(n-1\\), then the we can consider the sign pattern of such an \\(n\\) degree polynomial having one of these patterns: +...+- or +...-- (if \\(p_0 < 0\\)) or +...++ or +...-+ if (\\(p_0>0\\)). An induction step applied to all but the last sign for these four patterns leads to even, odd, even, odd as the number of sign changes. Incorporating the last sign leads to odd, odd, even, even as the number of sign changes.\nSecond: For a monic \\(p\\) if p_0 < 0 then pos(p) is odd, if p_0 > 0 then pos(p) is even.\nThis is clearly true for monic degree \\(1\\) polynomials: if \\(c\\) is positive \\(p = x - c\\) has one real root (an odd number) and \\(p = x + c\\) has \\(0\\) real roots (an even number). Now, suppose \\(p\\) has degree \\(n\\) and is monic. Then as \\(x\\) goes to \\(\\infty\\), it must be \\(p\\) goes to \\(\\infty\\).\nIf \\(p_0 < 0\\) then there must be a positive real root, say \\(r\\), (Bolzanos intermediate value theorem). Dividing \\(p\\) by \\((x-r)\\) to produce \\(q\\) requires \\(q_0\\) to be positive and of lower degree. By induction \\(q\\) will have an even number of roots. Add in the root \\(r\\) to see that \\(p\\) will have an odd number of roots.\nNow consider the case \\(p_0 > 0\\). There are two possibilities either pos(p) is zero or positive. If pos(p) is \\(0\\) then there are an even number of roots. If pos(p) is positive, then call \\(r\\) one of the real positive roots. Again divide by \\(x-r\\) to produce \\(p = (x-r) \\cdot q\\). Then \\(q_0\\) must be negative for \\(p_0\\) to be positive. By induction \\(q\\) must have an odd number or roots, meaning \\(p\\) must have an even numbers\nSo there is parity between var(p) and pos(p): if \\(p\\) is monic and \\(p_0 < 0\\) then both var(p) and pos(p) are both odd; and if \\(p_0 > 0\\) both var(p) and pos(p) are both even.\nDescartes rule of signs will be established if it can be shown that var(p) is at least as big as pos(p). Supppose \\(r\\) is a positive real root of \\(p\\) with \\(p = (x-r)q\\). We show that var(p) > var(q) which can be repeatedly applied to show that if \\(p=(x-r_1)\\cdot(x-r_2)\\cdot \\cdots \\cdot (x-r_l) q\\), where the \\(r_i\\)s are the postive real roots, then var(p) >= l + var(q) >= l = pos(p).\nAs \\(p = (x-c)q\\) we must have the leading term is \\(p_nx^n = x \\cdot q_{n-1} x^{n-1}\\) so \\(q_{n_1}\\) will also be + under our monic assumption. Looking at a possible pattern for the signs of \\(q\\), we might see the following unfinished synthetic division table for a specific \\(q\\):\n + ? ? ? ? ? ? ? ?\n+ ? ? ? ? ? ? ? ?\n -----------------\n + - - - + - + + 0\nBut actually, we can fill in more, as the second row is formed by multiplying a postive \\(c\\):\n + ? ? ? ? ? ? ? ?\n+ + - - - + - + +\n -----------------\n + - - - + - + + 0\nWhats more, using the fact that to get 0 the two summands must differ in sign and to have a ? plus + yield a -, the ? must be - (and reverse), the following must be the case for the signs of p:\n + - ? ? + - + ? -\n+ + - - - + - + +\n -----------------\n + - - - + - + + 0\nIf the bottom row represents \\(q_7, q_6, \\dots, q_0\\) and the top row \\(p_8, p_7, \\dots, p_0\\), then the sign changes in \\(q\\) from + to - are matched by sign changes in \\(p\\). The ones in \\(q\\) from \\(-\\) to \\(+\\) are also matched regardless of the sign of the first two question marks (though \\(p\\) could possibly have more). The last sign change in \\(p\\) between \\(p_2\\) and \\(p_0\\) has no counterpart in \\(q\\), so there is at least one more sign change in \\(p\\) than \\(q\\).\nAs such, the var(p) \\(\\geq 1 +\\) var(q)."
},
{
"objectID": "precalc/polynomials_package.html",
"href": "precalc/polynomials_package.html",
"title": "13  The Polynomials package",
"section": "",
"text": "This section will use the following add-on packages:\nWhile SymPy can be used to represent polynomials, there are also native Julia packages available for this and related tasks. These packages include Polynomials, MultivariatePolynomials, and AbstractAlgebra, among many others. (A search on juliahub.com found over \\(50\\) packages matching “polynomial”.) We will look at the Polynomials package in the following, as it is straightforward to use and provides the features we are looking at for univariate polynomials."
},
{
"objectID": "precalc/polynomials_package.html#construction",
"href": "precalc/polynomials_package.html#construction",
"title": "13  The Polynomials package",
"section": "13.1 Construction",
"text": "13.1 Construction\nThe polynomial expression \\(p = a_0 + a_1\\cdot x + a_2\\cdot x^2 + \\cdots + a_n\\cdot x^n\\) can be viewed mathematically as a vector of numbers with respect to some “basis”, which for standard polynomials, as above, is just the set of monomials, \\(1, x, x^2, \\dots, x^n\\). With this viewpoint, the polynomial \\(p\\) can be identified with the vector [a0, a1, a2, ..., an]. The Polynomials package provides a wrapper for such an identification through the Polynomial constructor. We have previously loaded this add-on package.\nTo illustrate, the polynomial \\(p = 3 + 4x + 5x^2\\) is constructed with\n\np = Polynomial([3,4,5])\n\n3 + 4∙x + 5∙x2\n\n\nwhere the vector [3,4,5] represents the coefficients. The polynomial \\(q = 3 + 5x^2 + 7x^4\\) has some coefficients that are \\(0\\), these too must be indicated on construction, so we would have:\n\nq = Polynomial([3,0,5,0,7])\n\n3 + 5∙x2 + 7∙x4\n\n\nThe coeffs function undoes Polynomial, returning the coefficients from a Polynomial object.\n\ncoeffs(q)\n\n5-element Vector{Int64}:\n 3\n 0\n 5\n 0\n 7\n\n\nOnce defined, the usual arithmetic operations for polynomials follow:\n\np + q\n\n6 + 4∙x + 10∙x2 + 7∙x4\n\n\n\np*q + p^2\n\n18 + 36∙x + 76∙x2 + 60∙x3 + 71∙x4 + 28∙x5 + 35∙x6\n\n\nA polynomial has several familiar methods, such as degree:\n\ndegree(p), degree(q)\n\n(2, 4)\n\n\nThe zero polynomial has degree -1, by convention.\nPolynomials may be evaluated using function notation, that is:\n\np(1)\n\n12\n\n\nThis blurs the distinction between a polynomial expression a formal object consisting of an indeterminate, coefficients, and the operations of addition, subtraction, multiplication, and non-negative integer powers and a polynomial function.\nThe polynomial variable, in this case 1x, can be returned by variable:\n\nx = variable(p)\n\nx\n\n\nThis variable is a Polynomial object, so can be manipulated as a polynomial; we can then construct polynomials through expressions like:\n\nr = (x-2)^3 * (x-1) * (x+1)\n\n8 - 12∙x - 2∙x2 + 11∙x3 - 6∙x4 + x5\n\n\nThe product is expanded for storage by Polynomials, which may not be desirable for some uses. A new variable can produced by calling variable(); so we could have constructed p by:\n\nx = variable()\n3 + 4x + 5x^2\n\n3 + 4∙x + 5∙x2\n\n\nA polynomial in factored form, as r above is, can be constructed from its roots. Above, r has roots \\(2\\) (twice), \\(1\\), and \\(-1\\). Passing these as a vector to fromroots re-produces r:\n\nfromroots([2,2,1,-1])\n\n-4 + 4∙x + 3∙x2 - 4∙x3 + x4\n\n\nThe fromroots function is basically the factor thereom which links the factored form of the polynomial with the roots of the polynomial: \\((x-k)\\) is a factor of \\(p\\) if and only if \\(k\\) is a root of \\(p\\). By combining a factor of the type \\((x-k)\\) for each specified root, the polynomial can be constructed by multiplying its factors. For example, using prod and a generarator, we would have:\n\nx = variable()\nprod(x - k for k in [2,2,1,-1])\n\n-4 + 4∙x + 3∙x2 - 4∙x3 + x4\n\n\nThe Polynomials package has different ways to represent polynomials, and a factored form can also be used. For example, the fromroots function constructs polynomials from the specified roots and FactoredPolynomial leaves these in a factored form:\n\nfromroots(FactoredPolynomial, [2, 2, 1, -1])\n\n(x - 2)² * (x + 1) * (x - 1)\n\n\nThis form is helpful for some operations, for example polynomial multiplication and positive integer exponentiation, but not others such as addition of polynomials, where such polynomials must first be converted to the standard basis to add and are then converted back into a factored form.\n\nThe indeterminate, or polynomial symbol is a related, but different concept to variable. Polynomials are stored as a collection of coefficients, an implicit basis, and a symbol, in the above this symbol is :x. A polynomials symbol is checked to ensure that polynomials with different symbols are not algebraically combined, except for the special case of constant polynomials. The symbol is specified through a second argument on construction:\n\ns = Polynomial([1,2,3], \"t\")\n\n1 + 2∙t + 3∙t2\n\n\nAs r uses “x”, and s a “t” the two can not be added, say:\n\nr + s\n\nLoadError: ArgumentError: Polynomials have different indeterminates"
},
{
"objectID": "precalc/polynomials_package.html#graphs",
"href": "precalc/polynomials_package.html#graphs",
"title": "13  The Polynomials package",
"section": "13.2 Graphs",
"text": "13.2 Graphs\nPolynomial objects have a plot recipe defined plotting from the Plots package should be as easy as calling plot:\n\nplot(r, legend=false) # suppress the legend\n\n\n\n\nThe choice of domain is heuristically identified; it and can be manually adjusted, as with:\n\nplot(r, 1.5, 2.5, legend=false)"
},
{
"objectID": "precalc/polynomials_package.html#roots",
"href": "precalc/polynomials_package.html#roots",
"title": "13  The Polynomials package",
"section": "13.3 Roots",
"text": "13.3 Roots\nThe default plot recipe checks to ensure the real roots of the polynomial are included in the domain of the plot. To do this, it must identify the roots. This is done numerically by the roots function, as in this example:\n\nx = variable()\np = x^5 - x - 1\nroots(p)\n\n5-element Vector{ComplexF64}:\n -0.7648844336005849 - 0.35247154603172626im\n -0.7648844336005849 + 0.35247154603172626im\n 0.18123244446987605 - 1.0839541013177107im\n 0.18123244446987605 + 1.0839541013177107im\n 1.1673039782614187 + 0.0im\n\n\nA consequence of the fundamental theorem of algebra and the factor theorem is that any fifth degree polynomial with integer coefficients has \\(5\\) roots, where possibly some are complex. For real coefficients, these complex values must come in conjugate pairs, which can be observed from the output. The lone real root is approximately 1.1673039782614187. This value being a numeric approximation to the irrational root.\n\n\n\n\n\n\nNote\n\n\n\nSymPy also has a roots function. If both Polynomials and SymPy are used together, calling roots must be qualified, as with Polynomials.roots(...). Similarly, degree is provided in both, so it too must be qualified.\n\n\nThe roots function numerically identifies roots. As such, it is susceptible to floating point issues. For example, the following polynomial has one root with multiplicity \\(5\\), but \\(5\\) distinct roots are numerically identified:\n\nx = variable()\np = (x-1)^5\nroots(p)\n\n5-element Vector{ComplexF64}:\n 0.9990471550471702 + 0.0im\n 0.9997060762685409 - 0.0009060415877147721im\n 0.9997060762685409 + 0.0009060415877147721im\n 1.000770346207878 - 0.0005593476807788428im\n 1.000770346207878 + 0.0005593476807788428im\n\n\nThe Polynomials package has the multroot function to identify roots of polynomials when there are multiplicities expected. This function is not exported, so is called through:\n\nx = variable()\np = (x-1)^5\nPolynomials.Multroot.multroot(p)\n\n(values = [1.0], multiplicities = [5], κ = 0.1348399724926484, ϵ = 0.0)\n\n\nFloating point error can also prevent the finding of real roots. For example, this polynomial has \\(3\\) real roots, but roots finds but \\(1\\), as the two nearby ones are identified as complex:\n\nx = variable()\np = -1 + 254x - 16129x^2 + x^9\nroots(p)\n\n9-element Vector{ComplexF64}:\n -3.5980557124631396 - 1.7316513703738028im\n -3.5980557124631396 + 1.7316513703738028im\n -0.8903404519370821 - 3.8909853177372544im\n -0.8903404519370821 + 3.8909853177372544im\n 0.007874015748031492 - 2.1956827901729616e-10im\n 0.007874015748031492 + 2.1956827901729616e-10im\n 2.486125235439536 - 3.1203279635732852im\n 2.486125235439536 + 3.1203279635732852im\n 3.9887938264253098 + 0.0im\n\n\nThe RealPolynomialRoots package, loaded at the top of this section, can assist in the case of identifying real roots of square-free polynomials (no multiple roots). For example:\n\nps = coeffs(-1 + 254x - 16129x^2 + x^9)\nst = ANewDsc(ps)\nrefine_roots(st)\n\n3-element Vector{BigFloat}:\n 3.988793826425306473641427181104981832147640952968369063329423338980976674467138\n 0.007874015750717319072560781468357231530618080530596497294257518612423955993846073\n 0.007874015745345658111832733197085506951679614123256514007974427831118237687743416"
},
{
"objectID": "precalc/polynomials_package.html#fitting-a-polynomial-to-data",
"href": "precalc/polynomials_package.html#fitting-a-polynomial-to-data",
"title": "13  The Polynomials package",
"section": "13.4 Fitting a polynomial to data",
"text": "13.4 Fitting a polynomial to data\nThe fact that two distinct points determine a line is well known. Deriving the line is easy. Say we have two points \\((x_0, y_0)\\) and \\((x_1, y_1)\\). The slope is then\n\\[\nm = \\frac{y_1 - y_0}{x_1 - x_0}, \\quad x_1 \\neq x_0\n\\]\nThe line is then given from the point-slope form by, say, \\(y= y_0 + m\\cdot (x-x_0)\\). This all assumes, \\(x_1 \\neq x_0\\), as were that the case the slope would be infinite (though the vertical line \\(x=x_0\\) would still be determined).\nA line, \\(y=mx+b\\) can be a linear polynomial or a constant depending on \\(m\\), so we could say \\(2\\) points determine a polynomial of degree \\(1\\) or less. Similarly, \\(3\\) distinct points determine a degree \\(2\\) polynomial or less, \\(\\dots\\), \\(n+1\\) distinct points determine a degree \\(n\\) or less polynomial. Finding a polynomial, \\(p\\) that goes through \\(n+1\\) points (i.e., \\(p(x_i)=y_i\\) for each \\(i\\)) is called polynomial interpolation. The main theorem is:\n\nPolynomial interpolation theorem: There exists a unique polynomial of degree \\(n\\) or less that interpolates the points \\((x_0,y_0), (x_1,y_1), \\dots, (x_n, y_n)\\) when the \\(x_i\\) are distinct.\n\n(Uniqueness follows as suppose \\(p\\) and \\(q\\) satisfy the above, then \\((p-q)(x) = 0\\) at each of the \\(x_i\\) and is of degree \\(n\\) or less, so must be the \\(0\\) polynomial. Existence comes by construction. See the Lagrange basis in the questions.)\nKnowing we can succeed, we approach the problem of \\(3\\) points, say \\((x_0, y_0)\\), \\((x_1,y_1)\\), and \\((x_2, y_2)\\). There is a polynomial \\(p = a\\cdot x^2 + b\\cdot x + c\\) with \\(p(x_i) = y_i\\). This gives \\(3\\) equations for the \\(3\\) unknown values \\(a\\), \\(b\\), and \\(c\\):\n\\[\n\\begin{align*}\na\\cdot x_0^2 + b\\cdot x_0 + c &= y_0\\\\\na\\cdot x_1^2 + b\\cdot x_1 + c &= y_1\\\\\na\\cdot x_2^2 + b\\cdot x_2 + c &= y_2\\\\\n\\end{align*}\n\\]\nSolving this with SymPy is tractable. A comprehension is used below to create the \\(3\\) equations; the zip function is a simple means to iterate over \\(2\\) or more iterables simultaneously:\n\nSymPy.@syms a b c xs[0:2] ys[0:2]\neqs = [a*xi^2 + b*xi + c ~ yi for (xi,yi) in zip(xs, ys)]\nabc = SymPy.solve(eqs, [a,b,c])\n\nDict{Any, Any} with 3 entries:\n a => (-xs₀*ys₁ + xs₀*ys₂ + xs₁*ys₀ - xs₁*ys₂ - xs₂*ys₀ + xs₂*ys₁)/(xs₀^2*xs₁ …\n c => (xs₀^2*xs₁*ys₂ - xs₀^2*xs₂*ys₁ - xs₀*xs₁^2*ys₂ + xs₀*xs₂^2*ys₁ + xs₁^2*x…\n b => (xs₀^2*ys₁ - xs₀^2*ys₂ - xs₁^2*ys₀ + xs₁^2*ys₂ + xs₂^2*ys₀ - xs₂^2*ys₁)/…\n\n\nAs can be seen, the terms do get quite unwieldy when treated symbolically. Numerically, the fit function from the Polynomials package will return the interpolating polynomial. To compare,\n\nfit(Polynomial, [1,2,3], [3,1,2])\n\n8.0 - 6.5∙x + 1.5∙x2\n\n\nand we can compare that the two give the same answer with, for example:\n\nabc[b]((xs .=> [1,2,3])..., (ys .=> [3,1,2])...)\n\n-13/2\n\n\n(Ignore the tricky way of substituting in each value of xs and ys for the symbolic values in x and y.)\n\nExample Inverse quadratic interpolation\nA related problem, that will arise when finding iterative means to solve for zeros of functions, is inverse quadratic interpolation. That is finding \\(q\\) that goes through the points \\((x_0,y_0), (x_1, y_1), \\dots, (x_n, y_n)\\) satisfying \\(q(y_i) = x_i\\). (That is \\(x\\) and \\(y\\) are reversed, as with inverse functions.) For the envisioned task, where the inverse quadratic function intersects the \\(x\\) axis is of interest, which is at the constant term of the polynomial (as it is like the \\(y\\) intercept of typical polynomial). Lets see what that is in general by replicating the above steps (though now the assumption is the \\(y\\) values are distinct):\n\nSymPy.@syms a b c xs[0:2] ys[0:2]\neqs = [a*yi^2 + b*yi + c ~ xi for (xi, yi) in zip(xs,ys)]\nabc = SymPy.solve(eqs, [a,b,c])\nabc[c]\n\n 2 2 2 2 2 \nxs₀⋅ys₁ ⋅ys₂ - xs₀⋅ys₁⋅ys₂ - xs₁⋅ys₀ ⋅ys₂ + xs₁⋅ys₀⋅ys₂ + xs₂⋅ys₀ ⋅ys₁ - xs₂\n──────────────────────────────────────────────────────────────────────────────\n 2 2 2 2 2 2 \n ys₀ ⋅ys₁ - ys₀ ⋅ys₂ - ys₀⋅ys₁ + ys₀⋅ys₂ + ys₁ ⋅ys₂ - ys₁⋅ys₂ \n\n 2\n⋅ys₀⋅ys₁ \n─────────\n \n \n\n\nWe can graphically see the result for the specific values of xs and ys as follows:"
},
{
"objectID": "precalc/polynomials_package.html#questions",
"href": "precalc/polynomials_package.html#questions",
"title": "13  The Polynomials package",
"section": "13.5 Questions",
"text": "13.5 Questions\n\nQuestion\nDo the polynomials \\(p = x^4\\) and \\(q = x^2 - 2\\) intersect?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDo the polynomials \\(p = x^4-4\\) and \\(q = x^2 - 2\\) intersect?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow many real roots does \\(p = 1 + x + x^2 + x^3 + x^4 + x^5\\) have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nMathematically we say the \\(0\\) polynomial has no degree. What convention does Polynomials use? (Look at degree(zero(Polynomial)).)\n\n\n\n \n \n \n \n \n \n \n \n \n nothing\n \n \n\n\n \n \n \n \n -1\n \n \n\n\n \n \n \n \n 0\n \n \n\n\n \n \n \n \n Inf\n \n \n\n\n \n \n \n \n -Inf\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the polynomial \\(p(x) = a_1 x - a_3 x^3 + a_5 x^5\\) where\n\\[\n\\begin{align*}\na_1 &= 4(\\frac{3}{\\pi} - \\frac{9}{16}) \\\\\na_3 &= 2a_1 -\\frac{5}{2}\\\\\na_5 &= a_1 - \\frac{3}{2}.\n\\end{align*}\n\\]\n\nForm the polynomial p by first computing the \\(a\\)s and forming p=Polynomial([0,a1,0,-a3,0,a5])\nForm the polynomial q by these commands x=variable(); q=p(2x/pi)\n\nThe polynomial q, a \\(5\\)th-degree polynomial, is a good approximation of for the sine function.\nMake graphs of both q and sin. Over which interval is the approximation (visually) a good one?\n\n\n\n \n \n \n \n \n \n \n \n \n \\([0,1]\\)\n \n \n\n\n \n \n \n \n \\([0,\\pi]\\)\n \n \n\n\n \n \n \n \n \\([0,2\\pi]\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(This blog post shows how this approximation is valuable under some specific circumstances.)\n\n\nQuestion\nThe polynomial\n\nfromroots([1,2,3,3,5])\n\n-90 + 213∙x - 184∙x2 + 74∙x3 - 14∙x4 + x5\n\n\nhas \\(5\\) sign changes and \\(5\\) real roots. For x = variable() use div(p, x-3) to find the result of dividing \\(p\\) by \\(x-3\\). How many sign changes are there in the new polynomial?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe identification of a collection of coefficients with a polynomial depends on an understood basis. A basis for the polynomials of degree \\(n\\) or less, consists of a minimal collection of polynomials for which all the polynomials of degree \\(n\\) or less can be expressed through a combination of sums of terms, each of which is just a coefficient times a basis member. The typical basis is the \\(n+\\) polynomials \\(1`, `x`, `x^2, \\dots, x^n\\). However, though every basis must have \\(n+1\\) members, they need not be these.\nA basis used by Lagrange is the following. Let there be \\(n+1\\) points distinct points \\(x_0, x_1, \\dots, x_n\\). For each \\(i\\) in \\(0\\) to \\(n\\) define\n\\[\nl_i(x) = \\prod_{0 \\leq j \\leq n; j \\ne i} \\frac{x-x_j}{x_i - x_j} =\n\\frac{(x-x_1)\\cdot(x-x_2)\\cdot \\cdots \\cdot (x-x_{j-1}) \\cdot (x-x_{j+1}) \\cdot \\cdots \\cdot (x-x_n)}{(x_i-x_1)\\cdot(x_i-x_2)\\cdot \\cdots \\cdot (x_i-x_{j-1}) \\cdot (x_i-x_{j+1}) \\cdot \\cdots \\cdot (x_i-x_n)}.\n\\]\nThat is \\(l_i(x)\\) is a product of terms like \\((x-x_j)/(x_i-x_j)\\) except when \\(j=i\\).\nWhat is is the value of \\(l_0(x_0)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy?\n\n\n\n \n \n \n \n \n \n \n \n \n All terms like \\((x-x_j)/(x_0 - x_j)\\) will be \\(1\\) when \\(x=x_0\\) and these are all the terms in the product defining \\(l_0\\).\n \n \n\n\n \n \n \n \n The term \\((x_0-x_0)\\) will be \\(0\\), so the product will be zero\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\(l_i(x_i)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\(l_0(x_1)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy?\n\n\n\n \n \n \n \n \n \n \n \n \n The term \\((x-x_1)/(x_0-x_1)\\) is omitted from the product, so the answer is non-zero.\n \n \n\n\n \n \n \n \n The term like \\((x-x_1)/(x_0 - x_1)\\) will be \\(0\\) when \\(x=x_1\\) and so the product will be \\(0\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value of \\(l_i(x_j)\\) if \\(i \\ne j\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSuppose the \\(x_0, x_1, \\dots, x_n\\) are the \\(x\\) coordinates of \\(n\\) distinct points \\((x_0,y_0)\\), \\((x_1, y_1), \\dots, (x_n,y_n).\\) Form the polynomial with the above basis and coefficients being the \\(y\\) values. That is consider:\n\\[\np(x) = \\sum_{i=0}^n y_i l_i(x) = y_0l_0(x) + y_1l_1(x) + \\dots + y_nl_n(x)\n\\]\nWhat is the value of \\(p(x_j)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(y_j\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThis last answer is why \\(p\\) is called an interpolating polynomial and this question shows an alternative way to identify interpolating polynomials from solving a system of linear equations.\n\n\nQuestion\nThe Chebyshev (\\(T\\)) polynomials are polynomials which use a different basis from the standard basis. Denote the basis elements \\(T_0\\), \\(T_1\\), … where we have \\(T_0(x) = 1\\), \\(T_1(x) = x\\), and for bigger indices \\(T_{i+1}(x) = 2xT_i(x) - T_{i-1}(x)\\). The first others are then:\n\\[\n\\begin{align*}\nT_2(x) &= 2xT_1(x) - T_0(x) = 2x^2 - 1\\\\\nT_3(x) &= 2xT_2(x) - T_1(x) = 2x(2x^2-1) - x = 4x^3 - 3x\\\\\nT_4(x) &= 2xT_3(x) - T_2(x) = 2x(4x^3-3x) - (2x^2-1) = 8x^4 - 8x^2 + 1\n\\end{align*}\n\\]\nWith these definitions what is the polynomial associated to the coefficients \\([0,1,2,3]\\) with this basis?\n\n\n\n \n \n \n \n \n \n \n \n \n It is \\(0\\cdot 1 + 1 \\cdot x + 2 \\cdots x^2 + 3\\cdot x^3 = x + 2x^2 + 3x^3\\)\n \n \n\n\n \n \n \n \n It is \\(0\\cdot T_1(x) + 1\\cdot T_1(x) + 2\\cdot T_2(x) + 3\\cdot T_3(x) = -2 - 8\\cdot x + 4\\cdot x^2 + 12\\cdot x^3\\)`\n \n \n\n\n \n \n \n \n It is \\(0\\cdot T_1(x) + 1\\cdot T_1(x) + 2\\cdot T_2(x) + 3\\cdot T_3(x) = 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe Polynomials package has an implementation, so you can check your answer through convert(Polynomial, ChebyshevT([0,1,2,3])). Similarly, the SpecialPolynomials package has these and many other polynomial bases represented.\nThe ApproxFun package is built on top of polynomials expressed in this basis, as the Chebyshev polynomials have special properties which make them very suitable when approximating functions with polynomials. The ApproxFun package uses easier-to-manipulate polynomials to approximate functions very accurately, thereby being useful for investigating properties of non-linear functions leveraging properties for polynomials."
},
{
"objectID": "precalc/rational_functions.html",
"href": "precalc/rational_functions.html",
"title": "14  Rational functions",
"section": "",
"text": "This section uses the following add-on packages:\nThe Polynomials package is “imported” to avoid naming collisions with SymPy; names will need to be qualified.\nA rational expression is the ratio of two polynomial expressions. Such expressions arise in many modeling situations. As many facts are known about polynomial expressions, much can be determined about rational expressions. This section covers some additional details that arise when graphing such expressions."
},
{
"objectID": "precalc/rational_functions.html#rational-functions",
"href": "precalc/rational_functions.html#rational-functions",
"title": "14  Rational functions",
"section": "14.1 Rational functions",
"text": "14.1 Rational functions\nThe rational numbers are simply ratios of integers, of the form \\(p/q\\) for non-zero \\(q\\). A rational function is a ratio of polynomial functions of the form \\(p(x)/q(x)\\), again \\(q\\) is non-zero, but may have zeros.\nWe know that polynomials have nice behaviors due to the following facts:\n\nBehaviors at \\(-\\infty\\), \\(\\infty\\) are known just from the leading term.\nThere are possible wiggles up and down, the exact behavior depends on intermediate terms, but there can be no more than \\(n-1\\) wiggles.\nThe number of real zeros is no more than \\(n\\), the degree of the polynomial.\n\nRational functions are not quite so nice:\n\nbehavior at \\(-\\infty\\) and \\(\\infty\\) can be like a polynomial of any degree, including constants\nbehaviour at any value x can blow up due to division by \\(0\\) - rational functions, unlike polynomials, need not be always defined\nThe function may or may not cross zero, even if the range includes every other point, as the graph of \\(f(x) =1/x\\) will show.\n\nHere, as with our discussion on polynomials, we are interested for now in just a few properties:\n\nWhat happens to \\(f(x)\\) when \\(x\\) gets really big or really small (towards \\(\\infty\\) or \\(-\\infty\\))?\nWhat happens near the values where \\(q(x) = 0\\)?\nWhen is \\(f(x) = 0\\)?\n\nThese questions can often be answered with a graph, but with rational functions we will see that care must be taken to produce a useful graph.\nFor example, consider this graph generated from a simple rational function:\n\\[\nf(x) = \\frac{(x-1)^2 \\cdot (x-2)}{(x+3) \\cdot (x-3)}.\n\\]\n\nf(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) )\nplot(f, -10, 10)\n\n\n\n\nWe would be hard pressed to answer any of the three questions above from the graph, though, on inspection, we might think the strange spikes have something to do with \\(x\\) values where \\(q(x)=0\\).\nThe question of big or small \\(x\\) is not answered well with this graph, as the spikes dominate the scale of the \\(y\\)-axis. Setting a much larger viewing window illuminates this question:\n\nplot(f, -100, 100)\n\n\n\n\nWe can see from this, that the function eventually looks like a slanted straight line. The eventual shape of the graph is something that can be determined just from the two leading terms.\nThe spikes havent vanished completely. It is just that with only a few hundred points to make the graph, there arent any values near enough to the problem to make a large spike. The spikes happen because the function has a vertical asymptote at these values. Though not quite right, it is reasonable to think of the graph being made by selecting a few hundred points in the specified domain, computing the corresponding \\(y\\) values, plotting the pairs, and finally connecting the points with straight line segments. Near a vertical asymptote the function values can be arbitrarily large in absolute values, though at the vertical asymptote the function is undefined. This graph doesnt show such detail.\nThe spikes will be related to the points where \\(q(x) = 0\\), though not necessarily all of them not all such points will produce a vertical asymptote.\nWhere the function crosses \\(0\\) is very hard to tell from these two graphs. As well, other finer features, such as local peaks or valleys, when present, can be hard to identify as the \\(y\\)-scale is set to accommodate the asymptotes. Working around the asymptotes requires some extra effort. Strategies are discussed herein."
},
{
"objectID": "precalc/rational_functions.html#asymptotes",
"href": "precalc/rational_functions.html#asymptotes",
"title": "14  Rational functions",
"section": "14.2 Asymptotes",
"text": "14.2 Asymptotes\nFormally, an asymptote of a curve is a line such that the distance between the curve and the line approaches \\(0\\) as they tend to infinity. Tending to infinity can happen as \\(x \\rightarrow \\pm \\infty\\) or \\(y \\rightarrow \\pm \\infty\\), the former being related to horizontal asymptotes or slant asymptotes, the latter being related to vertical asymptotes.\n\n14.2.1 Behaviour as \\(x \\rightarrow \\infty\\) or \\(x \\rightarrow -\\infty\\).\nLets look more closely at our example rational function using symbolic math.\nIn particular, lets rewrite the expression in terms of its numerator and denominator:\n\n@syms x::real\nnum = (x-1)^2 * (x-2)\nden = (x+3) * (x-3)\n\n \n\\[\n\\left(x - 3\\right) \\left(x + 3\\right)\n\\]\n\n\n\nEuclids division algorithm can be used for polynomials \\(a(x)\\) and \\(b(x)\\) to produce \\(q(x)\\) and \\(r(x)\\) with \\(a = b\\cdot q + r\\) and the degree of \\(r(x)\\) is less than the degree of \\(b(x)\\). This is in direct analogy to the division algorithm of integers, only there the value of the remainder, \\(r(x)\\), satisfies \\(0 \\leq r < b\\). Given \\(q(x)\\) and \\(r(x)\\) as above, we can reexpress the rational function\n\\[\n\\frac{a(x)}{b(x)} = q(x) + \\frac{r(x)}{b(x)}.\n\\]\nThe rational expression on the right-hand side has larger degree in the denominator.\nThe division algorithm is implemented in Julia generically through the divrem method:\n\nq, r = divrem(num, den)\n\n(x - 4, 14*x - 38)\n\n\nThis yields the decomposition of num/den:\n\nq + r/den\n\n \n\\[\nx - 4 + \\frac{14 x - 38}{\\left(x - 3\\right) \\left(x + 3\\right)}\n\\]\n\n\n\nA similar result can be found using the apart function, which can be easier to use if the expression is not given in terms of a separate numerator and denominator.\n\ng(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3)) # as a function\nh = g(x) # a symbolic expression\napart(h)\n\n \n\\[\nx - 4 + \\frac{40}{3 \\left(x + 3\\right)} + \\frac{2}{3 \\left(x - 3\\right)}\n\\]\n\n\n\nThis decomposition breaks the rational expression into two pieces: \\(x-4\\) and \\(40/(3x+9) + 2/(3x-9)\\). The first piece would have a graph that is the line with slope \\(1\\) and \\(y\\)-intercept \\(4\\). As \\(x\\) goes to \\(\\infty\\), the second piece will clearly go towards \\(0,\\) as this simple graph shows:\n\nplot(apart(h) - (x - 4), 10, 100)\n\n\n\n\nSimilarly, a plot over \\([-100, -10]\\) would show decay towards \\(0\\), though in that case from below. Combining these two facts then, it is now no surprise that the graph of the rational function \\(f(x)\\) should approach a straight line, in this case \\(y=x-4\\) as \\(x \\rightarrow \\pm \\infty\\).\nWe can easily do most of this analysis without needing a computer or algebra. First, we should know the four eventual shapes of a polynomial, that the graph of \\(y=mx\\) is a line with slope \\(m\\), the graph of \\(y = c\\) is a constant line at height \\(c\\), and the graph of \\(y=c/x^m\\), \\(m > 0\\) will decay towards \\(0\\) as \\(x \\rightarrow \\pm\\infty\\). The latter should be clear, as \\(x^m\\) gets big, so its reciprocal goes towards \\(0\\).\nThe factored form, as \\(p\\) is presented, is a bit hard to work with, rather we use the expanded form, which we get through the cancel function\n\ncancel(h)\n\n \n\\[\n\\frac{x^{3} - 4 x^{2} + 5 x - 2}{x^{2} - 9}\n\\]\n\n\n\nWe can see that the numerator is of degree \\(3\\) and the denominator of degree \\(2\\). The leading terms are \\(x^3\\) and \\(x^2\\), respectively. If we were to pull those out we would get:\n\\[\n\\frac{x^3 \\cdot (1 - 4/x + 5/x^2 - 2/x^3)}{x^2 \\cdot (1 - 9/x^2)}.\n\\]\nThe terms \\((1 - 4/x + 5/x^2 - 2/x^3)\\) and \\((1 - 9/x^2)\\) go towards \\(1\\) as \\(x \\rightarrow \\pm \\infty\\), as each term with \\(x\\) goes towards \\(0\\). So the dominant terms comes from the ratio of the leading terms, \\(x^3\\) and \\(x^2\\). This ratio is \\(x\\), so their will be an asymptote around a line with slope \\(1\\). (The fact that the asymptote is \\(y=x-4\\) takes a bit more work, as a division step is needed.)\nJust by looking at the ratio of the two leading terms, the behaviour as \\(x \\rightarrow \\pm \\infty\\) can be discerned. If this ratio is of:\n\nthe form \\(c x^m\\) with \\(m > 1\\) then the shape will follow the polynomial growth of of the monomial \\(c x^m\\).\nthe form \\(c x^m\\) with \\(m=1\\) then there will be a line with slope \\(c\\) as a slant asymptote.\nthe form \\(cx^0\\) with \\(m=0\\) (or just \\(c\\)) then there will be a horizontal asymptote \\(y=c\\).\nthe form \\(c/x^{m}\\) with \\(m > 0\\) then there will be a horizontal asymptote \\(y=0\\), or the \\(y\\) axis.\n\nTo expand on the first points where the degree of the numerator is greater than that of the denominator, we have from the division algorithm that if \\(a(x)\\) is the numerator and \\(b(x)\\) the denominator, then \\(a(x)/b(x) = q(x) + r(x)/b(x)\\) where the degree of \\(b(x)\\) is greater than the degree of \\(r(x)\\), so the right-most term will have a horizontal asymptote of \\(0\\). This says that the graph will eventually approach the graph of \\(q(x)\\), giving more detail than just saying it follows the shape of the leading term of \\(q(x)\\), at the expense of the work required to find \\(q(x)\\).\n\n\n14.2.2 Examples\nConsider the rational expression\n\\[\n\\frac{17x^5 - 300x^4 - 1/2}{x^5 - 2x^4 + 3x^3 - 4x^2 + 5}.\n\\]\nThe leading term of the numerator is \\(17x^5\\) and the leading term of the denominator is \\(x^5\\). The ratio is \\(17\\) (or \\(17x^0 = 17x^{5-5}\\)). As such, we would have a horizontal asymptote \\(y=17\\).\n\nIf we consider instead this rational expression:\n\\[\n\\frac{x^5 - 2x^4 + 3x^3 - 4x^2 + 5}{5x^4 + 4x^3 + 3x^2 + 2x + 1}\n\\]\nThen we can see that the ratio of the leading terms is \\(x^5 / (5x^4) = (1/5)x\\). We expect a slant asymptote with slope \\(1/5\\), though we would need to divide to see the exact intercept. This is found with, say:\n\np = (x^5 - 2x^4 + 3x^3 - 4x^2 + 5) / (5x^4 + 4x^3 + 3x^2 + 2x + 1)\nquo, rem = divrem(numerator(p), denominator(p)) # or apart(p)\nquo\n\n \n\\[\n\\frac{x}{5} - \\frac{14}{25}\n\\]\n\n\n\n\nThe rational function\n\\[\n\\frac{5x^3 + 6x^2 + 2}{x-1}\n\\]\nhas decomposition \\(5x^2 + 11x + 11 + 13/(x-1)\\):\n\ntop = 5x^3 + 6x^2 +2\nbottom = x-1\nquo, rem = divrem(top, bottom)\n\n(5*x^2 + 11*x + 11, 13)\n\n\nThe graph of has nothing in common with the graph of the quotient for small \\(x\\)\n\nplot(top/bottom, -3, 3)\nplot!(quo, -3, 3)\n\n\n\n\nBut the graphs do match for large \\(x\\):\n\nplot(top/bottom, 5, 10)\nplot!(quo, 5, 10)\n\n\n\n\n\nFinally, consider this rational expression in factored form:\n\\[\n\\frac{(x-2)^3\\cdot(x-4)\\cdot(x-3)}{(x-5)^4 \\cdot (x-6)^2}.\n\\]\nBy looking at the powers we can see that the leading term of the numerator will the \\(x^5\\) and the leading term of the denominator \\(x^6\\). The ratio is \\(1/x^1\\). As such, we expect the \\(y\\)-axis as a horizontal asymptote:\n\nPartial fractions\nThe apart function was useful to express a rational function in terms of a polynomial plus additional rational functions whose horizontal asymptotes are \\(0\\). This function computes the partial fraction decomposition of a rational function. Outside of the initial polynomial, this decomposition is a reexpression of a rational function into a sum of rational functions, where the denominators are irreducible, or unable to be further factored (non-trivially) and the numerators have lower degree than the denominator. Hence the horizontal asymptotes of \\(0\\).\nTo see another example we have:\n\np = (x-1)*(x-2)\nq = (x-3)^3 * (x^2 - x - 1)\napart(p/q)\n\n \n\\[\n\\frac{2 x - 1}{25 \\left(x^{2} - x - 1\\right)} - \\frac{2}{25 \\left(x - 3\\right)} + \\frac{1}{5 \\left(x - 3\\right)^{2}} + \\frac{2}{5 \\left(x - 3\\right)^{3}}\n\\]\n\n\n\nThe denominator, \\(q\\), has factors \\(x-3\\) and \\(x^2 - x - 1\\), each irreducible. The answer is expressed in terms of a sum of rational functions each with a denominator coming from one of these factors, possibly with a power.\n\n\n\n14.2.3 Vertical asymptotes\nAs just discussed, the graph of \\(1/x\\) will have a horizontal asymptote. However it will also show a spike at \\(0\\):\n\nplot(1/x, -1, 1)\n\n\n\n\nAgain, this spike is an artifact of the plotting algorithm. The \\(y\\) values for \\(x\\)-values just smaller than \\(0\\) are large negative values and the \\(x\\) values just larger than \\(0\\) produce large, positive \\(y\\) values.\nThe two points with \\(x\\) components closest to \\(0\\) are connected with a line, though that is misleading. Here we deliberately use far fewer points to plot \\(1/x\\) to show how this happens:\n\nf(x) = 1/x\nxs = range(-1, 1, length=12)\nys = f.(xs)\nplot(xs, ys)\nscatter!(xs, ys)\n\n\n\n\nThe line \\(x = 0\\) is a vertical asymptote for the graph of \\(1/x\\). As \\(x\\) values get close to \\(0\\) from the right, the \\(y\\) values go towards \\(\\infty\\) and as the \\(x\\) values get close to \\(0\\) on the left, the \\(y\\) values go towards \\(-\\infty\\).\nThis has everything to do with the fact that \\(0\\) is a root of the denominator.\nFor a rational function \\(p(x)/q(x)\\), the roots of \\(q(x)\\) may or may not lead to vertical asymptotes. For a root \\(c\\) if \\(p(c)\\) is not zero then the line \\(x=c\\) will be a vertical asymptote. If \\(c\\) is a root of both \\(p(x)\\) and \\(q(x)\\), then we can rewrite the expression as:\n\\[\n\\frac{p(x)}{q(x)} = \\frac{(x-c)^m r(x)}{(x-c)^n s(x)},\n\\]\nwhere both \\(r(c)\\) and \\(s(c)\\) are non zero. Knowing \\(m\\) and \\(n\\) (the multiplicities of the root \\(c\\)) allows the following to be said:\n\nIf \\(m < n\\) then \\(x=c\\) will be a vertical asymptote.\nIf \\(m \\geq n\\) then \\(x=c\\) will not be vertical asymptote. (The value \\(c\\) will be known as a removable singularity). In this case, the graph of \\(p(x)/q(x)\\) and the graph of \\((x-c)^{m-n}r(x)/s(x)\\) will differ, though very slightly, as the latter will include a value for \\(x=c\\), whereas \\(x=c\\) is not in the domain of \\(p(x)/q(x)\\).\n\nFinding the multiplicity may or may not be hard, but there is a very kludgy quick check that is often correct. With Julia, if you have a rational function that has f(c) evaluate to Inf or -Inf then there will be a vertical asymptote. If the expression evaluates to NaN, more analysis is needed. (The value of 0/0 is NaN, where as 1/0 is Inf.)\nFor example, the function \\(f(x) = ((x-1)^2 \\cdot (x-2)) / ((x+3) \\cdot(x-3))\\) has vertical asymptotes at \\(-3\\) and \\(3\\), as its graph illustrated. Without the graph we could see this as well:\n\nf(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) )\nf(3), f(-3)\n\n(Inf, -Inf)\n\n\n\nGraphing with vertical asymptotes\nAs seen in several graphs, the basic plotting algorithm does a poor job with vertical asymptotes. For example, it may erroneously connect their values with a steep vertical line, or the \\(y\\)-axis scale can get so large as to make reading the rest of the graph impossible. There are some tricks to work around this.\nConsider again the function \\(f(x) = ((x-1)^2 \\cdot (x-2)) / ((x+3) \\cdot(x-3))\\). Without much work, we can see that \\(x=3\\) and \\(x=-3\\) will be vertical asymptotes and there will be a slant asymptote with slope \\(1\\). How to graph this?\nWe can avoid the vertical asymptotes in our viewing window. For example we could look at the area between the vertical asymptotes, by plotting over \\((-2.9, 2.9)\\), say:\n\n𝒇(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) )\nplot(𝒇, -2.9, 2.9)\n\n\n\n\nThis backs off by \\(\\delta = 0.1\\). As we have that \\(3 - 2.9\\) is \\(\\delta\\) and \\(1/\\delta\\) is 10, the \\(y\\) axis wont get too large, and indeed it doesnt.\nThis graph doesnt show well the two zeros at \\(x=1\\) and \\(x=2\\), for that a narrower viewing window is needed. By successively panning throughout the interesting part of the graph, we can get a view of the function.\nWe can also clip the y axis. The plot function can be passed an argument ylims=(lo, hi) to limit which values are plotted. With this, we can have:\n\nplot(𝒇, -5, 5, ylims=(-20, 20))\n\n\n\n\nThis isnt ideal, as the large values are still computed, just the viewing window is clipped. This leaves the vertical asymptotes still effecting the graph.\nThere is another way, we could ask Julia to not plot \\(y\\) values that get too large. This is not a big request. If instead of the value of f(x) - when it is large - -we use NaN instead, then the connect-the-dots algorithm will skip those values.\nThis was discussed in an earlier section where the rangeclamp function was introduced to replace large values of f(x) (in absolute values) with NaN.\n\nplot(rangeclamp(𝒇, 30), -25, 25) # rangeclamp is in the CalculusWithJulia package\n\n\n\n\nWe can see the general shape of \\(3\\) curves broken up by the vertical asymptotes. The two on the side heading off towards the line \\(x-4\\) and the one in the middle. We still cant see the precise location of the zeros, but that wouldnt be the case with most graphs that show asymptotic behaviors. However, we can clearly tell where to “zoom in” were those of interest.\n\n\n\n14.2.4 Sign charts\nWhen sketching graphs of rational functions by hand, it is useful to use sign charts. A sign chart of a function indicates when the function is positive, negative, \\(0\\), or undefined. It typically is represented along the lines of this one for \\(f(x) = x^3 - x\\):\n - 0 + 0 - 0 +\n< ----- -1 ----- 0 ----- 1 ----- >\nThe usual recipe for construction follows these steps:\n\nIdentify when the function is \\(0\\) or undefined. Place those values on a number line.\nIdentify “test points” within each implied interval (these are \\((-\\infty, -1)\\), \\((-1,0)\\), \\((0,1)\\), and \\((1, \\infty)\\) in the example) and check for the sign of \\(f(x)\\) at these test points. Write in -, +, 0, or *, as appropriate. The value comes from the fact that “continuous” functions may only change sign when they cross \\(0\\) or are undefined.\n\nWith the computer, where it is convenient to draw a graph, it might be better to emphasize the sign on the graph of the function. The sign_chart function from CalculusWithJulia does this by numerically identifying points where the function is \\(0\\) or \\(\\infty\\) and indicating the sign as \\(x\\) crosses over these points.\n\n\nsign_chart (generic function with 1 method)\n\n\n\nf(x) = x^3 - x\nsign_chart(f, -3/2, 3/2)\n\n3-element Vector{NamedTuple{(:∞0, :sign_change), Tuple{Float64, String}}}:\n (∞0 = -1.0, sign_change = \"- → +\")\n (∞0 = 0.0, sign_change = \"+ → -\")\n (∞0 = 1.0, sign_change = \"- → +\")"
},
{
"objectID": "precalc/rational_functions.html#pade-approximate",
"href": "precalc/rational_functions.html#pade-approximate",
"title": "14  Rational functions",
"section": "14.3 Pade approximate",
"text": "14.3 Pade approximate\nOne area where rational functions are employed is in approximating functions. Later, the Taylor polynomial will be seen to be a polynomial that approximates well a function (where “well” will be described later). The Pade approximation is similar, though uses a rational function for the form \\(p(x)/q(x)\\), where \\(q(0)=1\\) is customary.\nSome example approximations are\n\\[\n\\sin(x) \\approx \\frac{x - 7/60 \\cdot x^3}{1 + 1/20 \\cdot x^2}\n\\]\nand\n\\[\n\\tan(x) \\approx \\frac{x - 1/15 \\cdot x^3}{1 - 2/5 \\cdot x^2}\n\\]\nWe can look graphically at these approximations:\n\nsin_p(x) = (x - (7/60)*x^3) / (1 + (1/20)*x^2)\ntan_p(x) = (x - (1/15)*x^3) / (1 - (2/5)*x^2)\nplot(sin, -pi, pi)\nplot!(sin_p, -pi, pi)\n\n\n\n\n\nplot(tan, -pi/2 + 0.2, pi/2 - 0.2)\nplot!(tan_p, -pi/2 + 0.2, pi/2 - 0.2)"
},
{
"objectID": "precalc/rational_functions.html#the-polynomials-package-for-rational-functions",
"href": "precalc/rational_functions.html#the-polynomials-package-for-rational-functions",
"title": "14  Rational functions",
"section": "14.4 The Polynomials package for rational functions",
"text": "14.4 The Polynomials package for rational functions\nIn the following, we import some functions from the Polynomials package. We avoided loading the entire namespace, as there are conflicts with SymPy. Here we import some useful functions and the Polynomial constructor:\nimport Polynomials: Polynomial, variable, lowest_terms, fromroots, coeffs\nThe Polynomials package has support for rational functions. The // operator can be used to create rational expressions:\n\n𝒙 = variable()\n𝒑 = (𝒙-1)*(𝒙-2)^2\n𝒒 = (𝒙-2)*(𝒙-3)\n𝒑𝒒 = 𝒑 // 𝒒\n\n(-4 + 8*x - 5*x^2 + x^3) // (6 - 5*x + x^2)\n\n\nA rational expression is a formal object; a rational function the viewpoint that this object will be evaluated by substituting values for the indeterminate. Rational expressions made within Polynomials are evaluated just like functions:\n\n𝒑𝒒(4) # p(4)/q(4)\n\n6.0\n\n\nThe rational expressions are not in lowest terms unless requested through the lowest_terms method:\n\nlowest_terms(𝒑𝒒)\n\n(1.999999999999998 - 2.9999999999999982*x + 0.9999999999999996*x^2) // (-2.9999999999999973 + 1.0*x)\n\n\nFor polynomials as simple as these, this computation is not a problem, but there is the very real possibility that the lowest term computation may be incorrect. Unlike SymPy which factors symbolically, lowest_terms uses a numeric algorithm and does not, as would be done by hand or with SymPy, factor the polynomial and then cancel common factors.\nThe distinction between the two expressions is sometimes made; the initial expression is not defined at \\(x=2\\); the reduced one is, so the two are not identical when viewed as functions of the variable \\(x\\).\nRational expressions include polynomial expressions, just as the rational numbers include the integers. The identification there is to divide by \\(1\\), thinking of \\(3\\) as \\(3/1\\). In Julia, we would just use\n\n3//1\n\n3//1\n\n\nThe integer can be recovered from the rational number using numerator:\n\nnumerator(3//1)\n\n3\n\n\nSimilarly, we can divide a polynomial by the polynomial \\(1\\), which in Julia is returned by one(p), to produce a rational expression:\n\npp = 𝒑 // one(𝒑)\n\n(-4 + 8*x - 5*x^2 + x^3) // (1)\n\n\nAnd as with rational numbers, p is recovered by numerator:\n\nnumerator(pp)\n\n-4 + 8∙x - 5∙x2 + x3\n\n\nOne difference is the rational number 3//1 also represents other expressions, say 6/2 or 12/4, as Julias rational numbers are presented in lowest terms, unlike the rational expressions in Polynomials.\nRational functions also have a plot recipe defined for them that attempts to ensure the basic features are identifiable. As previously discussed, a plot of a rational function can require some effort to avoid the values associated to vertical asymptotes taking up too many of the available vertical pixels in a graph.\nFor the polynomial pq above, we have from observation that \\(1\\) and \\(2\\) will be zeros and \\(x=3\\) a vertical asymptote. We also can identify a slant asymptote with slope \\(1\\). These are hinted at in this graph:\n\nplot(𝒑𝒒)\n\n\n\n\nTo better see the zeros, a plot over a narrower interval, say \\([0,2.5]\\), would be encouraged; to better see the slant asymptote, a plot over a wider interval, say \\([-10,10]\\), would be encouraged.\nFor one more example of the default plot recipe, we redo the graphing of the rational expression we earlier plotted with rangeclamp:\n\np,q = fromroots([1,1,2]), fromroots([-3,3])\nplot(p//q)\n\n\n\n\n\nExample: transformations of polynomials; real roots\nWe have seen some basic transformations of functions such as shifts and scales. For a polynomial expression we can implement these as follows, taking advantage of polynomial evaluation:\n\nx = variable()\np = 3 + 4x + 5x^2\na = 2\np(a*x), p(x+a) # scale, shift\n\n(Polynomial(3 + 8*x + 20*x^2), Polynomial(31 + 24*x + 5*x^2))\n\n\nA different polynomial transformation is inversion, or the mapping \\(x^d \\cdot p(1/x)\\) where \\(d\\) is the degree of \\(p\\). This will yield a polynomial, as perhaps this example will convince you:\n\np = Polynomial([1, 2, 3, 4, 5])\nd = Polynomials.degree(p) # degree is in SymPy and Polynomials, indicate which\npp = p // one(p)\nx = variable(pp)\nq = x^d * pp(1/x)\nlowest_terms(q)\n\n(5.0 + 4.0*x + 3.0*x^2 + 2.0*x^3 + 1.0*x^4) // (1.0)\n\n\nWe had to use a rational expression so that division by the variable was possible. The above indicates that the new polynomial, \\(q\\), is constructed from \\(p\\) by reversing the coefficients.\nInversion is like a funhouse mirror, flipping around parts of the polynomial. For example, the interval \\([1/4,1/2]\\) is related to the interval \\([2,4]\\). Of interest here, is that if \\(p(x)\\) had a root, \\(r\\), in \\([1/4,1/2]\\) then \\(q(x) = x^d \\cdot p(1/x)\\) would have a root in \\([2,4]\\) at \\(1/r\\).\nSo these three transformations scale, shift, and inversion can be defined for polynomials.\nCombined, the three can be used to create a Mobius transformation. For two values \\(a\\) and \\(b\\), consider the polynomial derived from \\(p\\) (again d=degree(p)) by:\n\\[\nq = (x+1)^d \\cdot p(\\frac{ax + b}{x + 1}).\n\\]\nHere is a non-performant implementation as a Julia function:\n\nfunction mobius_transformation(p, a, b)\n x = variable(p)\n p = p(x + a) # shift\n p = p((b-a)*x) # scale\n p = Polynomial(reverse(coeffs(p))) # invert\n p = p(x + 1) # shift\n p\nend\n\nmobius_transformation (generic function with 1 method)\n\n\nWe can verify this does what we want through example with the previously defined p:\n\n𝐩 = Polynomial([1, 2, 3, 4, 5])\n𝐪 = mobius_transformation(𝐩, 4, 6)\n\n7465 + 20280∙x + 20670∙x2 + 9368∙x3 + 1593∙x4\n\n\nAs contrasted with\n\na, b = 4, 6\n\npq = 𝐩 // one(𝐩)\nx = variable(pq)\nd = Polynomials.degree(𝐩)\nnumerator(lowest_terms( (x + 1)^2 * pq((a*x + b)/(x + 1))))\n\n7465.000000001552 + 20280.00000000283∙x + 20670.00000000175∙x2 + 9368.000000000367∙x3 + 1593.0000000000002∙x4\n\n\n\nNow, why is this of any interest?\nMobius transforms are used to map regions into other regions. In this special case, the transform \\(\\phi(x) = (ax + b)/(x + 1)\\) takes the interval \\([0,\\infty]\\) and sends it to \\([a,b]\\) (\\(0\\) goes to \\((a\\cdot 0 + b)/(0+1) = b\\), whereas \\(\\infty\\) goes to \\(ax/x \\rightarrow a\\)). Using this, if \\(p(u) = 0\\), with \\(q(x) = (x-1)^d p(\\phi(x))\\), then setting \\(u = \\phi(x)\\) we have \\(q(x) = (\\phi^{-1}(u)+1)^d p(\\phi(\\phi^{-1}(u))) = (\\phi^{-1}(u)+1)^d \\cdot p(u) = (\\phi^{-1}(u)+1)^d \\cdot 0 = 0\\). That is, a zero of \\(p\\) in \\([a,b]\\) will appear as a zero of \\(q\\) in \\([0,\\infty)\\) at \\(\\phi^{-1}(u)\\).\nThe Descartes rule of signs applied to \\(q\\) then will give a bound on the number of possible roots of \\(p\\) in the interval \\([a,b]\\). In the example we did, the Mobius transform for \\(a=4, b=6\\) is \\(15 - x - 11x^2 - 3x^3\\) with \\(1\\) sign change, so there must be exactly \\(1\\) real root of \\(p=(x-1)(x-3)(x-5)\\) in the interval \\([4,6]\\), as we can observe from the factored form of \\(p\\).\nSimilarly, we can see there are \\(2\\) or \\(0\\) roots for \\(p\\) in the interval \\([2,6]\\) by counting the two sign changes here:\n\nmobius_transformation(𝐩, 2,6)\n\n7465 + 10700∙x + 5790∙x2 + 1404∙x3 + 129∙x4\n\n\nThis observation, along with a detailed analysis provided by Kobel, Rouillier, and Sagraloff provides a means to find intervals that enclose the real roots of a polynomial.\nThe basic algorithm, as presented next, is fairly simple to understand, and hints at the bisection algorithm to come. It is due to Akritas and Collins. Suppose you know the only possible positive real roots are between \\(0\\) and \\(M\\) and no roots are repeated. Find the transformed polynomial over \\([0,M]\\):\n\nIf there are no sign changes, then there are no roots of \\(p\\) in \\([0,M]\\).\nIf there is one sign change, then there is a single root of \\(p\\) in \\([0,M]\\). The interval \\([0,M]\\) is said to isolate the root (and the actual root can then be found by other means)\nIf there is more than one sign change, divide the interval in two (\\([0,M/2]\\) and \\([M/2,M]\\), say) and apply the same consideration to each.\n\nEventually, mathematically this will find isolating intervals for each positive real root. (The negative ones can be similarly isolated.)\nApplying these steps to \\(p\\) with an initial interval, say \\([0,9]\\), we would have:\n\np = fromroots([1,3,5]) # (x-1)⋅(x-3)⋅(x-5) = -15 + 23*x - 9*x^2 + x^3\nmobius_transformation(p, 0, 9) # 3\nmobius_transformation(p, 0, 9//2) # 2\nmobius_transformation(p, 9//2, 9) # 1 (and done)\nmobius_transformation(p, 0, 9//4) # 1 (and done)\nmobius_transformation(p, 9//4, 9//2) # 1 (and done)\n\n-21//8 - 225//16∙x + 81//32∙x2 + 165//64∙x3\n\n\nSo the three roots (\\(1\\), \\(3\\), \\(5\\)) are isolated by \\([0, 9/4]\\), \\([9/4, 9/2]\\), and \\([9/2, 9]\\).\n\n\n14.4.1 The RealPolynomialRoots package.\nFor square-free polynomials, the RealPolynomialRoots package implements a basic version of the paper of Kobel, Rouillier, and Sagraloff to identify the real roots of a polynomial using the Descartes rule of signs and the Möbius transformations just described.\nThe ANewDsc function takes a collection of coefficients representing a polynomial and returns isolating intervals for each real root. For example:\n\np₀ = fromroots([1,3,5])\nst = ANewDsc(coeffs(p₀))\n\nThere were 3 isolating intervals found:\n[4.25…, 6.0…]₂₅₆\n[2.62…, 4.25…]₂₅₆\n[-0.5…, 2.62…]₂₅₆\n\n\nThese intervals can be refined to give accurate approximations to the roots:\n\nrefine_roots(st)\n\n3-element Vector{BigFloat}:\n 4.999999999999999999988812639274634601976233507362249045172904058207428900243547\n 3.000000000000000000006252262937500966174570607240785586677609011021020113396673\n 1.000000000000000000018681997535084761434897534348173150562760465656642061560433\n\n\nMore challenging problems can be readily handled by this package. The following polynomial\n\n𝒔 = Polynomial([0,1]) # also just variable(Polynomial{Int})\n𝒖 = -1 + 254*𝒔 - 16129*𝒔^2 + 𝒔^15\n\n-1 + 254∙x - 16129∙x2 + x15\n\n\nhas three real roots, two of which are clustered very close to each other:\n\n𝒔𝒕 = ANewDsc(coeffs(𝒖))\n\nThere were 3 isolating intervals found:\n[1.56…, 3.62…]₅₃\n[0.0078740157480314962595…, 0.0078740157480314988277…]₁₃₁\n[0.00787401574803149368282…, 0.0078740157480314962595…]₁₃₁\n\n\nand\n\nrefine_roots(𝒔𝒕)\n\n3-element Vector{BigFloat}:\n 2.10577422917648295433232280879150031024324852056247605949961474907154886438575\n 0.00787401574803149746184924478761025007324703047462819163289637077019245526555863\n 0.007874015748031494730205602008914819785754607338104103490190741092899528046885136\n\n\nThe SymPy package (sympy.real_roots) can accurately identify the three roots but it can take a very long time. The Polynomials.roots function from the Polynomials package identifies the cluster as complex valued. Though the implementation in RealPolynomialRoots doesnt handle such large polynomials, the authors of the algorithm have implementations that can quickly solve polynomials with degrees as high as \\(10,000\\)."
},
{
"objectID": "precalc/rational_functions.html#questions",
"href": "precalc/rational_functions.html#questions",
"title": "14  Rational functions",
"section": "14.5 Questions",
"text": "14.5 Questions\n\nQuestion\nThe rational expression \\((x^3 - 2x + 3) / (x^2 - x + 1)\\) would have\n\n\n\n \n \n \n \n \n \n \n \n \n A horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n A horizontal asymptote \\(y=1\\)\n \n \n\n\n \n \n \n \n A slant asymptote with slope \\(m=1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rational expression \\((x^2 - x + 1)/ (x^3 - 2x + 3)\\) would have\n\n\n\n \n \n \n \n \n \n \n \n \n A horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n A horizontal asymptote \\(y=1\\)\n \n \n\n\n \n \n \n \n A slant asymptote with slope \\(m=1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rational expression \\((x^2 - x + 1)/ (x^2 - 3x + 3)\\) would have\n\n\n\n \n \n \n \n \n \n \n \n \n A horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n A horizontal asymptote \\(y=1\\)\n \n \n\n\n \n \n \n \n A slant asymptote with slope \\(m=1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rational expression\n\\[\n\\frac{(x-1)\\cdot(x-2)\\cdot(x-3)}{(x-4)\\cdot(x-5)\\cdot(x-6)}\n\\]\nwould have\n\n\n\n \n \n \n \n \n \n \n \n \n A horizontal asymptote \\(y=1\\)\n \n \n\n\n \n \n \n \n A slant asymptote with slope \\(m=1\\)\n \n \n\n\n \n \n \n \n A horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rational expression\n\\[\n\\frac{(x-1)\\cdot(x-2)\\cdot(x-3)}{(x-4)\\cdot(x-5)\\cdot(x-6)}\n\\]\nwould have\n\n\n\n \n \n \n \n \n \n \n \n \n A vertical asymptote \\(x=5\\)\n \n \n\n\n \n \n \n \n A slant asymptote with slope \\(m=1\\)\n \n \n\n\n \n \n \n \n A vertical asymptote \\(x=1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rational expression\n\\[\n\\frac{x^3 - 3x^2 + 2x}{3x^2 - 6x + 2}\n\\]\nhas a slant asymptote. What is the equation of that line?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(y = 3x\\)\n \n \n\n\n \n \n \n \n \\(y = (1/3)x - (1/3)\\)\n \n \n\n\n \n \n \n \n \\(y = (1/3)x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLook at the graph of the function \\(f(x) = ((x-1)\\cdot(x-2)) / ((x-3)\\cdot(x-4))\\)\n\n\n\n\n\nIs the following common conception true: “The graph of a function never crosses its asymptotes.”\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, this is true\n \n \n\n\n \n \n \n \n No, the graph clearly crosses the drawn asymptote\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(The wikipedia page indicates that the term “asymptote” was introduced by Apollonius of Perga in his work on conic sections, but in contrast to its modern meaning, he used it to mean any line that does not intersect the given curve. It can sometimes take a while to change perception.)\n\n\nQuestion\nConsider the two graphs of \\(f(x) = 1/x\\) over \\([10,20]\\) and \\([100, 200]\\):\n\n\n\n\n\n\n\n\n\n\nThe two shapes are basically identical and do not look like straight lines. How does this reconcile with the fact that \\(f(x)=1/x\\) has a horizontal asymptote \\(y=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The graph is always decreasing, hence it will eventually reach \\(-\\infty\\).\n \n \n\n\n \n \n \n \n The \\(y\\)-axis scale shows that indeed the \\(y\\) values are getting close to \\(0\\).\n \n \n\n\n \n \n \n \n The horizontal asymptote is not a straight line.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe amount of drug in a bloodstream after \\(t\\) hours is modeled by the rational function\n\\[\nr(t) = \\frac{50t^2}{t^3 + 20}, \\quad t \\geq 0.\n\\]\nWhat is the amount of the drug after \\(1\\) hour?\n\n\nr1 (generic function with 1 method)\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the amount of drug in the bloodstream after 24 hours?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is more accurate: the peak amount is\n\n\n\n \n \n \n \n \n \n \n \n \n between \\(16\\) and \\(24\\) hours\n \n \n\n\n \n \n \n \n between \\(8\\) and \\(16\\) hours\n \n \n\n\n \n \n \n \n after one day\n \n \n\n\n \n \n \n \n between \\(0\\) and \\(8\\) hours\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThis graph has\n\n\n\n \n \n \n \n \n \n \n \n \n a horizontal asymptote \\(y=20\\)\n \n \n\n\n \n \n \n \n a vertical asymptote with \\(x = 20^{1/3}\\)\n \n \n\n\n \n \n \n \n a horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n a slant asymptote with slope \\(50\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe (low-order) Pade approximation for \\(\\sin(x)\\) was seen to be \\((x - 7/60 \\cdot x^3)/(1 + 1/20 \\cdot x^2)\\). The graph showed that this approximation was fairly close over \\([-\\pi, \\pi]\\). Without graphing would you expect the behaviour of the function and its approximation to be similar for large values of \\(x\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy?\n\n\n\n \n \n \n \n \n \n \n \n \n The \\(\\sin(x)\\) oscillates, but the rational function has a slant asymptote\n \n \n\n\n \n \n \n \n The \\(\\sin(x)\\) oscillates, but the rational function has a horizontal asymptote of \\(0\\)\n \n \n\n\n \n \n \n \n The \\(\\sin(x)\\) oscillates, but the rational function eventually follows \\(7/60 \\cdot x^3\\)\n \n \n\n\n \n \n \n \n The \\(\\sin(x)\\) oscillates, but the rational function has a non-zero horizontal asymptote"
},
{
"objectID": "precalc/exp_log_functions.html",
"href": "precalc/exp_log_functions.html",
"title": "15  Exponential and logarithmic functions",
"section": "",
"text": "This section uses the following add-on packages:\nThe family of exponential functions is used to model growth and decay. The family of logarithmic functions is defined here as the inverse of the exponential functions, but have reach far outside of that."
},
{
"objectID": "precalc/exp_log_functions.html#exponential-functions",
"href": "precalc/exp_log_functions.html#exponential-functions",
"title": "15  Exponential and logarithmic functions",
"section": "15.1 Exponential functions",
"text": "15.1 Exponential functions\nThe family of exponential functions is defined by \\(f(x) = a^x, -\\infty< x < \\infty\\) and \\(a > 0\\). For \\(0 < a < 1\\) these functions decay or decrease, for \\(a > 1\\) the functions grow or increase, and if \\(a=1\\) the function is constantly \\(1\\).\nFor a given \\(a\\), defining \\(a^n\\) for positive integers is straightforward, as it means multiplying \\(n\\) copies of \\(a.\\) From this, for integer powers, the key properties of exponents: \\(a^x \\cdot a^y = a^{x+y}\\), and \\((a^x)^y = a^{x \\cdot y}\\) are immediate consequences. For example with \\(x=3\\) and \\(y=2\\):\n\\[\n\\begin{align*}\na^3 \\cdot a^2 &= (a\\cdot a \\cdot a) \\cdot (a \\cdot a) \\\\\n &= (a \\cdot a \\cdot a \\cdot a \\cdot a) \\\\\n &= a^5 = a^{3+2},\\\\\n(a^3)^2 &= (a\\cdot a \\cdot a) \\cdot (a\\cdot a \\cdot a)\\\\\n &= (a\\cdot a \\cdot a \\cdot a\\cdot a \\cdot a) \\\\\n &= a^6 = a^{3\\cdot 2}.\n\\end{align*}\n\\]\nFor \\(a \\neq 0\\), \\(a^0\\) is defined to be \\(1\\).\nFor positive, integer values of \\(n\\), we have by definition that \\(a^{-n} = 1/a^n\\).\nFor \\(n\\) a positive integer, we can define \\(a^{1/n}\\) to be the unique positive solution to \\(x^n=a\\).\nUsing the key properties of exponents we can extend this to a definition of \\(a^x\\) for any rational \\(x\\).\nDefining \\(a^x\\) for any real number requires some more sophisticated mathematics.\nOne method is to use a theorem that says a bounded monotonically increasing sequence will converge. (This uses the Completeness Axiom.) Then for \\(a > 1\\) we have if \\(q_n\\) is a sequence of rational numbers increasing to \\(x\\), then \\(a^{q_n}\\) will be a bounded sequence of increasing numbers, so will converge to a number defined to be \\(a^x\\). Something similar is possible for the \\(0 < a < 1\\) case.\nThis definition can be done to ensure the rules of exponents hold for \\(a > 0\\):\n\\[\na^{x + y} = a^x \\cdot a^y, \\quad (a^x)^y = a^{x \\cdot y}.\n\\]\nIn Julia these functions are implemented using ^. A special value of the base, \\(e\\), may be defined as well in terms of a limit. The exponential function \\(e^x\\) is implemented in exp.\n\nplot(x -> (1/2)^x, -2, 2, label=\"1/2\")\nplot!(x -> 1^x, label=\"1\")\nplot!(x -> 2^x, label=\"2\")\nplot!(x -> exp(x), label=\"e\")\n\n\n\n\nWe see examples of some general properties:\n\nThe domain is all real \\(x\\) and the range is all positive \\(y\\) (provided \\(a \\neq 1\\)).\nFor \\(0 < a < 1\\) the functions are monotonically decreasing.\nFor \\(a > 1\\) the functions are monotonically increasing.\nIf \\(1 < a < b\\) and \\(x > 0\\) we have \\(a^x < b^x\\).\n\n\nExample\nContinuously compounded interest allows an initial amount \\(P_0\\) to grow over time according to \\(P(t)=P_0e^{rt}\\). Investigate the difference between investing \\(1,000\\) dollars in an account which earns \\(2\\)% as opposed to an account which earns \\(8\\)% over \\(20\\) years.\nThe \\(r\\) in the formula is the interest rate, so \\(r=0.02\\) or \\(r=0.08\\). To compare the differences we have:\n\nr2, r8 = 0.02, 0.08\nP0 = 1000\nt = 20\nP0 * exp(r2*t), P0 * exp(r8*t)\n\n(1491.8246976412704, 4953.0324243951145)\n\n\nAs can be seen, there is quite a bit of difference.\nIn \\(1494\\), Pacioli gave the “Rule of \\(72\\)”, stating that to find the number of years it takes an investment to double when continuously compounded one should divide the interest rate into \\(72\\).\nThis formula is not quite precise, but a rule of thumb, the number is closer to \\(69\\), but \\(72\\) has many divisors which makes this an easy to compute approximation. Lets see how accurate it is:\n\nt2, t8 = 72/2, 72/8\nexp(r2*t2), exp(r8*t8)\n\n(2.0544332106438876, 2.0544332106438876)\n\n\nSo fairly close - after \\(72/r\\) years the amount is \\(2.05...\\) times more than the initial amount.\n\n\nExample\nBacterial growth (according to Wikipedia) is the asexual reproduction, or cell division, of a bacterium into two daughter cells, in a process called binary fission. During the log phase “the number of new bacteria appearing per unit time is proportional to the present population.” The article states that “Under controlled conditions, cyanobacteria can double their population four times a day…”\nSuppose an initial population of \\(P_0\\) bacteria, a formula for the number after \\(n\\) hours is \\(P(n) = P_0 2^{n/6}\\) where \\(6 = 24/4\\).\nAfter two days what multiple of the initial amount is present if conditions are appropriate?\n\nn = 2 * 24\n2^(n/6)\n\n256.0\n\n\nThat would be an enormous growth. Dont worry: “Exponential growth cannot continue indefinitely, however, because the medium is soon depleted of nutrients and enriched with wastes.”\n\n\n\n\n\n\nNote\n\n\n\nThe value of 2^n and 2.0^n is different in Julia. The former remains an integer and is subject to integer overflow for n > 62. As used above, 2^(n/6) will not overflow for larger n, as when the exponent is a floating point value, the base is promoted to a floating point value.\n\n\n\n\nExample\nThe famous Fibonacci numbers are \\(1,1,2,3,5,8,13,\\dots\\), where \\(F_{n+1}=F_n+F_{n-1}\\). These numbers increase. To see how fast, if we guess that the growth is eventually exponential and assume \\(F_n \\approx c \\cdot a^n\\), then our equation is approximately \\(ca^{n+1} = ca^n + ca^{n-1}\\). Factoring out common terms gives \\(ca^{n-1} \\cdot (a^2 - a - 1) = 0\\). The term \\(a^{n-1}\\) is always positive, so any solution would satisfy \\(a^2 - a -1 = 0\\). The positve solution is \\((1 + \\sqrt{5})/2 \\approx 1.618\\)\nThat is evidence that the \\(F_n \\approx c\\cdot 1.618^n\\). (See Relation to golden ratio for a related, but more explicit exact formula.\n\n\nExample\nIn the previous example, the exponential family of functions is used to describe growth. Polynomial functions also increase. Could these be used instead? If so that would be great, as they are easier to reason about.\nThe key fact is that exponential growth is much greater than polynomial growth. That is for large enough \\(x\\) and for any fixed \\(a>1\\) and positive integer \\(n\\) it is true that \\(a^x \\gg x^n\\).\nLater we will see an easy way to certify this statement.\n\n\nThe mathematical constant \\(e\\)\nEulers number, \\(e\\), may be defined several ways. One way is to define \\(e^x\\) by the limit \\((1+x/n)^n\\). Then \\(e=e^1\\). The value is an irrational number. This number turns up to be the natural base to use for many problems arising in Calculus. In Julia there are a few mathematical constants that get special treatment, so that when needed, extra precision is available. The value e is not immediately assigned to this value, rather is. This is typed \\euler[tab]. The label e is thought too important for other uses to reserve the name for representing a single number. However, users can issue the command using Base.MathConstants and e will be available to represent this number. When the CalculusWithJulia package is loaded, the value e is defined to be the floating point number returned by exp(1). This loses the feature of arbitrary precision, but has other advantages.\nA cute appearance of \\(e\\) is in this problem: Let \\(a>0\\). Cut \\(a\\) into \\(n\\) equal pieces and then multiply them. What \\(n\\) will produce the largest value? Note that the formula is \\((a/n)^n\\) for a given \\(a\\) and \\(n\\).\nSuppose \\(a=5\\) then for \\(n=1,2,3\\) we get:\n\na = 5\n(a/1)^1, (a/2)^2, (a/3)^3\n\n(5.0, 6.25, 4.629629629629631)\n\n\nWed need to compare more, but at this point \\(n=2\\) is the winner when \\(a=5\\).\nWith calculus, we will be able to see that the function \\(f(x) = (a/x)^x\\) will be maximized at \\(a/e\\), but for now we approach this in an exploratory manner. Suppose \\(a=5\\), then we have:\n\na = 5\nn = 1:10\nf(n) = (a/n)^n\n@. [n f(n) (a/n - e)] # @. just allows broadcasting\n\n10×3 Matrix{Float64}:\n 1.0 5.0 2.28172\n 2.0 6.25 -0.218282\n 3.0 4.62963 -1.05162\n 4.0 2.44141 -1.46828\n 5.0 1.0 -1.71828\n 6.0 0.334898 -1.88495\n 7.0 0.0948645 -2.004\n 8.0 0.0232831 -2.09328\n 9.0 0.00504136 -2.16273\n 10.0 0.000976562 -2.21828\n\n\nWe can see more clearly that \\(n=2\\) is the largest value for \\(f\\) and \\(a/2\\) is the closest value to \\(e\\). This would be the case for any \\(a>0\\), pick \\(n\\) so that \\(a/n\\) is closest to \\(e\\).\n\n\nExample: The limits to growth\nThe \\(1972\\) book The limits to growth by Meadows et. al. discusses the implications of exponential growth. It begins stating their conclusion (emphasis added): “If the present growth trends in world population, industrialization, pollution, food production, and resource depletion continue unchanged, the limits to growth on this planet will be reached sometime in the next one hundred years.” They note it is possible to alter these growth trends. We are now half way into this time period.\nLets consider one of their examples, the concentration of carbon dioxide in the atmosphere. In their Figure \\(15\\) they show data from \\(1860\\) onward of CO\\(_2\\) concentration extrapolated out to the year \\(2000\\). At climate.gov we can see actual measurements from \\(1960\\) to \\(2020\\). Numbers from each graph are read from the graphs, and plotted in the code below:\n\nco2_1970 = [(1860, 293), (1870, 293), (1880, 294), (1890, 295), (1900, 297),\n (1910, 298), (1920, 300), (1930, 303), (1940, 305), (1950, 310),\n (1960, 313), (1970, 320), (1980, 330), (1990, 350), (2000, 380)]\nco2_2021 = [(1960, 318), (1970, 325), (1980, 338), (1990, 358), (2000, 370),\n (2010, 390), (2020, 415)]\n\nxs,ys = unzip(co2_1970)\nplot(xs, ys, legend=false)\n\n𝒙s, 𝒚s = unzip(co2_2021)\nplot!(𝒙s, 𝒚s)\n\nr = 0.002\nx₀, P₀ = 1960, 313\nplot!(x -> P₀ * exp(r * (x - x₀)), 1950, 1990, linewidth=5, alpha=0.25)\n\n𝒓 = 0.005\n𝒙₀, 𝑷₀ = 2000, 370\nplot!(x -> 𝑷₀ * exp(𝒓 * (x - 𝒙₀)), 1960, 2020, linewidth=5, alpha=0.25)\n\n\n\n\n(The unzip function is from the CalculusWithJulia package and will be explained in a subsequent section.) We can see that the projections from the year \\(1970\\) hold up fairly well\nOn this plot we added two exponential models. at \\(1960\\) we added a roughly \\(0.2\\) percent per year growth (a rate mentioned in an accompanying caption) and at \\(2000\\) a roughly \\(0.5\\) percent per year growth. The former barely keeping up with the data.\nThe word roughly above could be made exact. Suppose we knew that between \\(1960\\) and \\(1970\\) the rate went from \\(313\\) to \\(320\\). If this followed an exponential model, then \\(r\\) above would satisfy:\n\\[\nP_{1970} = P_{1960} e^{r * (1970 - 1960)}\n\\]\nor on division \\(320/313 = e^{r\\cdot 10}\\). Solving for \\(r\\) can be done as explained next and yields \\(0.002211\\dots\\)."
},
{
"objectID": "precalc/exp_log_functions.html#logarithmic-functions",
"href": "precalc/exp_log_functions.html#logarithmic-functions",
"title": "15  Exponential and logarithmic functions",
"section": "15.2 Logarithmic functions",
"text": "15.2 Logarithmic functions\nAs the exponential functions are strictly decreasing when \\(0 < a < 1\\) and strictly increasing when \\(a>1,\\) in both cases an inverse function will exist. (When \\(a=1\\) the function is a constant and is not one-to-one.) The domain of an exponential function is all real \\(x\\) and the range is all positive \\(x\\), so these are switched around for the inverse function. Explicitly: the inverse function to \\(f(x)=a^x\\) will have domain \\((0,\\infty)\\) and range \\((-\\infty, \\infty)\\) when \\(a > 0, a \\neq 1\\).\nThe inverse function will solve for \\(x\\) in the equation \\(a^x = y\\). The answer, formally, is the logarithm base \\(a\\), written \\(\\log_a(x)\\).\nThat is \\(a^{\\log_a(x)} = x\\) for \\(x > 0\\) and \\(\\log_a(a^x) = x\\) for all \\(x\\).\nTo see how a logarithm is mathematically defined will have to wait, though the family of functions - one for each \\(a>0\\) - are implemented in Julia through the function log(a,x). There are special cases requiring just one argument: log(x) will compute the natural log, base \\(e\\) - the inverse of \\(f(x) = e^x\\); log2(x) will compute the log base \\(2\\) - the inverse of \\(f(x) = 2^x\\); and log10(x) will compute the log base \\(10\\) - the inverse of \\(f(x)=10^x\\). (Also log1p computes an accurate value of \\(\\log(1 + p)\\) when \\(p \\approx 0\\).)\nTo see this in an example, we plot for base \\(2\\) the exponential function \\(f(x)=2^x\\), its inverse, and the logarithm function with base \\(2\\):\n\nf(x) = 2^x\nxs = range(-2, stop=2, length=100)\nys = f.(xs)\nplot(xs, ys, color=:blue, label=\"2ˣ\") # plot f\nplot!(ys, xs, color=:red, label=\"f⁻¹\") # plot f^(-1)\nxs = range(1/4, stop=4, length=100)\nplot!(xs, log2.(xs), color=:green, label=\"log₂\") # plot log2\n\n\n\n\nThough we made three graphs, only two are seen, as the graph of log2 matches that of the inverse function.\nNote that we needed a bit of care to plot the inverse function directly, as the domain of \\(f\\) is not the domain of \\(f^{-1}\\). Again, in this case the domain of \\(f\\) is all \\(x\\), but the domain of \\(f^{-1}\\) is only all positive \\(x\\) values.\nKnowing that log2 implements an inverse function allows us to solve many problems involving doubling.\n\nExample\nAn old story about doubling is couched in terms of doubling grains of wheat. To simplify the story, suppose each day an amount of grain is doubled. How many days of doubling will it take \\(1\\) grain to become \\(1\\) million grains?\nThe number of grains after one day is \\(2\\), two days is \\(4\\), three days is \\(8\\) and so after \\(n\\) days the number of grains is \\(2^n\\). To answer the question, we need to solve \\(2^x = 1,000,000\\). The logarithm function yields \\(20\\) days (after rounding up):\n\nlog2(1_000_000)\n\n19.931568569324174\n\n\n\n\nExample\nThe half-life of a radioactive material is the time it takes for half the material to decay. Different materials have quite different half lives with some quite long, and others quite short. See half lives for some details.\nThe carbon \\(14\\) isotope is a naturally occurring isotope on Earth, appearing in trace amounts. Unlike Carbon \\(12\\) and \\(13\\) it decays, in this case with a half life of \\(5730\\) years (plus or minus \\(40\\) years). In a technique due to Libby, measuring the amount of Carbon 14 present in an organic item can indicate the time since death. The amount of Carbon \\(14\\) at death is essentially that of the atmosphere, and this amount decays over time. So, for example, if roughly half the carbon \\(14\\) remains, then the death occurred about \\(5730\\) years ago.\nA formula for the amount of carbon \\(14\\) remaining \\(t\\) years after death would be \\(P(t) = P_0 \\cdot 2^{-t/5730}\\).\nIf \\(1/10\\) of the original carbon \\(14\\) remains, how old is the item? This amounts to solving \\(2^{-t/5730} = 1/10\\). We have: \\(-t/5730 = \\log_2(1/10)\\) or:\n\n-5730 * log2(1/10)\n\n19034.647983704584\n\n\n\n\n\n\n\n\nNote\n\n\n\n(Historically) Libby and James Arnold proceeded to test the radiocarbon dating theory by analyzing samples with known ages. For example, two samples taken from the tombs of two Egyptian kings, Zoser and Sneferu, independently dated to \\(2625\\) BC plus or minus \\(75\\) years, were dated by radiocarbon measurement to an average of \\(2800\\) BC plus or minus \\(250\\) years. These results were published in Science in \\(1949\\). Within \\(11\\) years of their announcement, more than \\(20\\) radiocarbon dating laboratories had been set up worldwide. Source: Wikipedia.\n\n\n\n\n15.2.1 Properties of logarithms\nThe basic graphs of logarithms (\\(a > 1\\)) are all similar, though as we see larger bases lead to slower growing functions, though all satisfy \\(\\log_a(1) = 0\\):\n\nplot(log2, 1/2, 10, label=\"2\") # base 2\nplot!(log, 1/2, 10, label=\"e\") # base e\nplot!(log10, 1/2, 10, label=\"10\") # base 10\n\n\n\n\nNow, what do the properties of exponents imply about logarithms?\nConsider the sum \\(\\log_a(u) + \\log_a(v)\\). If we raise \\(a\\) to this power, we have using the powers of exponents and the inverse nature of \\(a^x\\) and \\(\\log_a(x)\\) that:\n\\[\na^{\\log_a(u) + \\log_a(v)} = a^{\\log_a(u)} \\cdot a^{\\log_a(v)} = u \\cdot v.\n\\]\nTaking \\(\\log_a\\) of both sides yields \\(\\log_a(u) + \\log_a(v)=\\log_a(u\\cdot v)\\). That is logarithms turn products into sums (of logs).\nSimilarly, the relation \\((a^{x})^y =a^{x \\cdot y}, a > 0\\) can be used to see that \\(\\log_a(b^x) = x \\cdot\\log_a(b)\\). This follows, as applying \\(a^x\\) to each side yields the same answer.\nDue to inverse relationship between \\(a^x\\) and \\(\\log_a(x)\\) we have:\n\\[\na^{\\log_a(b^x)} = b^x.\n\\]\nDue to the rules of exponents, we have:\n\\[\na^{x \\log_a(b)} = a^{\\log_a(b) \\cdot x} = (a^{\\log_a(b)})^x = b^x.\n\\]\nFinally, since \\(a^x\\) is one-to-one (when \\(a>0\\) and \\(a \\neq 1\\)), if \\(a^{\\log_a(b^x)}=a^{x \\log_a(b)}\\) it must be that \\(\\log_a(b^x) = x \\log_a(b)\\). That is, logarithms turn powers into products.\nFinally, we use the inverse property of logarithms and powers to show that logarithms can be defined for any base. Say \\(a, b > 0\\). Then \\(\\log_a(x) = \\log_b(x)/\\log_b(a)\\). Again, to verify this we apply \\(a^x\\) to both sides to see we get the same answer:\n\\[\na^{\\log_a(x)} = x,\n\\]\nthis by the inverse property. Whereas, by expressing \\(a=b^{\\log_b(a)}\\) we have:\n\\[\na^{(\\log_b(x)/\\log_b(b))} = (b^{\\log_b(a)})^{(\\log_b(x)/\\log_b(a))} =\nb^{\\log_b(a) \\cdot \\log_b(x)/\\log_b(a) } = b^{\\log_b(x)} = x.\n\\]\nIn short, we have these three properties of logarithmic functions:\nIf \\(a, b\\) are positive bases; \\(u,v\\) are positive numbers; and \\(x\\) is any real number then:\n\\[\n\\begin{align*}\n\\log_a(uv) &= \\log_a(u) + \\log_a(v), \\\\\n\\log_a(u^x) &= x \\log_a(u), \\text{ and} \\\\\n\\log_a(u) &= \\log_b(u)/\\log_b(a).\n\\end{align*}\n\\]\n\nExample\nBefore the ubiquity of electronic calculating devices, the need to compute was still present. Ancient civilizations had abacuses to make addition easier. For multiplication and powers a slide rule could be used. It is easy to represent addition physically with two straight pieces of wood - just represent a number with a distance and align the two pieces so that the distances are sequentially arranged. To multiply then was as easy: represent the logarithm of a number with a distance then add the logarithms. The sum of the logarithms is the logarithm of the product of the original two values. Converting back to a number answers the question. The conversion back and forth is done by simply labeling the wood using a logartithmic scale. The slide rule was invented soon after Napiers initial publication on the logarithm in 1614.\n\n\nExample\nReturning to the Rule of \\(72\\), what should the exact number be?\nThe amount of time to double an investment that grows according to \\(P_0 e^{rt}\\) solves \\(P_0 e^{rt} = 2P_0\\) or \\(rt = \\log_e(2)\\). So we get \\(t=\\log_e(2)/r\\). As \\(\\log_e(2)\\) is\n\nlog(e, 2)\n\n0.6931471805599453\n\n\nWe get the actual rule should be the “Rule of \\(69.314...\\).”"
},
{
"objectID": "precalc/exp_log_functions.html#questions",
"href": "precalc/exp_log_functions.html#questions",
"title": "15  Exponential and logarithmic functions",
"section": "15.3 Questions",
"text": "15.3 Questions\n\nQuestion\nSuppose every \\(4\\) days, a population doubles. If the population starts with \\(2\\) individuals, what is its size after \\(4\\) weeks?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA bouncing ball rebounds to a height of \\(5/6\\) of the previous peak height. If the ball is droppet at a height of \\(3\\) feet, how high will it bounce after \\(5\\) bounces?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich is bigger \\(e^2\\) or \\(2^e\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^2\\)\n \n \n\n\n \n \n \n \n \\(2^e\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich is bigger \\(\\log_8(9)\\) or \\(\\log_9(10)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\log_8(9)\\)\n \n \n\n\n \n \n \n \n \\(\\log_9(10)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(x\\), \\(y\\), and \\(z\\) satisfy \\(2^x = 3^y\\) and \\(4^y = 5^z\\), what is the ratio \\(x/z\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\frac{\\log(5)\\log(4)}{\\log(3)\\log(2)}\\)\n \n \n\n\n \n \n \n \n \\(2/5\\)\n \n \n\n\n \n \n \n \n \\(\\frac{\\log(2)\\log(3)}{\\log(5)\\log(4)}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes \\(12\\) satisfy \\(\\log_2(x) + \\log_3(x) = \\log_4(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Richter magnitude is determined from the logarithm of the amplitude of waves recorded by seismographs (Wikipedia). The formula is \\(M=\\log(A) - \\log(A_0)\\) where \\(A_0\\) depends on the epicenter distance. Suppose an event has \\(A=100\\) and \\(A_0=1/100\\). What is \\(M\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf the magnitude of one earthquake is \\(9\\) and the magnitude of another earthquake is \\(7\\), how many times stronger is \\(A\\) if \\(A_0\\) is the same for each?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1000\\) times\n \n \n\n\n \n \n \n \n \\(100\\) times\n \n \n\n\n \n \n \n \n \\(10\\) times\n \n \n\n\n \n \n \n \n the same\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Loudest band can possibly be measured in decibels. In \\(1976\\) the Who recorded \\(126\\) db and in \\(1986\\) Motorhead recorded \\(130\\) db. Suppose both measurements record power through the formula \\(db = 10 \\log_{10}(P)\\). What is the ratio of the Motorhead \\(P\\) to the \\(P\\) for the Who?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBased on this graph:\n\nplot(log, 1/4, 4, label=\"log\")\nf(x) = x - 1\nplot!(f, 1/4, 4, label=\"x-1\")\n\n\n\n\nWhich statement appears to be true?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x \\geq 1 + \\log(x)\\)\n \n \n\n\n \n \n \n \n \\(x \\leq 1 + \\log(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this graph:\n\nf(x) = log(1-x)\ng(x) = -x - x^2/2\nplot(f, -3, 3/4, label=\"f\")\nplot!(g, -3, 3/4, label=\"g\")\n\n\n\n\nWhat statement appears to be true?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\log(1-x) \\geq -x - x^2/2\\)\n \n \n\n\n \n \n \n \n \\(\\log(1-x) \\leq -x - x^2/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(a > 1\\). If \\(\\log_a(x) = y\\) what is \\(\\log_{1/a}(x)\\)? (The reciprocal property of exponents, \\(a^{-x} = (1/a)^x\\), is at play here.)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-y\\)\n \n \n\n\n \n \n \n \n \\(-1/y\\)\n \n \n\n\n \n \n \n \n \\(1/y\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nBased on this, the graph of \\(\\log_{1/a}(x)\\) is the graph of \\(\\log_a(x)\\) under which transformation?\n\n\n\n \n \n \n \n \n \n \n \n \n Flipped over the \\(x\\) axis\n \n \n\n\n \n \n \n \n Flipped over the \\(y\\) axis\n \n \n\n\n \n \n \n \n Flipped over the line \\(y=x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(x < y\\). Then for \\(a > 0\\), \\(a^y - a^x\\) is equal to:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(a^{y-x} \\cdot (a^x - 1)\\)\n \n \n\n\n \n \n \n \n \\(a^{y-x}\\)\n \n \n\n\n \n \n \n \n \\(a^x \\cdot (a^{y-x} - 1)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nUsing \\(a > 1\\) we have:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(a^{y-x} > 0\\)\n \n \n\n\n \n \n \n \n as \\(a^x > 1\\), \\(a^y > a^x\\)\n \n \n\n\n \n \n \n \n as \\(a^{y-x} > 1\\) and \\(y-x > 0\\), \\(a^y > a^x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(a < 1\\) then:\n\n\n\n \n \n \n \n \n \n \n \n \n as \\(a^{y-x} < 1\\) as \\(y-x > 0\\), \\(a^y < a^x\\)\n \n \n\n\n \n \n \n \n \\(a^{y-x} < 0\\)\n \n \n\n\n \n \n \n \n as \\(a^x < 1\\), \\(a^y < a^x\\)"
},
{
"objectID": "precalc/trig_functions.html",
"href": "precalc/trig_functions.html",
"title": "16  Trigonometric functions",
"section": "",
"text": "This section uses the following add-on packages:\nWe have informally used some of the trigonometric functions in examples so far. In this section we quickly review their definitions and some basic properties.\nThe trigonometric functions are used to describe relationships between triangles and circles as well as oscillatory motions. With such a wide range of utility it is no wonder that they pop up in many places and their origins date to Hipparcus and Ptolemy over \\(2000\\) years ago."
},
{
"objectID": "precalc/trig_functions.html#the-6-basic-trigonometric-functions",
"href": "precalc/trig_functions.html#the-6-basic-trigonometric-functions",
"title": "16  Trigonometric functions",
"section": "16.1 The 6 basic trigonometric functions",
"text": "16.1 The 6 basic trigonometric functions\nWe measure angles in radians, where \\(360\\) degrees is \\(2\\pi\\) radians. By proportions, \\(180\\) degrees is \\(\\pi\\) radian, \\(90\\) degrees is \\(\\pi/2\\) radians, \\(60\\) degrees is \\(\\pi/3\\) radians, etc. In general, \\(x\\) degrees is \\(2\\pi \\cdot x / 360\\) radians (or, with cancellation, \\(x \\cdot \\frac{\\pi}{180}\\)).\nFor a right triangle with angles \\(\\theta\\), \\(\\pi/2 - \\theta\\), and \\(\\pi/2\\) (\\(0 < \\theta < \\pi/2\\)) we call the side opposite \\(\\theta\\) the “opposite” side, the shorter adjacent side the “adjacent” side and the longer adjacent side the hypotenuse.\n\n\n\n\n\nWith these, the basic definitions for the primary trigonometric functions are\n\\[\n\\begin{align*}\n\\sin(\\theta) &= \\frac{\\text{opposite}}{\\text{hypotenuse}} &\\quad(\\text{the sine function})\\\\\n\\cos(\\theta) &= \\frac{\\text{adjacent}}{\\text{hypotenuse}} &\\quad(\\text{the cosine function})\\\\\n\\tan(\\theta) &= \\frac{\\text{opposite}}{\\text{adjacent}}. &\\quad(\\text{the tangent function})\n\\end{align*}\n\\]\n\n\n\n\n\n\nNote\n\n\n\nMany students remember these through SOH-CAH-TOA.\n\n\nSome algebra shows that \\(\\tan(\\theta) = \\sin(\\theta)/\\cos(\\theta)\\). There are also \\(3\\) reciprocal functions, the cosecant, secant and cotangent.\nThese definitions in terms of sides only apply for \\(0 \\leq \\theta \\leq \\pi/2\\). More generally, if we relate any angle taken in the counter clockwise direction for the \\(x\\)-axis with a point \\((x,y)\\) on the unit circle, then we can extend these definitions - the point \\((x,y)\\) is also \\((\\cos(\\theta), \\sin(\\theta))\\).\n\n\n \n An angle in radian measure corresponds to a point on the unit circle, whose coordinates define the sine and cosine of the angle. That is \\((x,y) = (\\cos(\\theta), \\sin(\\theta))\\).\n \n \n\n\n\n\n16.1.1 The trigonometric functions in Julia\nJulia has the \\(6\\) basic trigonometric functions defined through the functions sin, cos, tan, csc, sec, and cot.\nTwo right triangles - the one with equal, \\(\\pi/4\\), angles; and the one with angles \\(\\pi/6\\) and \\(\\pi/3\\) can have the ratio of their sides computed from basic geometry. In particular, this leads to the following values, which are usually committed to memory:\n\\[\n\\begin{align*}\n\\sin(0) &= 0, \\quad \\sin(\\pi/6) = \\frac{1}{2}, \\quad \\sin(\\pi/4) = \\frac{\\sqrt{2}}{2}, \\quad\\sin(\\pi/3) = \\frac{\\sqrt{3}}{2},\\text{ and } \\sin(\\pi/2) = 1\\\\\n\\cos(0) &= 1, \\quad \\cos(\\pi/6) = \\frac{\\sqrt{3}}{2}, \\quad \\cos(\\pi/4) = \\frac{\\sqrt{2}}{2}, \\quad\\cos(\\pi/3) = \\frac{1}{2},\\text{ and } \\cos(\\pi/2) = 0.\n\\end{align*}\n\\]\nUsing the circle definition allows these basic values to inform us of values throughout the unit circle.\nThese all follow from the definition involving the unit circle:\n\nIf the angle \\(\\theta\\) corresponds to a point \\((x,y)\\) on the unit circle, then the angle \\(-\\theta\\) corresponds to \\((x, -y)\\). So \\(\\sin(\\theta) = - \\sin(-\\theta)\\) (an odd function), but \\(\\cos(\\theta) = \\cos(-\\theta)\\) (an even function).\nIf the angle \\(\\theta\\) corresponds to a point \\((x,y)\\) on the unit circle, then rotating by \\(\\pi\\) moves the points to \\((-x, -y)\\). So \\(\\cos(\\theta) = x = - \\cos(\\theta + \\pi)\\), and \\(\\sin(\\theta) = y = -\\sin(\\theta + \\pi)\\).\nIf the angle \\(\\theta\\) corresponds to a point \\((x,y)\\) on the unit circle, then rotating by \\(\\pi/2\\) moves the points to \\((-y, x)\\). So \\(\\cos(\\theta) = x = \\sin(\\theta + \\pi/2)\\).\n\nThe fact that \\(x^2 + y^2 = 1\\) for the unit circle leads to the “Pythagorean identity” for trigonometric functions:\n\\[\n\\sin(\\theta)^2 + \\cos(\\theta)^2 = 1.\n\\]\nThis basic fact can be manipulated many ways. For example, dividing through by \\(\\cos(\\theta)^2\\) gives the related identity: \\(\\tan(\\theta)^2 + 1 = \\sec(\\theta)^2\\).\nJulias functions can compute values for any angles, including these fundamental ones:\n\n[cos(theta) for theta in [0, pi/6, pi/4, pi/3, pi/2]]\n\n5-element Vector{Float64}:\n 1.0\n 0.8660254037844387\n 0.7071067811865476\n 0.5000000000000001\n 6.123233995736766e-17\n\n\nThese are floating point approximations, as can be seen clearly in the last value. Symbolic math can be used if exactness matters:\n\ncos.([0, PI/6, PI/4, PI/3, PI/2])\n\n5-element Vector{Sym}:\n 1\n sqrt(3)/2\n sqrt(2)/2\n 1/2\n 0\n\n\nThe sincos function computes both sin and cos simultaneously, which can be more performant when both values are needed.\nsincos(pi/3)\n\n\n\n\n\n\nNote\n\n\n\nFor really large values, round off error can play a big role. For example, the exact value of \\(\\sin(1000000 \\pi)\\) is \\(0\\), but the returned value is not quite \\(0\\) sin(1_000_000 * pi) = -2.231912181360871e-10. For exact multiples of \\(\\pi\\) with large multiples the sinpi and cospi functions are useful.\n(Both functions are computed by first employing periodicity to reduce the problem to a smaller angle. However, for large multiples the floating-point roundoff becomes a problem with the usual functions.)\n\n\n\nExample\nMeasuring the height of a tree may be a real-world task for some, but a typical task for nearly all trigonometry students. How might it be done? If a right triangle can be formed where the angle and adjacent side length are known, then the opposite side (the height of the tree) can be solved for with the tangent function. For example, if standing \\(100\\) feet from the base of the tree the tip makes a \\(15\\) degree angle the height is given by:\n\ntheta = 15 * pi / 180\nadjacent = 100\nopposite = adjacent * tan(theta)\n\n26.79491924311227\n\n\nHaving some means to compute an angle and then a tangent of that angle handy is not a given, so the linked to article provides a few other methods taking advantage of similar triangles.\nYou can also measure distance with your thumb or fist. How? The fist takes up about \\(10\\) degrees of view when held straight out. So, pacing off backwards until the fist completely occludes the tree will give the distance of the adjacent side of a right triangle. If that distance is \\(30\\) paces what is the height of the tree? Well, we need some facts. Suppose your pace is \\(3\\) feet. Then the adjacent length is \\(90\\) feet. The multiplier is the tangent of \\(10\\) degrees, or:\n\ntan(10 * pi/180)\n\n0.17632698070846498\n\n\nWhich for sake of memory we will say is \\(1/6\\) (a \\(5\\) percent error). So that answer is roughly \\(15\\) feet:\n\n30 * 3 / 6\n\n15.0\n\n\nSimilarly, you can use your thumb instead of your first. To use your first you can multiply by \\(1/6\\) the adjacent side, to use your thumb about \\(1/30\\) as this approximates the tangent of \\(2\\) degrees:\n\n1/30, tan(2*pi/180)\n\n(0.03333333333333333, 0.03492076949174773)\n\n\nThis could be reversed. If you know the height of something a distance away that is covered by your thumb or fist, then you would multiply that height by the appropriate amount to find your distance.\n\n\n\n16.1.2 Basic properties\nThe sine function is defined for all real \\(\\theta\\) and has a range of \\([-1,1]\\). Clearly as \\(\\theta\\) winds around the \\(x\\)-axis, the position of the \\(y\\) coordinate begins to repeat itself. We say the sine function is periodic with period \\(2\\pi\\). A graph will illustrate:\n\nplot(sin, 0, 4pi)\n\n\n\n\nThe graph shows two periods. The wavy aspect of the graph is why this function is used to model periodic motions, such as the amount of sunlight in a day, or the alternating current powering a computer.\nFrom this graph - or considering when the \\(y\\) coordinate is \\(0\\) - we see that the sine function has zeros at any integer multiple of \\(\\pi\\), or \\(k\\pi\\), \\(k\\) in \\(\\dots,-2,-1, 0, 1, 2, \\dots\\).\nThe cosine function is similar, in that it has the same domain and range, but is “out of phase” with the sine curve. A graph of both shows the two are related:\n\nplot(sin, 0, 4pi, label=\"sin\")\nplot!(cos, 0, 4pi, label=\"cos\")\n\n\n\n\nThe cosine function is just a shift of the sine function (or vice versa). We see that the zeros of the cosine function happen at points of the form \\(\\pi/2 + k\\pi\\), \\(k\\) in \\(\\dots,-2,-1, 0, 1, 2, \\dots.\\)\nThe tangent function does not have all \\(\\theta\\) for its domain, rather those points where division by \\(0\\) occurs are excluded. These occur when the cosine is \\(0\\), or, again, at \\(\\pi/2 + k\\pi\\), \\(k\\) in \\(\\dots,-2,-1, 0, 1, 2, \\dots.\\) The range of the tangent function will be all real \\(y\\).\nThe tangent function is also periodic, but not with period \\(2\\pi\\), but rather just \\(\\pi\\). A graph will show this. Here we avoid the vertical asymptotes using rangeclamp:\n\nplot(rangeclamp(tan), -10, 10, label=\"tan\")\n\n\n\n\n\nExample sums of sines\nFor the function \\(f(x) = \\sin(x)\\) we have an understanding of the related family of functions defined by linear transformations:\n\\[\ng(x) = a + b \\sin((2\\pi n)x)\n\\]\nThat is \\(g\\) is shifted up by \\(a\\) units, scaled vertically by \\(b\\) units and has a period of \\(1/n\\). We see a simple plot here where we can verify the transformation:\n\ng(x; b=1,n=1) = b*sin(2pi*n*x)\ng1(x) = 1 + g(x, b=2, n=3)\nplot(g1, 0, 1)\n\n\n\n\nWe can consider the sum of such functions, for example\n\ng2(x) = 1 + g(x, b=2, n=3) + g(x, b=4, n=5)\nplot(g2, 0, 1)\n\n\n\n\nThough still periodic, we can see with this simple example that sums of different sine functions can have somewhat complicated graphs.\nSine functions can be viewed as the x position of a point traveling around a circle so g(x, b=2, n=3) is the x position of point traveling around a circle of radius \\(2\\) that completes a circuit in \\(1/3\\) units of time.\nThe superposition of the two sine functions that g2 represents could be viewed as the position of a circle moving around a point that is moving around another circle. The following graphic, with \\(b_1=1/3, n_1=3, b_2=1/4\\), and \\(n_2=4\\), shows an example that produces the related cosine sum (moving right along the \\(x\\) axis), the sine sum (moving down along the \\(y\\) axis, and the trace of the position of the point generating these two plots.\n\n\n \n Superposition of sines and cosines represented by an epicycle\n \n \n\n\n\nAs can be seen, even a somewhat simple combination can produce complicated graphs (a fact known to Ptolemy) . How complicated can such a graph get? This wont be answered here, but for fun enjoy this video produced by the same technique using more moving parts from the Javis.jl package:\n\n\n\n\n \n\n \n \n \n Julia logo animated\n \n \n \n\n\n\n\n\n\n\n16.1.3 Functions using degrees\nTrigonometric function are functions of angles which have two common descriptions: in terms of degrees or radians. Degrees are common when right triangles are considered, radians much more common in general, as the relationship with arc-length holds in that \\(r\\theta = l\\), where \\(r\\) is the radius of a circle and \\(l\\) the length of the arc formed by angle \\(\\theta\\).\nThe two are related, as a circle has both \\(2\\pi\\) radians and \\(360\\) degrees. So to convert from degrees into radians it takes multiplying by \\(2\\pi/360\\) and to convert from radians to degrees it takes multiplying by \\(360/(2\\pi)\\). The deg2rad and rad2deg functions are available for this task.\nIn Julia, the functions sind, cosd, tand, cscd, secd, and cotd are available to simplify the task of composing the two operations (that is sin(deg2rad(x)) is the essentially same as sind(x))."
},
{
"objectID": "precalc/trig_functions.html#the-sum-and-difference-formulas",
"href": "precalc/trig_functions.html#the-sum-and-difference-formulas",
"title": "16  Trigonometric functions",
"section": "16.2 The sum-and-difference formulas",
"text": "16.2 The sum-and-difference formulas\nConsider the point on the unit circle \\((x,y) = (\\cos(\\theta), \\sin(\\theta))\\). In terms of \\((x,y)\\) (or \\(\\theta\\)) is there a way to represent the angle found by rotating an additional \\(\\theta\\), that is what is \\((\\cos(2\\theta), \\sin(2\\theta))\\)?\nMore generally, suppose we have two angles \\(\\alpha\\) and \\(\\beta\\), can we represent the values of \\((\\cos(\\alpha + \\beta), \\sin(\\alpha + \\beta))\\) using the values just involving \\(\\beta\\) and \\(\\alpha\\) separately?\nAccording to Wikipedia the following figure (from mathalino.com) has ideas that date to Ptolemy:\n\n\n\nRelations between angles\n\n\nTo read this, there are three triangles: the bigger (green with pink part) has hypotenuse \\(1\\) (and adjacent and opposite sides that form the hypotenuses of the other two); the next biggest (yellow) hypotenuse \\(\\cos(\\beta)\\), adjacent side (of angle \\(\\alpha\\)) \\(\\cos(\\beta)\\cdot \\cos(\\alpha)\\), and opposite side \\(\\cos(\\beta)\\cdot\\sin(\\alpha)\\); and the smallest (pink) hypotenuse \\(\\sin(\\beta)\\), adjacent side (of angle \\(\\alpha\\)) \\(\\sin(\\beta)\\cdot \\cos(\\alpha)\\), and opposite side \\(\\sin(\\beta)\\sin(\\alpha)\\).\nThis figure shows the following sum formula for sine and cosine:\n\\[\n\\begin{align*}\n\\sin(\\alpha + \\beta) &= \\sin(\\alpha)\\cos(\\beta) + \\cos(\\alpha)\\sin(\\beta), & (\\overline{CE} + \\overline{DF})\\\\\n\\cos(\\alpha + \\beta) &= \\cos(\\alpha)\\cos(\\beta) - \\sin(\\alpha)\\sin(\\beta). & (\\overline{AC} - \\overline{DE})\n\\end{align*}\n\\]\nUsing the fact that \\(\\sin\\) is an odd function and \\(\\cos\\) an even function, related formulas for the difference \\(\\alpha - \\beta\\) can be derived.\nTaking \\(\\alpha = \\beta\\) we immediately get the “double-angle” formulas:\n\\[\n\\begin{align*}\n\\sin(2\\alpha) &= 2\\sin(\\alpha)\\cos(\\alpha)\\\\\n\\cos(2\\alpha) &= \\cos(\\alpha)^2 - \\sin(\\alpha)^2.\n\\end{align*}\n\\]\nThe latter looks like the Pythagorean identify, but has a minus sign. In fact, the Pythagorean identify is often used to rewrite this, for example \\(\\cos(2\\alpha) = 2\\cos(\\alpha)^2 - 1\\) or \\(1 - 2\\sin(\\alpha)^2\\).\nApplying the above with \\(\\alpha = \\beta/2\\), we get that \\(\\cos(\\beta) = 2\\cos(\\beta/2)^2 -1\\), which rearranged yields the “half-angle” formula: \\(\\cos(\\beta/2)^2 = (1 + \\cos(\\beta))/2\\).\n\nExample\nConsider the expressions \\(\\cos((n+1)\\theta)\\) and \\(\\cos((n-1)\\theta)\\). These can be re-expressed as:\n\\[\n\\begin{align*}\n\\cos((n+1)\\theta) &= \\cos(n\\theta + \\theta) = \\cos(n\\theta) \\cos(\\theta) - \\sin(n\\theta)\\sin(\\theta), \\text{ and}\\\\\n\\cos((n-1)\\theta) &= \\cos(n\\theta - \\theta) = \\cos(n\\theta) \\cos(-\\theta) - \\sin(n\\theta)\\sin(-\\theta).\n\\end{align*}\n\\]\nBut \\(\\cos(-\\theta) = \\cos(\\theta)\\), whereas \\(\\sin(-\\theta) = -\\sin(\\theta)\\). Using this, we add the two formulas above to get:\n\\[\n\\cos((n+1)\\theta) = 2\\cos(n\\theta) \\cos(\\theta) - \\cos((n-1)\\theta).\n\\]\nThat is the angle for a multiple of \\(n+1\\) can be expressed in terms of the angle with a multiple of \\(n\\) and \\(n-1\\). This can be used recursively to find expressions for \\(\\cos(n\\theta)\\) in terms of polynomials in \\(\\cos(\\theta)\\)."
},
{
"objectID": "precalc/trig_functions.html#inverse-trigonometric-functions",
"href": "precalc/trig_functions.html#inverse-trigonometric-functions",
"title": "16  Trigonometric functions",
"section": "16.3 Inverse trigonometric functions",
"text": "16.3 Inverse trigonometric functions\nThe trigonometric functions are all periodic. In particular they are not monotonic over their entire domain. This means there is no inverse function applicable. However, by restricting the domain to where the functions are monotonic, inverse functions can be defined:\n\nFor \\(\\sin(x)\\), the restricted domain of \\([-\\pi/2, \\pi/2]\\) allows for the arcsine function to be defined. In Julia this is implemented with asin.\nFor \\(\\cos(x)\\), the restricted domain of \\([0,\\pi]\\) allows for the arccosine function to be defined. In Julia this is implemented with acos.\nFor \\(\\tan(x)\\), the restricted domain of \\((-\\pi/2, \\pi/2)\\) allows for the arctangent function to be defined. In Julia this is implemented with atan.\n\nFor example, the arcsine function is defined for \\(-1 \\leq x \\leq 1\\) and has a range of \\(-\\pi/2\\) to \\(\\pi/2\\):\n\nplot(asin, -1, 1)\n\n\n\n\nThe arctangent has domain of all real \\(x\\). It has shape given by:\n\nplot(atan, -10, 10)\n\n\n\n\nThe horizontal asymptotes are \\(y=\\pi/2\\) and \\(y=-\\pi/2\\).\n\n16.3.1 Implications of a restricted domain\nNotice that \\(\\sin(\\arcsin(x)) = x\\) for any \\(x\\) in \\([-1,1]\\), but, of course, not for all \\(x\\), as the output of the sine function cant be arbitrarily large.\nHowever, \\(\\arcsin(\\sin(x))\\) is defined for all \\(x\\), but only equals \\(x\\) when \\(x\\) is in \\([-\\pi/2, \\pi/2]\\). The output, or range, of the \\(\\arcsin\\) function is restricted to that interval.\nThis can be limiting at times. A common case is to find the angle in \\([0, 2\\pi)\\) corresponding to a point \\((x,y)\\). In the simplest case (the first and fourth quadrants) this is just given by \\(\\arctan(y/x)\\). But with some work, the correct angle can be found for any pair \\((x,y)\\). As this is a common desire, the atan function with two arguments, atan(y,x), is available. This function returns a value in \\((-\\pi, \\pi]\\).\nFor example, this will not give back \\(\\theta\\) without more work to identify the quadrant:\n\ntheta = 3pi/4 # 2.35619...\nx,y = (cos(theta), sin(theta)) # -0.7071..., 0.7071...\natan(y/x)\n\n-0.7853981633974484\n\n\nBut,\n\natan(y, x)\n\n2.356194490192345\n\n\n\nExample\nA (white) light shining through a prism will be deflected depending on the material of the prism and the angles involved (refer to the link for a figure). The relationship can be analyzed by tracing a ray through the figure and utilizing Snells law. If the prism has index of refraction \\(n\\) then the ray will deflect by an amount \\(\\delta\\) that depends on the angle, \\(\\alpha\\) of the prism and the initial angle (\\(\\theta_0\\)) according to:\n\\[\n\\delta = \\theta_0 - \\alpha + \\arcsin(n \\sin(\\alpha - \\arcsin(\\frac{1}{n}\\sin(\\theta_0)))).\n\\]\nIf \\(n=1.5\\) (glass), \\(\\alpha = \\pi/3\\) and \\(\\theta_0=\\pi/6\\), find the deflection (in radians).\nWe have:\n\nn, alpha, theta0 = 1.5, pi/3, pi/6\ndelta = theta0 - alpha + asin(n * sin(alpha - asin(sin(theta0)/n)))\n\n0.8219769749498015\n\n\nFor small \\(\\theta_0\\) and \\(\\alpha\\) the deviation is approximated by \\((n-1)\\alpha\\). Compare this approximation to the actual value when \\(\\theta_0 = \\pi/10\\) and \\(\\alpha=\\pi/15\\).\nWe have:\n\nn, alpha, theta0 = 1.5, pi/15, pi/10\ndelta = theta0 - alpha + asin(n * sin(alpha - asin(sin(theta0)/n)))\ndelta, (n-1)*alpha\n\n(0.10763338241545499, 0.10471975511965977)\n\n\nThe approximation error is about \\(2.7\\) percent.\n\n\nExample\nThe AMS has an interesting column on rainbows the start of which uses some formulas from the previous example. Click through to see a ray of light passing through a spherical drop of water, as analyzed by Descartes. The deflection of the ray occurs when the incident light hits the drop of water, then there is an internal deflection of the light, and finally when the light leaves, there is another deflection. The total deflection (in radians) is \\(D = (i-r) + (\\pi - 2r) + (i-r) = \\pi - 2i - 4r\\). However, the incident angle \\(i\\) and the refracted angle \\(r\\) are related by Snells law: \\(\\sin(i) = n \\sin(r)\\). The value \\(n\\) is the index of refraction and is \\(4/3\\) for water. (It was \\(3/2\\) for glass in the previous example.) This gives\n\\[\nD = \\pi + 2i - 4 \\arcsin(\\frac{1}{n} \\sin(i)).\n\\]\nGraphing this for incident angles between \\(0\\) and \\(\\pi/2\\) we have:\n\nn = 4/3\nD(i) = pi + 2i - 4 * asin(sin(i)/n)\nplot(D, 0, pi/2)\n\n\n\n\nDescartes was interested in the minimum value of this graph, as it relates to where the light concentrates. This is roughly at \\(1\\) radian or about \\(57\\) degrees:\n\nrad2deg(1.0)\n\n57.29577951308232\n\n\n(Using calculus it can be seen to be \\(\\arccos(((n^2-1)/3)^{1/2})\\).)\n\n\nExample: The Chebyshev Polynomials\nConsider again this equation derived with the sum-and-difference formula:\n\\[\n\\cos((n+1)\\theta) = 2\\cos(n\\theta) \\cos(\\theta) - \\cos((n-1)\\theta).\n\\]\nLet \\(T_n(x) = \\cos(n \\arccos(x))\\). Calling \\(\\theta = \\arccos(x)\\) for \\(-1 \\leq x \\leq x\\) we get a relation between these functions:\n\\[\nT_{n+1}(x) = 2x T_n(x) - T_{n-1}(x).\n\\]\nWe can simplify a few: For example, when \\(n=0\\) we see immediately that \\(T_0(x) = 1\\), the constant function. Whereas with \\(n=1\\) we get \\(T_1(x) = \\cos(\\arccos(x)) = x\\). Things get more interesting as we get bigger \\(n\\), for example using the equation above we get \\(T_2(x) = 2xT_1(x) - T_0(x) = 2x\\cdot x - 1 = 2x^2 - 1\\). Continuing, wed get \\(T_3(x) = 2 x T_2(x) - T_1(x) = 2x(2x^2 - 1) - x = 4x^3 -3x\\).\nA few things become clear from the above two representations:\n\nStarting from \\(T_0(x) = 1\\) and \\(T_1(x)=x\\) and using the recursive defintion of \\(T_{n+1}\\) we get a family of polynomials where \\(T_n(x)\\) is a degree \\(n\\) polynomial. These are defined for all \\(x\\), not just \\(-1 \\leq x \\leq 1\\).\nUsing the initial definition, we see that the zeros of \\(T_n(x)\\) all occur within \\([-1,1]\\) and happen when \\(n\\arccos(x) = k\\pi + \\pi/2\\), or \\(x=\\cos((2k+1)/n \\cdot \\pi/2)\\) for \\(k=0, 1, \\dots, n-1\\).\n\nOther properties of this polynomial family are not at all obvious. One is that amongst all polynomials of degree \\(n\\) with roots in \\([-1,1]\\), \\(T_n(x)\\) will be the smallest in magnitude (after we divide by the leading coefficient to make all polynomials considered to be monic). We check this for one case. Take \\(n=4\\), then we have: \\(T_4(x) = 8x^4 - 8x^2 + 1\\). Compare this with \\(q(x) = (x+3/5)(x+1/5)(x-1/5)(x-3/5)\\) (evenly spaced zeros):\n\nT4(x) = (8x^4 - 8x^2 + 1) / 8\nq(x) = (x+3/5)*(x+1/5)*(x-1/5)*(x-3/5)\nplot(abs ∘ T4, -1,1, label=\"|T₄|\")\nplot!(abs ∘ q, -1,1, label=\"|q|\")"
},
{
"objectID": "precalc/trig_functions.html#hyperbolic-trigonometric-functions",
"href": "precalc/trig_functions.html#hyperbolic-trigonometric-functions",
"title": "16  Trigonometric functions",
"section": "16.4 Hyperbolic trigonometric functions",
"text": "16.4 Hyperbolic trigonometric functions\nRelated to the trigonometric functions are the hyperbolic trigonometric functions. Instead of associating a point \\((x,y)\\) on the unit circle with an angle \\(\\theta\\), we associate a point \\((x,y)\\) on the unit hyperbola (\\(x^2 - y^2 = 1\\)). We define the hyperbolic sine (\\(\\sinh\\)) and hyperbolic cosine (\\(\\cosh\\)) through \\((\\cosh(\\theta), \\sinh(\\theta)) = (x,y)\\).\n\n\n\n\n\nThese values are more commonly expressed using the exponential function as:\n\\[\n\\begin{align*}\n\\sinh(x) &= \\frac{e^x - e^{-x}}{2}\\\\\n\\cosh(x) &= \\frac{e^x + e^{-x}}{2}.\n\\end{align*}\n\\]\nThe hyperbolic tangent is then the ratio of \\(\\sinh\\) and \\(\\cosh\\). As well, three inverse hyperbolic functions can be defined.\nThe Julia functions to compute these values are named sinh, cosh, and tanh."
},
{
"objectID": "precalc/trig_functions.html#questions",
"href": "precalc/trig_functions.html#questions",
"title": "16  Trigonometric functions",
"section": "16.5 Questions",
"text": "16.5 Questions\n\nQuestion\nWhat is bigger \\(\\sin(1.23456)\\) or \\(\\cos(6.54321)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sin(1.23456)\\)\n \n \n\n\n \n \n \n \n \\(\\cos(6.54321)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(x=\\pi/4\\). What is bigger \\(\\cos(x)\\) or \\(x\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\cos(x)\\)\n \n \n\n\n \n \n \n \n \\(x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe cosine function is a simple tranformation of the sine function. Which one?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\cos(x) = \\sin(x - \\pi/2)\\)\n \n \n\n\n \n \n \n \n \\(\\cos(x) = \\sin(x + \\pi/2)\\)\n \n \n\n\n \n \n \n \n \\(\\cos(x) = \\pi/2 \\cdot \\sin(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGraph the secant function. The vertical asymptotes are at?\n\n\n\n \n \n \n \n \n \n \n \n \n The values \\(k\\pi\\) for \\(k\\) in \\(\\dots, -2, -1, 0, 1, 2, \\dots\\)\n \n \n\n\n \n \n \n \n The values \\(\\pi/2 + k\\pi\\) for \\(k\\) in \\(\\dots, -2, -1, 0, 1, 2, \\dots\\)\n \n \n\n\n \n \n \n \n The values \\(2k\\pi\\) for \\(k\\) in \\(\\dots, -2, -1, 0, 1, 2, \\dots\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA formula due to Bhaskara I dates to around 650AD and gives a rational function approximation to the sine function. In degrees, we have\n\\[\n\\sin(x^\\circ) \\approx \\frac{4x(180-x)}{40500 - x(180-x)}, \\quad 0 \\leq x \\leq 180.\n\\]\nPlot both functions over \\([0, 180]\\). What is the maximum difference between the two to two decimal points? (You may need to plot the difference of the functions to read off an approximate answer.)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSolve the following equation for a value of \\(x\\) using acos:\n\\[\n\\cos(x/3) = 1/3.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor any postive integer \\(n\\) the equation \\(\\cos(x) - nx = 0\\) has a solution in \\([0, \\pi/2]\\). Graphically estimate the value when \\(n=10\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe sine function is an odd function.\n\nThe hyperbolic sine is:\n\n\n\n\n \n \n \n \n \n \n \n \n \n odd\n \n \n\n\n \n \n \n \n even\n \n \n\n\n \n \n \n \n neither\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe hyperbolic cosine is:\n\n\n\n\n \n \n \n \n \n \n \n \n \n odd\n \n \n\n\n \n \n \n \n even\n \n \n\n\n \n \n \n \n neither\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe hyperbolic tangent is:\n\n\n\n\n \n \n \n \n \n \n \n \n \n odd\n \n \n\n\n \n \n \n \n even\n \n \n\n\n \n \n \n \n neither\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe hyperbolic sine satisfies this formula:\n\\[\n\\sinh(\\theta + \\beta) = \\sinh(\\theta)\\cosh(\\beta) + \\sinh(\\beta)\\cosh(\\theta).\n\\]\nIs this identical to the pattern for the regular sine function?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe hyperbolic cosine satisfies this formula:\n\\[\n\\cosh(\\theta + \\beta) = \\cosh(\\theta)\\cosh(\\beta) + \\sinh(\\beta)\\sinh(\\theta).\n\\]\nIs this identical to the pattern for the regular sine function?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No"
},
{
"objectID": "precalc/julia_overview.html",
"href": "precalc/julia_overview.html",
"title": "17  Overview of Julia commands",
"section": "",
"text": "The Julia programming language is well suited as a computer accompaniment while learning the concepts of calculus. The following overview covers the language-specific aspects of the pre-calculus part of the Calculus with Julia notes."
},
{
"objectID": "precalc/julia_overview.html#installing-julia",
"href": "precalc/julia_overview.html#installing-julia",
"title": "17  Overview of Julia commands",
"section": "17.1 Installing Julia",
"text": "17.1 Installing Julia\nJulia is an open source project which allows anyone with a supported computer to use it. To install locally, the downloads page has several different binaries for installation. Additionally, the downloads page contains a link to a docker image. For Microsoft Windows, the new juilaup installer may be of interest; it is available from the Windows Store. Julia can also be compiled from source.\nJulia can also be run through the web. The https://mybinder.org/ service in particular allows free access, though limited in terms of allotted memory and with a relatively short timeout for inactivity.\nLaunch Binder"
},
{
"objectID": "precalc/julia_overview.html#interacting-with-julia",
"href": "precalc/julia_overview.html#interacting-with-julia",
"title": "17  Overview of Julia commands",
"section": "17.2 Interacting with Julia",
"text": "17.2 Interacting with Julia\nAt a basic level, Julia provides a means to read commands or instructions, evaluate those commands, and then print or return those commands. At a user level, there are many different ways to interact with the reading and printing. For example:\n\nThe REPL. The Julia terminal is the built-in means to interact with Julia. A Julia Terminal has a command prompt, after which commands are typed and then sent to be evaluated by the enter key. The terminal may look something like the following where 2+2 is evaluated:\n\n\n$ julia\n _\n _ _ _(_)_ | Documentation: https://docs.julialang.org\n (_) | (_) (_) |\n _ _ _| |_ __ _ | Type \"?\" for help, \"]?\" for Pkg help.\n | | | | | | |/ _` | |\n | | |_| | | | (_| | | Version 1.7.0 (2021-11-30)\n _/ |\\__'_|_|_|\\__'_| | Official https://julialang.org/ release\n|__/ |\n\njulia> 2 + 2\n4\n\n\nAn IDE. For programmers, an integrated development environment is often used to manage bigger projects. Julia has Juno and VSCode.\nA notebook. The Project Juptyer provides a notebook interface for interacting with Julia and a more IDE style jupyterlab interface. A jupyter notebook has cells where commands are typed and immediately following is the printed output returned by Julia. The output of a cell depends on the state of the kernel when the cell is computed, not the order of the cells in the notebook. Cells have a number attached, showing the execution order. The Juypter notebook is used by binder and can be used locally through the IJulia package. This notebook has the ability to display many different types of outputs in addition to plain text, such as images, marked up math text, etc.\nThe Pluto package provides a reactive notebook interface. Reactive means when one “cell” is modified and executed, the new values cascade to all other dependent cells which in turn are updated. This is very useful for exploring a parameter space, say. Pluto notebooks can be exported as HTML files which make them easy to read online and by clever design embed the .jl file that can run through Pluto if it is downloaded.\n\nThe Pluto interface has some idiosyncracies that need explanation:\n\nCells can only have one command within them. Multiple-command cells must be contained in a begin block or a let block.\nBy default, the cells are reactive. This means when a variable in one cell is changed, then any references to that variable are also updated like a spreadsheet. This is fantastic for updating several computations at once. However it means variable names can not be repeated within a page. Pedagogically, it is convenient to use variable names and function names (e.g., x and f) repeatedly, but this is only possible if they are within a let block or a function body.\nTo not repeat names, but to be able to reference a value from cell-to-cell, some Unicode variants are used within a page. Visually these look familiar, but typing the names requires some understanding of Unicode input. The primary usages is bold italic (e.g., \\bix[tab] or \\bif[tab]) or bold face (e.g. \\bfx[tab] or bff[tab]).\nThe notebooks snapshot the packages they depend on, which is great for reproducability, but may mean older versions are silently used."
},
{
"objectID": "precalc/julia_overview.html#augmenting-base-julia",
"href": "precalc/julia_overview.html#augmenting-base-julia",
"title": "17  Overview of Julia commands",
"section": "17.3 Augmenting base Julia",
"text": "17.3 Augmenting base Julia\nThe base Julia installation has many features, but leaves many others to Julias package ecosystem. These notes use packages to provide plotting, symbolic math, access to special functions, numeric routines, and more.\nWithin Pluto, using add-on packages is very simple, as Pluto downloads and installs packages when they are requested through a using or import directive.\n\nFor other interfaces to Julia some more detail is needed.\nThe Julia package manager makes add-on packages very easy to install.\nJulia comes with just a few built-in packages, one being Pkg which manages subsequent package installation. To add more packages, we first must load the Pkg package. This is done by issuing the following command:\nusing Pkg\nThe using command loads the specified package and makes all its exported values available for direct use. There is also the import command which allows the user to select which values should be imported from the package, if any, and otherwise gives access to the new functionality through the dot syntax.\nPackages need to be loaded just once per session.\nTo use Pkg to “add” another package, we would have a command like:\nPkg.add(\"CalculusWithJulia\")\nThis command instructs Julia to look at its general registry for the CalculusWithJulia.jl package, download it, then install it. Once installed, a package only needs to be brought into play with the using or import commands.\n\n\n\n\n\n\nNote\n\n\n\nIn a terminal setting, there is a package mode, entered by typing ] as the leading character and exited by entering <delete> at a blank line. This mode allows direct access to Pkg with a simpler syntax. The command above would be just add CalculusWithJulia.)\n\n\nPackages can be updated through the command Pkg.up(), and removed with Pkg.rm(pkgname).\nBy default packages are installed in a common area. It may be desirable to keep packages for projects isolated. For this the Pkg.activate command can be used. This feature allows a means to have reproducible environments even if Julia or the packages used are upgraded, possibly introducing incompatabilities.\nFor these notes, the following packages, among others, are used:\nPkg.add(\"CalculusWithJulia\") # for some simplifying functions and a few packages (SpecialFunctions, ForwardDiff)\nPkg.add(\"Plots\") # for basic plotting\nPkg.add(\"SymPy\") # for symbolic math\nPkg.add(\"Roots\") # for solving `f(x)=0`\nPkg.add(\"QuadGk\") # for integration\nPkg.add(\"HQuadrature\") # for higher-dimensional integration"
},
{
"objectID": "precalc/julia_overview.html#julia-commands",
"href": "precalc/julia_overview.html#julia-commands",
"title": "17  Overview of Julia commands",
"section": "17.4 Julia commands",
"text": "17.4 Julia commands\nIn a Jupyter notebook or Pluto notebook, commands are typed into a notebook cell:\n\n2 + 2 # use shift-enter to evaluate\n\n4\n\n\nCommands are executed by using shift-enter or a run button near the cell.\nIn Jupyter multiple commands per cell are allowed. In Pluto, a begin or let block is used to collect multiple commmands into a single call. Commands may be separated by new lines or semicolons.\nOn a given line, anything after a # is a comment and is not processed.\nThe results of the last command executed will be displayed in an output area. Separating values by commas allows more than one value to be displayed. Plots are displayed when the plot object is returned by the last executed command.\nIn Jupyter, the state of the notebook is a determined by the cells executed along with their order. The state of a Pluto notebook is a result of all the cells in the notebook being executed. The cell order does not impact this and can be rearranged by the user."
},
{
"objectID": "precalc/julia_overview.html#numbers-variable-types",
"href": "precalc/julia_overview.html#numbers-variable-types",
"title": "17  Overview of Julia commands",
"section": "17.5 Numbers, variable types",
"text": "17.5 Numbers, variable types\nJulia has many different number types beyond the floating point type employed by most calculators. These include\n\nFloating point numbers: 0.5\nIntegers: 2\nRational numbers: 1//2\nComplex numbers 2 + 0im\n\nJulias parser finds the appropriate type for the value, when read in. The following all create the number \\(1\\) first as an integer, then a rational, then a floating point number, again as floating point number, and finally as a complex number:\n\n1, 1//1, 1.0, 1e0, 1 + 0im\n\n(1, 1//1, 1.0, 1.0, 1 + 0im)\n\n\nAs much as possible, operations involving certain types of numbers will produce output of a given type. For example, both of these divisions produce a floating point answer, even though mathematically, they need not:\n\n2/1, 1/2\n\n(2.0, 0.5)\n\n\nSome powers with negative bases, like (-3.0)^(1/3), are not defined. However, Julia provides the special-case function cbrt (and sqrt) for handling these.\nInteger operations may silently overflow, producing odd answers, at first glance:\n\n2^64\n\n0\n\n\n(Though the output is predictable, if overflow is taken into consideration appropriately.)\nWhen different types of numbers are mixed, Julia will usually promote the values to a common type before the operation:\n\n(2 + 1//2) + 0.5\n\n3.0\n\n\nJulia will first add 2 and 1//2 promoting 2 to rational before doing so. Then add the result, 5//2 to 0.5 by promoting 5//2 to the floating point number 2.5 before proceeding.\nJulia uses a special type to store a handful of irrational constants such as pi. The special type allows these constants to be treated without round off, until they mix with other floating point numbers. There are some functions that require these be explicitly promoted to floating point. This can be done by calling float.\nThe standard mathematical operations are implemented by +, -, *, /, ^. Parentheses are used for grouping.\n\n17.5.1 Vectors\nA vector is an indexed collection of similarly typed values. Vectors can be constructed with square brackets (syntax for concatenation):\n\n[1, 1, 2, 3, 5, 8]\n\n6-element Vector{Int64}:\n 1\n 1\n 2\n 3\n 5\n 8\n\n\nValues will be promoted to a common type (or type Any if none exists). For example, this vector will have type Float64 due to the 1/3 computation:\n\n[1, 1//2, 1/3]\n\n3-element Vector{Float64}:\n 1.0\n 0.5\n 0.3333333333333333\n\n\n(Vectors are used as a return type from some functions, as such, some familiarity is needed.)\nRegular arithmetic sequences can be defined by either:\n\nRange operations: a:h:b or a:b which produces a generator of values starting at a separated by h (h is 1 in the last form) until they reach b.\nThe range function: range(a, b, length=n) which produces a generator of n values between a and b;\n\nThese constructs return range objects. A range object compactly stores the values it references. To see all the values, they can be collected with the collect function, though this is rarely needed in practice.\nRandom sequences are formed by rand, among others:\n\nrand(3)\n\n3-element Vector{Float64}:\n 0.10423678334027975\n 0.3613002059686551\n 0.13804328154133638\n\n\nThe call rand() returns a single random number (in \\([0,1)\\).)"
},
{
"objectID": "precalc/julia_overview.html#variables",
"href": "precalc/julia_overview.html#variables",
"title": "17  Overview of Julia commands",
"section": "17.6 Variables",
"text": "17.6 Variables\nValues can be assigned variable names, with =. There are some variants\n\nu = 2\na_really_long_name = 3\na0, b0 = 1, 2 # multiple assignment\na1 = a2 = 0 # chained assignment, sets a2 and a1 to 0\n\n0\n\n\nThe names can be short, as above, or more verbose. Variable names cant start with a number, but can include numbers. Variables can also include Unicode or even be an emoji.\n\nα, β = π/3, π/4\n\n(1.0471975511965976, 0.7853981633974483)\n\n\nWe can then use the variables to reference the values:\n\nu + a_really_long_name + a0 - b0 + α\n\n5.047197551196597\n\n\nWithin Pluto, names are idiosyncratic: within the global scope, only a single usage is possible per notebook; functions and variables can be freely renamed; structures can be redefined or renamed; …\nOutside of Pluto, names may be repurposed, even with values of different types (Julia is a dynamic language), save for (generic) function names, which have some special rules and can only be redefined as another function. Generic functions are central to Julias design. Generic functions use a method table to dispatch on, so once a name is assigned to a generic function, it can not be used as a variable name; the reverse is also true."
},
{
"objectID": "precalc/julia_overview.html#functions",
"href": "precalc/julia_overview.html#functions",
"title": "17  Overview of Julia commands",
"section": "17.7 Functions",
"text": "17.7 Functions\nFunctions in Julia are first-class objects. In these notes, we often pass them as arguments to other functions. There are many built-in functions and it is easy to define new functions.\nWe “call” a function by passing argument(s) to it, grouped by parentheses:\n\nsqrt(10)\nsin(pi/3)\nlog(5, 100) # log base 5 of 100\n\n2.8613531161467867\n\n\nWith out parentheses, the name (usually) refers to a generic name and the output lists the number of available implementations (methods).\n\nlog\n\nlog (generic function with 42 methods)\n\n\n\n17.7.1 Built-in functions\nJulia has numerous built-in mathematical functions, we review a few here:\n\nPowers logs and roots\nBesides ^, there are sqrt and cbrt for powers. In addition basic functions for exponential and logarithmic functions:\nsqrt, cbrt\nexp\nlog # base e\nlog10, log2, # also log(b, x)\n\n\nTrigonometric functions\nThe 6 standard trig functions are implemented; their implementation for degree arguments; their inverse functions; and the hyperbolic analogs.\nsin, cos, tan, csc, sec, cot\nasin, acos, atan, acsc, asec, acot\nsinh, cosh, tanh, csch, sech, coth\nasinh, acosh, atanh, acsch, asech, acoth\nIf degrees are preferred, the following are defined to work with arguments in degrees:\nsind, cosd, tand, cscd, secd, cotd\n\n\nUseful functions\nOther useful and familiar functions are defined:\n\nabs: absolute value\nsign: is \\(\\lvert x \\rvert/x\\) except at \\(x=0\\), where it is \\(0\\).\nfloor, ceil: greatest integer less or least integer greater\nmax(a,b), min(a,b): larger (or smaller) of a or b\nmaximum(xs), minimum(xs): largest or smallest of the collection referred to by xs\n\n\nIn a Pluto session, the “Live docs” area shows inline documentation for the current object.\nFor other uses of Julia, the built-in documentation for an object is accessible through a leading ?, say, ?sign. There is also the @doc macro, for example:\n@doc sign\n\n\n\n\n17.7.2 User-defined functions\nSimple mathematical functions can be defined using standard mathematical notation:\n\nf(x) = -16x^2 + 100x + 2\n\nf (generic function with 1 method)\n\n\nThe argument x is passed into the body of function.\nOther values are found from the environment where defined:\n\na = 1\nf(x) = 2*a + x\nf(3) # 2 * 1 + 3\na = 4\nf(3) # now 2 * 4 + 3\n\n11\n\n\nUser-defined functions can have \\(0\\), \\(1\\) or more arguments:\n\narea(w, h) = w*h\n\narea (generic function with 1 method)\n\n\nJulia makes different methods for generic function names, so function definitions whose argument specification is different are for different uses, even if the name is the same. This is polymorphism. The practical use is that it means users need only remember a much smaller set of function names, as attempts are made to give common expectations to the same name. (That is, + should be used only for “add” ing objects, however defined.)\nFunctions can be defined with keyword arguments that may have defaults specified:\n\nf(x; m=1, b=0) = m*x + b # note \";\"\nf(1) # uses m=1, b=0 -> 1 * 1 + 0\nf(1, m=10) # uses m=10, b=0 -> 10 * 1 + 0\nf(1, m=10, b=5) # uses m=10, b=5 -> 10 * 1 + 5\n\n15\n\n\nLonger functions can be defined using the function keyword, the last command executed is returned:\n\nfunction 𝒇(x)\n y = x^2\n z = y - 3\n z\nend\n\n𝒇 (generic function with 1 method)\n\n\nFunctions without names, anonymous functions, are made with the -> syntax as in:\n\nx -> cos(x)^2 - cos(2x)\n\n#13 (generic function with 1 method)\n\n\nThese are useful when passing a function to another function or when writing a function that returns a function."
},
{
"objectID": "precalc/julia_overview.html#conditional-statements",
"href": "precalc/julia_overview.html#conditional-statements",
"title": "17  Overview of Julia commands",
"section": "17.8 Conditional statements",
"text": "17.8 Conditional statements\nJulia provides the traditional if-else-end statements, but more conveniently has a ternary operator for the simplest case:\n\nour_abs(x) = (x < 0) ? -x : x\n\nour_abs (generic function with 1 method)"
},
{
"objectID": "precalc/julia_overview.html#looping",
"href": "precalc/julia_overview.html#looping",
"title": "17  Overview of Julia commands",
"section": "17.9 Looping",
"text": "17.9 Looping\nIterating over a collection can be done with the traditional for loop. However, there are list comprehensions to mimic the definition of a set:\n\n[x^2 for x in 1:10]\n\n10-element Vector{Int64}:\n 1\n 4\n 9\n 16\n 25\n 36\n 49\n 64\n 81\n 100\n\n\nComprehensions can be filtered through the if keyword\n\n[x^2 for x in 1:10 if iseven(x)]\n\n5-element Vector{Int64}:\n 4\n 16\n 36\n 64\n 100\n\n\nThis is more efficient than creating the collection then filtering, as is done with:\n\nfilter(iseven, [x^2 for x in 1:10])\n\n5-element Vector{Int64}:\n 4\n 16\n 36\n 64\n 100"
},
{
"objectID": "precalc/julia_overview.html#broadcasting-mapping",
"href": "precalc/julia_overview.html#broadcasting-mapping",
"title": "17  Overview of Julia commands",
"section": "17.10 Broadcasting, mapping",
"text": "17.10 Broadcasting, mapping\nA function can be applied to each element of a vector through mapping or broadcasting. The latter is implemented in a succinct notation. Calling a function with a “.” before its opening “(` will apply the function to each individual value in the argument:\n\nxs = [1,2,3,4,5]\nsin.(xs) # gives back [sin(1), sin(2), sin(3), sin(4), sin(5)]\n\n5-element Vector{Float64}:\n 0.8414709848078965\n 0.9092974268256817\n 0.1411200080598672\n -0.7568024953079282\n -0.9589242746631385\n\n\nFor “infix” operators, the dot precedes the operator, as in this example instructing pointwise multiplication of each element in xs:\nxs .* xs\nAlternatively, the more traditional map can be used:\n\nmap(sin, xs)\n\n5-element Vector{Float64}:\n 0.8414709848078965\n 0.9092974268256817\n 0.1411200080598672\n -0.7568024953079282\n -0.9589242746631385"
},
{
"objectID": "precalc/julia_overview.html#plotting",
"href": "precalc/julia_overview.html#plotting",
"title": "17  Overview of Julia commands",
"section": "17.11 Plotting",
"text": "17.11 Plotting\nPlotting is not built-in to Julia, rather added through add-on packages. Julias Plots package is an interface to several plotting packages. We mention plotly (built-in) for web based graphics, pyplot, and gr (also built into Plots) for other graphics.\nWe must load Plots before we can plot (and it must be installed before we can load it):\nusing Plots\nWith Plots loaded, we can plot a function by passing the function object by name to plot, specifying the range of x values to show, as follows:\n\nplot(sin, 0, 2pi) # plot a function - by name - over an interval [a,b]\n\n\n\n\n!!1 note This is in the form of the basic pattern employed: verb(function_object, arguments...). The verb in this example is plot, the object sin, the arguments 0, 2pi to specify [a,b] domain to plot over.\nPlotting more than one function over [a,b] is achieved through the plot! function, which modifies the existing plot (plot creates a new one) by adding a new layer:\n\nplot(sin, 0, 2pi)\nplot!(cos, 0, 2pi)\nplot!(zero, 0, 2pi) # add the line y=0\n\n\n\n\nIndividual points are added with scatter or scatter!:\n\nplot(sin, 0, 2pi, legend=false)\nplot!(cos, 0, 2pi)\nscatter!([pi/4, pi+pi/4], [sin(pi/4), sin(pi + pi/4)])\n\n\n\n\n(The extra argument legend=false suppresses the automatic legend drawing. There are many other useful arguments to adjust a graphic. For example, passing markersize=10 to the scatter! command would draw the points larger than the default.)\nPlotting an anonymous function is a bit more immediate than the two-step approach of defining a named function then calling plot with this as an argument:\n\nplot( x -> exp(-x/pi) * sin(x), 0, 2pi)\n\n\n\n\nThe scatter! function used above takes two vectors of values to describe the points to plot, one for the \\(x\\) values and one for the matching \\(y\\) values. The plot function can also produce plots with this interface. For example, here we use a comprehension to produce y values from the specified x values:\n\nxs = range(0, 2pi, length=251)\nys = [sin(2x) + sin(3x) + sin(4x) for x in xs]\nplot(xs, ys)"
},
{
"objectID": "precalc/julia_overview.html#equations",
"href": "precalc/julia_overview.html#equations",
"title": "17  Overview of Julia commands",
"section": "17.12 Equations",
"text": "17.12 Equations\nNotation for Julia and math is similar for functions - but not for equations. In math, an equation might look like:\n\\[\nx^2 + y^2 = 3\n\\]\nIn Julia the equals sign is only for assignment and mutation. The left-hand side of an equals sign in Julia is reserved for a) variable assignment; b) function definition (via f(x) = ...); c) indexed mutation of a vector or array; d) mutation of fields in a structure. (Vectors are indexed by a number allowing retrieval and mutation of the stored value in the container. The notation mentioned here would be xs[2] = 3 to mutate the 2nd element of xs to the value 3."
},
{
"objectID": "precalc/julia_overview.html#symbolic-math",
"href": "precalc/julia_overview.html#symbolic-math",
"title": "17  Overview of Julia commands",
"section": "17.13 Symbolic math",
"text": "17.13 Symbolic math\nSymbolic math is available through an add-on package SymPy (among others). Once loaded, symbolic variables are created with the macro @syms:\nusing SymPy\n\n@syms x a b c\n\n(x, a, b, c)\n\n\n(A macro rewrites values into other commands before they are interpreted. Macros are prefixed with the @ sign. In this use, the “macro” @syms translates x a b c into a command involving SymPys symbols function.)\nSymbolic expressions - unlike numeric expressions - are not immediately evaluated, though they may be simplified:\n\np = a*x^2 + b*x + c\n\n \n\\[\na x^{2} + b x + c\n\\]\n\n\n\nTo substitute a value, we can use Julias pair notation (variable=>value):\n\np(x=>2), p(x=>2, a=>3, b=>4, c=>1)\n\n(4*a + 2*b + c, 21)\n\n\nThis is convenient notation for calling the subs function for SymPy.\nSymPy expressions of a single free variable can be plotted directly:\n\nplot(64 - (1/2)*32 * x^2, 0, 2)\n\n\n\n\n\nSymPy has functions for manipulating expressions: simplify, expand, together, factor, cancel, apart, \\(...\\)\nSymPy has functions for basic math: factor, roots, solve, solveset, \\(\\dots\\)\nSymPy has functions for calculus: limit, diff, integrate, \\(\\dots\\)"
},
{
"objectID": "limits/limits.html",
"href": "limits/limits.html",
"title": "18  Limits",
"section": "",
"text": "This section uses the following add-on packages:\nAn historic problem in the history of math was to find the area under the graph of \\(f(x)=x^2\\) between \\([0,1]\\).\nThere wasnt a ready-made formula for the area of this shape, as was known for a triangle or a square. However, Archimedes found a method to compute areas enclosed by a parabola and line segments that cross the parabola.\nThe figure illustrates a means to compute the area bounded by the parabola, the line \\(y=1\\) and the line \\(x=0\\) using triangles. It suggests that this area can be found by adding the following sum\n\\[\nA = 1/2 + 1/8 + 2 \\cdot (1/8)^2 + 4 \\cdot (1/8)^3 + \\cdots\n\\]\nThis value is \\(2/3\\), so the area under the curve would be \\(1/3\\). Forget about this specific value - which through more modern machinery becomes uneventful - and focus for a minute on the method: a problem is solved by a suggestion of an infinite process, in this case the creation of more triangles to approximate the unaccounted for area. This is the so-call method of exhaustion known since the 5th century BC.\nArchimedes used this method to solve a wide range of area problems related to basic geometric shapes, including a more general statement of what we described above.\nThe \\(\\cdots\\) in the sum expression are the indication that this process continues and that the answer is at the end of an infinite process. To make this line of reasoning rigorous requires the concept of a limit. The concept of a limit is then an old one, but it wasnt until the age of calculus that it was formalized.\nNext, we illustrate how Archimedes approximated \\(\\pi\\) the ratio of the circumference of a circle to its diameter using interior and exterior \\(n\\)-gons whose perimeters could be computed.\nHere Archimedes uses bounds to constrain an unknown value. Had he been able to compute these bounds for larger and larger \\(n\\) the value of \\(\\pi\\) could be more accurately determined. In a “limit” it would be squeezed in to have a specific value, which we now know is an irrational number.\nContinuing these concepts, Fermat in the 1600s essentially took a limit to find the slope of a tangent line to a polynomial curve. Newton in the late 1600s, exploited the idea in his development of calculus (as did Leibniz). Yet it wasnt until the 1800s that Bolzano, Cauchy and Weierstrass put the idea on a firm footing.\nTo make things more precise, we begin by discussing the limit of a univariate function as \\(x\\) approaches \\(c\\).\nInformally, if a limit exists it is the value that \\(f(x)\\) gets close to as \\(x\\) gets close to - but not equal to - \\(c\\).\nThe modern formulation is due to Weirstrass:\nWe comment on this later.\nCauchy begins his incredibly influential treatise on calculus considering two examples, the limit as \\(x\\) goes to \\(0\\) of\n\\[\n\\frac{\\sin(x)}{x} \\quad\\text{and}\\quad (1 + x)^{1/x}.\n\\]\nThese take the indeterminate forms \\(0/0\\) and \\(1^\\infty\\), which are found by just putting \\(0\\) in for \\(x\\). An expression does not need to be defined at \\(c\\), as these two arent at \\(c=0\\), to discuss its limit. Cauchy illustrates two methods to approach the questions above. The first is to pull out an inequality:\n\\[\n\\frac{\\sin(x)}{\\sin(x)} > \\frac{\\sin(x)}{x} > \\frac{\\sin(x)}{\\tan(x)}\n\\]\nwhich is equivalent to:\n\\[\n1 > \\frac{\\sin(x)}{x} > \\cos(x)\n\\]\nThis bounds the expression \\(\\sin(x)/x\\) between \\(1\\) and \\(\\cos(x)\\) and as \\(x\\) gets close to \\(0\\), the value of \\(\\cos(x)\\) “clearly” goes to \\(1\\), hence \\(L\\) must be \\(1\\). This is an application of the squeeze theorem, the same idea Archimedes implied when bounding the value for \\(\\pi\\) above and below.\nThe above bound comes from this figure, for small \\(x > 0\\):\nTo discuss the case of \\((1+x)^{1/x}\\) it proved convenient to assume \\(x = 1/m\\) for integer values of \\(m\\). At the time of Cauchy, log tables were available to identify the approximate value of the limit. Cauchy computed the following value from logarithm tables.\nA table can show the progression to this value:\nThis progression can be seen to be increasing. Cauchy, in his treatise, can see this through:\n\\[\n\\begin{align*}\n(1 + \\frac{1}{m})^n &= 1 + \\frac{1}{1} + \\frac{1}{1\\cdot 2}(1 = \\frac{1}{m}) + \\\\\n& \\frac{1}{1\\cdot 2\\cdot 3}(1 - \\frac{1}{m})(1 - \\frac{2}{m}) + \\cdots \\\\\n&+\n\\frac{1}{1 \\cdot 2 \\cdot \\cdots \\cdot m}(1 - \\frac{1}{m}) \\cdot \\cdots \\cdot (1 - \\frac{m-1}{m}).\n\\end{align*}\n\\]\nThese values are clearly increasing as \\(m\\) increases. Cauchy showed the value was bounded between \\(2\\) and \\(3\\) and had the approximate value above. Then he showed the restriction to integers was not necessary. Later we will use this definition for the exponential function:\n\\[\ne^x = \\lim_{n \\rightarrow \\infty} (1 + \\frac{x}{n})^n,\n\\]\nwith a suitably defined limit.\nThese two cases illustrate that though the definition of the limit exists, the computation of a limit is generally found by other means and the intuition of the value of the limit can be gained numerically."
},
{
"objectID": "limits/limits.html#graphical-approaches-to-limits",
"href": "limits/limits.html#graphical-approaches-to-limits",
"title": "18  Limits",
"section": "18.1 Graphical approaches to limits",
"text": "18.1 Graphical approaches to limits\nLets return to the function \\(f(x) = \\sin(x)/x\\). This function was studied by Euler as part of his solution to the Basel problem. He knew that near \\(0\\), \\(\\sin(x) \\approx x\\), so the ratio is close to \\(1\\) if \\(x\\) is near \\(0\\). Hence, the intuition is \\(\\lim_{x \\rightarrow 0} \\sin(x)/x = 1\\), as Cauchy wrote. We can verify this limit graphically two ways. First, a simple graph shows no issue at \\(0\\):\n\nf(x) = sin(x)/x\nxs, ys = unzip(f, -pi/2, pi/2) # get points used to plot `f`\nplot(xs, ys)\nscatter!(xs, ys)\n\n\n\n\nThe \\(y\\) values of the graph seem to go to \\(1\\) as the \\(x\\) values get close to \\(0\\). (That the graph looks defined at \\(0\\) is due to the fact that the points sampled to graph do not include \\(0\\), as shown through the scatter! command which can be checked via minimum(abs, xs).)\nWe can also verify Eulers intuition through this graph:\n\nplot(sin, -pi/2, pi/2)\nplot!(identity) # the function y = x, like how zero is y = 0\n\n\n\n\nThat the two are indistinguishable near \\(0\\) makes it easy to see that their ratio should be going towards \\(1\\).\nA parametric plot shows the same, we see below the slope at \\((0,0)\\) is basically \\(1\\), because the two functions are varying at the same rate when they are each near \\(0\\)\n\nplot(sin, identity, -pi/2, pi/2) # parametric plot\n\n\n\n\nThe graphical approach to limits - plotting \\(f(x)\\) around \\(c\\) and observing if the \\(y\\) values seem to converge to an \\(L\\) value when \\(x\\) get close to \\(c\\) - allows us to gather quickly if a function seems to have a limit at \\(c\\), though the precise value of \\(L\\) may be hard to identify.\n\nExample\nThis example illustrates the same limit a different way. Sliding the \\(x\\) value towards \\(0\\) shows \\(f(x) = \\sin(x)/x\\) approaches a value of \\(1\\).\n\n\nJXG = require(\"jsxgraph\")\n\nb = JXG.JSXGraph.initBoard('jsxgraph', {\n boundingbox: [-6, 1.2, 6,-1.2], axis:true\n});\n\nf = function(x) {return Math.sin(x) / x;};\ngraph = b.create(\"functiongraph\", [f, -6, 6])\nseg = b.create(\"line\", [[-6,0], [6,0]], {fixed:true});\n\nX = b.create(\"glider\", [2, 0, seg], {name:\"x\", size:4});\nP = b.create(\"point\", [function() {return X.X()}, function() {return f(X.X())}], {name:\"\"});\nQ = b.create(\"point\", [0, function() {return P.Y();}], {name:\"f(x)\"});\n\nsegup = b.create(\"segment\", [P,X], {dash:2});\nsegover = b.create(\"segment\", [P, [0, function() {return P.Y()}]], {dash:2});\n\n\ntxt = b.create('text', [2, 1, function() {\n return \"x = \" + X.X().toFixed(4) + \", f(x) = \" + P.Y().toFixed(4);\n}]);\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nExample\nConsider now the following limit\n\\[\n\\lim_{x \\rightarrow 2} \\frac{x^2 - 5x + 6}{x^2 +x - 6}\n\\]\nNoting that this is a ratio of nice polynomial functions, we first check whether there is anything to do:\n\nf(x) = (x^2 - 5x + 6) / (x^2 + x - 6)\nc = 2\nf(c)\n\nNaN\n\n\nThe NaN indicates that this function is indeterminate at \\(c=2\\). A quick plot gives us an idea that the limit exists and is roughly \\(-0.2\\):\n\nc, delta = 2, 1\nplot(x -> (x^2 - 5x + 6) / (x^2 + x - 6), c - delta, c + delta)\n\n\n\n\nThe graph looks “continuous.” In fact, the value \\(c=2\\) is termed a removable singularity as redefining \\(f(x)\\) to be \\(-0.2\\) when \\(x=2\\) results in a “continuous” function.\nAs an aside, we can redefine f using the “ternary operator”:\nf(x) = x == 2.0 ? -0.2 : (x^2 - 5x + 6) / (x^2 + x - 6)\nThis particular case is a textbook example: one can easily factor \\(f(x)\\) to get:\n\\[\nf(x) = \\frac{(x-2)(x-3)}{(x-2)(x+3)}\n\\]\nWritten in this form, we clearly see that this is the same function as \\(g(x) = (x-3)/(x+3)\\) when \\(x \\neq 2\\). The function \\(g(x)\\) is “continuous” at \\(x=2\\). So were one to redefine \\(f(x)\\) at \\(x=2\\) to be \\(g(2) = (2 - 3)/(2 + 3) = -0.2\\) it would be made continuous, hence the term removable singularity."
},
{
"objectID": "limits/limits.html#numerical-approaches-to-limits",
"href": "limits/limits.html#numerical-approaches-to-limits",
"title": "18  Limits",
"section": "18.2 Numerical approaches to limits",
"text": "18.2 Numerical approaches to limits\nThe investigation of \\(\\lim_{x \\rightarrow 0}(1 + x)^{1/x}\\) by evaluating the function at \\(1/10000\\) by Cauchy can be done much more easily nowadays. As does a graphical approach, a numerical approach can give insight into a limit and often a good numeric estimate.\nThe basic idea is to create a sequence of \\(x\\) values going towards \\(c\\) and then investigate if the corresponding \\(y\\) values are eventually near some \\(L\\).\nBest, to see by example. Suppose we are asked to investigate\n\\[\n\\lim_{x \\rightarrow 25} \\frac{\\sqrt{x} - 5}{\\sqrt{x - 16} - 3}.\n\\]\nWe first define a function and check if there are issues at \\(25\\):\n\nf(x) = (sqrt(x) - 5) / (sqrt(x-16) - 3)\n\nf (generic function with 1 method)\n\n\n\nc = 25\nf(c)\n\nNaN\n\n\nSo yes, an issue of the indeterminate form \\(0/0\\). We investigate numerically by making a set of numbers getting close to \\(c\\). This is most easily done making numbers getting close to \\(0\\) and adding them to or subtracting them from \\(c\\). Some natural candidates are negative powers of \\(10\\):\n\nhs = [1/10^i for i in 1:8]\n\n8-element Vector{Float64}:\n 0.1\n 0.01\n 0.001\n 0.0001\n 1.0e-5\n 1.0e-6\n 1.0e-7\n 1.0e-8\n\n\nWe can add these to \\(c\\) and then evaluate:\n\nxs = c .+ hs\nys = f.(xs)\n\n8-element Vector{Float64}:\n 0.6010616008415922\n 0.6001066157341047\n 0.6000106661569936\n 0.6000010666430725\n 0.6000001065281493\n 0.6000000122568625\n 0.5999999946709295\n 0.6\n\n\nTo visualize, we can put in a table using [xs ys] notation:\n\n[xs ys]\n\n8×2 Matrix{Float64}:\n 25.1 0.601062\n 25.01 0.600107\n 25.001 0.600011\n 25.0001 0.600001\n 25.0 0.6\n 25.0 0.6\n 25.0 0.6\n 25.0 0.6\n\n\nThe \\(y\\)-values seem to be getting near \\(0.6\\).\nSince limits are defined by the expression \\(0 < \\lvert x-c\\rvert < \\delta\\), we should also look at values smaller than \\(c\\). There isnt much difference (note the .- sign in c .- hs):\n\nxs = c .- hs\nys = f.(xs)\n[xs ys]\n\n8×2 Matrix{Float64}:\n 24.9 0.598928\n 24.99 0.599893\n 24.999 0.599989\n 24.9999 0.599999\n 25.0 0.6\n 25.0 0.6\n 25.0 0.6\n 25.0 0.6\n\n\nSame story. The numeric evidence supports a limit of \\(L=0.6\\).\n\nExample: the secant line\nLet \\(f(x) = x^x\\) and consider the ratio:\n\\[\n\\frac{f(c + h) - f(c)}{h}\n\\]\nAs \\(h\\) goes to \\(0\\), this will take the form \\(0/0\\) in most cases, and in the particular case of \\(f(x) = x^x\\) and \\(c=1\\) it will be. The expression has a geometric interpretation of being the slope of the secant line connecting the two points \\((c,f(c))\\) and \\((c+h, f(c+h))\\).\nTo look at the limit in this example, we have (recycling the values in hs):\n\nc = 1\nf(x) = x^x\nys = [(f(c + h) - f(c)) / h for h in hs]\n[hs ys]\n\n8×2 Matrix{Float64}:\n 0.1 1.10534\n 0.01 1.01005\n 0.001 1.001\n 0.0001 1.0001\n 1.0e-5 1.00001\n 1.0e-6 1.0\n 1.0e-7 1.0\n 1.0e-8 1.0\n\n\nThe limit looks like \\(L=1\\). A similar check on the left will confirm this numerically.\n\n\n18.2.1 Issues with the numeric approach\nThe numeric approach often gives a good intuition as to the existence of a limit and its value. However, it can be misleading. Consider this limit question:\n\\[\n\\lim_{x \\rightarrow 0} \\frac{1 - \\cos(x)}{x^2}.\n\\]\nWe can see that it is indeterminate of the form \\(0/0\\):\n\ng(x) = (1 - cos(x)) / x^2\ng(0)\n\nNaN\n\n\nWhat is the value of \\(L\\), if it exists? A quick attempt numerically yields:\n\n𝒙s = 0 .+ hs\n𝒚s = [g(x) for x in 𝒙s]\n[𝒙s 𝒚s]\n\n8×2 Matrix{Float64}:\n 0.1 0.499583\n 0.01 0.499996\n 0.001 0.5\n 0.0001 0.5\n 1.0e-5 0.5\n 1.0e-6 0.500044\n 1.0e-7 0.4996\n 1.0e-8 0.0\n\n\nHmm, the values in ys appear to be going to \\(0.5\\), but then end up at \\(0\\). Is the limit \\(0\\) or \\(1/2\\)? The answer is \\(1/2\\). The last \\(0\\) is an artifact of floating point arithmetic and the last few deviations from 0.5 due to loss of precision in subtraction. To investigate, we look more carefully at the two ratios:\n\ny1s = [1 - cos(x) for x in 𝒙s]\ny2s = [x^2 for x in 𝒙s]\n[𝒙s y1s y2s]\n\n8×3 Matrix{Float64}:\n 0.1 0.00499583 0.01\n 0.01 4.99996e-5 0.0001\n 0.001 5.0e-7 1.0e-6\n 0.0001 5.0e-9 1.0e-8\n 1.0e-5 5.0e-11 1.0e-10\n 1.0e-6 5.00044e-13 1.0e-12\n 1.0e-7 4.996e-15 1.0e-14\n 1.0e-8 0.0 1.0e-16\n\n\nLooking at the bottom of the second column reveals the error. The value of 1 - cos(1.0e-8) is 0 and not a value around 5e-17, as would be expected from the pattern above it. This is because the smallest floating point value less than 1.0 is more than 5e-17 units away, so cos(1e-8) is evaluated to be 1.0. There just isnt enough granularity to get this close to \\(0\\).\nNot that we needed to. The answer would have been clear if we had stopped with x=1e-6, say.\nIn general, some functions will frustrate the numeric approach. It is best to be wary of results. At a minimum they should confirm what a quick graph shows, though even that isnt enough, as this next example shows.\n\nExample\nLet \\(h(x)\\) be defined by\n\\[\nh(x) = x^2 + 1 + \\log(| 11 \\cdot x - 15 |)/99.\n\\]\nThe question is to investigate\n\\[\n\\lim_{x \\rightarrow 15/11} h(x)\n\\]\nA plot shows the answer appears to be straightforward:\n\n\n\n\n\nTaking values near \\(15/11\\) shows nothing too unusual:\n\nc = 15/11\nhs = [1/10^i for i in 4:3:16]\nxs = c .+ hs\n[xs h.(xs)]\n\n5×2 Matrix{Float64}:\n 1.36374 2.79096\n 1.36364 2.72092\n 1.36364 2.65114\n 1.36364 2.58135\n 1.36364 2.51643\n\n\n(Though both the graph and the table hint at something a bit odd.)\nHowever the limit in this case is \\(-\\infty\\) (or DNE), as there is an aysmptote at \\(c=15/11\\). The problem is the asymptote due to the logarithm is extremely narrow and happens between floating point values to the left and right of \\(15/11\\).\n\n\n\n18.2.2 Richardson extrapolation\nThe Richardson package provides a function to extrapolate a function f(x) to f(x0), as the numeric limit does. We illustrate its use by example:\n\nf(x) = sin(x)/x\nextrapolate(f, 1)\n\n(0.9999999999922424, 4.538478481919128e-9)\n\n\nThe answer involves two terms, the second being an estimate for the error in the estimation of f(0).\nThe values the method chooses could be viewed as follows:\n\nextrapolate(1) do x # using `do` notation for the function\n @show x\n sin(x)/x\nend\n\nx = 1.0\nx = 0.125\nx = 0.015625\nx = 0.001953125\nx = 0.000244140625\n\n\n(0.9999999999922424, 4.538478481919128e-9)\n\n\nThe extrapolate function avoids the numeric problems encountered in the following example\n\nf(x) = (1 - cos(x)) / x^2\nextrapolate(f, 1)\n\n(0.5000000007193545, 4.705535960880525e-11)\n\n\nTo find limits at a value of c not equal to 0, we set the x_0 argument. For example,\n\nf(x) = (sqrt(x) - 5) / (sqrt(x-16) - 3)\nc = 25\nextrapolate(f, 26, x0=25)\n\n(0.6000000000015944, 4.5619086286308175e-10)\n\n\nThis value can also be Inf, in anticipation of infinite limits to be discussed in a subsequent section:\n\nf(x) = (x^2 - 2x + 1)/(x^3 - 3x^2 + 2x + 1)\nextrapolate(f, 10, x0=Inf)\n\n(0.0, 0.0)\n\n\n(The starting value should be to the right of any zeros of the denominator.)"
},
{
"objectID": "limits/limits.html#symbolic-approach-to-limits",
"href": "limits/limits.html#symbolic-approach-to-limits",
"title": "18  Limits",
"section": "18.3 Symbolic approach to limits",
"text": "18.3 Symbolic approach to limits\nThe SymPy package provides a limit function for finding the limit of an expression in a given variable. It must be loaded, as was done initially. The limit functions use requires the expression, the variable and a value for \\(c\\). (Similar to the three things in the notation \\(\\lim_{x \\rightarrow c}f(x)\\).)\nFor example, the limit at \\(0\\) of \\((1-\\cos(x))/x^2\\) is easily handled:\n\n@syms x::real\nlimit((1 - cos(x)) / x^2, x => 0)\n\n \n\\[\n\\frac{1}{2}\n\\]\n\n\n\nThe pair notation (x => 0) is used to indicate the variable and the value it is going to.\n\nExample\nWe look again at this function which despite having a vertical asymptote at \\(x=15/11\\) has the property that it is positive for all floating point values, making both a numeric and graphical approach impossible:\n\\[\nf(x) = x^2 + 1 + \\log(| 11 \\cdot x - 15 |)/99.\n\\]\nWe find the limit symbolically at \\(c=15/11\\) as follows, taking care to use the exact value 15//11 and not the floating point approximation returned by 15/11:\n\nf(x) = x^2 + 1 + log(abs(11x - 15))/99\nlimit(f(x), x => 15 // 11)\n\n \n\\[\n-\\infty\n\\]\n\n\n\n\n\nExample\nFind the limits:\n\\[\n\\lim_{x \\rightarrow 0} \\frac{2\\sin(x) - \\sin(2x)}{x - \\sin(x)}, \\quad\n\\lim_{x \\rightarrow 0} \\frac{e^x - 1 - x}{x^2}, \\quad\n\\lim_{\\rho \\rightarrow 0} \\frac{x^{1-\\rho} - 1}{1 - \\rho}.\n\\]\nWe have for the first:\n\nlimit( (2sin(x) - sin(2x)) / (x - sin(x)), x => 0)\n\n \n\\[\n6\n\\]\n\n\n\nThe second is similarly done, though here we define a function for variety:\n\nf(x) = (exp(x) - 1 - x) / x^2\nlimit(f(x), x => 0)\n\n \n\\[\n\\frac{1}{2}\n\\]\n\n\n\nFinally, for the third we define a new variable and proceed:\n\n@syms rho::real\nlimit( (x^(1-rho) - 1) / (1 - rho), rho => 1)\n\n \n\\[\n\\log{\\left(x \\right)}\n\\]\n\n\n\nThis last limit demonstrates that the limit function of SymPy can readily evaluate limits that involve parameters, though at times some assumptions on the parameters may be needed, as was done through rho::real\nHowever, for some cases, the assumptions will not be enough, as they are broad. (E.g., something might be true for some values of the parameter and not others and these values arent captured in the assumptions.) So the user must be mindful that when parameters are involved, the answer may not reflect all possible cases.\n\n\nExample: floating point conversion issues\nThe Gruntz algorithm implemented in SymPy for symbolic limits is quite powerful. However, some care must be exercised to avoid undesirable conversions from exact values to floating point values.\nIn a previous example, we used 15//11 and not 15/11, as the former converts to an exact symbolic value for use in SymPy, but the latter would be approximated in floating point before this conversion so the exactness would be lost.\nTo illustrate further, lets look at the limit as \\(x\\) goes to \\(\\pi/2\\) of \\(j(x) = \\cos(x) / (x - \\pi/2)\\). We follow our past practice:\n\nj(x) = cos(x) / (x - pi/2)\nj(pi/2)\n\nInf\n\n\nThe value is not NaN, rather Inf. This is because cos(pi/2) is not exactly \\(0\\) as it should be mathematically, as pi/2 is rounded to a floating point number. This minor difference is important. If we try and correct for this by using PI we have:\n\nlimit(j(x), x => PI/2)\n\n \n\\[\n0\n\\]\n\n\n\nThe value is not right, as this simple graph suggests the limit is in fact \\(-1\\):\n\nplot(j, pi/4, 3pi/4)\n\n\n\n\nThe difference between pi and PI can be significant, and though usually pi is silently converted to PI, it doesnt happen here as the division by 2 happens first, which turns the symbol into an approximate floating point number. Hence, SymPy is giving the correct answer for the problem it is given, it just isnt the problem we wanted to look at.\nTrying again, being more aware of how pi and PI differ, we have:\n\nf(x) = cos(x) / (x - PI/2)\nlimit(f(x), x => PI/2)\n\n \n\\[\n-1\n\\]\n\n\n\n(The value pi is able to be exactly converted to PI when used in SymPy, as it is of type Irrational, and is not a floating point value. However, the expression pi/2 converts pi to a floating point value and then divides by 2, hence the loss of exactness when used symbolically.)\n\n\nExample: left and right limits\nRight and left limits will be discussed in the next section; here we give an example of the idea. The mathematical convention is to say a limit exists if both the left and right limits exist and are equal. Informally a right (left) limit at \\(c\\) only considers values of \\(x\\) less (more) than \\(c\\). The limit function of SymPy finds directional limits by default, a right limit, where \\(x > c\\).\nThe left limit can be found by passing the argument dir=\"-\". Passing dir=\"+-\" (and not \"-+\") will compute the mathematical limit, throwing an error in Python if no limit exists.\n\nlimit(ceil(x), x => 0), limit(ceil(x), x => 0, dir=\"-\")\n\n(1, 0)\n\n\nThis accurately shows the limit does not exist mathematically, but limit(ceil(x), x => 0) does exist (as it finds a right limit)."
},
{
"objectID": "limits/limits.html#rules-for-limits",
"href": "limits/limits.html#rules-for-limits",
"title": "18  Limits",
"section": "18.4 Rules for limits",
"text": "18.4 Rules for limits\nThe limit function doesnt compute limits from the definition, rather it applies some known facts about functions within a set of rules. Some of these rules are the following. Suppose the individual limits of \\(f\\) and \\(g\\) always exist (and are finite) below.\n\\[\n\\begin{align*}\n\\lim_{x \\rightarrow c} (a \\cdot f(x) + b \\cdot g(x)) &= a \\cdot\n \\lim_{x \\rightarrow c} f(x) + b \\cdot \\lim_{x \\rightarrow c} g(x)\n &\\\\\n%%\n\\lim_{x \\rightarrow c} f(x) \\cdot g(x) &= \\lim_{x \\rightarrow c}\n f(x) \\cdot \\lim_{x \\rightarrow c} g(x)\n &\\\\\n%%\n\\lim_{x \\rightarrow c} \\frac{f(x)}{g(x)} &=\n \\frac{\\lim_{x \\rightarrow c} f(x)}{\\lim_{x \\rightarrow c} g(x)}\n &(\\text{provided }\\lim_{x \\rightarrow c} g(x) \\neq 0)\\\\\n\\end{align*}\n\\]\nThese are verbally described as follows, when the individual limits exist and are finite then:\n\nLimits involving sums, differences or scalar multiples of functions exist and can be computed by first doing the individual limits and then combining the answers appropriately.\nLimits of products exist and can be found by computing the limits of the individual factors and then combining.\nLimits of ratios exist and can be found by computing the limit of the individual terms and then dividing provided you dont divide by \\(0\\). The last part is really important, as this rule is no help with the common indeterminate form \\(0/0\\)\n\nIn addition, consider the composition:\n\\[\n\\lim_{x \\rightarrow c} f(g(x))\n\\]\nSuppose that\n\nThe outer limit, \\(\\lim_{x \\rightarrow b} f(x) = L\\), exists, and\nthe inner limit, \\(\\lim_{x \\rightarrow c} g(x) = b\\), exists and\nfor some neighborhood around \\(c\\) (not including \\(c\\)) \\(g(x)\\) is not \\(b\\),\n\nThen the limit exists and equals \\(L\\):\n\\(\\lim_{x \\rightarrow c} f(g(x)) = \\lim_{u \\rightarrow b} f(u) = L.\\)\nAn alternative, is to assume \\(f(x)\\) is defined at \\(b\\) and equal to \\(L\\) (which is the definition of continuity), but that isnt the assumption above, hence the need to exclude \\(g\\) from taking on a value of \\(b\\) (where \\(f\\) may not be defined) near \\(c\\).\nThese rules, together with the fact that our basic algebraic functions have limits that can be found by simple evaluation, mean that many limits are easy to compute.\n\nExample: composition\nFor example, consider for some non-zero \\(k\\) the following limit:\n\\[\n\\lim_{x \\rightarrow 0} \\frac{\\sin(kx)}{x}.\n\\]\nThis is clearly related to the function \\(f(x) = \\sin(x)/x\\), which has a limit of \\(1\\) as \\(x \\rightarrow 0\\). We see \\(g(x) = k f(kx)\\) is the limit in question. As \\(kx \\rightarrow 0\\), though not taking a value of \\(0\\) except when \\(x=0\\), the limit above is \\(k \\lim_{x \\rightarrow 0} f(kx) = k \\lim_{u \\rightarrow 0} f(u) = 1\\).\nBasically when taking a limit as \\(x\\) goes to \\(0\\) we can multiply \\(x\\) by any constant and figure out the limit for that. (It is as though we “go to” \\(0\\) faster or slower. but are still going to \\(0\\).\nSimilarly,\n\\[\n\\lim_{x \\rightarrow 0} \\frac{\\sin(x^2)}{x^2} = 1,\n\\]\nas this is the limit of \\(f(g(x))\\) with \\(f\\) as above and \\(g(x) = x^2\\). We need \\(x \\rightarrow 0\\), \\(g\\)is only \\(0\\) at \\(x=0\\), which is the case.\n\n\nExample: products\nConsider this complicated limit found on this Wikipedia page.\n\\[\n\\lim_{x \\rightarrow 1/2} \\frac{\\sin(\\pi x)}{\\pi x} \\cdot \\frac{\\cos(\\pi x)}{1 - (2x)^2}.\n\\]\nWe know the first factor has a limit found by evaluation: \\(2/\\pi\\), so it is really just a constant. The second we can compute:\n\nl(x) = cos(PI*x) / (1 - (2x)^2)\nlimit(l, 1//2)\n\n \n\\[\n\\frac{\\pi}{4}\n\\]\n\n\n\nPutting together, we would get \\(1/2\\). Which we could have done directly in this case:\n\nlimit(sin(PI*x)/(PI*x) * l(x), x => 1//2)\n\n \n\\[\n\\frac{1}{2}\n\\]\n\n\n\n\n\nExample: ratios\nConsider again the limit of \\(\\cos(\\pi x) / (1 - (2x)^2)\\) at \\(c=1/2\\). A graph of both the top and bottom functions shows the indeterminate, \\(0/0\\), form:\n\nplot(cos(pi*x), 0.4, 0.6)\nplot!(1 - (2x)^2)\n\n\n\n\nHowever, following Eulers insight that \\(\\sin(x)/x\\) will have a limit at \\(0\\) of \\(1\\) as \\(\\sin(x) \\approx x\\), and \\(x/x\\) has a limit of \\(1\\) at \\(c=0\\), we can see that \\(\\cos(\\pi x)\\) looks like \\(-\\pi\\cdot (x - 1/2)\\) and \\((1 - (2x)^2)\\) looks like \\(-4(x-1/2)\\) around \\(x=1/2\\):\n\nplot(cos(pi*x), 0.4, 0.6)\nplot!(-pi*(x - 1/2))\n\n\n\n\n\nplot(1 - (2x)^2, 0.4, 0.6)\nplot!(-4(x - 1/2))\n\n\n\n\nSo around \\(c=1/2\\) the ratio should look like \\(-\\pi (x-1/2) / ( -4(x - 1/2)) = \\pi/4\\), which indeed it does, as that is the limit.\nThis is the basis of LHôpitals rule, which we will return to once the derivative is discussed.\n\n\nExample: sums\nIf it is known that the following limit exists by some means:\n\\[\nL = 0 = \\lim_{x \\rightarrow 0} \\frac{e^{\\csc(x)}}{e^{\\cot(x)}} - (1 + \\frac{1}{2}x + \\frac{1}{8}x^2)\n\\]\nThen this limit will exist\n\\[\nM = \\lim_{x \\rightarrow 0} \\frac{e^{\\csc(x)}}{e^{\\cot(x)}}\n\\]\nWhy? We can express the function \\(e^{\\csc(x)}/e^{\\cot(x)}\\) as the above function plus the polynomial \\(1 + x/2 + x^2/8\\). The above is then the sum of two functions whose limits exist and are finite, hence, we can conclude that \\(M = 0 + 1\\).\n\n\n18.4.1 The squeeze theorem\nWe note one more limit law. Suppose we wish to compute \\(\\lim_{x \\rightarrow c}f(x)\\) and we have two other functions, \\(l\\) and \\(u\\), satisfying:\n\nfor all \\(x\\) near \\(c\\) (possibly not including \\(c\\)) \\(l(x) \\leq f(x) \\leq u(x)\\).\nThese limits exist and are equal: \\(L = \\lim_{x \\rightarrow c} l(x) = \\lim_{x \\rightarrow c} u(x)\\).\n\nThen the limit of \\(f\\) must also be \\(L\\).\n\n\n \n As \\(x\\) goes to \\(0\\), the values of \\(sin(x)/x\\) are squeezed between \\(\\cos(x)\\) and \\(1\\) which both converge to \\(1\\)."
},
{
"objectID": "limits/limits.html#limits-from-the-definition",
"href": "limits/limits.html#limits-from-the-definition",
"title": "18  Limits",
"section": "18.5 Limits from the definition",
"text": "18.5 Limits from the definition\nThe formal definition of a limit involves clarifying what it means for \\(f(x)\\) to be “close to \\(L\\)” when \\(x\\) is “close to \\(c\\)”. These are quantified by the inequalities \\(0 < \\lvert x-c\\rvert < \\delta\\) and the \\(\\lvert f(x) - L\\rvert < \\epsilon\\). The second does not have the restriction that it is greater than \\(0\\), as indeed \\(f(x)\\) can equal \\(L\\). The order is important: it says for any idea of close for \\(f(x)\\) to \\(L\\), an idea of close must be found for \\(x\\) to \\(c\\).\nThe key is identifying a value for \\(\\delta\\) for a given value of \\(\\epsilon\\).\nA simple case is the linear case. Consider the function \\(f(x) = 3x + 2\\). Verify that the limit at \\(c=1\\) is \\(5\\).\nWe show “numerically” that \\(\\delta = \\epsilon/3\\).\n\nf(x) = 3x + 2\nc, L = 1, 5\nepsilon = rand() # some number in (0,1)\ndelta = epsilon / 3\nxs = c .+ delta * rand(100) # 100 numbers, c < x < c + delta\nas = [abs(f(x) - L) < epsilon for x in xs]\nall(as) # are all the as true?\n\ntrue\n\n\nThese lines produce a random \\(\\epsilon\\), the resulting \\(\\delta\\), and then verify for 100 numbers within \\((c, c+\\delta)\\) that the inequality \\(\\lvert f(x) - L \\rvert < \\epsilon\\) holds for each. Running them again and again should always produce true if \\(L\\) is the limit and \\(\\delta\\) is chosen properly.\n(Of course, we should also verify values to the left of \\(c\\).)\n(The random numbers are technically in \\([0,1)\\), so in theory epsilon could be 0. So the above approach would be more solid if some guard, such as epsilon = max(eps(), rand()), was used. As the formal definition is the domain of paper-and-pencil, we dont fuss.)\nIn this case, \\(\\delta\\) is easy to guess, as the function is linear and has slope \\(3\\). This basically says the \\(y\\) scale is 3 times the \\(x\\) scale. For non-linear functions, finding \\(\\delta\\) for a given \\(\\epsilon\\) can be a challenge. For the function \\(f(x) = x^3\\), illustrated below, a value of \\(\\delta=\\epsilon^{1/3}\\) is used for \\(c=0\\):\n\n\n \n Demonstration of \\(\\epsilon\\)-\\(\\delta\\) proof of \\(\\lim_{x \\rightarrow 0} x^3 = 0\\). For any \\(\\epsilon>0\\) (the orange lines) there exists a \\(\\delta>0\\) (the red lines of the box) for which the function \\(f(x)\\) does not leave the top or bottom of the box (except possibly at the edges). In this example \\(\\delta^3=\\epsilon\\)."
},
{
"objectID": "limits/limits.html#questions",
"href": "limits/limits.html#questions",
"title": "18  Limits",
"section": "18.6 Questions",
"text": "18.6 Questions\n\nQuestion\nFrom the graph, find the limit:\n\\[\nL = \\lim_{x\\rightarrow 1} \\frac{x^23x+2}{x^26x+5}\n\\]\n\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFrom the graph, find the limit \\(L\\):\n\\[\nL = \\lim_{x \\rightarrow -2} \\frac{x}{x+1} \\frac{x^2}{x^2 + 4}\n\\]\n\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGraphically investigate the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{e^x - 1}{x}.\n\\]\nWhat is the value of \\(L\\)?\n\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGraphically investigate the limit\n\\[\n\\lim_{x \\rightarrow 0} \\frac{\\cos(x) - 1}{x}.\n\\]\nThe limit exists, what is the value?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSelect the graph for which there is no limit at \\(a\\).\n\n\n\n \n \n \n \n \n\n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following limit is commonly used:\n\\[\n\\lim_{h \\rightarrow 0} \\frac{e^{x + h} - e^x}{h} = L.\n\\]\nFactoring out \\(e^x\\) from the top and using rules of limits this becomes,\n\\[\nL = e^x \\lim_{h \\rightarrow 0} \\frac{e^h - 1}{h}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(e^x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following limit is commonly used:\n\\[\n\\lim_{h \\rightarrow 0} \\frac{\\sin(x + h) - \\sin(x)}{h} = L.\n\\]\nThe answer should depend on \\(x\\), though it is possible it is a constant. Using a double angle formula and the rules of limits, this can be written as:\n\\[\nL = \\cos(x) \\lim_{h \\rightarrow 0}\\frac{\\sin(h)}{h} + \\sin(x) \\lim_{h \\rightarrow 0}\\frac{\\cos(h)-1}{h}.\n\\]\nUsing the last result, what is the value of \\(L\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(\\cos(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sin(h)/h\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the limit as \\(x\\) goes to \\(2\\) of\n\\[\nf(x) = \\frac{3x^2 - x -10}{x^2 - 4}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the limit as \\(x\\) goes to \\(-2\\) of\n\\[\nf(x) = \\frac{\\frac{1}{x} + \\frac{1}{2}}{x^3 + 8}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the limit as \\(x\\) goes to \\(27\\) of\n\\[\nf(x) = \\frac{x - 27}{x^{1/3} - 3}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the limit\n\\[\nL = \\lim_{x \\rightarrow \\pi/2} \\frac{\\tan (2x)}{x - \\pi/2}\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe limit of \\(\\sin(x)/x\\) at \\(0\\) has a numeric value. This depends upon the fact that \\(x\\) is measured in radians. Try to find this limit: limit(sind(x)/x, x => 0). What is the value?\n\n\n\n \n \n \n \n \n \n \n \n \n 0\n \n \n\n\n \n \n \n \n 1\n \n \n\n\n \n \n \n \n 180/pi\n \n \n\n\n \n \n \n \n pi/180\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the limit limit(sinpi(x)/x, x => 0)?\n\n\n\n \n \n \n \n \n \n \n \n \n 1\n \n \n\n\n \n \n \n \n 1/pi\n \n \n\n\n \n \n \n \n 0\n \n \n\n\n \n \n \n \n pi\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: limit properties\nThere are several properties of limits that allow one to break down more complicated problems into smaller subproblems. For example,\n\\[\n\\lim (f(x) + g(x)) = \\lim f(x) + \\lim g(x)\n\\]\nis notation to indicate that one can take a limit of the sum of two function or take the limit of each first, then add and the answer will be unchanged, provided all the limits in question exist.\nUse one or the either to find the limit of \\(f(x) = \\sin(x) + \\tan(x) + \\cos(x)\\) as \\(x\\) goes to \\(0\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe key assumption made above in being able to write\n\\[\n\\lim_{x\\rightarrow c} f(g(x)) = L,\n\\]\nwhen \\(\\lim_{x\\rightarrow b} f(x) = L\\) and \\(\\lim_{x\\rightarrow c}g(x) = b\\) is continuity.\nThis example shows why it is important.\nTake\n\\[\nf(x) = \\begin{cases}\n0 & x \\neq 0\\\\\n1 & x = 0\n\\end{cases}\n\\]\nWe have \\(\\lim_{x\\rightarrow 0}f(x) = 0\\), as \\(0\\) is clearly a removable discontinuity. So were the above applicable we would have \\(\\lim_{x \\rightarrow 0}f(f(x)) = 0\\). But this is not true. What is the limit at \\(0\\) of \\(f(f(x))\\)?\n\nnumericq(1)\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes this function have a limit as \\(h\\) goes to \\(0\\) from the right (that is, assume \\(h>0\\))?\n\\[\n\\frac{h^h - 1}{h}\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, the value is -11.5123\n \n \n\n\n \n \n \n \n Yes, the value is -9.2061\n \n \n\n\n \n \n \n \n No, the value heads to negative infinity\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the limit\n\\[\n\\lim_{x \\rightarrow 1} \\frac{x}{x-1} - \\frac{1}{\\ln(x)}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the limit\n\\[\n\\lim_{x \\rightarrow 1/2} \\frac{1}{\\pi} \\frac{\\cos(\\pi x)}{1 - (2x)^2}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSome limits involve parameters. For example, suppose we define ex as follows:\n\n@syms m::real k::real\nex = (1 + k*x)^(m/x)\n\n \n\\[\n\\left(k x + 1\\right)^{\\frac{m}{x}}\n\\]\n\n\n\nWhat is limit(ex, x => 0)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(k/m\\)\n \n \n\n\n \n \n \n \n \\(e^{km}\\)\n \n \n\n\n \n \n \n \n \\(e^{k/m}\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(m/k\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor a given \\(a\\), what is\n\\[\nL = \\lim_{x \\rightarrow 0+} (1 + a\\cdot (e^{-x} -1))^{(1/x)}\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n \\(L\\) does not exist\n \n \n\n\n \n \n \n \n \\(e^a\\)\n \n \n\n\n \n \n \n \n \\(e^{-a}\\)\n \n \n\n\n \n \n \n \n \\(a\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor positive integers \\(m\\) and \\(n\\) what is\n\\[\n\\lim_{x \\rightarrow 1} \\frac{x^{1/m}-1}{x^{1/n}-1}?\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n \\(n/m\\)\n \n \n\n\n \n \n \n \n \\(m/n\\)\n \n \n\n\n \n \n \n \n \\(mn\\)\n \n \n\n\n \n \n \n \n The limit does not exist\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat does SymPy find for the limit of ex (limit(ex, x => 0)), as defined here:\n\n@syms x a\nex = (a^x - 1)/x\n\n \n\\[\n\\frac{a^{x} - 1}{x}\n\\]\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^{-a}\\)\n \n \n\n\n \n \n \n \n \\(\\log(a)\\)\n \n \n\n\n \n \n \n \n \\(e^a\\)\n \n \n\n\n \n \n \n \n \\(a\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nShould SymPy have needed an assumption like\n\n@syms a::postive\n\n(a,)\n\n\n\nyesnoq(\"yes\")\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: The squeeze theorem\nLets look at the function \\(f(x) = x \\sin(1/x)\\). A graph around \\(0\\) can be made with:\n\nf(x) = x == 0 ? NaN : x * sin(1/x)\nc, delta = 0, 1/4\nplot(f, c - delta, c + delta)\nplot!(abs)\nplot!(x -> -abs(x))\n\n\n\n\nThis graph clearly oscillates near \\(0\\). To the graph of \\(f\\), we added graphs of both \\(g(x) = \\lvert x\\rvert\\) and \\(h(x) = - \\lvert x\\rvert\\). From this graph it is easy to see by the “squeeze theorem” that the limit at \\(x=0\\) is \\(0\\). Why?\n\n\n\n \n \n \n \n \n \n \n \n \n The functions \\(g\\) and \\(h\\) squeeze each other as \\(g(x) > h(x)\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has no limit - it oscillates too much near \\(0\\)\n \n \n\n\n \n \n \n \n The functions \\(g\\) and \\(h\\) both have a limit of \\(0\\) at \\(x=0\\) and the function \\(f\\) is in between both \\(g\\) and \\(h\\), so must to have a limit of \\(0\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(The Wikipedia entry for the squeeze theorem has this unverified, but colorful detail:\n\nIn many languages (e.g. French, German, Italian, Hungarian and Russian), the squeeze theorem is also known as the two policemen (and a drunk) theorem, or some variation thereof. The story is that if two policemen are escorting a drunk prisoner between them, and both officers go to a cell, then (regardless of the path taken, and the fact that the prisoner may be wobbling about between the policemen) the prisoner must also end up in the cell.\n\n\n\nQuestion\nArchimedes, in finding bounds on the value of \\(\\pi\\) used \\(n\\)-gons with sides \\(12, 24, 48,\\) and \\(96\\). This was so the trigonometry involved could be solved exactly for the interior angles (e.g. \\(n=12\\) is an interior angle of \\(\\pi/6\\) which has sin and cos computable by simple geometry. See Damini and Abhishek) These exact solutions led to subsequent bounds. A more modern approach to bound the circumference of a circle of radius \\(r\\) using a \\(n\\)-gon with interior angle \\(\\theta\\) would be to use the trigonometric functions. An upper bound would be found with (using the triangle with angle \\(\\theta/2\\), opposite side \\(x\\) and adjacent side \\(r\\):\n\n@syms theta::real r::real\n\n(theta, r)\n\n\n\nx = r * tan(theta/2)\nn = 2PI/theta # using PI to avoid floaing point roundoff in 2pi\n# C < n * 2x\nupper = n*2x\n\n \n\\[\n\\frac{4 \\pi r \\tan{\\left(\\frac{\\theta}{2} \\right)}}{\\theta}\n\\]\n\n\n\nA lower bound would use the triangle with angle \\(\\theta/2\\), hypotenuse \\(r\\) and opposite side \\(x\\):\n\nx = r*sin(theta/2)\nn = 2PI/theta\n# C > n * 2x\nlower = n*2x\n\n \n\\[\n\\frac{4 \\pi r \\sin{\\left(\\frac{\\theta}{2} \\right)}}{\\theta}\n\\]\n\n\n\nUsing the above, find the limit of upper and lower. Are the two equal and equal to a familiar value?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(If so, then the squeeze theorem would say that \\(\\pi\\) is the common limit.)"
},
{
"objectID": "limits/limits_extensions.html",
"href": "limits/limits_extensions.html",
"title": "19  Limits, issues, extensions of the concept",
"section": "",
"text": "This section uses the following add-on packages:\nThe limit of a function at \\(c\\) need not exist for one of many different reasons. Some of these reasons can be handled with extensions to the concept of the limit, others are just problematic in terms of limits. This section covers examples of each.\nLets begin with a function that is just problematic. Consider\n\\[\nf(x) = \\sin(1/x)\n\\]\nAs this is a composition of nice functions it will have a limit everywhere except possibly when \\(x=0\\), as then \\(1/x\\) may not have a limit. So rather than talk about where it is nice, lets consider the question of whether a limit exists at \\(c=0\\).\nA graph shows the issue:\nThe graph oscillates between \\(-1\\) and \\(1\\) infinitely many times on this interval - so many times, that no matter how close one zooms in, the graph on the screen will fail to capture them all. Graphically, there is no single value of \\(L\\) that the function gets close to, as it varies between all the values in \\([-1,1]\\) as \\(x\\) gets close to \\(0\\). A simple proof that there is no limit, is to take any \\(\\epsilon\\) less than \\(1\\), then with any \\(\\delta > 0\\), there are infinitely many \\(x\\) values where \\(f(x)=1\\) and infinitely many where \\(f(x) = -1\\). That is, there is no \\(L\\) with \\(|f(x) - L| < \\epsilon\\) when \\(\\epsilon\\) is less than \\(1\\) for all \\(x\\) near \\(0\\).\nThis function basically has too many values it gets close to. Another favorite example of such a function is the function that is \\(0\\) if \\(x\\) is rational and \\(1\\) if not. This function will have no limit anywhere, not just at \\(0\\), and for basically the same reason as above.\nThe issue isnt oscillation though. Take, for example, the function \\(f(x) = x \\cdot \\sin(1/x)\\). This function again has a limit everywhere save possibly \\(0\\). But in this case, there is a limit at \\(0\\) of \\(0\\). This is because, the following is true:\n\\[\n-|x| \\leq x \\sin(1/x) \\leq |x|.\n\\]\nThe following figure illustrates:\nThe squeeze theorem of calculus is the formal reason \\(f\\) has a limit at \\(0\\), as as both the upper function, \\(|x|\\), and the lower function, \\(-|x|\\), have a limit of \\(0\\) at \\(0\\)."
},
{
"objectID": "limits/limits_extensions.html#right-and-left-limits",
"href": "limits/limits_extensions.html#right-and-left-limits",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.1 Right and left limits",
"text": "19.1 Right and left limits\nAnother example where \\(f(x)\\) has no limit is the function \\(f(x) = x /|x|, x \\neq 0\\). This function is \\(-1\\) for negative \\(x\\) and \\(1\\) for positive \\(x\\). Again, this function will have a limit everywhere except possibly at \\(x=0\\), where division by \\(0\\) is possible.\nIts graph is\n\nf(x) = abs(x)/x\nplot(f, -2, 2)\n\n\n\n\nThe sharp jump at \\(0\\) is misleading - again, the plotting algorithm just connects the points, it doesnt handle what is a fundamental discontinuity well - the function is not defined at \\(0\\) and jumps from \\(-1\\) to \\(1\\) there. Similarly to our example of \\(\\sin(1/x)\\), near \\(0\\) the function gets close to both \\(1\\) and \\(-1\\), so will have no limit. (Again, just take \\(\\epsilon\\) smaller than \\(1\\).)\nBut unlike the previous example, this function would have a limit if the definition didnt consider values of \\(x\\) on both sides of \\(c\\). The limit on the right side would be \\(1\\), the limit on the left side would be \\(-1\\). This distinction is useful, so there is an extension of the idea of a limit to one-sided limits.\nLets loosen up the language in the definition of a limit to read:\n\nThe limit of \\(f(x)\\) as \\(x\\) approaches \\(c\\) is \\(L\\) if for every neighborhood, \\(V\\), of \\(L\\) there is a neighborhood, \\(U\\), of \\(c\\) for which \\(f(x)\\) is in \\(V\\) for every \\(x\\) in \\(U\\), except possibly \\(x=c\\).\n\nThe \\(\\epsilon-\\delta\\) definition has \\(V = (L-\\epsilon, L + \\epsilon)\\) and \\(U=(c-\\delta, c+\\delta)\\). This is a rewriting of \\(L-\\epsilon < f(x) < L + \\epsilon\\) as \\(|f(x) - L| < \\epsilon\\).\nNow for the defintion:\n\nA function \\(f(x)\\) has a limit on the right of \\(c\\), written \\(\\lim_{x \\rightarrow c+}f(x) = L\\) if for every \\(\\epsilon > 0\\), there exists a \\(\\delta > 0\\) such that whenever \\(0 < x - c < \\delta\\) it holds that \\(|f(x) - L| < \\epsilon\\). That is, \\(U\\) is \\((c, c+\\delta)\\)\n\nSimilarly, a limit on the left is defined where \\(U=(c-\\delta, c)\\).\nThe SymPy function limit has a keyword argument dir=\"+\" or dir=\"-\" to request that a one-sided limit be formed. The default is dir=\"+\". Passing dir=\"+-\" will compute both one side limits, and throw an error if the two are not equal, in agreement with no limit existing.\n\n@syms x\n\n(x,)\n\n\n\nf(x) = abs(x)/x\nlimit(f(x), x=>0, dir=\"+\"), limit(f(x), x=>0, dir=\"-\")\n\n(1, -1)\n\n\n\n\n\n\n\n\nWarning\n\n\n\nThat means the mathematical limit need not exist when SymPys limit returns an answer, as SymPy is only carrying out a one sided limit. Explicitly passing dir=\"+-\" or checking that both limit(ex, x=>c) and limit(ex, x=>c, dir=\"-\") are equal would be needed to confirm a limit exists mathematically.\n\n\nThe relation between the two concepts is that a function has a limit at \\(c\\) if an only if the left and right limits exist and are equal. This function \\(f\\) has both existing, but the two limits are not equal.\nThere are other such functions that jump. Another useful one is the floor function, which just rounds down to the nearest integer. A graph shows the basic shape:\n\nplot(floor, -5,5)\n\n\n\n\nAgain, the (nearly) vertical lines are an artifact of the graphing algorithm and not actual points that solve \\(y=f(x)\\). The floor function has limits except at the integers. There the left and right limits differ.\nConsider the limit at \\(c=0\\). If \\(0 < x < 1/2\\), say, then \\(f(x) = 0\\) as we round down, so the right limit will be \\(0\\). However, if \\(-1/2 < x < 0\\), then the \\(f(x) = -1\\), again as we round down, so the left limit will be \\(-1\\). Again, with this example both the left and right limits exists, but at the integer values they are not equal, as they differ by 1.\nSome functions only have one-sided limits as they are not defined in an interval around \\(c\\). There are many examples, but we will take \\(f(x) = x^x\\) and consider \\(c=0\\). This function is not well defined for all \\(x < 0\\), so it is typical to just take the domain to be \\(x > 0\\). Still it has a right limit \\(\\lim_{x \\rightarrow 0+} x^x = 1\\). SymPy can verify:\n\nlimit(x^x, x, 0, dir=\"+\")\n\n \n\\[\n1\n\\]\n\n\n\nThis agrees with the IEEE convention of assigning 0^0 to be 1.\nHowever, not all such functions with indeterminate forms of \\(0^0\\) will have a limit of \\(1\\).\n\nExample\nConsider this funny graph:\n\n\n\n\n\nDescribe the limits at \\(-1\\), \\(0\\), and \\(1\\).\n\nAt \\(-1\\) we see a jump, there is no limit but instead a left limit of 1 and a right limit appearing to be \\(1/2\\).\nAt \\(0\\) we see a limit of \\(1\\).\nFinally, at \\(1\\) again there is a jump, so no limit. Instead the left limit is about \\(-1\\) and the right limit \\(1\\)."
},
{
"objectID": "limits/limits_extensions.html#limits-at-infinity",
"href": "limits/limits_extensions.html#limits-at-infinity",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.2 Limits at infinity",
"text": "19.2 Limits at infinity\nThe loose definition of a horizontal asymptote is “a line such that the distance between the curve and the line approaches \\(0\\) as they tend to infinity.” This sounds like it should be defined by a limit. The issue is, that the limit would be at \\(\\pm\\infty\\) and not some finite \\(c\\). This requires the idea of a neighborhood of \\(c\\), \\(0 < |x-c| < \\delta\\) to be reworked.\nThe basic idea for a limit at \\(+\\infty\\) is that for any \\(\\epsilon\\), there exists an \\(M\\) such that when \\(x > M\\) it must be that \\(|f(x) - L| < \\epsilon\\). For a horizontal asymptote, the line would be \\(y=L\\). Similarly a limit at \\(-\\infty\\) can be defined with \\(x < M\\) being the condition.\nLets consider some cases.\nThe function \\(f(x) = \\sin(x)\\) will not have a limit at \\(+\\infty\\) for exactly the same reason that \\(f(x) = \\sin(1/x)\\) does not have a limit at \\(c=0\\) - it just oscillates between \\(-1\\) and \\(1\\) so never eventually gets close to a single value.\nSymPy gives an odd answer here indicating the range of values:\n\nlimit(sin(x), x => oo)\n\n \n\\[\n\\left\\langle -1, 1\\right\\rangle\n\\]\n\n\n\n(We used SymPys oo for \\(\\infty\\) and not Inf.)\n\nHowever, a damped oscillation, such as \\(f(x) = e^{-x} \\sin(x)\\) will have a limit:\n\nlimit(exp(-x)*sin(x), x => oo)\n\n \n\\[\n0\n\\]\n\n\n\n\nWe have rational functions will have the expected limit. In this example \\(m = n\\), so we get a horizontal asymptote that is not \\(y=0\\):\n\nlimit((x^2 - 2x +2)/(4x^2 + 3x - 2), x=>oo)\n\n \n\\[\n\\frac{1}{4}\n\\]\n\n\n\n\nThough rational functions can have only one (at most) horizontal asymptote, this isnt true for all functions. Consider the following \\(f(x) = x / \\sqrt{x^2 + 4}\\). It has different limits depending if \\(x\\) goes to \\(\\infty\\) or negative \\(\\infty\\):\n\nf(x) = x / sqrt(x^2 + 4)\nlimit(f(x), x=>oo), limit(f(x), x=>-oo)\n\n(1, -1)\n\n\n(A simpler example showing this behavior is just the function \\(x/|x|\\) considered earlier.)\n\nExample: Limits at infinity and right limits at \\(0\\)\nGiven a function \\(f\\) the question of whether this exists:\n\\[\n\\lim_{x \\rightarrow \\infty} f(x)\n\\]\ncan be reduced to the question of whether this limit exists:\n\\[\n\\lim_{x \\rightarrow 0+} f(1/x)\n\\]\nSo whether \\(\\lim_{x \\rightarrow 0+} \\sin(1/x)\\) exists is equivalent to whether \\(\\lim_{x\\rightarrow \\infty} \\sin(x)\\) exists, which clearly does not due to the oscillatory nature of \\(\\sin(x)\\).\nSimilarly, one can make this reduction\n\\[\n\\lim_{x \\rightarrow c+} f(x) =\n\\lim_{x \\rightarrow 0+} f(c + x) =\n\\lim_{x \\rightarrow \\infty} f(c + \\frac{1}{x}).\n\\]\nThat is, right limits can be analyzed as limits at \\(\\infty\\) or right limits at \\(0\\), should that prove more convenient."
},
{
"objectID": "limits/limits_extensions.html#limits-of-infinity",
"href": "limits/limits_extensions.html#limits-of-infinity",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.3 Limits of infinity",
"text": "19.3 Limits of infinity\nVertical asymptotes are nicely defined with horizontal asymptotes by the graph getting close to some line. However, the formal definition of a limit wont be the same. For a vertical asymptote, the value of \\(f(x)\\) heads towards positive or negative infinity, not some finite \\(L\\). As such, a neighborhood like \\((L-\\epsilon, L+\\epsilon)\\) will no longer make sense, rather we replace it with an expression like \\((M, \\infty)\\) or \\((-\\infty, M)\\). As in: the limit of \\(f(x)\\) as \\(x\\) approaches \\(c\\) is infinity if for every \\(M > 0\\) there exists a \\(\\delta>0\\) such that if \\(0 < |x-c| < \\delta\\) then \\(f(x) > M\\). Approaching \\(-\\infty\\) would conclude with \\(f(x) < -M\\) for all \\(M>0\\).\n\nExamples\nConsider the function \\(f(x) = 1/x^2\\). This will have a limit at every point except possibly \\(0\\), where division by \\(0\\) is possible. In this case, there is a vertical asymptote, as seen in the following graph. The limit at \\(0\\) is \\(\\infty\\), in the extended sense above. For \\(M>0\\), we can take any \\(0 < \\delta < 1/\\sqrt{M}\\). The following graph shows \\(M=25\\) where the function values are outside of the box, as \\(f(x) > M\\) for those \\(x\\) values with \\(0 < |x-0| < 1/\\sqrt{M}\\).\n\n\n\n\n\n\nThe function \\(f(x)=1/x\\) requires us to talk about left and right limits of infinity, with the natural generalization. We can see that the left limit at \\(0\\) is \\(-\\infty\\) and the right limit \\(\\infty\\):\n\n\n\n\n\nSymPy agrees:\n\nf(x) = 1/x\nlimit(f(x), x=>0, dir=\"-\"), limit(f(x), x=>0, dir=\"+\")\n\n(-oo, oo)\n\n\n\nConsider the function \\(g(x) = x^x(1 + \\log(x)), x > 0\\). Does this have a right limit at \\(0\\)?\nA quick graph shows that a limit may be \\(-\\infty\\):\n\ng(x) = x^x * (1 + log(x))\nplot(g, 1/100, 1)\n\n\n\n\nWe can check with SymPy:\n\nlimit(g(x), x=>0, dir=\"+\")\n\n \n\\[\n-\\infty\n\\]"
},
{
"objectID": "limits/limits_extensions.html#limits-of-sequences",
"href": "limits/limits_extensions.html#limits-of-sequences",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.4 Limits of sequences",
"text": "19.4 Limits of sequences\nAfter all this, we still cant formalize the basic question asked in the introduction to limits: what is the area contained in a parabola. For that we developed a sequence of sums: \\(s_n = 1/2 \\dot((1/4)^0 + (1/4)^1 + (1/4)^2 + \\cdots + (1/4)^n)\\). This isnt a function of \\(x\\), but rather depends only on non-negative integer values of \\(n\\). However, the same idea as a limit at infinity can be used to define a limit.\n\nLet \\(a_0,a_1, a_2, \\dots, a_n, \\dots\\) be a sequence of values indexed by \\(n\\). We have \\(\\lim_{n \\rightarrow \\infty} a_n = L\\) if for every \\(\\epsilon > 0\\) there exists an \\(M>0\\) where if \\(n > M\\) then \\(|a_n - L| < \\epsilon\\).\n\nCommon language is the sequence converges when the limit exists and otherwise diverges.\nThe above is essentially the same as a limit at infinity for a function, but in this case the functions domain is only the non-negative integers.\nSymPy is happy to compute limits of sequences. Defining this one involving a sum is best done with the summation function:\n\n@syms i::integer n::(integer, positive)\ns(n) = 1//2 * summation((1//4)^i, (i, 0, n)) # rationals make for an exact answer\nlimit(s(n), n=>oo)\n\n \n\\[\n\\frac{2}{3}\n\\]\n\n\n\n\nExample\nThe limit\n\\[\n\\lim_{x \\rightarrow 0} \\frac{e^x - 1}{x} = 1,\n\\]\nis an important limit. Using the definition of \\(e^x\\) by an infinite sequence:\n\\[\ne^x = \\lim_{n \\rightarrow \\infty} (1 + \\frac{x}{n})^n,\n\\]\nwe can establish the limit using the squeeze theorem. First,\n\\[\nA = |(1 + \\frac{x}{n})^n - 1 - x| = |\\Sigma_{k=0}^n {n \\choose k}(\\frac{x}{n})^k - 1 - x| = |\\Sigma_{k=2}^n {n \\choose k}(\\frac{x}{n})^k|,\n\\]\nthe first two sums cancelling off. The above comes from the binomial expansion theorem for a polynomial. Now \\({n \\choose k} \\leq n^k\\)so we have\n\\[\nA \\leq \\Sigma_{k=2}^n |x|^k = |x|^2 \\frac{1 - |x|^{n+1}}{1 - |x|} \\leq\n\\frac{|x|^2}{1 - |x|}.\n\\]\nusing the geometric sum formula with \\(x \\approx 0\\) (and not \\(1\\)):\n\n@syms x n i\nsummation(x^i, (i,0,n))\n\n \n\\[\n\\begin{cases} n + 1 & \\text{for}\\: x = 1 \\\\\\frac{1 - x^{n + 1}}{1 - x} & \\text{otherwise} \\end{cases}\n\\]\n\n\n\nAs this holds for all \\(n\\), as \\(n\\) goes to \\(\\infty\\) we have:\n\\[\n|e^x - 1 - x| \\leq \\frac{|x|^2}{1 - |x|}\n\\]\nDividing both sides by \\(x\\) and noting that as \\(x \\rightarrow 0\\), \\(|x|/(1-|x|)\\) goes to \\(0\\) by continuity, the squeeze theorem gives the limit:\n\\[\n\\lim_{x \\rightarrow 0} \\frac{e^x -1}{x} - 1 = 0.\n\\]\nThat \\({n \\choose k} \\leq n^k\\) can be viewed as the left side counts the number of combinations of \\(k\\) choices from \\(n\\) distinct items, which is less than the number of permutations of \\(k\\) choices, which is less than the number of choices of \\(k\\) items from \\(n\\) distinct ones without replacement what \\(n^k\\) counts.\n\n\n19.4.1 Some limit theorems for sequences\nThe limit discussion first defined limits of scalar univariate functions at a point \\(c\\) and then added generalizations. The pedagogical approach can be reversed by starting the discussion with limits of sequences and then generalizing from there. This approach relies on a few theorems to be gathered along the way that are mentioned here for the curious reader:\n\nConvergent sequences are bounded.\nAll bounded monotone sequences converge.\nEvery bounded sequence has a convergent subsequence. (Bolzano-Weirstrass)\nThe limit of \\(f\\) at \\(c\\) exists and equals \\(L\\) if and only if for every sequence \\(x_n\\) in the domain of \\(f\\) converging to \\(c\\) the sequence \\(s_n = f(x_n)\\) converges to \\(L\\)."
},
{
"objectID": "limits/limits_extensions.html#summary",
"href": "limits/limits_extensions.html#summary",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.5 Summary",
"text": "19.5 Summary\nThe following table captures the various changes to the definition of the limit to accommodate some of the possible behaviors.\n\n\n\n\n\n\nTypeNotationVU\n\nlimit\n\\(\\lim_{x\\rightarrow c}f(x) = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((c - \\delta, c+\\delta)\\)\n\nright limit\n\\(\\lim_{x\\rightarrow c+}f(x) = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((c, c+\\delta)\\)\n\nleft limit\n\\(\\lim_{x\\rightarrow c-}f(x) = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((c - \\delta, c)\\)\n\nlimit at \\(\\infty\\)\n\\(\\lim_{x\\rightarrow \\infty}f(x) = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((M, \\infty)\\)\n\nlimit at \\(-\\infty\\)\n\\(\\lim_{x\\rightarrow -\\infty}f(x) = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((-\\infty, M)\\)\n\nlimit of \\(\\infty\\)\n\\(\\lim_{x\\rightarrow c}f(x) = \\infty\\)\n\\((M, \\infty)\\)\n\\((c - \\delta, c+\\delta)\\)\n\nlimit of \\(-\\infty\\)\n\\(\\lim_{x\\rightarrow c}f(x) = -\\infty\\)\n\\((-\\infty, M)\\)\n\\((c - \\delta, c+\\delta)\\)\n\nlimit of a sequence\n\\(\\lim_{n \\rightarrow \\infty} a_n = L\\)\n\\((L-\\epsilon, L+\\epsilon)\\)\n\\((M, \\infty)\\)\n\n\n\n\n\n\n\nRoss summarizes this by enumerating the 15 different related definitions for \\(\\lim_{x \\rightarrow a} f(x) = L\\) that arise from \\(L\\) being either finite, \\(-\\infty\\), or \\(+\\infty\\) and \\(a\\) being any of \\(c\\), \\(c-\\), \\(c+\\), \\(-\\infty\\), or \\(+\\infty\\)."
},
{
"objectID": "limits/limits_extensions.html#rates-of-growth",
"href": "limits/limits_extensions.html#rates-of-growth",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.6 Rates of growth",
"text": "19.6 Rates of growth\nConsider two functions \\(f\\) and \\(g\\) to be comparable if there are positive integers \\(m\\) and \\(n\\) with both\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{f(x)^m}{g(x)} = \\infty \\quad\\text{and }\n\\lim_{x \\rightarrow \\infty} \\frac{g(x)^n}{f(x)} = \\infty.\n\\]\nThe first says \\(g\\) is eventually bounded by a power of \\(f\\), the second that \\(f\\) is eventually bounded by a power of \\(g\\).\nHere we consider which families of functions are comparable.\nFirst consider \\(f(x) = x^3\\) and \\(g(x) = x^4\\). We can take \\(m=2\\) and \\(n=1\\) to verify \\(f\\) and \\(g\\) are comparable:\n\nfx, gx = x^3, x^4\nlimit(fx^2/gx, x=>oo), limit(gx^1 / fx, x=>oo)\n\n(oo, oo)\n\n\nSimilarly for any pairs of powers, so we could conclude \\(f(x) = x^n\\) and \\(g(x) =x^m\\) are comparable. (However, as is easily observed, for \\(m\\) and \\(n\\) both positive integers \\(\\lim_{x \\rightarrow \\infty} x^{m+n}/x^m = \\infty\\) and \\(\\lim_{x \\rightarrow \\infty} x^{m}/x^{m+n} = 0\\), consistent with our discussion on rational functions that higher-order polynomials dominate lower-order polynomials.)\nNow consider \\(f(x) = x\\) and \\(g(x) = \\log(x)\\). These are not compatible as there will be no \\(n\\) large enough. We might say \\(x\\) dominates \\(\\log(x)\\).\n\nlimit(log(x)^n / x, x => oo)\n\n \n\\[\n0\n\\]\n\n\n\nAs \\(x\\) could be replaced by any monomial \\(x^k\\), we can say “powers” grow faster than “logarithms”.\nNow consider \\(f(x)=x\\) and \\(g(x) = e^x\\). These are not compatible as there will be no \\(m\\) large enough:\n\n@syms m::(positive, integer)\nlimit(x^m / exp(x), x => oo)\n\n \n\\[\n0\n\\]\n\n\n\nThat is \\(e^x\\) grows faster than any power of \\(x\\).\nNow, if \\(a, b > 1\\) then \\(f(x) = a^x\\) and \\(g(x) = b^x\\) will be comparable. Take \\(m\\) so that \\(a^m > b\\) and \\(n\\) so that \\(b^n > x\\) as then, say,\n\\[\n\\frac{(a^x)^m}{b^x} = \\frac{a^{xm}}{b^x} = \\frac{(a^m)^x}{b^x} = (\\frac{a^m}{b})^x,\n\\]\nwhich will go to \\(\\infty\\) as \\(x \\rightarrow \\infty\\) as \\(a^m/b > 1\\).\nFinally, consider \\(f(x) = \\exp(x^2)\\) and \\(g(x) = \\exp(x)^2\\). Are these comparable? No, as no \\(n\\) is large enough:\n\n@syms x n::(positive, integer)\nfx, gx = exp(x^2), exp(x)^2\nlimit(gx^n / fx, x => oo)\n\n \n\\[\n0\n\\]\n\n\n\nA negative test for compatability is the following: if\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{\\log(|f(x)|)}{\\log(|g(x)|)} = 0,\n\\]\nThen \\(f\\) and \\(g\\) are not compatible (and \\(g\\) grows faster than \\(f\\)). Applying this to the last two values of \\(f\\) and \\(g\\), we have\n\\[\n\\lim_{x \\rightarrow \\infty}\\frac{\\log(\\exp(x)^2)}{\\log(\\exp(x^2))} =\n\\lim_{x \\rightarrow \\infty}\\frac{2\\log(\\exp(x))}{x^2} =\n\\lim_{x \\rightarrow \\infty}\\frac{2x}{x^2} = 0,\n\\]\nso \\(f(x) = \\exp(x^2)\\) grows faster than \\(g(x) = \\exp(x)^2\\).\n\nKeeping in mind that logarithms grow slower than powers which grow slower than exponentials (\\(a > 1\\)) can help understand growth at \\(\\infty\\) as a comparison of leading terms does for rational functions.\nWe can immediately put this to use to compute \\(\\lim_{x\\rightarrow 0+} x^x\\). We first express this problem using \\(x^x = (\\exp(\\ln(x)))^x = e^{x\\ln(x)}\\). Rewriting \\(u(x) = \\exp(\\ln(u(x)))\\), which only uses the basic inverse relation between the two functions, can often be a useful step.\nAs \\(f(x) = e^x\\) is a suitably nice function (continuous) so that the limit of a composition can be computed through the limit of the inside function, \\(x\\ln(x)\\), it is enough to see what \\(\\lim_{x\\rightarrow 0+} x\\ln(x)\\) is. We re-express this as a limit at \\(\\infty\\)\n\\[\n\\lim_{x\\rightarrow 0+} x\\ln(x) = \\lim_{x \\rightarrow \\infty} (1/x)\\ln(1/x) =\n\\lim_{x \\rightarrow \\infty} \\frac{-\\ln(x)}{x} = 0\n\\]\nThe last equality follows, as the function \\(x\\) dominates the function \\(\\ln(x)\\). So by the limit rule involving compositions we have: \\(\\lim_{x\\rightarrow 0+} x^x = e^0 = 1\\)."
},
{
"objectID": "limits/limits_extensions.html#questions",
"href": "limits/limits_extensions.html#questions",
"title": "19  Limits, issues, extensions of the concept",
"section": "19.7 Questions",
"text": "19.7 Questions\n\nQuestion\nSelect the graph for which the limit at \\(a\\) is infinite.\n\n\n\n \n \n \n \n \n\n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSelect the graph for which the limit at \\(\\infty\\) appears to be defined.\n\n\n\n \n \n \n \n \n\n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the function \\(f(x) = \\sqrt{x}\\).\nDoes this function have a limit at every \\(c > 0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes this function have a limit at \\(c=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes this function have a right limit at \\(c=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes this function have a left limit at \\(c=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind \\(\\lim_{x \\rightarrow \\infty} \\sin(x)/x\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nFind \\(\\lim_{x \\rightarrow \\infty} (1-\\cos(x))/x^2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind \\(\\lim_{x \\rightarrow \\infty} \\log(x)/x\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind \\(\\lim_{x \\rightarrow 2+} (x-3)/(x-2)\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(L=-1\\)\n \n \n\n\n \n \n \n \n \\(L=\\infty\\)\n \n \n\n\n \n \n \n \n \\(L=-\\infty\\)\n \n \n\n\n \n \n \n \n \\(L=0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind \\(\\lim_{x \\rightarrow -3-} (x-3)/(x+3)\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(L=-\\infty\\)\n \n \n\n\n \n \n \n \n \\(L=\\infty\\)\n \n \n\n\n \n \n \n \n \\(L=0\\)\n \n \n\n\n \n \n \n \n \\(L=-1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\exp(x + \\exp(-x^2))\\) and \\(g(x) = \\exp(-x^2)\\). Compute:\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{\\ln(f(x))}{\\ln(g(x))}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the following expression:\n\nex = 1/(exp(-x + exp(-x))) - exp(x)\n\n \n\\[\n- e^{x} + e^{x - e^{- x}}\n\\]\n\n\n\nWe want to find the limit, \\(L\\), as \\(x \\rightarrow \\infty\\), which we assume exists below.\nWe first rewrite ex using w as exp(-x):\n\n@syms w\nex1 = ex(exp(-x) => w)\n\n \n\\[\n- \\frac{1}{w} + \\frac{e^{- w}}{w}\n\\]\n\n\n\nAs \\(x \\rightarrow \\infty\\), \\(w \\rightarrow 0+\\), so the limit at \\(0+\\) of ex1 is of interest.\nUse this fact, to find \\(L\\)\n\nlimit(ex1 - (w/2 - 1), w=>0)\n\n \n\\[\n0\n\\]\n\n\n\n\\(L\\) is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(This awkward approach is generalizable: replacing the limit as \\(w \\rightarrow 0\\) of an expression with the limit of a polynomial in w that is easy to identify.)\n\n\nQuestion\nAs mentioned, for limits that depend on specific values of parameters SymPy may have issues. As an example, SymPy has an issue with this limit, whose answer depends on the value of \\(k\\)”\n\\[\n\\lim_{x \\rightarrow 0+} \\frac{\\sin(\\sin(x^2))}{x^k}.\n\\]\nNote, regardless of \\(k\\) you find:\n\n@syms x::real k::integer\nlimit(sin(sin(x^2))/x^k, x=>0)\n\n \n\\[\n0\n\\]\n\n\n\nFor which value(s) of \\(k\\) in \\(1,2,3\\) is this actually the correct answer? (Do the above \\(3\\) times using a specific value of k, not a numeric one.\n\nchoices = [\"``1``\", \"``2``\", \"``3``\", \"``1,2``\", \"``1,3``\", \"``2,3``\", \"``1,2,3``\"]\nradioq(choices, 1, keep_order=true)\n\n\n \n \n \n \n \n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \\(3\\)\n \n \n\n\n \n \n \n \n \\(1,2\\)\n \n \n\n\n \n \n \n \n \\(1,3\\)\n \n \n\n\n \n \n \n \n \\(2,3\\)\n \n \n\n\n \n \n \n \n \\(1,2,3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: No limit\nSome functions do not have a limit. Make a graph of \\(\\sin(1/x)\\) from \\(0.0001\\) to \\(1\\) and look at the output. Why does a limit not exist?\n\n\n\n \n \n \n \n \n \n \n \n \n The function oscillates too much and its y values do not get close to any one value\n \n \n\n\n \n \n \n \n Any function that oscillates does not have a limit.\n \n \n\n\n \n \n \n \n Err, the limit does exists and is 1\n \n \n\n\n \n \n \n \n The limit does exist - it is any number from -1 to 1\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion \\(0^0\\) is not always \\(1\\)\nIs the form \\(0^0\\) really indeterminate? As mentioned 0^0 evaluates to 1.\nConsider this limit:\n\\[\n\\lim_{x \\rightarrow 0+} x^{k\\cdot x} = L.\n\\]\nConsider different values of \\(k\\) to see if this limit depends on \\(k\\) or not. What is \\(L\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\log(k)\\)\n \n \n\n\n \n \n \n \n The limit does not exist\n \n \n\n\n \n \n \n \n \\(k\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNow, consider this limit:\n\\[\n\\lim_{x \\rightarrow 0+} x^{1/\\log_k(x)} = L.\n\\]\nIn julia, \\(\\log_k(x)\\) is found with log(k,x). The default, log(x) takes \\(k=e\\) so gives the natural log. So, we would define h, for a given k, with\n\n\nh (generic function with 1 method)\n\n\nConsider different values of \\(k\\) to see if the limit depends on \\(k\\) or not. What is \\(L\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The limit does not exist\n \n \n\n\n \n \n \n \n \\(k\\)\n \n \n\n\n \n \n \n \n \\(\\log(k)\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLimits of infinity at infinity. We could define this concept quite easily mashing together the two definitions. Suppose we did. Which of these ratios would have a limit of infinity at infinity:\n\\[\nx^4/x^3,\\quad x^{100+1}/x^{100}, \\quad x/\\log(x), \\quad 3^x / 2^x, \\quad e^x/x^{100}\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n the first one\n \n \n\n\n \n \n \n \n the first and second ones\n \n \n\n\n \n \n \n \n the first, second and third ones\n \n \n\n\n \n \n \n \n the first, second, third, and fourth ones\n \n \n\n\n \n \n \n \n all of them\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA slant asymptote is a line \\(mx + b\\) for which the graph of \\(f(x)\\) gets close to as \\(x\\) gets large. We cant express this directly as a limit, as “\\(L\\)” is not a number. How might we?\n\n\n\n \n \n \n \n \n \n \n \n \n We can talk about the limit at \\(\\infty\\) of \\(f(x) - (mx + b)\\) being \\(0\\)\n \n \n\n\n \n \n \n \n We can talk about the limit at \\(\\infty\\) of \\(f(x) - mx\\) being \\(b\\)\n \n \n\n\n \n \n \n \n We can say \\(f(x) - (mx+b)\\) has a horizontal asymptote \\(y=0\\)\n \n \n\n\n \n \n \n \n We can say \\(f(x) - mx\\) has a horizontal asymptote \\(y=b\\)\n \n \n\n\n \n \n \n \n Any of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose a sequence of points \\(x_n\\) converges to \\(a\\) in the limiting sense. For a function \\(f(x)\\), the sequence of points \\(f(x_n)\\) may or may not converge. One alternative definition of a limit due to Heine is that \\(\\lim_{x \\rightarrow a}f(x) = L\\) if and only if all sequences \\(x_n \\rightarrow a\\) have \\(f(x_n) \\rightarrow L\\).\nConsider the function \\(f(x) = \\sin(1/x)\\), \\(a=0\\), and the two sequences implicitly defined by \\(1/x_n = \\pi/2 + n \\cdot (2\\pi)\\) and \\(y_n = 3\\pi/2 + n \\cdot(2\\pi)\\), \\(n = 0, 1, 2, \\dots\\).\nWhat is \\(\\lim_{x_n \\rightarrow 0} f(x_n)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(\\lim_{y_n \\rightarrow 0} f(y_n)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThis shows that\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x)\\) has a limit of \\(-1\\) as \\(x \\rightarrow 0\\)\n \n \n\n\n \n \n \n \n \\(f(x)\\) has a limit of \\(1\\) as \\(x \\rightarrow 0\\)\n \n \n\n\n \n \n \n \n \\(f(x)\\) does not have a limit as \\(x \\rightarrow 0\\)"
},
{
"objectID": "limits/continuity.html",
"href": "limits/continuity.html",
"title": "20  Continuity",
"section": "",
"text": "This section uses these add-on packages:\nThe definition Google finds for continuous is forming an unbroken whole; without interruption.\nThe concept in calculus, as transferred to functions, is similar. Roughly speaking, a continuous function is one whose graph could be drawn without having to lift (or interrupt) the pencil drawing it.\nConsider these two graphs:\nand\nThough similar at some level - they agree at nearly every value of \\(x\\) - the first has a “jump” from \\(-1\\) to \\(1\\) instead of the transition in the second one. The first is not continuous at \\(0\\) - a break is needed to draw it - where as the second is continuous.\nA formal definition of continuity was a bit harder to come about. At first the concept was that for any \\(y\\) between any two values in the range for \\(f(x)\\), the function should take on the value \\(y\\) for some \\(x\\). Clearly this could distinguish the two graphs above, as one takes no values in \\((-1,1)\\), whereas the other - the continuous one - takes on all values in that range.\nHowever, Cauchy defined continuity by \\(f(x + \\alpha) - f(x)\\) being small whenever \\(\\alpha\\) was small. This basically rules out “jumps” and proves more useful as a tool to describe continuity.\nThe modern definition simply pushes the details to the definition of the limit:\nThis says three things\nThis speaks to continuity at a point, we can extend this to continuity over an interval \\((a,b)\\) by saying:\nFinally, as with limits, it can be convenient to speak of right continuity and left continuity at a point, where the limit in the defintion is replaced by a right or left limit, as appropriate."
},
{
"objectID": "limits/continuity.html#rules-for-continuity",
"href": "limits/continuity.html#rules-for-continuity",
"title": "20  Continuity",
"section": "20.1 Rules for continuity",
"text": "20.1 Rules for continuity\nAs weve seen, functions can be combined in several ways. How do these relate with continuity?\nSuppose \\(f(x)\\) and \\(g(x)\\) are both continuous on \\(I\\). Then\n\nThe function \\(h(x) = a f(x) + b g(x)\\) is continuous on \\(I\\) for any real numbers \\(a\\) and \\(b\\);\nThe function \\(h(x) = f(x) \\cdot g(x)\\) is continuous on \\(I\\); and\nThe function \\(h(x) = f(x) / g(x)\\) is continuous at all points \\(c\\) in \\(I\\) where \\(g(c) \\neq 0\\).\nThe function \\(h(x) = f(g(x))\\) is continuous at \\(x=c\\) if \\(g(x)\\) is continuous at \\(c\\) and \\(f(x)\\) is continuous at \\(g(c)\\).\n\nSo, continuity is preserved for all of the basic operations except when dividing by \\(0\\).\n\nExamples\n\nSince a monomial \\(f(x) = ax^n\\) (\\(n\\) a non-negative integer) is continuous, by the first rule, any polynomial will be continuous.\nSince both \\(f(x) = e^x\\) and \\(g(x)=\\sin(x)\\) are continuous everywhere, so will be \\(h(x) = e^x \\cdot \\sin(x)\\).\nSince \\(f(x) = e^x\\) is continuous everywhere and \\(g(x) = -x\\) is continuous everywhere, the composition \\(h(x) = e^{-x}\\) will be continuous everywhere.\nSince \\(f(x) = x\\) is continuous everywhere, the function \\(h(x) = 1/x\\) - a ratio of continuous functions - will be continuous everywhere except possibly at \\(x=0\\) (where it is not continuous).\nThe function \\(h(x) = e^{x\\log(x)}\\) will be continuous on \\((0,\\infty)\\), the same domain that \\(g(x) = x\\log(x)\\) is continuous. This function (also written as \\(x^x\\)) has a right limit at \\(0\\) (of \\(1\\)), but is not right continuous, as \\(h(0)\\) is not defined."
},
{
"objectID": "limits/continuity.html#questions",
"href": "limits/continuity.html#questions",
"title": "20  Continuity",
"section": "20.2 Questions",
"text": "20.2 Questions\n\nQuestion\nLet \\(f(x) = \\sin(x)\\) and \\(g(x) = \\cos(x)\\). Which of these is not continuous everywhere?\n\\[\nf+g,~ f-g,~ f\\cdot g,~ f\\circ g,~ f/g\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f\\circ g\\)\n \n \n\n\n \n \n \n \n \\(f\\cdot g\\)\n \n \n\n\n \n \n \n \n \\(f+g\\)\n \n \n\n\n \n \n \n \n \\(f/g\\)\n \n \n\n\n \n \n \n \n \\(f-g\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\sin(x)\\), \\(g(x) = \\sqrt{x}\\).\nWhen will \\(f\\circ g\\) be continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n For all \\(x\\)\n \n \n\n\n \n \n \n \n For all \\(x > 0\\)\n \n \n\n\n \n \n \n \n For all \\(x\\) where \\(\\sin(x) > 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen will \\(g \\circ f\\) be continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n For all \\(x\\)\n \n \n\n\n \n \n \n \n For all \\(x > 0\\)\n \n \n\n\n \n \n \n \n For all \\(x\\) where \\(\\sin(x) > 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe composition \\(f\\circ g\\) will be continuous everywhere provided:\n\n\n\n \n \n \n \n \n \n \n \n \n The function \\(g\\) is continuous everywhere\n \n \n\n\n \n \n \n \n The function \\(f\\) is continuous everywhere\n \n \n\n\n \n \n \n \n The function \\(g\\) is continuous everywhere and \\(f\\) is continuous on the range of \\(g\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) is continuous everywhere and \\(g\\) is continuous on the range of \\(f\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAt which values is \\(f(x) = 1/\\sqrt{x-2}\\) not continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n For \\(x \\geq 0\\)\n \n \n\n\n \n \n \n \n When \\(x > 2\\)\n \n \n\n\n \n \n \n \n When \\(x \\geq 2\\)\n \n \n\n\n \n \n \n \n When \\(x \\leq 2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA value \\(x=c\\) is a removable singularity for \\(f(x)\\) if \\(f(x)\\) is not continuous at \\(c\\) but will be if \\(f(c)\\) is redefined to be \\(\\lim_{x \\rightarrow c} f(x)\\).\nThe function \\(f(x) = (x^2 - 4)/(x-2)\\) has a removable singularity at \\(x=2\\). What value would we redefine \\(f(2)\\) to be, to make \\(f\\) a continuous function?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe highly oscillatory function\n\\[\nf(x) = x^2 (\\cos(1/x) - 1)\n\\]\nhas a removable singularity at \\(x=0\\). What value would we redefine \\(f(0)\\) to be, to make \\(f\\) a continuous function?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x)\\) be defined by\n\\[\nf(x) = \\begin{cases}\nc + \\sin(2x - \\pi/2) & x > 0\\\\\n3x - 4 & x \\leq 0.\n\\end{cases}\n\\]\nWhat value of \\(c\\) will make \\(f(x)\\) continuous?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x)\\), \\(g(x)\\), and \\(h(x)\\) are continuous functions on \\((a,b)\\). If \\(a < c < b\\), are you sure that \\(lim_{x \\rightarrow c} f(g(x))\\) is \\(f(g(c))\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n No, as \\(g(c)\\) may not be in the interval \\((a,b)\\)\n \n \n\n\n \n \n \n \n Yes, composition of continuous functions results in a continuous function, so the limit is just the function value.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the function \\(f(x)\\) given by the following graph\n\n\n\n\n\nThe function \\(f(x)\\) is continuous at \\(x=1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function \\(f(x)\\) is continuous at \\(x=2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function \\(f(x)\\) is right continuous at \\(x=3\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function \\(f(x)\\) is left continuous at \\(x=4\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x)\\) and \\(g(x)\\) be continuous functions whose graph of \\([0,1]\\) is given by:\n\n\n\n\n\nWhat is \\(\\lim_{x \\rightarrow 0.25} f(g(x))\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(\\lim{x \\rightarrow 0.25} g(f(x))\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(\\lim_{x \\rightarrow 0.5} f(g(x))\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0.0\\)\n \n \n\n\n \n \n \n \n \\(-1.0\\)\n \n \n\n\n \n \n \n \n Can't tell"
},
{
"objectID": "limits/intermediate_value_theorem.html",
"href": "limits/intermediate_value_theorem.html",
"title": "21  Implications of continuity",
"section": "",
"text": "This section uses these add-on packages:\nContinuity for functions is a valued property which carries implications. In this section we discuss two: the intermediate value theorem and the extreme value theorem. These two theorems speak to some fundamental applications of calculus: finding zeros of a function and finding extrema of a function."
},
{
"objectID": "limits/intermediate_value_theorem.html#intermediate-value-theorem",
"href": "limits/intermediate_value_theorem.html#intermediate-value-theorem",
"title": "21  Implications of continuity",
"section": "21.1 Intermediate Value Theorem",
"text": "21.1 Intermediate Value Theorem\n\nThe intermediate value theorem: If \\(f\\) is continuous on \\([a,b]\\) with, say, \\(f(a) < f(b)\\), then for any \\(y\\) with \\(f(a) \\leq y \\leq f(b)\\) there exists a \\(c\\) in \\([a,b]\\) with \\(f(c) = y\\).\n\n\n\n \n Illustration of intermediate value theorem. The theorem implies that any randomly chosen \\(y\\) value between \\(f(a)\\) and \\(f(b)\\) will have at least one \\(x\\) in \\([a,b]\\) with \\(f(x)=y\\).\n \n \n\n\n\nIn the early years of calculus, the intermediate value theorem was intricately connected with the definition of continuity, now it is a consequence.\nThe basic proof starts with a set of points in \\([a,b]\\): \\(C = \\{x \\text{ in } [a,b] \\text{ with } f(x) \\leq y\\}\\). The set is not empty (as \\(a\\) is in \\(C\\)) so it must have a largest value, call it \\(c\\) (this requires the completeness property of the real numbers). By continuity of \\(f\\), it can be shown that \\(\\lim_{x \\rightarrow c-} f(x) = f(c) \\leq y\\) and \\(\\lim_{y \\rightarrow c+}f(x) =f(c) \\geq y\\), which forces \\(f(c) = y\\).\n\n21.1.1 Bolzano and the bisection method\nSuppose we have a continuous function \\(f(x)\\) on \\([a,b]\\) with \\(f(a) < 0\\) and \\(f(b) > 0\\). Then as \\(f(a) < 0 < f(b)\\), the intermediate value theorem guarantees the existence of a \\(c\\) in \\([a,b]\\) with \\(f(c) = 0\\). This was a special case of the intermediate value theorem proved by Bolzano first. Such \\(c\\) are called zeros of the function \\(f\\).\nWe use this fact when a building a “sign chart” of a polynomial function. Between any two consecutive real zeros the polynomial can not change sign. (Why?) So a “test point” can be used to determine the sign of the function over an entire interval.\nHere, we use the Bolzano theorem to give an algorithm - the bisection method - to locate the value \\(c\\) under the assumption \\(f\\) is continous on \\([a,b]\\) and changes sign between \\(a\\) and \\(b\\).\n\n\n \n Illustration of the bisection method to find a zero of a function. At each step the interval has \\(f(a)\\) and \\(f(b)\\) having opposite signs so that the intermediate value theorem guaratees a zero.\n \n \n\n\n\nCall \\([a,b]\\) a bracketing interval if \\(f(a)\\) and \\(f(b)\\) have different signs. We remark that having different signs can be expressed mathematically as \\(f(a) \\cdot f(b) < 0\\).\nWe can narrow down where a zero is in \\([a,b]\\) by following this recipe:\n\nPick a midpoint of the interval, for concreteness \\(c = (a+b)/2\\).\nIf \\(f(c) = 0\\) we are done, having found a zero in \\([a,b]\\).\nOtherwise if must be that either \\(f(a)\\cdot f(c) < 0\\) or \\(f(c) \\cdot f(b) < 0\\). If \\(f(a) \\cdot f(c) < 0\\), then let \\(b=c\\) and repeat the above. Otherwise, let \\(a=c\\) and repeat the above.\n\nAt each step the bracketing interval is narrowed indeed split in half as defined or a zero is found.\nFor the real numbers this algorithm never stops unless a zero is found. A “limiting” process is used to say that if it doesnt stop, it will converge to some value.\nHowever, using floating point numbers leads to differences from the real-number situation. In this case, due to the ultimate granularity of the approximation of floating point values to the real numbers, the bracketing interval eventually cant be subdivided, that is no \\(c\\) is found over the floating point numbers with \\(a < c < b\\). So there is a natural stopping criteria: stop when there is an exact zero, when the bracketing interval gets too small to subdivide, or when the interval is as small as desired.\nWe can write a relatively simple program to implement this algorithm:\n\nfunction simple_bisection(f, a, b)\n if f(a) == 0 return(a) end\n if f(b) == 0 return(b) end\n if f(a) * f(b) > 0 error(\"[a,b] is not a bracketing interval\") end\n\n tol = 1e-14 # small number (but should depend on size of a, b)\n c = a/2 + b/2\n\n while abs(b-a) > tol\n if f(c) == 0 return(c) end\n\n if f(a) * f(c) < 0\n a, b = a, c\n else\n a, b = c, b\n end\n\n c = a/2 + b/2\n\n end\n c\nend\n\nsimple_bisection (generic function with 1 method)\n\n\nThis function uses a while loop to repeat the process of subdividing \\([a,b]\\). A while loop will repeat until the condition is no longer true. The above will stop for reasonably sized floating point values (within \\((-100, 100)\\), say), but, as written, ignores the fact that the gap between floating point values depends on their magnitude.\nThe value \\(c\\) returned need not be an exact zero. Lets see:\n\nc = simple_bisection(sin, 3, 4)\n\n3.141592653589793\n\n\nThis value of \\(c\\) is a floating-point approximation to \\(\\pi\\), but is not quite a zero:\n\nsin(c)\n\n1.2246467991473532e-16\n\n\n(Even pi itself is not a “zero” due to floating point issues.)\n\n\n21.1.2 The find_zero function.\nThe Roots package has a function find_zero that implements the bisection method when called as find_zero(f, (a,b)) where \\([a,b]\\) is a bracket. Its use is similar to simple_bisection above. This package is loaded when CalculusWithJulia is. We illlustrate the usage of find_zero in the following:\n\nxstar = find_zero(sin, (3, 4))\n\n3.141592653589793\n\n\n\n\n\n\n\n\nWarning\n\n\n\nNotice, the call find_zero(sin, (3, 4)) again fits the template action(function, args...) that we see repeatedly. The find_zero function can also be called through fzero. The use of (3, 4) to specify the interval is not necessary. For example [3,4] would work equally as well. (Anything where extrema is defined works.)\n\n\nThis function utilizes some facts about floating point values to guarantee that the answer will be an exact zero or a value where there is a sign change between the next bigger floating point or the next smaller, which means the sign at the next and previous floating point values is different:\n\nsin(xstar), sign(sin(prevfloat(xstar))), sign(sin(nextfloat(xstar)))\n\n(1.2246467991473532e-16, 1.0, -1.0)\n\n\n\nExample\nThe polynomial \\(p(x) = x^5 - x + 1\\) has a zero between \\(-2\\) and \\(-1\\). Find it.\n\np(x) = x^5 - x + 1\nc₀ = find_zero(p, (-2, -1))\n(c₀, p(c₀))\n\n(-1.1673039782614187, -6.661338147750939e-16)\n\n\nWe see, as before, that \\(p(c)\\) is not quite \\(0\\). But it can be easily checked that p is negative at the previous floating point number, while p is seen to be positive at the returned value:\n\np(c₀), sign(p(prevfloat(c₀))), sign(p(nextfloat(c₀)))\n\n(-6.661338147750939e-16, -1.0, 1.0)\n\n\n\n\nExample\nThe function \\(q(x) = e^x - x^4\\) has a zero between \\(5\\) and \\(10\\), as this graph shows:\n\nq(x) = exp(x) - x^4\nplot(q, 5, 10)\n\n\n\n\nFind the zero numerically. The plot shows \\(q(5) < 0 < q(10)\\), so \\([5,10]\\) is a bracket. We thus have:\n\nfind_zero(q, (5, 10))\n\n8.6131694564414\n\n\n\n\nExample\nFind all real zeros of \\(f(x) = x^3 -x + 1\\) using the bisection method.\nWe show next that symbolic values can be used with find_zero, should that be useful.\nFirst, we produce a plot to identify a bracketing interval\n\n@syms x\nplot(x^3 - x + 1, -3, 3)\n\n\n\n\nIt appears (and a plot over \\([0,1]\\) verifies) that there is one zero between \\(-2\\) and \\(-1\\). It is found with:\n\nfind_zero(x^3 - x + 1, (-2, -1))\n\n-1.324717957244746\n\n\n\n\nExample\nThe equation \\(\\cos(x) = x\\) has just one solution, as can be seen in this plot:\n\n𝒇(x) = cos(x)\n𝒈(x) = x\nplot(𝒇, -pi, pi)\nplot!(𝒈)\n\n\n\n\nFind it.\nWe see from the graph that it is clearly between \\(0\\) and \\(2\\), so all we need is a function. (We have two.) The trick is to observe that solving \\(f(x) = g(x)\\) is the same problem as solving for \\(x\\) where \\(f(x) - g(x) = 0\\). So we define the difference and use that:\n\n𝒉(x) = 𝒇(x) - 𝒈(x)\nfind_zero(𝒉, (0, 2))\n\n0.7390851332151607\n\n\n\n\nUsing parameterized functions (f(x,p)) with find_zero\nGeometry will tell us that \\(\\cos(x) = x/p\\) for one \\(x\\) in \\([0, \\pi/2]\\) whenever \\(p>0\\). We could set up finding this value for a given \\(p\\) by making \\(p\\) part of the function definition, but as an illustration of passing parameters, we leave p as a parameter (in this case, as a second value with default of \\(1\\)):\n\nf(x, p=1) = cos(x) - x/p\nI = (0, pi/2)\nfind_zero(f, I), find_zero(f, I, p=2)\n\n(0.7390851332151607, 1.0298665293222589)\n\n\nThe second number is the solution when p=2.\n\nExample\nWe wish to compare two trash collection plans\n\nPlan 1: You pay \\(47.49\\) plus \\(0.77\\) per bag.\nPlan 2: You pay \\(30.00\\) plus \\(2.00\\) per bag.\n\nThere are some cases where plan 1 is cheaper and some where plan 2 is. Categorize them.\nBoth plans are linear models and may be written in slope-intercept form:\n\nplan1(x) = 47.49 + 0.77x\nplan2(x) = 30.00 + 2.00x\n\nplan2 (generic function with 1 method)\n\n\nAssuming this is a realistic problem and an average American household might produce \\(10\\)-\\(20\\) bags of trash a month (yes, that seems too much!) we plot in that range:\n\nplot(plan1, 10, 20)\nplot!(plan2)\n\n\n\n\nWe can see the intersection point is around \\(14\\) and that if a family generates between \\(0\\)-\\(14\\) bags of trash per month that plan \\(2\\) would be cheaper.\nLets get a numeric value, using a simple bracket and an anonymous function:\n\nfind_zero(x -> plan1(x) - plan2(x), (10, 20))\n\n14.21951219512195\n\n\n\n\nExample, the flight of an arrow\nThe flight of an arrow can be modeled using various functions, depending on assumptions. Suppose an arrow is launched in the air from a height of \\(0\\) feet above the ground at an angle of \\(\\theta = \\pi/4\\). With a suitable choice for the initial velocity, a model without wind resistance for the height of the arrow at a distance \\(x\\) units away may be:\n\\[\nj(x) = \\tan(\\theta) x - (1/2) \\cdot g(\\frac{x}{v_0 \\cos\\theta})^2.\n\\]\nIn julia we have, taking \\(v_0=200\\):\n\nj(x; theta=pi/4, g=32, v0=200) = tan(theta)*x - (1/2)*g*(x/(v0*cos(theta)))^2\n\nj (generic function with 1 method)\n\n\nWith a velocity-dependent wind resistance given by \\(\\gamma\\), again with some units, a similar equation can be constructed. It takes a different form:\n\\[\nd(x) = (\\frac{g}{\\gamma v_0 \\cos(\\theta)} + \\tan(\\theta)) \\cdot x +\n \\frac{g}{\\gamma^2}\\log(\\frac{v_0\\cos(\\theta) - \\gamma x}{v_0\\cos(\\theta)})\n\\]\nAgain, \\(v_0\\) is the initial velocity and is taken to be \\(200\\) and \\(\\gamma\\) a resistance, which we take to be \\(1\\). With this, we have the following julia definition (with a slight reworking of \\(\\gamma\\)):\n\nfunction d(x; theta=pi/4, g=32, v0=200, gamma=1)\n a = gamma * v0 * cos(theta)\n (g/a + tan(theta)) * x + g/gamma^2 * log((a-gamma^2 * x)/a)\nend\n\nd (generic function with 1 method)\n\n\nFor each model, we wish to find the value of \\(x\\) after launching where the height is modeled to be \\(0\\). That is how far will the arrow travel before touching the ground?\nFor the model without wind resistance, we can graph the function easily enough. Lets guess the distance is no more than \\(500\\) feet:\n\nplot(j, 0, 500)\n\n\n\n\nWell, we havent even seen the peak yet. Better to do a little spade work first. This is a quadratic function, so we can use roots from SymPy to find the roots:\n\nroots(j(x))\n\nDict{Any, Any} with 2 entries:\n 1250.00000000000 => 1\n 0 => 1\n\n\nWe see that \\(1250\\) is the largest root. So we plot over this domain to visualize the flight:\n\nplot(j, 0, 1250)\n\n\n\n\nAs for the model with wind resistance, a quick plot over the same interval, \\([0, 1250]\\) yields:\n\nplot(d, 0, 1250)\n\n\n\n\nThis graph eventually goes negative and then stops. This is due to the asymptote in model when (a - gamma^2*x)/a is zero. To plot the trajectory until it returns to \\(0\\), we need to identify the value of the zero. This model is non-linear and we dont have the simplicity of using roots to find out the answer, so we solve for when \\(a-\\gamma^2 x\\) is \\(0\\):\n\ngamma = 1\na = 200 * cos(pi/4)\nb = a/gamma^2\n\n141.4213562373095\n\n\nNote that the function is infinite at b:\n\nd(b)\n\n-Inf\n\n\nFrom the graph, we can see the zero is around b. As y(b) is -Inf we can use the bracket (b/2,b)\n\nx1 = find_zero(d, (b/2, b))\n\n140.7792933802306\n\n\nThe answer is approximately \\(140.7\\)\n(The bisection method only needs to know the sign of the function. Other bracketing methods would have issues with an endpoint with an infinite function value. To use them, some value between the zero and b would needed.)\nFinally, we plot both graphs at once to see that it was a very windy day indeed.\n\nplot(j, 0, 1250, label=\"no wind\")\nplot!(d, 0, x1, label=\"windy day\")\n\n\n\n\n\n\nExample: bisection and non-continuity\nThe Bolzano theorem assumes a continuous function \\(f\\), and when applicable, yields an algorithm to find a guaranteed zero.\nHowever, the algorithm itself does not know that the function is continuous or not, only that the function changes sign. As such, it can produce answers that are not “zeros” when used with discontinuous functions.\nIn general a function over floating point values could be considered as a large table of mappings: each of the \\(2^{64}\\) floating point values gets assigned a value. This is discrete mapping, there is nothing the computer sees related to continuity.\n\nThe concept of continuity, if needed, must be verified by the user of the algorithm.\n\nWe have seen this when plotting rational functions or functions with vertical asymptotes. The default algorithms just connect points with lines. The user must manage the discontinuity (by assigning some values NaN, say); the algorithms used do not.\nIn this particular case, the bisection algorithm can still be fruitful even when the function is not continuous, as the algorithm will yield information about crossing values of \\(0\\), possibly at discontinuities. But the user of the algorithm must be aware that the answers are only guaranteed to be zeros of the function if the function is continuous and the algorithm did not check for that assumption.\nAs an example, let \\(f(x) = 1/x\\). Clearly the interval \\([-1,1]\\) is a “bracketing” interval as \\(f(x)\\) changes sign between \\(a\\) and \\(b\\). What does the algorithm yield:\n\nfᵢ(x) = 1/x\nx0 = find_zero(fᵢ, (-1, 1))\n\n0.0\n\n\nThe function is not defined at the answer, but we do have the fact that just to the left of the answer (prevfloat) and just to the right of the answer (nextfloat) the function changes sign:\n\nsign(fᵢ(prevfloat(x0))), sign(fᵢ(nextfloat(x0)))\n\n(-1.0, 1.0)\n\n\nSo, the “bisection method” applied here finds a point where the function crosses \\(0\\), either by continuity or by jumping over the \\(0\\). (A jump discontinuity at \\(x=c\\) is defined by the left and right limits of \\(f\\) at \\(c\\) existing but being unequal. The algorithm can find \\(c\\) when this type of function jumps over \\(0\\).)\n\n\n\n\n21.1.3 The find_zeros function\nThe bisection method suggests a naive means to search for all zeros within an interval \\((a, b)\\): split the interval into many small intervals and for each that is a bracketing interval find a zero. This simple description has three flaws: it might miss values where the function doesnt actually cross the \\(x\\) axis; it might miss values where the function just dips to the other side; and it might miss multiple values in the same small interval.\nStill, with some engineering, this can be a useful approach, save the caveats. This idea is implemented in the find_zeros function of the Roots package. The function is called via find_zeros(f, (a, b)) but here the interval \\([a,b]\\) is not necessarily a bracketing interval.\nTo see, we have:\n\nf(x) = cos(10*pi*x)\nfind_zeros(f, (0, 1))\n\n10-element Vector{Float64}:\n 0.05\n 0.15\n 0.25\n 0.35\n 0.45\n 0.5499999999999999\n 0.6499999999999999\n 0.75\n 0.85\n 0.95\n\n\nOr for a polynomial:\n\nf(x) = x^5 - x^4 + x^3 - x^2 + 1\nfind_zeros(f, (-10, 10))\n\n1-element Vector{Float64}:\n -0.6518234538234416\n\n\n(Here \\(-10\\) and \\(10\\) were arbitrarily chosen. Cauchys method could be used to be more systematic.)\n\nExample: Solving f(x) = g(x)\nUse find_zeros to find when \\(e^x = x^5\\) in the interval \\([-20, 20]\\). Verify the answers.\nTo proceed with find_zeros, we define \\(f(x) = e^x - x^5\\), as \\(f(x) = 0\\) precisely when \\(e^x = x^5\\). The zeros are then found with:\n\nf₁(x) = exp(x) - x^5\nzs = find_zeros(f₁, (-20,20))\n\n2-element Vector{Float64}:\n 1.2958555090953687\n 12.713206788867632\n\n\nThe output of find_zeros is a vector of values. To check that each value is an approximate zero can be done with the “.” (broadcast) syntax:\n\nf₁.(zs)\n\n2-element Vector{Float64}:\n 0.0\n 0.0\n\n\n(For a continuous function this should be the case that the values returned by find_zeros are approximate zeros. Bear in mind that if \\(f\\) is not continous the algorithm might find jumping points that are not zeros and may not even be in the domain of the function.)\n\n\n\n21.1.4 An alternate interface to find_zero\nThe find_zero function in the Roots package is an interface to one of several methods. For now we focus on the bracketing methods, later we will see others. Bracketing methods, among others, include Roots.Bisection(), the basic bisection method though with a different sense of “middle” than \\((a+b)/2\\) and used by default above; Roots.A42(), which will typically converge much faster than simple bisection; Roots.Brent() for the classic method of Brent, and FalsePosition() for a family of regula falsi methods. These can all be used by specifying the method in a call to find_zero.\nAlternatively, Roots implements the CommonSolve interface popularized by its use in the DifferentialEquations.jl ecosystem, a wildly successful area for Julia. The basic setup is two steps: setup a “problem,” solve the problem.\nTo set up a problem, we call ZeroProblem with the function and an initial interval, as in:\n\nf₅(x) = x^5 - x - 1\nprob = ZeroProblem(f₅, (1,2))\n\nZeroProblem{typeof(f₅), Tuple{Int64, Int64}}(f₅, (1, 2))\n\n\nThen we can “solve” this problem with solve. For example:\n\nsolve(prob), solve(prob, Roots.Brent()), solve(prob, Roots.A42())\n\n(1.1673039782614187, 1.1673039782614187, 1.1673039782614187)\n\n\nThough the answers are identical, the methods employed were not. The first call, with an unspecified method, defaults to bisection."
},
{
"objectID": "limits/intermediate_value_theorem.html#extreme-value-theorem",
"href": "limits/intermediate_value_theorem.html#extreme-value-theorem",
"title": "21  Implications of continuity",
"section": "21.2 Extreme value theorem",
"text": "21.2 Extreme value theorem\nThe Extreme Value Theorem is another consequence of continuity.\nTo discuss the extreme value theorem, we define an absolute maximum.\n\nThe absolute maximum of \\(f(x)\\) over an interval \\(I\\), when it exists, is the value \\(f(c)\\), \\(c\\) in \\(I\\), where \\(f(x) \\leq f(c)\\) for any \\(x\\) in \\(I\\).\nSimilarly, an absolute minimum of \\(f(x)\\) over an interval \\(I\\) can be defined, when it exists, by a value \\(f(c)\\) where \\(c\\) is in \\(I\\) and \\(f(c) \\leq f(x)\\) for any \\(x\\) in \\(I\\).\n\nRelated but different is the concept of a relative of local extrema:\n\nA local maxima for \\(f\\) is a value \\(f(c)\\) where \\(c\\) is in some open interval \\(I=(a,b)\\), \\(I\\) in the domain of \\(f\\), and \\(f(c)\\) is an absolute maxima for \\(f\\) over \\(I\\). Similarly, an local minima for \\(f\\) is a value \\(f(c)\\) where \\(c\\) is in some open interval \\(I=(a,b)\\), \\(I\\) in the domain of \\(f\\), and \\(f(x)\\) is an absolute minima for \\(f\\) over \\(I\\).\n\nThe term local extrema is used to describe either a local maximum or local minimum.\nThe key point, is the extrema are values in the range that are realized by some value in the domain (possibly more than one.)\nThis chart of the Hardrock 100 illustrates the two concepts.\n\n\n\nElevation profile of the Hardrock 100 ultramarathon. Treating the elevation profile as a function, the absolute maximum is just about 14,000 feet and the absolute minimum about 7600 feet. These are of interest to the runner for different reasons. Also of interest would be each local maxima and local minima - the peaks and valleys of the graph - and the total elevation climbed - the latter so important/unforgettable its value makes it into the charts title.\n\n\nThe extreme value theorem discusses an assumption that ensures absolute maximum and absolute minimum values exist.\n\nThe extreme value theorem: If \\(f(x)\\) is continuous over a closed interval \\([a,b]\\) then \\(f\\) has an absolute maximum and an absolute minimum over \\([a,b]\\).\n\n(By continuous over \\([a,b]\\) we mean continuous on \\((a,b)\\) and right continuous at \\(a\\) and left continuous at \\(b\\).)\nThe assumption that \\([a,b]\\) includes its endpoints (it is closed) is crucial to make a guarantee. There are functions which are continuous on open intervals for which this result is not true. For example, \\(f(x) = 1/x\\) on \\((0,1)\\). This function will have no smallest value or largest value, as defined above.\nThe extreme value theorem is an important theoretical tool for investigating maxima and minima of functions.\n\nExample\nThe function \\(f(x) = \\sqrt{1-x^2}\\) is continuous on the interval \\([-1,1]\\) (in the sense above). It then has an absolute maximum, we can see to be \\(1\\) occurring at an interior point \\(0\\). The absolute minimum is \\(0\\), it occurs at each endpoint.\n\n\nExample\nThe function \\(f(x) = x \\cdot e^{-x}\\) on the closed interval \\([0, 5]\\) is continuous. Hence it has an absolute maximum, which a graph shows to be \\(0.4\\). It has an absolute minimum, clearly the value \\(0\\) occurring at the endpoint.\n\nplot(x -> x * exp(-x), 0, 5)\n\n\n\n\n\n\nExample\nThe tangent function does not have a guarantee of absolute maximum or minimum over \\((-\\pi/2, \\pi/2),\\) as it is not continuous at the endpoints. In fact, it doesnt have either extrema - it has vertical asymptotes at each endpoint of this interval.\n\n\nExample\nThe function \\(f(x) = x^{2/3}\\) over the interval \\([-2,2]\\) has cusp at \\(0\\). However, it is continuous on this closed interval, so must have an absolute maximum and absolute minimum. They can be seen from the graph to occur at the endpoints and the cusp at \\(x=0\\), respectively:\n\nplot(x -> (x^2)^(1/3), -2, 2)\n\n\n\n\n(The use of just x^(2/3) would fail, can you guess why?)\n\n\nExample\nA New York Times article discusses an idea of Norway moving its border some 490 feet north and 650 feet east in order to have the peak of Mount Halti be the highest point in Finland, as currently it would be on the boundary. Mathematically this hints at a higher dimensional version of the extreme value theorem."
},
{
"objectID": "limits/intermediate_value_theorem.html#continuity-and-closed-and-open-sets",
"href": "limits/intermediate_value_theorem.html#continuity-and-closed-and-open-sets",
"title": "21  Implications of continuity",
"section": "21.3 Continuity and closed and open sets",
"text": "21.3 Continuity and closed and open sets\nWe comment on two implications of continuity that can be generalized to more general settings.\nThe two intervals \\((a,b)\\) and \\([a,b]\\) differ as the latter includes the endpoints. The extreme value theorem shows this distinction can make a big difference in what can be said regarding images of such interval.\nIn particular, if \\(f\\) is continuous and \\(I = [a,b]\\) with \\(a\\) and \\(b\\) finite (\\(I\\) is closed and bounded) then the image of \\(I\\) sometimes denoted \\(f(I) = \\{y: y=f(x) \\text{ for } x \\in I\\}\\) has the property that it will be an interval and will include its endpoints (also closed and bounded).\nThat \\(f(I)\\) is an interval is a consequence of the intermediate value theorem. That \\(f(I)\\) contains its endpoints is the extreme value theorem.\nOn the real line, sets that are closed and bounded are “compact,” a term that generalizes to other settings.\n\nContinuity implies that the image of a compact set is compact.\n\nNow let \\((c,d)\\) be an open interval in the range of \\(f\\). An open interval is an open set. On the real line, an open set is one where each point in the set, \\(a\\), has some \\(\\delta\\) such that if \\(|b-a| < \\delta\\) then \\(b\\) is also in the set.\n\nContinuity implies that the preimage of an open set is an open set.\n\nThe preimage of an open set, \\(I\\), is \\(\\{a: f(a) \\in I\\}\\). (All \\(a\\) with an image in \\(I\\).) Taking some pair \\((a,y)\\) with \\(y\\) in \\(I\\) and \\(a\\) in the preimage as \\(f(a)=y\\). Let \\(\\epsilon\\) be such that \\(|x-y| < \\epsilon\\) implies \\(x\\) is in \\(I\\). Then as \\(f\\) is continuous at \\(a\\), given \\(\\epsilon\\) there is a \\(\\delta\\) such that \\(|b-a| <\\delta\\) implies \\(|f(b) - f(a)| < \\epsilon\\) or \\(|f(b)-y| < \\epsilon\\) which means that \\(f(b)\\) is in the \\(I\\) so \\(b\\) is in the preimage, implying the preimage is an open set."
},
{
"objectID": "limits/intermediate_value_theorem.html#questions",
"href": "limits/intermediate_value_theorem.html#questions",
"title": "21  Implications of continuity",
"section": "21.4 Questions",
"text": "21.4 Questions\n\nQuestion\nThere is negative zero in the interval \\([-10, 0]\\) for the function \\(f(x) = e^x - x^4\\). Find its value numerically:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThere is zero in the interval \\([0, 5]\\) for the function \\(f(x) = e^x - x^4\\). Find its value numerically:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = x^2 - 10 \\cdot x \\cdot \\log(x)\\). This function has two zeros on the positive \\(x\\) axis. You are asked to find the largest (graph and bracket…).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe airyai function has infinitely many negative roots, as the function oscillates when \\(x < 0\\) and no positive roots. Find the second largest root using the graph to bracket the answer, and then solve.\n\nplot(airyai, -10, 10) # `airyai` loaded in `SpecialFunctions` by `CalculusWithJulia`\n\n\n\n\nThe second largest root is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n(From Strang, p. 37)\nCertainly \\(x^3\\) equals \\(3^x\\) at \\(x=3\\). Find the largest value for which \\(x^3 = 3x\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nCompare \\(x^2\\) and \\(2^x\\). They meet at \\(2\\), where do the meet again?\n\n\n\n \n \n \n \n \n \n \n \n \n Before and after 2\n \n \n\n\n \n \n \n \n Only after 2\n \n \n\n\n \n \n \n \n Only before 2\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nJust by graphing, find a number in \\(b\\) with \\(2 < b < 3\\) where for values less than \\(b\\) there is a zero beyond \\(b\\) of \\(b^x - x^b\\) and for values more than \\(b\\) there isnt.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(b \\approx 2.2\\)\n \n \n\n\n \n \n \n \n \\(b \\approx 2.9\\)\n \n \n\n\n \n \n \n \n \\(b \\approx 2.5\\)\n \n \n\n\n \n \n \n \n \\(b \\approx 2.7\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: What goes up must come down…\n\n\n\nTrajectories of potential cannonball fires with air-resistance included. (http://ej.iop.org/images/0143-0807/33/1/149/Full/ejp405251f1_online.jpg)\n\n\nIn 1638, according to Amir D. Aczel, an experiment was performed in the French Countryside. A monk, Marin Mersenne, launched a cannonball straight up into the air in an attempt to help Descartes prove facts about the rotation of the earth. Though the experiment was not successful, Mersenne later observed that the time for the cannonball to go up was greater than the time to come down. “Vertical Projection in a Resisting Medium: Reflections on Observations of Mersenne”.\nThis isnt the case for simple ballistic motion where the time to go up is equal to the time to come down. We can “prove” this numerically. For simple ballistic motion:\n\\[\nf(t) = -\\frac{1}{2} \\cdot 32 t^2 + v_0t.\n\\]\nThe time to go up and down are found by the two zeros of this function. The peak time is related to a zero of a function given by f', which for now well take as a mystery operation, but later will be known as the derivative. (The notation assumes CalculusWithJulia has been loaded.)\nLet \\(v_0= 390\\). The three times in question can be found from the zeros of f and f'. What are they?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-4.9731, 0.0, 4.9731)\\)\n \n \n\n\n \n \n \n \n \\((0.0, 625.0, 1250.0)\\)\n \n \n\n\n \n \n \n \n \\((0.0, 12.1875, 24.375)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion What goes up must come down… (again)\nFor simple ballistic motion you find that the time to go up is the time to come down. For motion within a resistant medium, such as air, this isnt the case. Suppose a model for the height as a function of time is given by\n\\[\nh(t) = (\\frac{g}{\\gamma^2} + \\frac{v_0}{\\gamma})(1 - e^{-\\gamma t}) - \\frac{gt}{\\gamma}\n\\]\n(From “On the trajectories of projectiles depicted in early ballistic Woodcuts”)\nHere \\(g=32\\), again we take \\(v_0=390\\), and \\(\\gamma\\) is a drag coefficient that we will take to be \\(1\\). This is valid when \\(h(t) \\geq 0\\). In Julia, rather than hard-code the parameter values, for added flexibility we can pass them in as keyword arguments:\n\nh(t; g=32, v0=390, gamma=1) = (g/gamma^2 + v0/gamma)*(1 - exp(-gamma*t)) - g*t/gamma\n\nh (generic function with 1 method)\n\n\nNow find the three times: \\(t_0\\), the starting time; \\(t_a\\), the time at the apex of the flight; and \\(t_f\\), the time the object returns to the ground.\n\n\n\n \n \n \n \n \n \n \n \n \n \\((0, 13.187, 30.0)\\)\n \n \n\n\n \n \n \n \n \\((0, 32.0, 390.0)\\)\n \n \n\n\n \n \n \n \n \\((0, 2.579, 13.187)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nPart of the proof of the intermediate value theorem rests on knowing what the limit is of \\(f(x)\\) when \\(f(x) > y\\) for all \\(x\\). What can we say about \\(L\\) supposing \\(L = \\lim_{x \\rightarrow c+}f(x)\\) under this assumption on \\(f\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It must be that \\(L > y\\) as each \\(f(x)\\) is.\n \n \n\n\n \n \n \n \n It must be that \\(L \\geq y\\)\n \n \n\n\n \n \n \n \n It can happen that \\(L < y\\), \\(L=y\\), or \\(L>y\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem has two assumptions: a continuous function and a closed interval. Which of the following examples fails to satisfy the consequence of the extreme value theorem because the interval is not closed? (The consequence - the existence of an absolute maximum and minimum - can happen even if the theorem does not apply.)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = \\sin(x),~ I=(-2\\pi, 2\\pi)\\)\n \n \n\n\n \n \n \n \n \\(f(x) = \\sin(x),~ I=(-\\pi, \\pi)\\)\n \n \n\n\n \n \n \n \n \\(f(x) = \\sin(x),~ I=(-\\pi/2, \\pi/2)\\)\n \n \n\n\n \n \n \n \n None of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem has two assumptions: a continuous function and a closed interval. Which of the following examples fails to satisfy the consequence of the extreme value theorem because the function is not continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = 1/x,~ I=[1,2]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = 1/x,~ I=[-2, -1]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = 1/x,~ I=[-1, 1]\\)\n \n \n\n\n \n \n \n \n none of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem has two assumptions: a continuous function and a closed interval. Which of the following examples fails to satisfy the consequence of the extreme value theorem because the function is not continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = \\text{sign}(x),~ I=[-1, 1]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = 1/x,~ I=[-4, -1]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = \\text{floor}(x),~ I=[-1/2, 1/2]\\)\n \n \n\n\n \n \n \n \n none of the above\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe function \\(f(x) = x^3 - x\\) is continuous over the interval \\(I=[-2,2]\\). Find a value \\(c\\) for which \\(M=f(c)\\) is an absolute maximum over \\(I\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe function \\(f(x) = x^3 - x\\) is continuous over the interval \\(I=[-1,1]\\). Find a value \\(c\\) for which \\(M=f(c)\\) is an absolute maximum over \\(I\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the continuous function \\(f(x) = \\sin(x)\\) over the closed interval \\(I=[0, 10\\pi]\\). Which of these is true?\n\n\n\n \n \n \n \n \n \n \n \n \n There is no value \\(c\\) for which \\(f(c)\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n There is just one value of \\(c\\) for which \\(f(c)\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n There are many values of \\(c\\) for which \\(f(c)\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the continuous function \\(f(x) = \\sin(x)\\) over the closed interval \\(I=[0, 10\\pi]\\). Which of these is true?\n\n\n\n \n \n \n \n \n \n \n \n \n There is no value \\(M\\) for which \\(M=f(c)\\), \\(c\\) in \\(I\\) for which \\(M\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n There is just one value \\(M\\) for which \\(M=f(c)\\), \\(c\\) in \\(I\\) for which \\(M\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n There are many values \\(M\\) for which \\(M=f(c)\\), \\(c\\) in \\(I\\) for which \\(M\\) is an absolute maximum over \\(I\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem says that on a closed interval a continuous function has an extreme value \\(M=f(c)\\) for some \\(c\\). Does it also say that \\(c\\) is unique? Which of these examples might help you answer this?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = \\sin(x),\\quad I=[0, 2\\pi]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = \\sin(x),\\quad I=[-\\pi/2, \\pi/2]\\)\n \n \n\n\n \n \n \n \n \\(f(x) = \\sin(x),\\quad I=[-2\\pi, 2\\pi]\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe zeros of the equation \\(\\cos(x) \\cdot \\cosh(x) = 1\\) are related to vibrations of rods. Using find_zeros, what is the largest zero in the interval \\([0, 6\\pi]\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA parametric equation is specified by a parameterization \\((f(t), g(t)), a \\leq t \\leq b\\). The parameterization will be continuous if and only if each function is continuous.\nSuppose \\(k_x\\) and \\(k_y\\) are positive integers and \\(a, b\\) are positive numbers, will the Lissajous curve given by \\((a\\cos(k_x t), b\\sin(k_y t))\\) be continuous?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHere is a sample graph for \\(a=1, b=2, k_x=3, k_y=4\\):\n\na,b = 1, 2\nk_x, k_y = 3, 4\nplot(t -> a * cos(k_x *t), t-> b * sin(k_y * t), 0, 4pi)"
},
{
"objectID": "derivatives/derivatives.html",
"href": "derivatives/derivatives.html",
"title": "22  Derivatives",
"section": "",
"text": "This section uses these add-on packages:\nBefore defining the derivative of a function, lets begin with two motivating examples."
},
{
"objectID": "derivatives/derivatives.html#the-slope-of-the-secant-line",
"href": "derivatives/derivatives.html#the-slope-of-the-secant-line",
"title": "22  Derivatives",
"section": "22.1 The slope of the secant line",
"text": "22.1 The slope of the secant line\nIn the above examples, we see that the average speed is computed using the slope formula. This can be generalized for any univariate function \\(f(x)\\):\n\nThe average rate of change between \\(a\\) and \\(b\\) is \\((f(b) - f(a)) / (b - a)\\). It is typical to express this as \\(\\Delta y/ \\Delta x\\), where \\(\\Delta\\) means “change”.\n\nGeometrically, this is the slope of the line connecting the points \\((a, f(a))\\) and \\((b, f(b))\\). This line is called a secant line, which is just a line intersecting two specified points on a curve.\nRather than parameterize this problem using \\(a\\) and \\(b\\), we let \\(c\\) and \\(c+h\\) represent the two values for \\(x\\), then the secant-line-slope formula becomes\n\\[\nm = \\frac{f(c+h) - f(c)}{h}.\n\\]"
},
{
"objectID": "derivatives/derivatives.html#the-slope-of-the-tangent-line",
"href": "derivatives/derivatives.html#the-slope-of-the-tangent-line",
"title": "22  Derivatives",
"section": "22.2 The slope of the tangent line",
"text": "22.2 The slope of the tangent line\nThe slope of the secant line represents the average rate of change over a given period, \\(h\\). What if this rate is so variable, that it makes sense to take smaller and smaller periods \\(h\\)? In fact, what if \\(h\\) goes to \\(0\\)?\n\n\n \n The slope of each secant line represents the average rate of change between \\(c\\) and \\(c+h\\). As \\(h\\) goes towards \\(0\\), we recover the slope of the tangent line, which represents the instantatneous rate of change.\n \n \n\n\n\nThe graphic suggests that the slopes of the secant line converge to the slope of a “tangent” line. That is, for a given \\(c\\), this limit exists:\n\\[\n\\lim_{h \\rightarrow 0} \\frac{f(c+h) - f(c)}{h}.\n\\]\nWe will define the tangent line at \\((c, f(c))\\) to be the line through the point with the slope from the limit above - provided that limit exists. Informally, the tangent line is the line through the point that best approximates the function.\n\n\n \n The tangent line is the best linear approximation to the function at the point \\((c, f(c))\\). As the viewing window zooms in on \\((c,f(c))\\) we can see how the graph and its tangent line get more similar.\n \n \n\n\n\nThe tangent line is not just a line that intersects the graph in one point, nor does it need only intersect the line in just one point.\n\n\n\n\n\n\nNote\n\n\n\nThis last point was certainly not obvious at first. Barrow, who had Newton as a pupil, and was the first to sketch a proof of part of the Fundamental Theorem of Calculus, understood a tangent line to be a line that intersects a curve at only one point.\n\n\n\nExample\nWhat is the slope of the tangent line to \\(f(x) = \\sin(x)\\) at \\(c=0\\)?\nWe need to compute the limit \\((\\sin(c+h) - \\sin(c))/h\\) which is the limit as \\(h\\) goes to \\(0\\) of \\(\\sin(h)/h.\\) We know this to be \\(1.\\)\n\nf(x) = sin(x)\nc = 0\ntl(x) = f(c) + 1 * (x - c)\nplot(f, -pi/2, pi/2)\nplot!(tl, -pi/2, pi/2)"
},
{
"objectID": "derivatives/derivatives.html#the-derivative",
"href": "derivatives/derivatives.html#the-derivative",
"title": "22  Derivatives",
"section": "22.3 The derivative",
"text": "22.3 The derivative\nThe limit of the slope of the secant line gives an operation: for each \\(c\\) in the domain of \\(f\\) there is a number (the slope of the tangent line) or it does not exist. That is, there is a derived function from \\(f\\). Call this function the derivative of \\(f\\).\nThere are many notations for the derivative, mostly we use the “prime” notation:\n\\[\nf'(x) = \\lim_{h \\rightarrow 0} \\frac{f(x+h) - f(x)}{h}.\n\\]\nThe limit above is identical, only it uses \\(x\\) instead of \\(c\\) to emphasize that we are thinking of a function now, and not just a value at a point.\nThe derivative is related to a function, but at times it is more convenient to write only the expression defining the rule of the function. In that case, we use this notation for the derivative \\([\\text{expression}]'\\).\n\n22.3.1 Some basic derivatives\n\nThe power rule. What is the derivative of the monomial \\(f(x) = x^n\\)? We need to look at \\((x+h)^n - x^n\\) for positive, integer-value \\(n\\). Lets look at a case, \\(n=5\\)\n\n\n@syms x::real h::real\nn = 5\nex = expand((x+h)^n - x^n)\n\n \n\\[\nh^{5} + 5 h^{4} x + 10 h^{3} x^{2} + 10 h^{2} x^{3} + 5 h x^{4}\n\\]\n\n\n\nAll terms have an h in them, so we cancel it out:\n\ncancel(ex/h, h)\n\n \n\\[\nh^{4} + 5 h^{3} x + 10 h^{2} x^{2} + 10 h x^{3} + 5 x^{4}\n\\]\n\n\n\nWe see the lone term 5x^4 without an \\(h\\), so as we let \\(h\\) go to \\(0\\), this will be the limit. That is, \\(f'(x) = 5x^4\\).\nFor integer-valued, positive, \\(n\\), the binomial theorem gives an expansion \\((x+h)^n = x^n + nx^{n-1}\\cdot h^1 + n\\cdot(n-1)x^{n-2}\\cdot h^2 + \\cdots\\). Subtracting \\(x^n\\) then dividing by \\(h\\) leaves just the term \\(nx^{n-1}\\) without a power of \\(h\\), so the limit, in general, is just this term. That is:\n\\[\n[x^n]' = nx^{n-1}.\n\\]\nIt isnt a special case, but when \\(n=0\\), we also have the above formula applies, as \\(x^0\\) is the constant \\(1\\), and all constant functions will have a derivative of \\(0\\) at all \\(x\\). We will see that in general, the power rule applies for any \\(n\\) where \\(x^n\\) is defined.\n\nWhat is the derivative of \\(f(x) = \\sin(x)\\)? We know that \\(f'(0)= 1\\) by the earlier example with \\((\\sin(0+h)-\\sin(0))/h = \\sin(h)/h\\), here we solve in general.\n\nWe need to consider the difference \\(\\sin(x+h) - \\sin(x)\\):\n\nsympy.expand_trig(sin(x+h) - sin(x)) # expand_trig is not exposed in `SymPy`\n\n \n\\[\n\\sin{\\left(h \\right)} \\cos{\\left(x \\right)} + \\sin{\\left(x \\right)} \\cos{\\left(h \\right)} - \\sin{\\left(x \\right)}\n\\]\n\n\n\nThat used the formula \\(\\sin(x+h) = \\sin(x)\\cos(h) + \\sin(h)\\cos(x)\\).\nWe could then rearrange the secant line slope formula to become:\n\\[\n\\cos(x) \\cdot \\frac{\\sin(h)}{h} + \\sin(x) \\cdot \\frac{\\cos(h) - 1}{h}\n\\]\nand take a limit. If the answer isnt clear, we can let SymPy do this work:\n\nlimit((sin(x+h) - sin(x))/ h, h => 0)\n\n \n\\[\n\\cos{\\left(x \\right)}\n\\]\n\n\n\nFrom the formula \\([\\sin(x)]' = \\cos(x)\\) we can easily get the slope of the tangent line to \\(f(x) = \\sin(x)\\) at \\(x=0\\) by simply evaluating \\(\\cos(0) = 1\\).\n\nLets see what the derivative of \\(\\ln(x) = \\log(x)\\) is (using base \\(e\\) for \\(\\log\\) unless otherwise indicated). We have\n\n\\[\n\\frac{\\log(x+h) - \\log(x)}{h} = \\frac{1}{h}\\log(\\frac{x+h}{x}) = \\log((1+h/x)^{1/h}).\n\\]\nAs noted earlier, Cauchy saw the limit as \\(u\\) goes to \\(0\\) of \\(f(u) = (1 + u)^{1/u}\\) is \\(e\\). Re-expressing the above we can get \\(1/h \\cdot \\log(f(h/x))\\). The limit as \\(h\\) goes to \\(0\\) of this is found from the composition rules for limits: as \\(\\lim_{h \\rightarrow 0} f(h/x) = e^{1/x}\\), and since \\(\\log(x)\\) is continuous at \\(e^{1/x}\\) we get this expression has a limit of \\(1/x\\).\nWe verify through:\n\nlimit((log(x+h) - log(x))/h, h => 0)\n\n \n\\[\n\\frac{1}{x}\n\\]\n\n\n\n\nThe derivative of \\(f(x) = e^x\\) can also be done from a limit. We have\n\n\\[\n\\frac{e^{x+h} - e^x}{h} = \\frac{e^x \\cdot(e^h -1)}{h}.\n\\]\nEarlier, we saw that \\(\\lim_{h \\rightarrow 0}(e^h - 1)/h = 1\\). With this, we get \\([e^x]' = e^x\\), that is it is a function satisfying \\(f'=f\\).\n\nThere are several different notations for derivatives. Some are historical, some just add flexibility. We use the prime notation of Lagrange: \\(f'(x)\\), \\(u'\\) and \\([\\text{expr}]'\\), where the first emphasizes that the derivative is a function with a value at \\(x\\), the second emphasizes the derivative operates on functions, the last emphasizes that we are taking the derivative of some expression.\nThere are many other notations:\n\nThe Leibniz notation uses the infinitesimals: \\(dy/dx\\) to relate to \\(\\Delta y/\\Delta x\\). This notation is very common, and especially useful when more than one variable is involved. SymPy uses Leibniz notation in some of its output, expressing somethings such as:\n\n\\[\nf'(x) = \\frac{d}{d\\xi}(f(\\xi)) \\big|_{\\xi=x}.\n\\]\nThe notation - \\(\\big|\\) - on the right-hand side separates the tasks of finding the derivative and evaluating the derivative at a specific value.\n\nEuler used D for the operator D(f). This was initially used by Argobast. The notation D(f)(c) would be needed to evaluate the derivative at a point.\nNewton used a “dot” above the variable, \\(\\dot{x}(t)\\), which is still widely used in physics to indicate a derivative in time. This inidicates take the derivative and then plug in \\(t\\).\nThe notation \\([expr]'(c)\\) or \\([expr]'\\big|_{x=c}\\)would similarly mean, take the derivative of the expression and then evaluate at \\(c\\)."
},
{
"objectID": "derivatives/derivatives.html#rules-of-derivatives",
"href": "derivatives/derivatives.html#rules-of-derivatives",
"title": "22  Derivatives",
"section": "22.4 Rules of derivatives",
"text": "22.4 Rules of derivatives\nWe could proceed in a similar manner using limits to find other derivatives, but lets not. If we have a function \\(f(x) = x^5 \\sin(x)\\), it would be nice to leverage our previous work on the derivatives of \\(f(x) =x^5\\) and \\(g(x) = \\sin(x)\\), rather than derive an answer from scratch.\nAs with limits and continuity, it proves very useful to consider rules that make the process of finding derivatives of combinations of functions a matter of combining derivatives of the individual functions in some manner.\nWe already have one such rule:\n\n22.4.1 Power rule\nWe have seen for integer \\(n \\geq 0\\) the formula:\n\\[\n[x^n]' = n x^{n-1}.\n\\]\nThis will be shown true for all real exponents.\n\n\n22.4.2 Sum rule\nLets consider \\(k(x) = a\\cdot f(x) + b\\cdot g(x)\\), what is its derivative? That is, in terms of \\(f\\), \\(g\\) and their derivatives, can we express \\(k'(x)\\)?\nWe can rearrange \\((k(x+h) - k(x))\\) as follows:\n\\[\n(a\\cdot f(x+h) + b\\cdot g(x+h)) - (a\\cdot f(x) + b \\cdot g(x)) =\na\\cdot (f(x+h) - f(x)) + b \\cdot (g(x+h) - g(x)).\n\\]\nDividing by \\(h\\), we see that this becomes\n\\[\na\\cdot \\frac{f(x+h) - f(x)}{h} + b \\cdot \\frac{g(x+h) - g(x)}{h} \\rightarrow a\\cdot f'(x) + b\\cdot g'(x).\n\\]\nThat is \\([a\\cdot f(x) + b \\cdot g(x)]' = a\\cdot f'(x) + b\\cdot g'(x)\\).\nThis holds two rules: the derivative of a constant times a function is the constant times the derivative of the function; and the derivative of a sum of functions is the sum of the derivative of the functions.\nThis example shows a useful template:\n\\[\n\\begin{align*}\n[2x^2 - \\frac{x}{3} + 3e^x]' & = 2[\\square]' - \\frac{[\\square]]}{3} + 3[\\square]'\\\\\n&= 2[x^2]' - \\frac{[x]'}{3} + 3[e^x]'\\\\\n&= 2(2x) - \\frac{1}{3} + 3e^x\\\\\n&= 4x - \\frac{1}{3} + 3e^x\n\\end{align*}\n\\]\n\n\n22.4.3 Product rule\nOther rules can be similarly derived. SymPy can give us them as well. Here we define two symbolic functions u and v and let SymPy derive a formula for the derivative of a product of functions:\n\n@syms u() v()\nf(x) = u(x) * v(x)\nlimit((f(x+h) - f(x))/h, h => 0)\n\n \n\\[\nu{\\left(x \\right)} \\left. \\frac{d}{d \\xi_{1}} v{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=x }} + v{\\left(x \\right)} \\left. \\frac{d}{d \\xi_{1}} u{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=x }}\n\\]\n\n\n\nThe output uses the Leibniz notation to represent that the derivative of \\(u(x) \\cdot v(x)\\) is the \\(u\\) times the derivative of \\(v\\) evaluated at \\(x\\) plus \\(v\\) times the derivative of \\(u\\) evaluated at \\(x\\). A common shorthand is \\([uv]' = u'v + uv'\\).\nThis example shows a useful template for the product rule:\n\\[\n\\begin{align*}\n[(x^2+1)\\cdot e^x]' &= [\\square]' \\cdot (\\square) + (\\square) \\cdot [\\square]'\\\\\n&= [x^2 + 1]' \\cdot (e^x) + (x^2+1) \\cdot [e^x]'\\\\\n&= (2x)\\cdot e^x + (x^2+1)\\cdot e^x\n\\end{align*}\n\\]\n\n\n22.4.4 Quotient rule\nThe derivative of \\(f(x) = u(x)/v(x)\\) - a ratio of functions - can be similarly computed. The result will be \\([u/v]' = (u'v - uv')/u^2\\):\n\n@syms u() v()\nf(x) = u(x) / v(x)\nlimit((f(x+h) - f(x))/h, h => 0)\n\n \n\\[\n\\frac{- u{\\left(x \\right)} \\left. \\frac{d}{d \\xi_{1}} v{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=x }} + v{\\left(x \\right)} \\left. \\frac{d}{d \\xi_{1}} u{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=x }}}{v^{2}{\\left(x \\right)}}\n\\]\n\n\n\nThis example shows a useful template for the quotient rule:\n\\[\n\\begin{align*}\n[\\frac{x^2+1}{e^x}]' &= \\frac{[\\square]' \\cdot (\\square) - (\\square) \\cdot [\\square]'}{(\\square)^2}\\\\\n&= \\frac{[x^2 + 1]' \\cdot (e^x) - (x^2+1) \\cdot [e^x]'}{(e^x)^2}\\\\\n&= \\frac{(2x)\\cdot e^x - (x^2+1)\\cdot e^x}{e^{2x}}\n\\end{align*}\n\\]\n\nExamples\nCompute the derivative of \\(f(x) = (1 + \\sin(x)) + (1 + x^2)\\).\nAs written we can identify \\(f(x) = u(x) + v(x)\\) with \\(u=(1 + \\sin(x))\\), \\(v=(1 + x^2)\\). The sum rule immediately applies to give:\n\\[\nf'(x) = (\\cos(x)) + (2x).\n\\]\n\nCompute the derivative of \\(f(x) = (1 + \\sin(x)) \\cdot (1 + x^2)\\).\nThe same \\(u\\) and \\(v\\) my be identified. The product rule readily applies to yield:\n\\[\nf'(x) = u'v + uv' = \\cos(x) \\cdot (1 + x^2) + (1 + \\sin(x)) \\cdot (2x).\n\\]\n\nCompute the derivative of \\(f(x) = (1 + \\sin(x)) / (1 + x^2)\\).\nThe same \\(u\\) and \\(v\\) my be identified. The quotient rule readily applies to yield:\n\\[\nf'(x) = u'v - uv' = \\frac{\\cos(x) \\cdot (1 + x^2) - (1 + \\sin(x)) \\cdot (2x)}{(1+x^2)^2}.\n\\]\n\nCompute the derivative of \\(f(x) = (x-1) \\cdot (x-2)\\).\nThis can be done using the product rule or by expanding the polynomial and using the power and sum rule. As this polynomial is easy to expand, we do both and compare:\n\\[\n[(x-1)(x-2)]' = [x^2 - 3x + 2]' = 2x -3.\n\\]\nWhereas the product rule gives:\n\\[\n[(x-1)(x-2)]' = 1\\cdot (x-2) + (x-1)\\cdot 1 = 2x - 3.\n\\]\n\nFind the derivative of \\(f(x) = (x-1)(x-2)(x-3)(x-4)(x-5)\\).\nWe could expand this, as above, but without computer assistance the potential for error is high. Instead we will use the product rule on the product of \\(5\\) terms.\nLets first treat the case of \\(3\\) products:\n\\[\n[u\\cdot v\\cdot w]' =[ u \\cdot (vw)]' = u' (vw) + u [vw]' = u'(vw) + u[v' w + v w'] =\nu' vw + u v' w + uvw'.\n\\]\nThis pattern generalizes, clearly, to:\n\\[\n[f_1\\cdot f_2 \\cdots f_n]' = f_1' f_2 \\cdots f_n + f_1 \\cdot f_2' \\cdot (f_3 \\cdots f_n) + \\dots +\nf_1 \\cdots f_{n-1} \\cdot f_n'.\n\\]\nThere are \\(n\\) terms, each where one of the \\(f_i\\)s have a derivative. Were we to multiply top and bottom by \\(f_i\\), we would get each term looks like: \\(f \\cdot f_i'/f_i\\).\nWith this, we can proceed. Each term \\(x-i\\) has derivative \\(1\\), so the answer to \\(f'(x)\\), with \\(f\\) as above, is \\(f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)\\), that is:\n\\[\nf'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4).\n\\]\n\nFind the derivative of \\(x\\sin(x)\\) evaluated at \\(\\pi\\).\n\\[\n[x\\sin(x)]'\\big|_{x=\\pi} = (1\\sin(x) + x\\cos(x))\\big|_{x=\\pi} = (\\sin(\\pi) + \\pi \\cdot \\cos(\\pi)) = -\\pi.\n\\]\n\n\n\n22.4.5 Chain rule\nFinally, the derivative of a composition of functions can be computed using pieces of each function. This gives a rule called the chain rule. Before deriving, lets give a slight motivation.\nConsider the output of a factory for some widget. It depends on two steps: an initial manufacturing step and a finishing step. The number of employees is important in how much is initially manufactured. Suppose \\(x\\) is the number of employees and \\(g(x)\\) is the amount initially manufactured. Adding more employees increases the amount made by the made-up rule \\(g(x) = \\sqrt{x}\\). The finishing step depends on how much is made by the employees. If \\(y\\) is the amount made, then \\(f(y)\\) is the number of widgets finished. Suppose for some reason that \\(f(y) = y^2.\\)\nHow many widgets are made as a function of employees? The composition \\(u(x) = f(g(x))\\) would provide that. Changes in the initial manufacturing step lead to changes in how much is initially made; changes in the initial amount made leads to changes in the finished products. Each change contributes to the overall change.\nWhat is the effect of adding employees on the rate of output of widgets? In this specific case we know the answer, as \\((f \\circ g)(x) = x\\), so the answer is just the rate is \\(1\\).\nIn general, we want to express \\(\\Delta f / \\Delta x\\) in a form so that we can take a limit.\nBut what do we know? We know \\(\\Delta g / \\Delta x\\) and \\(\\Delta f/\\Delta y\\). Using \\(y=g(x)\\), this suggests that we might have luck with the right side of this equation:\n\\[\n\\frac{\\Delta f}{\\Delta x} = \\frac{\\Delta f}{\\Delta y} \\cdot \\frac{\\Delta y}{\\Delta x}.\n\\]\nInterpreting this, we get the average rate of change in the composition can be thought of as a product: The average rate of change of the initial step (\\(\\Delta y/ \\Delta x\\)) times the average rate of the change of the second step evaluated not at \\(x\\), but at \\(y\\), \\(\\Delta f/ \\Delta y\\).\nRe-expressing using derivative notation with \\(h\\) would be:\n\\[\n\\frac{f(g(x+h)) - f(g(x))}{h} = \\frac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)} \\cdot \\frac{g(x+h) - g(x)}{h}.\n\\]\nThe left hand side will converge to the derivative of \\(u(x)\\) or \\([f(g(x))]'\\).\nThe right most part of the right side would have a limit \\(g'(x)\\), were we to let \\(h\\) go to \\(0\\).\nIt isnt obvious, but the left part of the right side has the limit \\(f'(g(x))\\). This would be clear if only \\(g(x+h) = g(x) + h\\), for then the expression would be exactly the limit expression with \\(c=g(x)\\). But, alas, except to some hopeful students and some special cases, it is definitely not the case in general that \\(g(x+h) = g(x) + h\\) - that right parentheses actually means something. However, it is nearly the case that \\(g(x+h) = g(x) + kh\\) for some \\(k\\) and this can be used to formulate a proof (one of the two detailed here and here).\nCombined, we would end up with:\n\nThe chain rule: \\([f(g(x))]' = f'(g(x)) \\cdot g'(x)\\). That is the derivative of the outer function evaluated at the inner function times the derivative of the inner function.\n\nTo see that this works in our specific case, we assume the general power rule that \\([x^n]' = n x^{n-1}\\) to get:\n\\[\n\\begin{align*}\nf(x) &= x^2 & g(x) &= \\sqrt{x}\\\\\nf'(\\square) &= 2(\\square) & g'(x) &= \\frac{1}{2}x^{-1/2}\n\\end{align*}\n\\]\nWe use \\(\\square\\) for the argument of f' to emphasize that \\(g(x)\\) is the needed value, not just \\(x\\):\n\\[\n\\begin{align*}\n[(\\sqrt{x})^2]' &= [f(g(x)]'\\\\\n&= f'(g(x)) \\cdot g'(x) \\\\\n&= 2(\\sqrt{x}) \\cdot \\frac{1}{2}x^{-1/2}\\\\\n&= \\frac{2\\sqrt{x}}{2\\sqrt{x}}\\\\\n&=1\n\\end{align*}\n\\]\nThis is the same as the derivative of \\(x\\) found by first evaluating the composition. For this problem, the chain rule is not necessary, but typically it is a needed rule to fully differentiate a function.\n\nExamples\nFind the derivative of \\(f(x) = \\sqrt{1 - x^2}\\). We identify the composition of \\(\\sqrt{x}\\) and \\((1-x^2)\\). We set the functions and their derivatives into a pattern to emphasize the pieces in the chain-rule formula:\n\\[\n\\begin{align*}\nf(x) &=\\sqrt{x} = x^{1/2} & g(x) &= 1 - x^2 \\\\\nf'(\\square) &=(1/2)(\\square)^{-1/2} & g'(x) &= -2x\n\\end{align*}\n\\]\nThen:\n\\[\n[f(g(x))]' = (1/2)(1-x^2)^{-1/2} \\cdot (-2x).\n\\]\n\nFind the derivative of \\(\\log(2 + \\sin(x))\\). This is a composition \\(\\log(x)\\) with derivative \\(1/x\\) and \\(2 + \\sin(x)\\) with derivative \\(\\cos(x)\\). We get \\((1/\\sin(x)) \\cos(x)\\).\nIn general,\n\\[\n[\\log(f(x))]' \\frac{f'(x)}{f(x)}.\n\\]\n\nFind the derivative of \\(e^{f(x)}\\). The inner function has derivative \\(f'(x)\\), the outer function has derivative \\(e^x\\) (the same as the outer function itself). We get for a derivative\n\\[\n[e^{f(x)}]' = e^{f(x)} \\cdot f'(x).\n\\]\nThis is a useful rule to remember for expressions involving exponentials.\n\nFind the derivative of \\(\\sin(x)\\cos(2x)\\) at \\(x=\\pi\\).\n\\[\n[\\sin(x)\\cos(2x)]'\\big|_{x=\\pi} =\n(\\cos(x)\\cos(2x) + \\sin(x)(-\\sin(2x)\\cdot 2))\\big|_{x=\\pi} =\n((-1)(1) + (0)(-0)(2)) = -1.\n\\]\n\n\nProof of the Chain Rule\nA function is differentiable at \\(a\\) if the following limit exists \\(\\lim_{h \\rightarrow 0}(f(a+h)-f(a))/h\\). Reexpressing this as: \\(f(a+h) - f(a) - f'(a)h = \\epsilon_f(h) h\\) where as \\(h\\rightarrow 0\\), \\(\\epsilon_f(h) \\rightarrow 0\\). Then, we have:\n\\[\ng(a+h) = g(a) + g'(a)h + \\epsilon_g(h) h = g(a) + h',\n\\]\nWhere \\(h' = (g'(a) + \\epsilon_g(h))h \\rightarrow 0\\) as \\(h \\rightarrow 0\\) will be used to simplify the following:\n\\[\n\\begin{align}\nf(g(a+h)) - f(g(a)) &=\nf(g(a) + g'(a)h + \\epsilon_g(h)h) - f(g(a)) \\\\\n&= f(g(a)) + f'(g(a)) (g'(a)h + \\epsilon_g(h)h) + \\epsilon_f(h')(h') - f(g(a))\\\\\n&= f'(g(a)) g'(a)h + f'(g(a))(\\epsilon_g(h)h) + \\epsilon_f(h')(h').\n\\end{align}\n\\]\nRearranging:\n\\[\nf(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\\epsilon_g(h))h + \\epsilon_f(h')(h') =\n(f'(g(a)) \\epsilon_g(h) + \\epsilon_f(h')( (g'(a) + \\epsilon_g(h))))h =\n\\epsilon(h)h,\n\\]\nwhere \\(\\epsilon(h)\\) combines the above terms which go to zero as \\(h\\rightarrow 0\\) into one. This is the alternative definition of the derivative, showing \\((f\\circ g)'(a) = f'(g(a)) g'(a)\\) when \\(g\\) is differentiable at \\(a\\) and \\(f\\) is differentiable at \\(g(a)\\).\n\n\nThe “chain” rule\nThe chain rule name could also be simply the “composition rule,” as that is the operation the rule works for. However, in practice, there are usually multiple compositions, and the “chain” rule is used to chain together the different pieces. To get a sense, consider a triple composition \\(u(v(w(x())))\\). This will have derivative:\n\\[\n\\begin{align*}\n[u(v(w(x)))]' &= u'(v(w(x))) \\cdot [v(w(x))]' \\\\\n&= u'(v(w(x))) \\cdot v'(w(x)) \\cdot w'(x)\n\\end{align*}\n\\]\nThe answer can be viewed as a repeated peeling off of the outer function, a view with immediate application to many compositions. To see that in action with an expression, consider this derivative problem, shown in steps:\n\\[\n\\begin{align*}\n[\\sin(e^{\\cos(x^2-x)})]'\n&= \\cos(e^{\\cos(x^2-x)}) \\cdot [e^{\\cos(x^2-x)}]'\\\\\n&= \\cos(e^{\\cos(x^2-x)}) \\cdot e^{\\cos(x^2-x)} \\cdot [\\cos(x^2-x)]'\\\\\n&= \\cos(e^{\\cos(x^2-x)}) \\cdot e^{\\cos(x^2-x)} \\cdot (-\\sin(x^2-x)) \\cdot [x^2-x]'\\\\\n&= \\cos(e^{\\cos(x^2-x)}) \\cdot e^{\\cos(x^2-x)} \\cdot (-\\sin(x^2-x)) \\cdot (2x-1)\\\\\n\\end{align*}\n\\]\n\n\nMore examples of differentiation\nFind the derivative of \\(x^5 \\cdot \\sin(x)\\).\nThis is a product of functions, using \\([u\\cdot v]' = u'v + uv'\\) we get:\n\\[\n5x^4 \\cdot \\sin(x) + x^5 \\cdot \\cos(x)\n\\]\n\nFind the derivative of \\(x^5 / \\sin(x)\\).\nThis is a quotient of functions. Using \\([u/v]' = (u'v - uv')/v^2\\) we get\n\\[\n(5x^4 \\cdot \\sin(x) - x^5 \\cdot \\cos(x)) / (\\sin(x))^2.\n\\]\n\nFind the derivative of \\(\\sin(x^5)\\). This is a composition of functions \\(u(v(x))\\) with \\(v(x) = x^5\\). The chain rule says find the derivative of \\(u\\) (\\(\\cos(x)\\)) and evaluate at \\(v(x)\\) (\\(\\cos(x^5)\\)) then multiply by the derivative of \\(v\\):\n\\[\n\\cos(x^5) \\cdot 5x^4.\n\\]\n\nSimilarly, but differently, find the derivative of \\(\\sin(x)^5\\). Now \\(v(x) = \\sin(x)\\), so the derivative of \\(u(x)\\) (\\(5x^4\\)) evaluated at \\(v(x)\\) is \\(5(\\sin(x))^4\\) so multiplying by \\(v'\\) gives:\n\\[\n5(\\sin(x))^4 \\cdot \\cos(x)\n\\]\n\nWe can verify these with SymPy. Rather than take a limit, we will use SymPys diff function to compute derivatives.\n\ndiff(x^5 * sin(x))\n\n \n\\[\nx^{5} \\cos{\\left(x \\right)} + 5 x^{4} \\sin{\\left(x \\right)}\n\\]\n\n\n\n\ndiff(x^5/sin(x))\n\n \n\\[\n- \\frac{x^{5} \\cos{\\left(x \\right)}}{\\sin^{2}{\\left(x \\right)}} + \\frac{5 x^{4}}{\\sin{\\left(x \\right)}}\n\\]\n\n\n\n\ndiff(sin(x^5))\n\n \n\\[\n5 x^{4} \\cos{\\left(x^{5} \\right)}\n\\]\n\n\n\nand finally,\n\ndiff(sin(x)^5)\n\n \n\\[\n5 \\sin^{4}{\\left(x \\right)} \\cos{\\left(x \\right)}\n\\]\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe diff function can be called as diff(ex) when there is just one free variable, as in the above examples; as diff(ex, var) when there are parameters in the expression.\n\n\n\nThe general product rule: For any \\(n\\) - not just integer values - we can re-express \\(x^n\\) using \\(e\\): \\(x^n = e^{n \\log(x)}\\). Now the chain rule can be applied:\n\\[\n[x^n]' = [e^{n\\log(x)}]' = e^{n\\log(x)} \\cdot (n \\frac{1}{x}) = n x^n \\cdot \\frac{1}{x} = n x^{n-1}.\n\\]\n\nFind the derivative of \\(f(x) = x^3 (1-x)^2\\) using either the power rule or the sum rule.\nThe power rule expresses \\(f=u\\cdot v\\). With \\(u(x)=x^3\\) and \\(v(x)=(1-x)^2\\) we get:\n\\[\nu'(x) = 3x^2, \\quad v'(x) = 2 \\cdot (1-x)^1 \\cdot (-1),\n\\]\nthe last by the chain rule. Combining with \\(u' v + u v'\\) we get: \\(f'(x) = (3x^2)\\cdot (1-x)^2 + x^3 \\cdot (-2) \\cdot (1-x)\\).\nOtherwise, the polynomial can be expanded to give \\(f(x)=x^5-2x^4+x^3\\) which has derivative \\(f'(x) = 5x^4 - 8x^3 + 3x^2\\).\n\nFind the derivative of \\(f(x) = x \\cdot e^{-x^2}\\).\nUsing the product rule and then the chain rule, we have:\n\\[\n\\begin{align}\nf'(x) &= [x \\cdot e^{-x^2}]'\\\\\n&= [x]' \\cdot e^{-x^2} + x \\cdot [e^{-x^2}]'\\\\\n&= 1 \\cdot e^{-x^2} + x \\cdot (e^{-x^2}) \\cdot [-x^2]'\\\\\n&= e^{-x^2} + x \\cdot e^{-x^2} \\cdot (-2x)\\\\\n&= e^{-x^2} (1 - 2x^2).\n\\end{align}\n\\]\n\nFind the derivative of \\(f(x) = e^{-ax} \\cdot \\sin(x)\\).\nUsing the product rule and then the chain rule, we have:\n\\[\n\\begin{align}\nf'(x) &= [e^{-ax} \\cdot \\sin(x)]'\\\\\n&= [e^{-ax}]' \\cdot \\sin(x) + e^{-ax} \\cdot [\\sin(x)]'\\\\\n&= e^{-ax} \\cdot [-ax]' \\cdot \\sin(x) + e^{-ax} \\cdot \\cos(x)\\\\\n&= e^{-ax} \\cdot (-a) \\cdot \\sin(x) + e^{-ax} \\cos(x)\\\\\n&= e^{-ax}(\\cos(x) - a\\sin(x)).\n\\end{align}\n\\]\n\nFind the derivative of \\(e^{-x^2/2}\\) at \\(x=1\\).\n\\[\n[e^{-x^2/2}]'\\big|_{x=1} =\n(e^{-x^2/2} \\cdot \\frac{-2x}{2}) \\big|_{x=1} =\ne^{-1/2} \\cdot (-1) = -e^{-1/2}.\n\\]\n\n\nExample: derivative of inverse functions\nSuppose we knew that \\(\\log(x)\\) had derivative of \\(1/x\\), but didnt know the derivative of \\(e^x\\). From their inverse relation, we have: \\(x=\\log(e^x)\\), so taking derivatives of both sides would yield:\n\\[\n1 = (\\frac{1}{e^x}) \\cdot [e^x]'.\n\\]\nOr solving, \\([e^x]' = e^x\\). This is a general strategy to find the derivative of an inverse function.\nThe graph of an inverse function is related to the graph of the function through the symmetry \\(y=x\\).\nFor example, the graph of \\(e^x\\) and \\(\\log(x)\\) have this symmetry, emphasized below:\n\n\n\n\n\nThe point \\((1, e)\\) on the graph of \\(e^x\\) matches the point \\((e, 1)\\) on the graph of the inverse function, \\(\\log(x)\\). The slope of the tangent line at \\(x=1\\) to \\(e^x\\) is given by \\(e\\) as well. What is the slope of the tangent line to \\(\\log(x)\\) at \\(x=e\\)?\nAs seen, the value can be computed, but how?\nFinding the derivative of the inverse function can be achieved from the chain rule using the identify \\(f^{-1}(f(x)) = x\\) for all \\(x\\) in the domain of \\(f\\).\nThe chain rule applied to both sides, yields:\n\\[\n1 = [f^{-1}]'(f(x)) \\cdot f'(x)\n\\]\nSolving, we see that \\([f^{-1}]'(f(x)) = 1/f'(x)\\). To emphasize the evaluation of the derivative of the inverse function at \\(f(x)\\) we might write:\n\\[\n\\frac{d}{du} (f^{-1}(u)) \\big|_{u=f(x)} = \\frac{1}{f'(x)}\n\\]\nSo the reciprocal of the slope of the tangent line of \\(f\\) at the mirror image point. In the above, we see if the slope of the tangent line at \\((1,e)\\) to \\(f\\) is \\(e\\), then the slope of the tangent line to \\(f^{-1}(x)\\) at \\((e,1)\\) would be \\(1/e\\).\n\n\nRules of derivatives and some sample functions\nThis table summarizes the rules of derivatives that allow derivatives of more complicated expressions to be computed with the derivatives of their pieces.\n\n\n\n\n\n\nNameRule\n\nPower rule\n\\([x^n]' = n\\cdot x^{n-1}\\)\n\nconstant\n\\([cf(x)]' = c \\cdot f'(x)\\)\n\nsum/difference\n\\([f(x) \\pm g(x)]' = f'(x) \\pm g'(x)\\)\n\nproduct\n\\([f(x) \\cdot g(x)]' = f'(x)\\cdot g(x) + f(x) \\cdot g'(x)\\)\n\nquotient\n\\([f(x)/g(x)]' = (f'(x) \\cdot g(x) - f(x) \\cdot g'(x)) / g(x)^2\\)\n\nchain\n\\([f(g(x))]' = f'(g(x)) \\cdot g'(x)\\)\n\n\n\n\n\n\n\nThis table gives some useful derivatives:\n\n\n\n\n\n\nFunctionDerivative\n\n\\(x^n (\\text{ all } n)\\)\n\\(nx^{n-1}\\)\n\n\\(e^x\\)\n\\(e^x\\)\n\n\\(\\log(x)\\)\n\\(1/x\\)\n\n\\(\\sin(x)\\)\n\\(\\cos(x)\\)\n\n\\(\\cos(x)\\)\n\\(-\\sin(x)\\)"
},
{
"objectID": "derivatives/derivatives.html#higher-order-derivatives",
"href": "derivatives/derivatives.html#higher-order-derivatives",
"title": "22  Derivatives",
"section": "22.5 Higher-order derivatives",
"text": "22.5 Higher-order derivatives\nThe derivative of a function is an operator, it takes a function and returns a new, derived, function. We could repeat this operation. The result is called a higher-order derivative. The Lagrange notation uses additional “primes” to indicate how many. So \\(f''(x)\\) is the second derivative and \\(f'''(x)\\) the third. For even higher orders, sometimes the notation is \\(f^{(n)}(x)\\) to indicate an \\(n\\)th derivative.\n\nExamples\nFind the first \\(3\\) derivatives of \\(f(x) = ax^3 + bx^2 + cx + d\\).\nDifferentiating a polynomial is done with the sum rule, here we repeat three times:\n\\[\n\\begin{align}\nf(x) &= ax^3 + bx^2 + cx + d\\\\\nf'(x) &= 3ax^2 + 2bx + c \\\\\nf''(x) &= 3\\cdot 2 a x + 2b \\\\\nf'''(x) &= 6a\n\\end{align}\n\\]\nWe can see, the fourth derivative and all higher order ones would be identically \\(0\\). This is part of a general phenomenon: an \\(n\\)th degree polynomial has only \\(n\\) non-zero derivatives.\n\nFind the first \\(5\\) derivatives of \\(\\sin(x)\\).\n\\[\n\\begin{align}\nf(x) &= \\sin(x) \\\\\nf'(x) &= \\cos(x) \\\\\nf''(x) &= -\\sin(x) \\\\\nf'''(x) &= -\\cos(x) \\\\\nf^{(4)} &= \\sin(x) \\\\\nf^{(5)} &= \\cos(x)\n\\end{align}\n\\]\nWe see the derivatives repeat themselves. (We also see alternative notation for higher order derivatives.)\n\nFind the second derivative of \\(e^{-x^2}\\).\nWe need the chain rule and the product rule:\n\\[\n[e^{-x^2}]'' = [e^{-x^2} \\cdot (-2x)]' = \\left(e^{-x^2} \\cdot (-2x)\\right) \\cdot(-2x) + e^{-x^2} \\cdot (-2) =\ne^{-x^2}(4x^2 - 2).\n\\]\nThis can be verified:\n\ndiff(diff(exp(-x^2))) |> simplify\n\n \n\\[\n2 \\cdot \\left(2 x^{2} - 1\\right) e^{- x^{2}}\n\\]\n\n\n\nHaving to iterate the use of diff is cumbersome. An alternate notation is either specifying the variable twice: diff(ex, x, x) or using a number after the variable: diff(ex, x, 2):\n\ndiff(exp(-x^2), x, x) |> simplify\n\n \n\\[\n2 \\cdot \\left(2 x^{2} - 1\\right) e^{- x^{2}}\n\\]\n\n\n\nHigher-order derivatives can become involved when the product or quotient rules becomes involved."
},
{
"objectID": "derivatives/derivatives.html#questions",
"href": "derivatives/derivatives.html#questions",
"title": "22  Derivatives",
"section": "22.6 Questions",
"text": "22.6 Questions\n\nQuestion\nThe derivative at \\(c\\) is the slope of the tangent line at \\(x=c\\). Answer the following based on this graph:\n\nfn = x -> -x*exp(x)*sin(pi*x)\nplot(fn, 0, 2)\n\n\n\n\nAt which of these points \\(c= 1/2, 1, 3/2\\) is the derivative negative?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1/2\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(3/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich value looks bigger from reading the graph:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(1)\\)\n \n \n\n\n \n \n \n \n \\(f(3/2)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\(0.708 \\dots\\) and \\(1.65\\dots\\) the derivative has a common value. What is it?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the graph of the airyai function (from SpecialFunctions) over \\([-5, 5]\\).\n\n\n\n\n\nAt \\(x = -2.5\\) the derivative is postive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\(x=0\\) the derivative is postive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\(x = 2.5\\) the derivative is postive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the derivative of \\(e^x\\) using limit. What do you get?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^x\\)\n \n \n\n\n \n \n \n \n \\(x^e\\)\n \n \n\n\n \n \n \n \n \\((e-1)x^e\\)\n \n \n\n\n \n \n \n \n \\(e x^{(e-1)}\\)\n \n \n\n\n \n \n \n \n something else\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the derivative of \\(x^e\\) using limit. What do you get?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^x\\)\n \n \n\n\n \n \n \n \n \\(x^e\\)\n \n \n\n\n \n \n \n \n \\((e-1)x^e\\)\n \n \n\n\n \n \n \n \n \\(e x^{(e-1)}\\)\n \n \n\n\n \n \n \n \n something else\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the derivative of \\(e^{e\\cdot x}\\) using limit. What do you get?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(e^x\\)\n \n \n\n\n \n \n \n \n \\(x^e\\)\n \n \n\n\n \n \n \n \n \\((e-1)x^e\\)\n \n \n\n\n \n \n \n \n \\(e x^{(e-1)}\\)\n \n \n\n\n \n \n \n \n \\(e \\cdot e^{e\\cdot x}\\)\n \n \n\n\n \n \n \n \n something else\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the derivation of the derivative of \\(\\sin(x)\\), the following limit is needed:\n\\[\nL = \\lim_{h \\rightarrow 0} \\frac{\\cos(h) - 1}{h}.\n\\]\nThis is\n\n\n\n \n \n \n \n \n \n \n \n \n Does not exist. The answer is \\(0/0\\) which is undefined\n \n \n\n\n \n \n \n \n \\(0\\), as this expression is the derivative of cosine at \\(0\\). The answer follows, as cosine clearly has a tangent line with slope \\(0\\) at \\(x=0\\).\n \n \n\n\n \n \n \n \n \\(1\\), as this is clearly the analog of the limit of \\(\\sin(h)/h\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = (e^x + e^{-x})/2\\) and \\(g(x) = (e^x - e^{-x})/2\\). Which is true?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f'(x) = -g(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = -f(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = g(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = (e^x + e^{-x})/2\\) and \\(g(x) = (e^x - e^{-x})/2\\). Which is true?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f''(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = -f(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = -g(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = g(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the function \\(f\\) and its transformation \\(g(x) = a + f(x)\\) (shift up by \\(a\\)). Do \\(f\\) and \\(g\\) have the same derivative?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the function \\(f\\) and its transformation \\(g(x) = f(x - a)\\) (shift right by \\(a\\)). Do \\(f\\) and \\(g\\) have the same derivative?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the function \\(f\\) and its transformation \\(g(x) = f(x - a)\\) (shift right by \\(a\\)). Is \\(f'\\) at \\(x\\) equal to \\(g'\\) at \\(x-a\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the function \\(f\\) and its transformation \\(g(x) = c f(x)\\), \\(c > 1\\). Do \\(f\\) and \\(g\\) have the same derivative?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the function \\(f\\) and its transformation \\(g(x) = f(x/c)\\), \\(c > 1\\). Do \\(f\\) and \\(g\\) have the same derivative?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich of the following is true?\n\n\n\n \n \n \n \n \n \n \n \n \n If the graphs of \\(f\\) and \\(g\\) are rescalings of each other through \\(g(x)=cf(x)\\), \\(c > 1\\). Then the tangent line for corresponding points is the same.\n \n \n\n\n \n \n \n \n If the graphs of \\(f\\) and \\(g\\) are translations up and down, the tangent line at corresponding points is unchanged.\n \n \n\n\n \n \n \n \n If the graphs of \\(f\\) and \\(g\\) are rescalings of each other through \\(g(x)=f(x/c)\\), \\(c > 1\\). Then the tangent line for corresponding points is the same.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe rate of change of volume with respect to height is \\(3h\\). The rate of change of height with respect to time is \\(2t\\). At at \\(t=3\\) the height is \\(h=14\\) what is the rate of change of volume with respect to time when \\(t=3\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich equation below is \\(f(x) = \\sin(k\\cdot x)\\) a solution of (\\(k > 1\\))?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f'(x) = k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = -k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = -k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = e^{k\\cdot x}\\), \\(k > 1\\). Which equation below is \\(f(x)\\) a solution of?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f''(x) = -k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f'(x) = -k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \\(f''(x) = k^2 \\cdot f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nTheir are \\(6\\) trig functions. The derivatives of \\(\\sin(x)\\) and \\(\\cos(x)\\) should be memorized. The others can be derived if not memorized using the quotient rule or chain rule.\nWhat is \\([\\tan(x)]'\\)? (Use \\(\\tan(x) = \\sin(x)/\\cos(x)\\).)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sec^2(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc(x)\\cot(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc^2(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec(x)\\tan(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\([\\cot(x)]'\\)? (Use \\(\\tan(x) = \\cos(x)/\\sin(x)\\).)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-\\csc(x)\\cot(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec(x)\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc^2(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec^2(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\([\\sec(x)]'\\)? (Use \\(\\sec(x) = 1/\\cos(x)\\).)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-\\csc^2(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec(x)\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec^2(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc(x)\\cot(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\([\\csc(x)]'\\)? (Use \\(\\csc(x) = 1/\\sin(x)\\).)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sec^2(x)\\)\n \n \n\n\n \n \n \n \n \\(\\sec(x)\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc(x)\\cot(x)\\)\n \n \n\n\n \n \n \n \n \\(-\\csc^2(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this picture of composition:\n\n\n\n\n\nThe right graph is of \\(g(x) = \\exp(x)\\) at \\(x=1\\), the left graph of \\(f(x) = \\sin(x)\\) rotated \\(90\\) degrees counter-clockwise. Chasing the arrows shows graphically how \\(f(g(1))\\) can be computed. The nearby values \\(f(g(1+h))\\) are using the tangent line of \\(g\\) at \\(x-1\\) approximated by \\(f(g(1) + g'(1)\\cdot h)\\), as shown in the graph segment on the left.\nAssuming the approximation gets better for \\(h\\) close to \\(0\\), as it visually does, the derivative at \\(1\\) for \\(f(g(x))\\) should be given by this limit:\n\\[\n\\begin{align*}\n\\frac{d(f\\circ g)}{dx}\\mid_{x=1}\n&= \\lim_{h\\rightarrow 0} \\frac{f(g(1) + g'(1)h)-f(g(1))}{h}\\\\\n&= \\lim_{h\\rightarrow 0} \\frac{f(g(1) + g'(1)h)-f(g(1))}{h}\\\\\n&= \\lim_{h\\rightarrow 0} \\frac{f(g(1) + g'(1)h)-f(g(1))}{g'(1)h} \\cdot g'(1)\\\\\n&= \\lim_{h\\rightarrow 0} (f\\circ g)'(1) \\cdot g'(1).\n\\end{align*}\n\\]\nWhat limit law, described below assuming all limits exist. allows the last equals sign?\n\n\n\n \n \n \n \n \n \n \n \n \n The limit of a sum is the sum of the limits: \\(\\lim_{x\\rightarrow c}(au(x)+bv(x)) = a\\lim_{x\\rightarrow c}u(x) + b\\lim_{x\\rightarrow c}v(x)\\)\n \n \n\n\n \n \n \n \n The limit of a product is the product of the limits: \\(\\lim_{x\\rightarrow c}(u(x)\\cdot v(x)) = \\lim_{x\\rightarrow c}u(x) \\cdot \\lim_{x\\rightarrow c}v(x)\\)\n \n \n\n\n \n \n \n \n The limit of a composition (under assumptions on \\(v\\)): \\(\\lim_{x \\rightarrow c}u(v(x)) = \\lim_{w \\rightarrow \\lim_{x \\rightarrow c}v(x)} u(w)\\)."
},
{
"objectID": "derivatives/numeric_derivatives.html",
"href": "derivatives/numeric_derivatives.html",
"title": "23  Numeric derivatives",
"section": "",
"text": "This section uses these add-on packages:\nSymPy returns symbolic derivatives. Up to choices of simplification, these answers match those that would be derived by hand. This is useful when comparing with known answers and for seeing the structure of the answer. However, there are times we just want to work with the answer numerically. For that we have other options within Julia. We discuss approximate derivatives and automatic derivatives. The latter will find wide usage in these notes."
},
{
"objectID": "derivatives/numeric_derivatives.html#recap-on-derivatives-in-julia",
"href": "derivatives/numeric_derivatives.html#recap-on-derivatives-in-julia",
"title": "23  Numeric derivatives",
"section": "23.1 Recap on derivatives in Julia",
"text": "23.1 Recap on derivatives in Julia\nA quick summary for finding derivatives in Julia, as there are \\(3\\) different manners:\n\nSymbolic derivatives are found using diff from SymPy\nAutomatic derivatives are found using the notation f' using ForwardDiff.derivative\napproximate derivatives at a point, c, for a given h are found with (f(c+h)-f(c))/h.\n\nFor example, here all three are computed and compared:\n\nf(x) = exp(-x)*sin(x)\n\nc = pi\nh = 1e-8\n\nfp = diff(f(x),x)\n\nfp, fp(c), f'(c), (f(c+h) - f(c))/h\n\n(-exp(-x)*sin(x) + exp(-x)*cos(x), -exp(-pi), -0.043213918263772265, -0.04321391756900175)\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe use of ' to find derivatives provided by CalculusWithJulia is convenient, and used extensively in these notes, but it needs to be noted that it does not conform with the generic meaning of ' within Julias wider package ecosystem and may cause issue with linear algebra operations; the symbol is meant for the adjoint of a matrix."
},
{
"objectID": "derivatives/numeric_derivatives.html#questions",
"href": "derivatives/numeric_derivatives.html#questions",
"title": "23  Numeric derivatives",
"section": "23.2 Questions",
"text": "23.2 Questions\n\nQuestion\nFind the derivative using a forward difference approximation of \\(f(x) = x^x\\) at the point \\(x=2\\) using h=0.1:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nUsing D or f' find the value using automatic differentiation\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nMathematically, as the value of h in the forward difference gets smaller the forward difference approximation gets better. On the computer, this is thwarted by floating point representation issues (in particular the error in subtracting two like-sized numbers in forming \\(f(x+h)-f(x)\\).)\nFor 1e-16 what is the error (in absolute value) in finding the forward difference approximation for the derivative of \\(\\sin(x)\\) at \\(x=0\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nRepeat for \\(x=\\pi/4\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nLet \\(f(x) = x^x\\). Using D, find \\(f'(3)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\lvert 1 - \\sqrt{1 + x}\\rvert\\). Using D, find \\(f'(3)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = e^{\\sin(x)}\\). Using D, find \\(f'(3)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor Julias airyai function find a numeric derivative using the forward difference. For \\(c=3\\) and \\(h=10^{-8}\\) find the forward difference approximation to \\(f'(3)\\) for the airyai function.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the rate of change with respect to time of the function \\(f(t)= 64 - 16t^2\\) at \\(t=1\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the rate of change with respect to height, \\(h\\), of \\(f(h) = 32h^3 - 62 h + 12\\) at \\(h=2\\)."
},
{
"objectID": "derivatives/symbolic_derivatives.html",
"href": "derivatives/symbolic_derivatives.html",
"title": "24  Symbolic derivatives",
"section": "",
"text": "using TermInterface\n\nThe ability to breakdown an expression into operations and their arguments is necessary when trying to apply the differentiation rules. Such rules are applied from the outside in. Identifying the proper “outside” function is usually most of the battle when finding derivatives.\nIn the following example, we provide a sketch of a framework to differentiate expressions by a chosen symbol to illustrate how the outer function drives the task of differentiation.\nThe Symbolics package provides native symbolic manipulation abilities for Julia, similar to SymPy, though without the dependence on Python. The TermInterface package, used by Symbolics, provides a generic interface for expression manipulation for this package that also is implemented for Julias expressions and symbols.\nAn expression is an unevaluated portion of code that for our purposes below contains other expressions, symbols, and numeric literals. They are held in the Expr type. A symbol, such as :x, is distinct from a string (e.g. \"x\") and is useful to the programmer to distinguish between the contents a variable points to from the name of the variable. Symbols are fundamental to metaprogramming in Julia. An expression is a specification of some set of statements to execute. A numeric literal is just a number.\nThe three main functions from TermInterface we leverage are istree, operation, and arguments. The operation function returns the “outside” function of an expression. For example:\n\noperation(:(sin(x)))\n\n:sin\n\n\nWe see the sin function, referred to by a symbol (:sin). The :(...) above quotes the argument, and does not evaluate it, hence x need not be defined above. (The : notation is used to create both symbols and expressions.)\nThe arguments are the terms that the outside function is called on. For our purposes there may be \\(1\\) (unary), \\(2\\) (binary), or more than \\(2\\) (nary) arguments. (We ignore zero-argument functions.) For example:\n\narguments(:(-x)), arguments(:(pi^2)), arguments(:(1 + x + x^2))\n\n(Any[:x], Any[:pi, 2], Any[1, :x, :(x ^ 2)])\n\n\n(The last one may be surprising, but all three arguments are passed to the + function.)\nHere we define a function to decide the arity of an expression based on the number of arguments it is called with:\n\nfunction arity(ex)\n n = length(arguments(ex))\n n == 1 ? Val(:unary) :\n n == 2 ? Val(:binary) : Val(:nary)\nend\n\narity (generic function with 1 method)\n\n\nDifferentiation must distinguish between expressions, variables, and numbers. Mathematically expressions have an “outer” function, whereas variables and numbers can be directly differentiated. The istree function in TermInterface returns true when passed an expression, and false when passed a symbol or numeric literal. The latter two may be distinguished by isa(..., Symbol).\nHere we create a function, D, that when it encounters an expression it dispatches to a specific method of D based on the outer operation and arity, otherwise if it encounters a symbol or a numeric literal it does the differentiation:\n\nfunction D(ex, var=:x)\n if istree(ex)\n op, args = operation(ex), arguments(ex)\n D(Val(op), arity(ex), args, var)\n elseif isa(ex, Symbol) && ex == :x\n 1\n else\n 0\n end\nend\n\nD (generic function with 2 methods)\n\n\nNow to develop methods for D for different “outside” functions and arities.\nAddition can be unary (:(+x) is a valid quoting, even if it might simplify to the symbol :x when evaluated), binary, or nary. Here we implement the sum rule:\n\nD(::Val{:+}, ::Val{:unary}, args, var) = D(first(args), var)\n\nfunction D(::Val{:+}, ::Val{:binary}, args, var)\n a, b = D.(args, var)\n :($a + $b)\nend\n\nfunction D(::Val{:+}, ::Val{:nary}, args, var)\n as = D.(args, var)\n :(+($as...))\nend\n\nD (generic function with 5 methods)\n\n\nThe args are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply D to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The “primes” are unicode notation achieved through \\prime[tab] and not operations.) The nary case does something similar, only uses splatting to produce the sum.\nSubtraction must also be implemented in a similar manner, but not for the nary case:\n\nfunction D(::Val{:-}, ::Val{:unary}, args, var)\n a = D(first(args), var)\n :(-$a)\nend\nfunction D(::Val{:-}, ::Val{:binary}, args, var)\n a, b = D.(args, var)\n :($a - $b)\nend\n\nD (generic function with 7 methods)\n\n\nThe product rule is similar to addition, in that \\(3\\) cases are considered:\n\nD(op::Val{:*}, ::Val{:unary}, args, var) = D(first(args), var)\n\nfunction D(::Val{:*}, ::Val{:binary}, args, var)\n a, b = args\n a, b = D.(args, var)\n :($a * $b + $a * $b)\nend\n\nfunction D(op::Val{:*}, ::Val{:nary}, args, var)\n a, bs... = args\n b = :(*($(bs...)))\n a = D(a, var)\n b = D(b, var)\n :($a * $b + $a * $b)\nend\n\nD (generic function with 10 methods)\n\n\nThe nary case above just peels off the first factor and then uses the binary product rule.\nDivision is only a binary operation, so here we have the quotient rule:\n\nfunction D(::Val{:/}, ::Val{:binary}, args, var)\n u,v = args\n u, v = D(u, var), D(v, var)\n :( ($u*$v - $u*$v)/$v^2 )\nend\n\nD (generic function with 11 methods)\n\n\nPowers are handled a bit differently. The power rule would require checking if the exponent does not contain the variable of differentiation, exponential derivatives would require checking the base does not contain the variable of differentation. Trying to implement both would be tedious, so we use the fact that \\(x = \\exp(\\log(x))\\) (for x in the domain of log, more care is necessary if x is negative) to differentiate:\n\nfunction D(::Val{:^}, ::Val{:binary}, args, var)\n a, b = args\n D(:(exp($b*log($a))), var) # a > 0 assumed here\nend\n\nD (generic function with 12 methods)\n\n\nThat leaves the task of defining a rule to differentiate both exp and log. We do so with unary definitions. In the following we also implement sin and cos rules:\n\nfunction D(::Val{:exp}, ::Val{:unary}, args, var)\n a = first(args)\n a = D(a, var)\n :(exp($a) * $a)\nend\n\nfunction D(::Val{:log}, ::Val{:unary}, args, var)\n a = first(args)\n a = D(a, var)\n :(1/$a * $a)\nend\n\nfunction D(::Val{:sin}, ::Val{:unary}, args, var)\n a = first(args)\n a = D(a, var)\n :(cos($a) * $a)\nend\n\nfunction D(::Val{:cos}, ::Val{:unary}, args, var)\n a = first(args)\n a = D(a, var)\n :(-sin($a) * $a)\nend\n\nD (generic function with 16 methods)\n\n\nThe pattern is similar for each. The $a factor is needed due to the chain rule. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work.\n\nex₁ = :(x + 2/x)\nD(ex₁, :x)\n\n:(1 + (0 * x - 2 * 1) / x ^ 2)\n\n\nThe output does not simplify, so some work is needed to identify 1 - 2/x^2 as the answer.\n\nex₂ = :( (x + sin(x))/sin(x))\nD(ex₂, :x)\n\n:(((1 + cos(x) * 1) * sin(x) - (x + sin(x)) * (cos(x) * 1)) / sin(x) ^ 2)\n\n\nAgain, simplification is not performed.\nFinally, we have a second derivative taken below:\n\nex₃ = :(sin(x) - x - x^3/6)\nD(D(ex₃, :x), :x)\n\n:((((-(sin(x)) * 1) * 1 + cos(x) * 0) - 0) - (((((exp(3 * log(x)) * (0 * log(x) + 3 * ((1 / x) * 1))) * (0 * log(x) + 3 * ((1 / x) * 1)) + exp(3 * log(x)) * ((0 * log(x) + 0 * ((1 / x) * 1)) + (0 * ((1 / x) * 1) + 3 * (((0 * x - 1 * 1) / x ^ 2) * 1 + (1 / x) * 0)))) * 6 + (exp(3 * log(x)) * (0 * log(x) + 3 * ((1 / x) * 1))) * 0) - ((exp(3 * log(x)) * (0 * log(x) + 3 * ((1 / x) * 1))) * 0 + x ^ 3 * 0)) * 6 ^ 2 - ((exp(3 * log(x)) * (0 * log(x) + 3 * ((1 / x) * 1))) * 6 - x ^ 3 * 0) * (exp(2 * log(6)) * (0 * log(6) + 2 * ((1 / 6) * 0)))) / (6 ^ 2) ^ 2)\n\n\nThe length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand."
},
{
"objectID": "derivatives/mean_value_theorem.html",
"href": "derivatives/mean_value_theorem.html",
"title": "25  The mean value theorem for differentiable functions.",
"section": "",
"text": "This section uses these add-on packages:\nA function is continuous at \\(c\\) if \\(f(c+h) - f(c) \\rightarrow 0\\) as \\(h\\) goes to \\(0\\). We can write that as \\(f(c+h) - f(x) = \\epsilon_h\\), with \\(\\epsilon_h\\) denoting a function going to \\(0\\) as \\(h \\rightarrow 0\\). With this notion, differentiability could be written as \\(f(c+h) - f(c) - f'(c)h = \\epsilon_h \\cdot h\\). This is clearly a more demanding requirement that mere continuity at \\(c\\).\nWe defined a function to be continuous on an interval \\(I=(a,b)\\) if it was continuous at each point \\(c\\) in \\(I\\). Similarly, we define a function to be differentiable on the interval \\(I\\) it it is differentiable at each point \\(c\\) in \\(I\\).\nThis section looks at properties of differentiable functions. As there is a more stringent definition, perhaps more properties are a consequence of the definition."
},
{
"objectID": "derivatives/mean_value_theorem.html#differentiable-is-more-restrictive-than-continuous.",
"href": "derivatives/mean_value_theorem.html#differentiable-is-more-restrictive-than-continuous.",
"title": "25  The mean value theorem for differentiable functions.",
"section": "25.1 Differentiable is more restrictive than continuous.",
"text": "25.1 Differentiable is more restrictive than continuous.\nLet \\(f\\) be a differentiable function on \\(I=(a,b)\\). We see that \\(f(c+h) - f(c) = f'(c)h + \\epsilon_h\\cdot h = h(f'(c) + \\epsilon_h)\\). The right hand side will clearly go to \\(0\\) as \\(h\\rightarrow 0\\), so \\(f\\) will be continuous. In short:\n\nA differentiable function on \\(I=(a,b)\\) is continuous on \\(I\\).\n\nIs it possible that all continuous functions are differentiable?\nThe fact that the derivative is related to the tangent lines slope might give an indication that this wont be the case - we just need a function which is continuous but has a point with no tangent line. The usual suspect is \\(f(x) = \\lvert x\\rvert\\) at \\(0\\).\n\nf(x) = abs(x)\nplot(f, -1,1)\n\n\n\n\nWe can see formally that the secant line expression will not have a limit when \\(c=0\\) (the left limit is \\(-1\\), the right limit \\(1\\)). But more insight is gained by looking a the shape of the graph. At the origin, the graph always is vee-shaped. There is no linear function that approximates this function well. The function is just not smooth enough, as it has a kink.\nThere are other functions that have kinks. These are often associated with powers. For example, at \\(x=0\\) this function will not have a derivative:\n\nf(x) = (x^2)^(1/3)\nplot(f, -1, 1)\n\n\n\n\nOther functions have tangent lines that become vertical. The natural slope would be \\(\\infty\\), but this isnt a limiting answer (except in the extended sense we dont apply to the definition of derivatives). A candidate for this case is the cube root function:\n\nplot(cbrt, -1, 1)\n\n\n\n\nThe derivative at \\(0\\) would need to be \\(+\\infty\\) to match the graph. This is implied by the formula for the derivative from the power rule: \\(f'(x) = 1/3 \\cdot x^{-2/3}\\), which has a vertical asymptote at \\(x=0\\).\n\n\n\n\n\n\nNote\n\n\n\nThe cbrt function is used above, instead of f(x) = x^(1/3), as the latter is not defined for negative x. Though it can be for the exact power 1/3, it cant be for an exact power like 1/2. This means the value of the argument is important in determining the type of the output - and not just the type of the argument. Having type-stable functions is part of the magic to making Julia run fast, so x^c is not defined for negative x and most floating point exponents.\n\n\nLest you think that continuous functions always have derivatives except perhaps at exceptional points, this isnt the case. The functions used to model the stock market are continuous but have no points where they are differentiable."
},
{
"objectID": "derivatives/mean_value_theorem.html#derivatives-and-maxima.",
"href": "derivatives/mean_value_theorem.html#derivatives-and-maxima.",
"title": "25  The mean value theorem for differentiable functions.",
"section": "25.2 Derivatives and maxima.",
"text": "25.2 Derivatives and maxima.\nWe have defined an absolute maximum of \\(f(x)\\) over an interval to be a value \\(f(c)\\) for a point \\(c\\) in the interval that is as large as any other value in the interval. Just specifying a function and an interval does not guarantee an absolute maximum, but specifying a continuous function and a closed interval does, by the extreme value theorem.\n\nA relative maximum: We say \\(f(x)\\) has a relative maximum at \\(c\\) if there exists some interval \\(I=(a,b)\\) with \\(a < c < b\\) for which \\(f(c)\\) is an absolute maximum for \\(f\\) and \\(I\\).\n\nThe difference is a bit subtle, for an absolute maximum the interval must also be specified, for a relative maximum there just needs to exist some interval, possibly really small, though it must be bigger than a point.\n\n\n\n\n\n\nNote\n\n\n\nA hiker can appreciate the difference. A relative maximum would be the crest of any hill, but an absolute maximum would be the summit.\n\n\nWhat does this have to do with derivatives?\nFermat, perhaps with insight from Kepler, was interested in maxima of polynomial functions. As a warm up, he considered a line segment \\(AC\\) and a point \\(E\\) with the task of choosing \\(E\\) so that \\((E-A) \\times (C-A)\\) being a maximum. We might recognize this as finding the maximum of \\(f(x) = (x-A)\\cdot(C-x)\\) for some \\(A < C\\). Geometrically, we know this to be at the midpoint, as the equation is a parabola, but Fermat was interested in an algebraic solution that led to more generality.\nHe takes \\(b=AC\\) and \\(a=AE\\). Then the product is \\(a \\cdot (b-a) = ab - a^2\\). He then perturbs this writing \\(AE=a+e\\), then this new product is \\((a+e) \\cdot (b - a - e)\\). Equating the two, and canceling like terms gives \\(be = 2ae + e^2\\). He cancels the \\(e\\) and basically comments that this must be true for all \\(e\\) even as \\(e\\) goes to \\(0\\), so \\(b = 2a\\) and the value is at the midpoint.\nIn a more modern approach, this would be the same as looking at this expression:\n\\[\n\\frac{f(x+e) - f(x)}{e} = 0.\n\\]\nWorking on the left hand side, for non-zero \\(e\\) we can cancel the common \\(e\\) terms, and then let \\(e\\) become \\(0\\). This becomes a problem in solving \\(f'(x)=0\\). Fermat could compute the derivative for any polynomial by taking a limit, a task we would do now by the power rule and the sum and difference of function rules.\nThis insight holds for other types of functions:\n\nIf \\(f(c)\\) is a relative maximum then either \\(f'(c) = 0\\) or the derivative at \\(c\\) does not exist.\n\nWhen the derivative exists, this says the tangent line is flat. (If it had a slope, then the the function would increase by moving left or right, as appropriate, a point we pursue later.)\nFor a continuous function \\(f(x)\\), call a point \\(c\\) in the domain of \\(f\\) where either \\(f'(c)=0\\) or the derivative does not exist a critical point.\nWe can combine Bolzanos extreme value theorem with Fermats insight to get the following:\n\nA continuous function on \\([a,b]\\) has an absolute maximum that occurs at a critical point \\(c\\), \\(a < c < b\\), or an endpoint, \\(a\\) or \\(b\\).\n\nA similar statement holds for an absolute minimum. This gives a restricted set of places to look for absolute maximum and minimum values - all the critical points and the endpoints.\nIt is also the case that all relative extrema occur at a critical point, however not all critical points correspond to relative extrema. We will see derivative tests that help characterize when that occurs.\n\n\n\nImage number 32 from LHopitals calculus book (the first) showing that at a relative minimum, the tangent line is parallel to the \\(x\\)-axis. This of course is true when the tangent line is well defined by Fermats observation.\n\n\n\n25.2.1 Numeric derivatives\nThe ForwardDiff package provides a means to numerically compute derivatives without approximations at a point. In CalculusWithJulia this is extended to find derivatives of functions and the ' notation is overloaded for function objects. Hence these two give nearly identical answers, the difference being only the type of number used:\n\nf(x) = 3x^3 - 2x\nfp(x) = 9x^2 - 2\nf'(3), fp(3)\n\n(79.0, 79)\n\n\n\nExample\nFor the function \\(f(x) = x^2 \\cdot e^{-x}\\) find the absolute maximum over the interval \\([0, 5]\\).\nWe have that \\(f(x)\\) is continuous on the closed interval of the question, and in fact differentiable on \\((0,5)\\), so any critical point will be a zero of the derivative. We can check for these with:\n\nf(x) = x^2 * exp(-x)\ncps = find_zeros(f', -1, 6) # find_zeros in `Roots`\n\n2-element Vector{Float64}:\n 0.0\n 1.9999999999999998\n\n\nWe get \\(0\\) and \\(2\\) are critical points. The endpoints are \\(0\\) and \\(5\\). So the absolute maximum over this interval is either at \\(0\\), \\(2\\), or \\(5\\):\n\nf(0), f(2), f(5)\n\n(0.0, 0.5413411329464508, 0.16844867497713667)\n\n\nWe see that \\(f(2)\\) is then the maximum.\nA few things. First, find_zeros can miss some roots, in particular endpoints and roots that just touch \\(0\\). We should graph to verify it didnt. Second, it can be easier sometimes to check the values using the “dot” notation. If f, a,b are the function and the interval, then this would typically follow this pattern:\n\na, b = 0, 5\ncritical_pts = find_zeros(f', a, b)\nf.(critical_pts), f(a), f(b)\n\n([0.0, 0.5413411329464508], 0.0, 0.16844867497713667)\n\n\nFor this problem, we have the left endpoint repeated, but in general this wont be a point where the derivative is zero.\nAs an aside, the output above is not a single container. To achieve that, the values can be combined before the broadcasting:\n\nf.(vcat(a, critical_pts, b))\n\n4-element Vector{Float64}:\n 0.0\n 0.0\n 0.5413411329464508\n 0.16844867497713667\n\n\n\n\nExample\nFor the function \\(g(x) = e^x\\cdot(x^3 - x)\\) find the absolute maximum over the interval \\([0, 2]\\).\nWe follow the same pattern. Since \\(f(x)\\) is continuous on the closed interval and differentiable on the open interval we know that the absolute maximum must occur at an endpoint (\\(0\\) or \\(2\\)) or a critical point where \\(f'(c)=0\\). To solve for these, we have again:\n\ng(x) = exp(x) * (x^3 - x)\ngcps = find_zeros(g', 0, 2)\n\n1-element Vector{Float64}:\n 0.675130870566646\n\n\nAnd checking values gives:\n\ng.(vcat(0, gcps, 2))\n\n3-element Vector{Float64}:\n 0.0\n -0.7216901289290208\n 44.3343365935839\n\n\nHere the maximum occurs at an endpoint. The critical point \\(c=0.67\\dots\\) does not produce a maximum value. Rather \\(f(0.67\\dots)\\) is an absolute minimum.\n\n\n\n\n\n\nNote\n\n\n\n\n\n\nAbsolute minimum We havent discussed the parallel problem of absolute minima over a closed interval. By considering the function \\(h(x) = - f(x)\\), we see that the any thing true for an absolute maximum should hold in a related manner for an absolute minimum, in particular an absolute minimum on a closed interval will only occur at a critical point or an end point."
},
{
"objectID": "derivatives/mean_value_theorem.html#rolles-theorem",
"href": "derivatives/mean_value_theorem.html#rolles-theorem",
"title": "25  The mean value theorem for differentiable functions.",
"section": "25.3 Rolles theorem",
"text": "25.3 Rolles theorem\nLet \\(f(x)\\) be differentiable on \\((a,b)\\) and continuous on \\([a,b]\\). Then the absolute maximum occurs at an endpoint or where the derivative is \\(0\\) (as the derivative is always defined). This gives rise to:\n\nRolles theorem: For \\(f\\) differentiable on \\((a,b)\\) and continuous on \\([a,b]\\), if \\(f(a)=f(b)\\), then there exists some \\(c\\) in \\((a,b)\\) with \\(f'(c) = 0\\).\n\nThis modest observation opens the door to many relationships between a function and its derivative, as it ties the two together in one statement.\nTo see why Rolles theorem is true, we assume that \\(f(a)=0\\), otherwise consider \\(g(x)=f(x)-f(a)\\). By the extreme value theorem, there must be an absolute maximum and minimum. If \\(f(x)\\) is ever positive, then the absolute maximum occurs in \\((a,b)\\) - not at an endpoint - so at a critical point where the derivative is \\(0\\). Similarly if \\(f(x)\\) is ever negative. Finally, if \\(f(x)\\) is just \\(0\\), then take any \\(c\\) in \\((a,b)\\).\nThe statement in Rolles theorem speaks to existence. It doesnt give a recipe to find \\(c\\). It just guarantees that there is one or more values in the interval \\((a,b)\\) where the derivative is \\(0\\) if we assume differentiability on \\((a,b)\\) and continuity on \\([a,b]\\).\n\nExample\nLet \\(j(x) = e^x \\cdot x \\cdot (x-1)\\). We know \\(j(0)=0\\) and \\(j(1)=0\\), so on \\([0,1]\\). Rolles theorem guarantees that we can find at least one answer (unless numeric issues arise):\n\nj(x) = exp(x) * x * (x-1)\nfind_zeros(j', 0, 1)\n\n1-element Vector{Float64}:\n 0.6180339887498948\n\n\nThis graph illustrates the lone value for \\(c\\) for this problem"
},
{
"objectID": "derivatives/mean_value_theorem.html#the-mean-value-theorem",
"href": "derivatives/mean_value_theorem.html#the-mean-value-theorem",
"title": "25  The mean value theorem for differentiable functions.",
"section": "25.4 The mean value theorem",
"text": "25.4 The mean value theorem\nWe are driving south and in one hour cover 70 miles. If the speed limit is 65 miles per hour, were we ever speeding? Well we averaged more than the speed limit so we know the answer is yes, but why? Speeding would mean our instantaneous speed was more than the speed limit, yet we only know for sure our average speed was more than the speed limit. The mean value tells us that if some conditions are met, then at some point (possibly more than one) we must have that our instantaneous speed is equal to our average speed.\nThe mean value theorem is a direct generalization of Rolles theorem.\n\nMean value theorem: Let \\(f(x)\\) be differentiable on \\((a,b)\\) and continuous on \\([a,b]\\). Then there exists a value \\(c\\) in \\((a,b)\\) where \\(f'(c) = (f(b) - f(a)) / (b - a)\\).\n\nThis says for any secant line between \\(a < b\\) there will be a parallel tangent line at some \\(c\\) with \\(a < c < b\\) (all provided \\(f\\) is differentiable on \\((a,b)\\) and continuous on \\([a,b]\\)).\nThis graph illustrates the theorem. The orange line is the secant line. A parallel line tangent to the graph is guaranteed by the mean value theorem. In this figure, there are two such lines, rendered using red.\n\n\n\n\n\nLike Rolles theorem this is a guarantee that something exists, not a recipe to find it. In fact, the mean value theorem is just Rolles theorem applied to:\n\\[\ng(x) = f(x) - (f(a) + (f(b) - f(a)) / (b-a) \\cdot (x-a))\n\\]\nThat is the function \\(f(x)\\), minus the secant line between \\((a,f(a))\\) and \\((b, f(b))\\).\n\n\nJXG = require(\"jsxgraph\");\n\nboard = JXG.JSXGraph.initBoard('jsxgraph', {boundingbox: [-5, 10, 7, -6], axis:true});\np = [\n board.create('point', [-1,-2], {size:2}),\n board.create('point', [6,5], {size:2}),\n board.create('point', [-0.5,1], {size:2}),\n board.create('point', [3,3], {size:2})\n];\nf = JXG.Math.Numerics.lagrangePolynomial(p);\ngraph = board.create('functiongraph', [f,-10, 10]);\n\ng = function(x) {\n return JXG.Math.Numerics.D(f)(x)-(p[1].Y()-p[0].Y())/(p[1].X()-p[0].X());\n};\n\nr = board.create('glider', [\n function() { return JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5); },\n function() { return f(JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5)); },\n graph], {name:' ',size:4,fixed:true});\nboard.create('tangent', [r], {strokeColor:'#ff0000'});\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nline = board.create('line',[p[0],p[1]],{strokeColor:'#ff0000',dash:1});\n\n\n\n\n\n\n\n\nThis interactive example can also be found at jsxgraph. It shows a cubic polynomial fit to the \\(4\\) adjustable points labeled A through D. The secant line is drawn between points A and B with a dashed line. A tangent line with the same slope as the secant line is identified at a point \\((\\alpha, f(\\alpha))\\) where \\(\\alpha\\) is between the points A and B. That this can always be done is a conseuqence of the mean value theorem.\n\nExample\nThe mean value theorem is an extremely useful tool to relate properties of a function with properties of its derivative, as, like Rolles theorem, it includes both \\(f\\) and \\(f'\\) in its statement.\nFor example, suppose we have a function \\(f(x)\\) and we know that the derivative is always \\(0\\). What can we say about the function?\nWell, constant functions have derivatives that are constantly \\(0\\). But do others? We will see the answer is no: If a function has a zero derivative in \\((a,b)\\) it must be a constant. We can readily see that if \\(f\\) is a polynomial function this is the case, as we can differentiate a polynomial function and this will be zero only if all its coefficients are \\(0\\), which would mean there is no non-constant leading term in the polynomial. But polynomials are not representative of all functions, and so a proof requires a bit more effort.\nSuppose it is known that \\(f'(x)=0\\) on some interval \\(I\\) and we take any \\(a < b\\) in \\(I\\). Since \\(f'(x)\\) always exists, \\(f(x)\\) is always differentiable, and hence always continuous. So on \\([a,b]\\) the conditions of the mean value theorem apply. That is, there is a \\(c\\) in \\((a,b)\\) with \\((f(b) - f(a)) / (b-a) = f'(c) = 0\\). But this would imply \\(f(b) - f(a)=0\\). That is \\(f(x)\\) is a constant, as for any \\(a\\) and \\(b\\), we see \\(f(a)=f(b)\\).\n\n\n25.4.1 The Cauchy mean value theorem\nCauchy offered an extension to the mean value theorem above. Suppose both \\(f\\) and \\(g\\) satisfy the conditions of the mean value theorem on \\([a,b]\\) with \\(g(b)-g(a) \\neq 0\\), then there exists at least one \\(c\\) with \\(a < c < b\\) such that\n\\[\nf'(c) = g'(c) \\cdot \\frac{f(b) - f(a)}{g(b) - g(a)}.\n\\]\nThe proof follows by considering \\(h(x) = f(x) - r\\cdot g(x)\\), with \\(r\\) chosen so that \\(h(a)=h(b)\\). Then Rolles theorem applies so that there is a \\(c\\) with \\(h'(c)=0\\), so \\(f'(c) = r g'(c)\\), but \\(r\\) can be seen to be \\((f(b)-f(a))/(g(b)-g(a))\\), which proves the theorem.\nLetting \\(g(x) = x\\) demonstrates that the mean value theorem is a special case.\n\nExample\nSuppose \\(f(x)\\) and \\(g(x)\\) satisfy the Cauchy mean value theorem on \\([0,x]\\), \\(g'(x)\\) is non-zero on \\((0,x)\\), and \\(f(0)=g(0)=0\\). Then we have:\n\\[\n\\frac{f(x) - f(0)}{g(x) - g(0)} = \\frac{f(x)}{g(x)} = \\frac{f'(c)}{g'(c)},\n\\]\nFor some \\(c\\) in \\([0,x]\\). If \\(\\lim_{x \\rightarrow 0} f'(x)/g'(x) = L\\), then the right hand side will have a limit of \\(L\\), and hence the left hand side will too. That is, when the limit exists, we have under these conditions that \\(\\lim_{x\\rightarrow 0}f(x)/g(x) = \\lim_{x\\rightarrow 0}f'(x)/g'(x)\\).\nThis could be used to prove the limit of \\(\\sin(x)/x\\) as \\(x\\) goes to \\(0\\) just by showing the limit of \\(\\cos(x)/1\\) is \\(1\\), as is known by continuity.\n\n\n\n25.4.2 Visualizing the Cauchy mean value theorem\nThe Cauchy mean value theorem can be visualized in terms of a tangent line and a parallel secant line in a similar manner as the mean value theorem as long as a parametric graph is used. A parametric graph plots the points \\((g(t), f(t))\\) for some range of \\(t\\). That is, it graphs both functions at the same time. The following illustrates the construction of such a graph:\n\n\n \n Illustration of parametric graph of \\((g(t), f(t))\\) for \\(-\\pi/2 \\leq t \\leq \\pi/2\\) with \\(g(x) = \\sin(x)\\) and \\(f(x) = x\\). Each point on the graph is from some value \\(t\\) in the interval. We can see that the graph goes through \\((0,0)\\) as that is when \\(t=0\\). As well, it must go through \\((1, \\pi/2)\\) as that is when \\(t=\\pi/2\\)\n \n \n\n\n\nWith \\(g(x) = \\sin(x)\\) and \\(f(x) = x\\), we can take \\(I=[a,b] = [0, \\pi/2]\\). In the figure below, the secant line is drawn in red which connects \\((g(a), f(a))\\) with the point \\((g(b), f(b))\\), and hence has slope \\(\\Delta f/\\Delta g\\). The parallel lines drawn show the tangent lines with slope \\(f'(c)/g'(c)\\). Two exist for this problem, the mean value theorem guarantees at least one will."
},
{
"objectID": "derivatives/mean_value_theorem.html#questions",
"href": "derivatives/mean_value_theorem.html#questions",
"title": "25  The mean value theorem for differentiable functions.",
"section": "25.5 Questions",
"text": "25.5 Questions\n\nQuestion\nRolles theorem is a guarantee of a value, but does not provide a recipe to find it. For the function \\(1 - x^2\\) over the interval \\([-5,5]\\), find a value \\(c\\) that satisfies the result.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function \\(f(x) = \\sin(x)\\) on \\(I=[0, \\pi]\\) find a value \\(c\\) satisfying the theorem for an absolute maximum.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function \\(f(x) = \\sin(x)\\) on \\(I=[\\pi, 3\\pi/2]\\) find a value \\(c\\) satisfying the theorem for an absolute maximum.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For \\(f(x) = x^2\\) on \\([0,2]\\) find a value of \\(c\\) satisfying the theorem.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Cauchy mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For \\(f(x) = x^3\\) and \\(g(x) = x^2\\) find a value \\(c\\) in the interval \\([1, 2]\\)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill the function \\(f(x) = x + 1/x\\) satisfy the conditions of the mean value theorem over \\([-1/2, 1/2]\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nJust as it is a fact that \\(f'(x) = 0\\) (for all \\(x\\) in \\(I\\)) implies \\(f(x)\\) is a constant, so too is it a fact that if \\(f'(x) = g'(x)\\) that \\(f(x) - g(x)\\) is a constant. What function would you consider, if you wanted to prove this with the mean value theorem?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(h(x) = f'(x) - g'(x)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = f(x) - (f(b) - f(a)) / (b - a) \\cdot g(x)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = f(x) - g(x)\\)\n \n \n\n\n \n \n \n \n \\(h(x) = f(x) - (f(b) - f(a)) / (b - a)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f''(x) > 0\\) on \\(I\\). Why is it impossible that \\(f'(x) = 0\\) at more than one value in \\(I\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It isn't. The function \\(f(x) = x^2\\) has two zeros and \\(f''(x) = 2 > 0\\)\n \n \n\n\n \n \n \n \n By the mean value theorem, we must have \\(f'(b) - f'(a) > 0\\) when ever \\(b > a\\). This means \\(f'(x)\\) is increasing and can't double back to have more than one zero.\n \n \n\n\n \n \n \n \n By the Rolle's theorem, there is at least one, and perhaps more\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = 1/x\\). For \\(0 < a < b\\), find \\(c\\) so that \\(f'(c) = (f(b) - f(a)) / (b-a)\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(c = \\sqrt{ab}\\)\n \n \n\n\n \n \n \n \n \\(c = 1 / (1/a + 1/b)\\)\n \n \n\n\n \n \n \n \n \\(c = (a+b)/2\\)\n \n \n\n\n \n \n \n \n \\(c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = x^2\\). For \\(0 < a < b\\), find \\(c\\) so that \\(f'(c) = (f(b) - f(a)) / (b-a)\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(c = (a+b)/2\\)\n \n \n\n\n \n \n \n \n \\(c = \\sqrt{ab}\\)\n \n \n\n\n \n \n \n \n \\(c = 1 / (1/a + 1/b)\\)\n \n \n\n\n \n \n \n \n \\(c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn an example, we used the fact that if \\(0 < c < x\\), for some \\(c\\) given by the mean value theorem and \\(f(x)\\) goes to \\(0\\) as \\(x\\) goes to zero then \\(f(c)\\) will also go to zero. Suppose we say that \\(c=g(x)\\) for some function \\(c\\).\nWhy is it known that \\(g(x)\\) goes to \\(0\\) as \\(x\\) goes to zero (from the right)?\n\n\n\n \n \n \n \n \n \n \n \n \n As \\(f(x)\\) goes to zero by Rolle's theorem it must be that \\(g(x)\\) goes to \\(0\\).\n \n \n\n\n \n \n \n \n The squeeze theorem applies, as \\(0 < g(x) < x\\).\n \n \n\n\n \n \n \n \n This follows by the extreme value theorem, as there must be some \\(c\\) in \\([0,x]\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSince \\(g(x)\\) goes to zero, why is it true that if \\(f(x)\\) goes to \\(L\\) as \\(x\\) goes to zero that \\(f(g(x))\\) must also have a limit \\(L\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The squeeze theorem applies, as \\(0 < g(x) < x\\)\n \n \n\n\n \n \n \n \n It isn't true. The limit must be 0\n \n \n\n\n \n \n \n \n This follows from the limit rules for composition of functions"
},
{
"objectID": "derivatives/optimization.html",
"href": "derivatives/optimization.html",
"title": "26  Optimization",
"section": "",
"text": "This section uses these add-on packages:\nA basic application of calculus is to answer questions which relate to the largest or smallest a function can be given some constraints.\nFor example,\nThe main tool is the extreme value theorem of Bolzano and Fermats theorem about critical points, which combined say:\nThough not all of our problems lend themselves to a description of a continuous function on a closed interval, if they do, we have an algorithmic prescription to find the absolute extrema of a function:\nWith the computer we can take some shortcuts, as we will be able to graph our function to see where the extreme values will be, and in particular if they occur at end points or critical points."
},
{
"objectID": "derivatives/optimization.html#fixed-perimeter-and-area",
"href": "derivatives/optimization.html#fixed-perimeter-and-area",
"title": "26  Optimization",
"section": "26.1 Fixed perimeter and area",
"text": "26.1 Fixed perimeter and area\nThe simplest way to investigate the maximum or minimum value of a function over a closed interval is to just graph it and look.\nWe began with the question of which rectangles of perimeter \\(20\\) have the largest area? The figure shows a few different rectangles with this perimeter and their respective areas.\n\n\n \n Some possible rectangles that satisfy the constraint on the perimeter and their area.\n \n \n\n\n\nThe basic mathematical approach is to find a function of a single variable to maximize or minimize. In this case we have two variables describing a rectangle: a base \\(b\\) and height \\(h\\). Our formulas are the area of a rectangle:\n\\[\nA = bh,\n\\]\nand the formula for the perimeter of a rectangle:\n\\[\nP = 2b + 2h = 20.\n\\]\nFrom this last one, we see that \\(b\\) can be no bigger than \\(10\\) and no smaller than \\(0\\) from the restriction put in place through the perimeter. Solving for \\(h\\) in terms of \\(b\\) then yields this restatement of the problem:\nMaximize \\(A(b) = b \\cdot (10 - b)\\) over the interval \\([0,10]\\).\nThis is exactly the form needed to apply our theorem about the existence of extrema (a continuous function on a closed interval). Rather than solve analytically by taking a derivative, we simply graph to find the value:\n\nArea(b) = b * (10 - b)\nplot(Area, 0, 10)\n\n\n\n\nYou should see the maximum occurs at \\(b=5\\) by symmetry, so \\(h=5\\) as well, and the maximum area is then \\(25\\). This gives the satisfying answer that among all rectangles of fixed perimeter, that with the largest area is a square. As well, this indicates a common result: there is often some underlying symmetry in the answer.\n\n26.1.1 Exploiting polymorphism\nBefore moving on, lets see a slightly different way to do this problem with Julia, where we trade off some algebra for a bit of abstraction. This technique was discussed in the section on functions.\nLets first write area as a function of both base and height:\n\nA(b, h) = b * h\n\nA (generic function with 1 method)\n\n\nFrom the constraint given by the perimeter being a fixed value we can solve for h in terms of b. We write this as a function:\n\nh(b) = (20 - 2b) / 2\n\nh (generic function with 1 method)\n\n\nTo get A(b) we simply need to substitute h(b) into our formula for the area, A. However, instead of doing the substitution ourselves using algebra we let Julia do it through composition of functions:\n\nA(b) = A(b, h(b))\n\nA (generic function with 2 methods)\n\n\nNow we can solve graphically as before, or numerically, such as here where we search for zeros of the derivative:\n\nfind_zeros(A', 0, 10) # find_zeros in `Roots`,\n\n1-element Vector{Float64}:\n 5.0\n\n\n(As a reminder, the notation A' is defined in CalculusWithJulia using the derivative function from the ForwardDiff package.)\n\n\n\n\n\n\nNote\n\n\n\nLook at the last definition of A. The function A appears on both sides, though on the left side with one argument and on the right with two. These are two “methods” of a generic function, A. Julia allows multiple definitions for the same name as long as the arguments (their number and type) can disambiguate which to use. In this instance, when one argument is passed in then the last defintion is used (A(b,h(b))), whereas if two are passed in, then the method that multiplies both arguments is used. The advantage of multiple dispatch is illustrated: the same concept - area - has one function name, though there may be different ways to compute the area, so there is more than one implementation.\n\n\n\nExample: Norman windows\nHere is a similar, though more complicated, example where the analytic approach can be a bit more tedious, but the graphical one mostly satisfying, though we do use a numerical algorithm to find an exact final answer.\nLet a “Norman” window consist of a rectangular window of top length \\(x\\) and side length \\(y\\) and a half circle on top. The goal is to maximize the area for a fixed value of the perimeter. Again, assume this perimeter is \\(20\\) units.\nThis figure shows two such windows, one with base length given by \\(x=3\\), the other with base length given by \\(x=4\\). The one with base length \\(4\\) seems to have much bigger area, what value of \\(x\\) will lead to the largest area?\n\n\n\n\n\nFor this problem, we have two equations.\nThe area is the area of the rectangle plus the area of the half circle (\\(\\pi r^2/2\\) with \\(r=x/2\\)).\n\\[\nA = xy + \\pi(x/2)^2/2\n\\]\nIn Julia this is\n\nAᵣ(x, y) = x*y + pi*(x/2)^2 / 2\n\nAᵣ (generic function with 1 method)\n\n\nThe perimeter consists of \\(3\\) sides of the rectangle and the perimeter of half a circle (\\(\\pi r\\), with \\(r=x/2\\)):\n\\[\nP = 2y + x + \\pi(x/2) = 20\n\\]\nWe solve for \\(y\\) in the first with \\(y = (20 - x - \\pi(x/2))/2\\) so that in Julia we have:\n\ny(x) = (20 - x - pi * x/2) / 2\n\ny (generic function with 1 method)\n\n\nAnd then we substitute in y(x) for y in the area formula through:\n\nAᵣ(x) = Aᵣ(x, y(x))\n\nAᵣ (generic function with 2 methods)\n\n\nOf course both \\(x\\) and \\(y\\) are non-negative. The latter forces \\(x\\) to be no more than \\(x=20/(1+\\pi/2)\\).\nThis leaves us the calculus problem of finding an absolute maximum of a continuous function over the closed interval \\([0, 20/(1+\\pi/2)]\\). Our theorem tells us this maximum must occur, we now proceed to find it.\nWe begin by simply graphing and estimating the values of the maximum and where it occurs.\n\nplot(Aᵣ, 0, 20/(1+pi/2))\n\n\n\n\nThe naked eye sees that maximum value is somewhere around \\(27\\) and occurs at \\(x\\approx 5.6\\). Clearly from the graph, we know the maximum value happens at the critical point and there is only one such critical point.\nAs reading the maximum from the graph is more difficult than reading a \\(0\\) of a function, we plot the derivative using our approximate derivative.\n\nplot(Aᵣ', 5.5, 5.7)\n\n\n\n\nWe confirm that the critical point is around \\(5.6\\).\n\n\nUsing find_zero to locate critical points.\nRather than zoom in graphically, we now use a root-finding algorithm, to find a more precise value of the zero of \\(A'\\). We know that the maximum will occur at a critical point, a zero of the derivative. The find_zero function from the Roots package provides a non-linear root-finding algorithm based on the bisection method. The only thing to keep track of is that solving \\(f'(x) = 0\\) means we use the derivative and not the original function.\nWe see from the graph that \\([0, 20/(1+\\pi/2)]\\) will provide a bracket, as there is only one relative maximum:\n\nx = find_zero(Aᵣ', (0, 20/(1+pi/2)))\n\n5.600991535115574\n\n\nThis value is the lone critical point, and in this case gives the position of the value that will maximize the function. The value and maximum area are then given by:\n\n(x, Aᵣ(x))\n\n(5.600991535115574, 28.004957675577867)\n\n\n(Compare this answer to the previous, is the square the figure of greatest area for a fixed perimeter, or just the figure amongst all rectangles? See Isoperimetric inequality for an answer.)\n\n\n\n26.1.2 Using argmax to identify where a function is maximized\nThis value that maximizes a function is sometimes referred to as the argmax, or argument which maximizes the function. In Julia the argmax(f,domain) function is defined to “Return a value \\(x\\) in the domain of \\(f\\) for which \\(f(x)\\) is maximized. If there are multiple maximal values for \\(f(x)\\) then the first one will be found.” The domain is some iterable collection. In the mathematical world this would be an interval \\([a,b]\\), but on the computer it is an approximation, such as is returned by range below. Without out having to take a derivative, as above, but sacrificing some accuracy, the task of identifying x for where A is maximum, could be done with\n\nargmax(Aᵣ, range(0, 20/(1+pi/2), length=10000))\n\n5.6011593738142205\n\n\n\nA symbolic approach\nWe could also do the above problem symbolically with the aid of SymPy. Here are the steps:\n\n@syms 𝒘::real 𝒉::real\n\n𝑨₀ = 𝒘 * 𝒉 + pi * (𝒘/2)^2 / 2\n𝑷erim = 2*𝒉 + 𝒘 + pi * 𝒘/2\n𝒉₀ = solve(𝑷erim - 20, 𝒉)[1]\n𝑨₁ = 𝑨₀(𝒉 => 𝒉₀)\n𝒘₀ = solve(diff(𝑨₁,𝒘), 𝒘)[1]\n\n \n\\[\n\\frac{40}{\\pi + 4}\n\\]\n\n\n\nWe know that 𝒘₀ is the maximum in this example from our previous work. We shall see soon, that just knowing that the second derivative is negative at 𝒘₀ would suffice to know this. Here we check that condition:\n\ndiff(𝑨₁, 𝒘, 𝒘)(𝒘 => 𝒘₀)\n\n \n\\[\n- (\\frac{\\pi}{4} + 1)\n\\]\n\n\n\nAs an aside, compare the steps involved above for a symbolic solution to those of previous work for a numeric solution:\n\nAᵣ(w, h) = w*h + pi*(w/2)^2 / 2\nh(w) = (20 - w - pi * w/2) / 2\nAᵣ(w) = A(w, h(w))\nfind_zero(A', (0, 20/(1+pi/2))) # 40 / (pi + 4)\n\n3.8898452964834274\n\n\nThey are similar, except we solved for h0 symbolically, rather than by hand, when we solved for h(w).\n\nExample\n\n\n\n\n\nA trapezoid is inscribed in the upper-half circle of radius \\(r\\). The trapezoid is found be connecting the points \\((x,y)\\) (in the first quadrant) with \\((r, 0)\\), \\((-r,0)\\), and \\((-x, y)\\). Find the maximum area. (The above figure has \\(x=0.75\\) and \\(r=1\\).)\nHere the constraint is simply \\(r^2 = x^2 + y^2\\) with \\(x\\) and \\(y\\) being non-negative. The area is then found through the average of the two lengths times the height. Using height for y, we have:\n\n@syms x::positive r::positive\nhₜ = sqrt(r^2 - x^2)\naₜ = (2x + 2r)/2 * hₜ\npossible_sols = solve(diff(aₜ, x) ~ 0, x) # possibly many solutions\nx0 = first(possible_sols) # only solution is also found from first or [1] indexing\n\n \n\\[\n\\frac{r}{2}\n\\]\n\n\n\nThe other values of interest can be found through substitution. For example:\n\nhₜ(x => x0)\n\n \n\\[\n\\frac{\\sqrt{3} r}{2}\n\\]"
},
{
"objectID": "derivatives/optimization.html#trigonometry-problems",
"href": "derivatives/optimization.html#trigonometry-problems",
"title": "26  Optimization",
"section": "26.2 Trigonometry problems",
"text": "26.2 Trigonometry problems\nMany maximization and minimization problems involve triangles, which in turn use trigonometry in their description. Here is an example, the “ladder corner problem.” (There are many other ladder problems.)\nA ladder is to be moved through a two-dimensional hallway which has a bend and gets narrower after the bend. The hallway is \\(8\\) feet wide then \\(5\\) feet wide. What is the longest such ladder that can be navigated around the corner?\nThe figure shows a ladder of length \\(l_1 + l_2\\) that got stuck - it was too long.\n\n\n\n\n\nWe approach this problem in reverse. It is easy to see when a ladder is too long. It gets stuck at some angle \\(\\theta\\). So for each \\(\\theta\\) we find that ladder length that is just too long. Then we find the minimum length of all these ladders that are too long. If a ladder is this length or more it will get stuck for some angle. However, if it is less than this length it will not get stuck. So to maximize a ladder length, we minimize a different function. Neat.\nNow, to find the length \\(l = l_1 + l_2\\) as a function of \\(\\theta\\).\nWe need to brush off our trigonometry, in particular right triangle trigonometry. We see from the figure that \\(l_1\\) is the hypotenuse of a right triangle with opposite side \\(8\\) and \\(l_2\\) is the hypotenuse of a right triangle with adjacent side \\(5\\). So, \\(8/l_1 = \\sin\\theta\\) and \\(5/l_2 = \\cos\\theta\\).\nThat is, we have\n\nl(l1, l2) = l1 + l2\nl1(t) = 8/sin(t)\nl2(t) = 5/cos(t)\n\nl(t) = l(l1(t), l2(t)) # or simply l(t) = 8/sin(t) + 5/cos(t)\n\nl (generic function with 2 methods)\n\n\nOur goal is to minimize this function for all angles between \\(0\\) and \\(90\\) degrees, or \\(0\\) and \\(\\pi/2\\) radians.\nThis is not a continuous function on a closed interval - it is undefined at the endpoints. That being said, a quick plot will convince us that the minimum occurs at a critical point and there is only one critical point in \\((0, \\pi/2)\\).\n\ndelta = 0.2\nplot(l, delta, pi/2 - delta)\n\n\n\n\nThe graph shows the minimum occurs between \\(0.50\\) and \\(1.00\\) radians, a bracket for the derivative. Here we find \\(x\\) and the minimum value:\n\nx = find_zero(l', (0.5, 1.0))\nx, l(x)\n\n(0.8634136052517809, 18.219533699708656)\n\n\nThat is, any ladder less than this length can get around the hallway."
},
{
"objectID": "derivatives/optimization.html#rate-times-time-problems",
"href": "derivatives/optimization.html#rate-times-time-problems",
"title": "26  Optimization",
"section": "26.3 Rate times time problems",
"text": "26.3 Rate times time problems\nEthan Hunt, a top secret spy, has a mission to chase a bad guy. Here is what we know:\n\nEthan likes to run. He can run at \\(10\\) miles per hour.\nHe can drive a car - usually some concept car by BMW - at \\(30\\) miles per hour, but only on the road.\n\nFor his mission, he needs to go \\(10\\) miles west and \\(5\\) `miles north. He can do this by:\n\njust driving \\(8.310\\) miles west then \\(5\\) miles north, or\njust running the diagonal distance, or\ndriving \\(0 < x < 10\\) miles west, then running on the diagonal\n\nA quick analysis says:\n\nIt would take \\((10+5)/30\\) hours to just drive\nIt would take \\(\\sqrt{10^2 + 5^2}/10\\) hours to just run\n\nNow, if he drives \\(x\\) miles west (\\(0 < x < 10\\)) he would run an amount given by the hypotenuse of a triangle with lengths \\(5\\) and \\(10-x\\). His time driving would be \\(x/30\\) and his time running would be \\(\\sqrt{5^2+(10-x)^2}/10\\) for a total of:\n\\[\nT(x) = x/30 + \\sqrt{5^2 + (10-x)^2}/10, \\quad 0 < x < 10\n\\]\nWith the endpoints given by \\(T(0) = \\sqrt{10^2 + 5^2}/10\\) and \\(T(10) = (10 + 5)/30\\).\nLets plot \\(T(x)\\) over the interval \\((0,10)\\) and look:\n\nT(x) = x/30 + sqrt(5^2 + (10-x)^2)/10\n\nT (generic function with 1 method)\n\n\n\nplot(T, 0, 10)\n\n\n\n\nThe minimum happens way out near 8. We zoom in a bit:\n\nplot(T, 7, 9)\n\n\n\n\nIt appears to be around \\(8.3\\). We now use find_zero to refine our guess at the critical point using \\([7,9]\\):\n\nα = find_zero(T', (7, 9))\n\n8.232233047033631\n\n\nOkay, got it. Around\\(8.23\\). So is our minimum time\n\nT(α)\n\n0.804737854124365\n\n\nWe know this is a relative minimum, but not that it is the global minimum over the closed time interlal. For that we must also check the endpoints:\n\nsqrt(10^2 + 5^2)/10, T(α), (10+5)/30\n\n(1.118033988749895, 0.804737854124365, 0.5)\n\n\nAhh, we see that \\(T(x)\\) is not continuous on \\([0, 10]\\), as it jumps at \\(x=10\\) down to an even smaller amount of \\(1/2\\). It may not look as impressive as a miles-long sprint, but Mr. Hunt is advised by Benji to drive the whole way.\n\n26.3.1 Rate times time … the origin story\n\n\n\nImage number \\(43\\) from lHospitals calculus book (the first). A traveler leaving location \\(C\\) to go to location \\(F\\) must cross two regions separated by the straight line \\(AEB\\). We suppose that in the region on the side of \\(C\\), he covers distance \\(a\\) in time \\(c\\), and that on the other, on the side of \\(F\\), distance \\(b\\) in the same time \\(c\\). We ask through which point \\(E\\) on the line \\(AEB\\) he should pass, so as to take the least possible time to get from \\(C\\) to \\(F\\)? (From http://www.ams.org/samplings/feature-column/fc-2016-05.)\n\n\nThe last example is a modern day illustration of a problem of calculus dating back to lHospital. His parameterization is a bit different. Lets change his by taking two points \\((0, a)\\) and \\((L,-b)\\), with \\(a,b,L\\) positive values. Above the \\(x\\) axis travel happens at rate \\(r_0\\), and below, travel happens at rate \\(r_1\\), again, both positive. What value \\(x\\) in \\([0,L]\\) will minimize the total travel time?\nWe approach this symbolically with SymPy:\n\n@syms x::positive a::positive b::positive L::positive r0::positive r1::positive\n\nd0 = sqrt(x^2 + a^2)\nd1 = sqrt((L-x)^2 + b^2)\n\nt = d0/r0 + d1/r1 # time = distance/rate\ndt = diff(t, x) # look for critical points\n\n \n\\[\n\\frac{- L + x}{r_{1} \\sqrt{b^{2} + \\left(L - x\\right)^{2}}} + \\frac{x}{r_{0} \\sqrt{a^{2} + x^{2}}}\n\\]\n\n\n\nThe answer will occur at a critical point or an endpoint, either \\(x=0\\) or \\(x=L\\).\nThe structure of dt is too complicated for simply calling solve to find the critical points. Instead we help SymPy out a bit. We are solving an equation of the form \\(a/b + c/d = 0\\). These solutions will also be solutions of \\((a/b)^2 - (c/d)^2=0\\) or even \\(a^2d^2 - c^2b^2 = 0\\). This follows as solutions to \\(u+v=0\\), also solve \\((u+v)\\cdot(u-v)=0\\), or \\(u^2 - v^2=0\\). Setting \\(u=a/b\\) and \\(v=c/d\\) completes the comparison.\nWe can get these terms - \\(a\\), \\(b\\), \\(c\\), and \\(d\\) - as follows:\n\nt1, t2 = dt.args # the `args` property returns the arguments to the outer function (+ in this case)\n\n(x/(r0*sqrt(a^2 + x^2)), (-L + x)/(r1*sqrt(b^2 + (L - x)^2)))\n\n\nThe equivalent of \\(a^2d^2 - c^2 b^2\\) is found using the generic functions numerator and denominator to access the numerator and denominator of the fractions:\n\nex = numerator(t1^2)*denominator(t2^2) - denominator(t1^2)*numerator(t2^2)\n\n \n\\[\n- r_{0}^{2} \\left(- L + x\\right)^{2} \\left(a^{2} + x^{2}\\right) + r_{1}^{2} x^{2} \\left(b^{2} + \\left(L - x\\right)^{2}\\right)\n\\]\n\n\n\nThis is a polynomial in the x variable of degree \\(4\\), as seen here where the sympy.Poly function is used to identify the symbols of the polynomial from the parameters:\n\np = sympy.Poly(ex, x) # a0 + a1⋅x + a2⋅x^2 + a3⋅x^3 + a4⋅x^4\np.coeffs()\n\n5-element Vector{Sym}:\n -r0^2 + r1^2\n 2*L*r0^2 - 2*L*r1^2\n -L^2*r0^2 + L^2*r1^2 - a^2*r0^2 + b^2*r1^2\n 2*L*a^2*r0^2\n -L^2*a^2*r0^2\n\n\nFourth degree polynomials can be solved. The critical points of the original equation will be among the \\(4\\) solutions given. However, the result is complicated. The article from which the figure came states that “In todays textbooks the problem, usually involving a river, involves walking along one bank and then swimming across; this corresponds to setting \\(g=0\\) in lHospitals example, and leads to a quadratic equation.” Lets see that case, which we can get in our notation by taking \\(b=0\\):\n\nq = ex(b=>0)\nfactor(q)\n\n \n\\[\n- \\left(- L + x\\right)^{2} \\left(a^{2} r_{0}^{2} + r_{0}^{2} x^{2} - r_{1}^{2} x^{2}\\right)\n\\]\n\n\n\nWe see two terms: one with \\(x=L\\) and another quadratic. For the simple case \\(r_0=r_1\\), a straight line is the best solution, and this corresponds to \\(x=L\\), which is clear from the formula above, as we only have one solution to the following:\n\nsolve(q(r1=>r0), x)\n\n1-element Vector{Sym}:\n L\n\n\nWell, not so fast. We need to check the other endpoint, \\(x=0\\):\n\nta = t(b=>0, r1=>r0)\nta(x=>0), ta(x=>L)\n\n(L/r0 + a/r0, sqrt(L^2 + a^2)/r0)\n\n\nThe value at \\(x=L\\) is smaller, as \\(L^2 + a^2 \\leq (L+a)^2\\). (Well, that was a bit pedantic. The travel rates being identical means the fastest path will also be the shortest path and that is clearly \\(x=L\\) and not \\(x=0\\).)\nNow, if, say, travel above the line is half as slow as travel along, then \\(2r_0 = r_1\\), and the critical points will be:\n\nout = solve(q(r1 => 2r0), x)\n\n2-element Vector{Sym}:\n L\n sqrt(3)*a/3\n\n\nIt is hard to tell which would minimize time without more work. To check a case (\\(a=1, L=2, r_0=1\\)) we might have\n\nx_straight = t(r1 =>2r0, b=>0, x=>out[1], a=>1, L=>2, r0 => 1) # for x=L\n\n \n\\[\n\\sqrt{5}\n\\]\n\n\n\nCompared to the smaller (\\(x=\\sqrt{3}a/3\\)):\n\nx_angle = t(r1 =>2r0, b=>0, x=>out[2], a=>1, L=>2, r0 => 1)\n\n \n\\[\n\\frac{\\sqrt{3}}{2} + 1\n\\]\n\n\n\nWhat about \\(x=0\\)?\n\nx_bent = t(r1 =>2r0, b=>0, x=>0, a=>1, L=>2, r0 => 1)\n\n \n\\[\n2\n\\]\n\n\n\nThe value of \\(x=\\sqrt{3}a/3\\) minimizes time:\n\nmin(x_straight, x_angle, x_bent)\n\n \n\\[\n\\frac{\\sqrt{3}}{2} + 1\n\\]\n\n\n\nThe traveler in this case is advised to head to the \\(x\\) axis at \\(x=\\sqrt{3}a/3\\) and then travel along the \\(x\\) axis.\nWill this approach always be true? Consider different parameters, say we switch the values of \\(a\\) and \\(L\\) so \\(a > L\\):\n\npts = [0, out...]\nm,i = findmin([t(r1 =>2r0, b=>0, x=>u, a=>2, L=>1, r0 => 1) for u in pts]) # min, index\nm, pts[i]\n\n(sqrt(5), L)\n\n\nHere traveling directly to the point \\((L,0)\\) is fastest. Though travel is slower, the route is more direct and there is no time saved by taking the longer route with faster travel for part of it."
},
{
"objectID": "derivatives/optimization.html#unbounded-domains",
"href": "derivatives/optimization.html#unbounded-domains",
"title": "26  Optimization",
"section": "26.4 Unbounded domains",
"text": "26.4 Unbounded domains\nMaximize the function \\(xe^{-(1/2) x^2}\\) over the interval \\([0, \\infty)\\).\nHere the extreme value theorem doesnt technically apply, as we dont have a closed interval. However, if we can eliminate the endpoints as candidates, then we should be able to convince ourselves the maximum must occur at a critical point of \\(f(x)\\). (If not, then convince yourself for all sufficiently large \\(M\\) the maximum over \\([0,M]\\) occurs at a critical point, not an endpoint. Then let \\(M\\) go to infinity. In general, for an optimization problem of a continuous function on the interval \\((a,b)\\) if the right limit at \\(a\\) and left limit at \\(b\\) can be ruled out as candidates, the optimal value must occur at a critical point.)\nSo to approach this problem we first graph it over a wide interval.\n\nf(x) = x * exp(-x^2)\nplot(f, 0, 100)\n\n\n\n\nClearly the action is nearer to \\(1\\) than \\(100\\). We try graphing the derivative near that area:\n\nplot(f', 0, 5)\n\n\n\n\nThis shows the value of interest near \\(0.7\\) for a critical point. We use find_zero with \\([0,1]\\) as a bracket\n\nc = find_zero(f', (0, 1))\n\n0.7071067811865476\n\n\nThe maximum is then at\n\nf(c)\n\n0.42888194248035333\n\n\n\nExample: Minimize the surface area of a can\nFor a more applied problem of this type (infinite domain), consider a can of some soft drink that is to contain \\(355\\)ml which is \\(355\\) cubic centimeters. We use metric units, as the relationship between volume (cubic centimeters) and fluid amount (ml) is clear. A can to hold this amount is produced in the shape of cylinder with radius \\(r\\) and height \\(h\\). The materials involved give the surface area, which would be:\n\\[\nSA = h \\cdot 2\\pi r + 2 \\cdot \\pi r^2.\n\\]\nThe volume satisfies:\n\\[\nV = 355 = h \\cdot \\pi r^2.\n\\]\nFind the values of \\(r\\) and \\(h\\) which minimize the surface area.\nFirst the surface area in both variables is given by\n\nSA(h, r) = h * 2pi * r + 2pi * r^2\n\nSA (generic function with 1 method)\n\n\nSolving from the constraint on the volume for h in terms of r yields:\n\ncanheight(r) = 355 / (pi * r^2)\n\ncanheight (generic function with 1 method)\n\n\nComposing gives a function of r alone:\n\nSA(r) = SA(canheight(r), r)\n\nSA (generic function with 2 methods)\n\n\nThis is minimized subject to the constraint that \\(r \\geq 0\\). A quick glance shows that as \\(r\\) gets close to \\(0\\), the can must get infinitely tall to contain that fixed volume, and would have infinite surface area as the \\(1/r^2\\) in the first term implies. On the other hand, as \\(r\\) goes to infinity, the height must go to \\(0\\) to make a really flat can. Again, we would have infinite surface area, as the \\(r^2\\) term at the end indicates. With this observation, we can rule out the endpoints as possible minima, so any minima must occur at a critical point.\nWe start by making a graph, making an educated guess that the answer is somewhere near a real life answer, or around \\(3\\)-\\(5\\) cms in radius:\n\nplot(SA, 2, 10)\n\n\n\n\nThe minimum looks to be around \\(4\\)cm and is clearly between \\(2\\)cm and \\(6\\)cm. We can use find_zero to zero in on the value of the critical point:\n\nrₛₐ = find_zero(SA', (2, 6))\n\n3.8372152480156734\n\n\nOkay, \\(3.837...\\) is our answer for \\(r\\). Use this to get \\(h\\):\n\ncanheight(rₛₐ)\n\n7.674430496031345\n\n\nThis produces a can which is about square in profile. This is not how most cans look though. Perhaps our model is too simple, or the cans are optimized for some other purpose than minimizing materials."
},
{
"objectID": "derivatives/optimization.html#questions",
"href": "derivatives/optimization.html#questions",
"title": "26  Optimization",
"section": "26.5 Questions",
"text": "26.5 Questions\n\nQuestion\nA geometric figure has area given in terms of two measurements by \\(A=\\pi a b\\) and perimeter \\(P = \\pi (a + b)\\). If the perimeter is fixed to be 20 units long, what is the maximal area the figure can be?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA geometric figure has area given in terms of two measurements by \\(A=\\pi a b\\) and perimeter \\(P=\\pi \\cdot \\sqrt{a^2 + b^2}/2\\). If the perimeter is 20 units long, what is the maximal area?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA rancher with \\(10\\) meters of fence wishes to make a pen adjacent to an existing fence. The pen will be a rectangle with one edge using the existing fence. Say that has length \\(x\\), then \\(10 = 2y + x\\), with \\(y\\) the other dimension of the pen. What is the maximum area that can be made?\n\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs there “symmetry” in the answer between \\(x\\) and \\(y\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is you were do do two pens like this back to back, then the answer would involve a rectangle. Is there symmetry in the answer now?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA rectangle of sides \\(w\\) and \\(h\\) has fixed area \\(20\\). What is the smallest perimeter it can have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA rectangle of sides \\(w\\) and \\(h\\) has fixed area \\(20\\). What is the largest perimeter it can have?\n\n\n\n \n \n \n \n \n \n \n \n \n It can be infinite\n \n \n\n\n \n \n \n \n It is also 20\n \n \n\n\n \n \n \n \n \\(17.888\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA cardboard box is to be designed with a square base and an open top holding a fixed volume \\(V\\). What dimensions yield the minimal surface area?\nIf this problem were approached symbolically, we might see the following code. First:\n@syms V::positive x::positive z::positive\nSA = 1 * x * x + 4 * x * z\nWhat does this express?\n\n\n\n \n \n \n \n \n \n \n \n \n The box has a square base with open top, so x*x is the amount of material in the base; the 4 sides each have x*z area.\n \n \n\n\n \n \n \n \n The surface area of a box is 6x*x, so this is wrong.\n \n \n\n\n \n \n \n \n The volume is a fixed amount, so is x*x*z, with sides suitably labeled\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does this command express?\nSA = subs(SA, z => V / x^2)\n\n\n\n \n \n \n \n \n \n \n \n \n This command replaces z, reparameterizing in V instead.\n \n \n\n\n \n \n \n \n This command replaces z with an expression in x using the constraint of fixed volume V\n \n \n\n\n \n \n \n \n This command is merely algebraic simplification\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does this command find?\nsolve(diff(SA, x) ~ 0, x)\n\n\n\n \n \n \n \n \n \n \n \n \n This solves \\(SA'=0\\), that is it find critical points of a continuously differentiable function\n \n \n\n\n \n \n \n \n This solves for \\(V\\) the fixed, but unknown volume\n \n \n\n\n \n \n \n \n This checks the values of SA at the end points of the domain\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat do these commands do?\ncps = solve(diff(SA, x) ~ 0, x)\nxx = filter(isreal, cps)[1]\ndiff(SA, x, x)(xx) > 0\n\n\n\n \n \n \n \n \n \n \n \n \n This applies the first derivative test to the lone real critical point showing there is a local minimum at that point.\n \n \n\n\n \n \n \n \n This finds the `4th derivative of SA\n \n \n\n\n \n \n \n \n This applies the second derivative test to the lone real critical point showing there is a local minimum at that point.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA rain gutter is constructed from a 30” wide sheet of tin by bending it into thirds. If the sides are bent 90 degrees, then the cross-sectional area would be \\(100 = 10^2\\). This is not the largest possible amount. For example, if the sides are bent by 45 degrees, the cross sectional area is:\n\n\n120.71067811865474\n\n\nFind a value in degrees that gives the maximum. (The first task is to write the area in terms of \\(\\theta\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion Non-Norman windows\nSuppose our new “Norman” window has half circular tops at the top and bottom? If the perimeter is fixed at \\(20\\) and the dimensions of the rectangle are \\(x\\) for the width and \\(y\\) for the height.\nWhat is the value of \\(y\\) that maximizes the area?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion (Thanks https://www.math.ucdavis.edu/~kouba)\nA movie screen projects on a wall 20 feet high beginning 10 feet above the floor. This figure shows \\(\\theta\\) for \\(x=30\\):\n\n\n\n\n\nWhat value of \\(x\\) gives the largest angle \\(\\theta\\)? (In degrees.)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA maximum likelihood estimator is a value derived by maximizing a function. For example, if\n\nLikhood(t) = t^3 * exp(-3t) * exp(-2t) * exp(-4t) ## 0 <= t <= 10\n\nLikhood (generic function with 1 method)\n\n\nThen Likhood(t) is continuous and has single peak, so the maximum occurs at the lone critical point. It turns out that this problem is bit sensitive to an initial condition, so we bracket\n\nfind_zero(Likhood', (0.1, 0.5))\n\n0.3333333333333333\n\n\nNow if \\(Likhood(t) = \\exp(-3t) \\cdot \\exp(-2t) \\cdot \\exp(-4t), \\quad 0 \\leq t \\leq 10\\), by graphing, explain why the same approach wont work:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(Likhood(t)\\) is not continuous on \\(0\\) to \\(10\\)\n \n \n\n\n \n \n \n \n It does work and the answer is x = 2.27...\n \n \n\n\n \n \n \n \n \\(Likhood(t)\\) takes its maximum at a boundary point - not a critical point\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(x_1\\), \\(x_2\\), \\(x_n\\) be a set of unspecified numbers in a data set. Form the expression \\(s(x) = (x-x_1)^2 + \\cdots (x-x_n)^2\\). What is the smallest this can be (in \\(x\\))?\nWe approach this using SymPy and \\(n=10\\)\n@syms s xs[1:10]\ns(x) = sum((x-xi)^2 for xi in xs)\ncps = solve(diff(s(x), x), x)\nRun the above code. Baseed on the critical points found, what do you guess will be the minimum value in terms of the values \\(x_1\\), \\(x_2, \\dots\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The square roots of the values squared, \\((x_1^2 + \\cdots x_n^2)^2\\)\n \n \n\n\n \n \n \n \n The median, or middle number, of the values\n \n \n\n\n \n \n \n \n The mean, or average, of the values\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nMinimize the function \\(f(x) = 2x + 3/x\\) over \\((0, \\infty)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nOf all rectangles of area 4, find the one with smallest perimeter. What is the perimeter?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA running track is in the shape of two straight aways and two half circles. The total distance (perimeter) is 400 meters. Suppose \\(w\\) is the width (twice the radius of the circles) and \\(h\\) is the height. What dimensions minimize the sum \\(w + h\\)?\nYou have \\(P(w, h) = 2\\pi \\cdot (w/2) + 2\\cdot(h-w)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA cell phone manufacturer wishes to make a rectangular phone with total surface area of 12,000 \\(mm^2\\) and maximal screen area. The screen is surrounded by bezels with sizes of 8\\(mm\\) on the long sides and 32\\(mm\\) on the short sides. (So, for example, the screen width is shorter by \\(2\\cdot 8\\) mm than the phone width.)\nWhat are the dimensions (width and height) that allow the maximum screen area?\nThe width is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe height is?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the value \\(x > 0\\) which minimizes the distance from the graph of \\(f(x) = \\log_e(x) - x\\) to the origin \\((0,0)\\).\n\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\nImage number \\(40\\) from lHospitals calculus book (the first calculus book). Among all the cones that can be inscribed in a sphere, determine which one has the largest lateral area. (From http://www.ams.org/samplings/feature-column/fc-2016-05)\n\n\nThe figure above poses a problem about cones in spheres, which can be reduced to a two-dimensional problem. Take a sphere of radius \\(r=1\\), and imagine a secant line of length \\(l\\) connecting \\((-r, 0)\\) to another point \\((x,y)\\) with \\(y>0\\). Rotating that line around the \\(x\\) axis produces a cone and its lateral surface is given by \\(SA=\\pi \\cdot y \\cdot l\\). Write \\(SA\\) as a function of \\(x\\) and solve.\nThe largest lateral surface area is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe surface area of a sphere of radius \\(1\\) is \\(4\\pi r^2 = 4 \\pi\\). This is how many times greater than that of the largest cone?\n\n\n\n \n \n \n \n \n \n \n \n \n exactly four times\n \n \n\n\n \n \n \n \n about \\(2.6\\) times as big\n \n \n\n\n \n \n \n \n exactly \\(\\pi\\) times\n \n \n\n\n \n \n \n \n about the same\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the examples the functions argmax(f, itr) and findmin(collection) were used. These have mathematical analogs. What is argmax(f,itr) in terms of math notation, where \\(vs\\) is the iterable collection of values:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\{i \\mid v_i \\text{ in } vs, f(v_i) = \\max(f(vs))\\}\\)\n \n \n\n\n \n \n \n \n \\(\\{f(v) \\mid v \\text{ in } vs, f(v) = \\max(f(vs))\\}\\)\n \n \n\n\n \n \n \n \n \\(\\{v \\mid v \\text{ in } vs, f(v) = \\max(f(vs))\\}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe functions are related: findmax returns the maximum value and an index in the collection for which the value will be largest; argmax returns an element of the set for which the function is largest, so argmax(identify, itr) should correspond to the index found by findmax (through itr[findmax(itr)[2])\n\n\nQuestion\nLet \\(f(x) = (a/x)^x\\) for \\(a,x > 0\\). When is this maximized? The following might be helpful\n\n@syms x::positive a::postive\ndiff((a/x)^x, x)\n\n \n\\[\n\\left(\\frac{a}{x}\\right)^{x} \\left(\\log{\\left(\\frac{a}{x} \\right)} - 1\\right)\n\\]\n\n\n\nThis can be solved to discover the answer.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(a/e\\)\n \n \n\n\n \n \n \n \n \\(e/a\\)\n \n \n\n\n \n \n \n \n \\(e\\)\n \n \n\n\n \n \n \n \n \\(a \\cdot e\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe ladder problem has an trigonometry-free solution. We show one attributed to Asma.\n\n\n\n\n\nIntroducing a variable \\(p\\), we get, following the above figure, the ladder of length \\(c\\) touching the wall at \\(b+bp\\) and \\(a + x\\).\nUsing similar triangles, we have:\n\n@syms a::positive b::positive p::positive x::positive\nsolve(x/b ~ (x+a)/(b + b*p), x)\n\n1-element Vector{Sym}:\n a/p\n\n\nWith \\(x = a/p\\) we get by Pythagoreans theorem that\n\\[\n\\begin{align*}\nc^2 &= (a + a/p)^2 + (b + bp)^2 \\\\\n &= a^2(1 + \\frac{1}{p})^2 + b^2(1+p)^2.\n\\end{align*}\n\\]\nThe ladder problem minimizes \\(c\\) or equivalently \\(c^2\\).\nWhy is the following set of commands useful in this task:\nc2 = a^2*(1 + 1/p)^2 + b^2*(1+p)^2\nc2p = diff(c2, p)\neq = numer(together(c2p))\nsolve(eq ~ 0, p)\n\n\n\n \n \n \n \n \n \n \n \n \n It finds the critical points\n \n \n\n\n \n \n \n \n It finds the minimal value of p\n \n \n\n\n \n \n \n \n It finds the minimal value of c\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe polynomial nu is what degree in p?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe only positive real solution for \\(p\\) from \\(nu\\) is\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\((a/b)^{2/3}\\)\n \n \n\n\n \n \n \n \n \\(\\sqrt{3}/2 \\cdot (a/b)^{2/3}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn Hall we can find a dozen optimization problems related to the following figure of the parabola \\(y=x^2\\) a point \\(P=(a,a^2)\\), \\(a > 0\\), and its normal line. We will do two.\n\np = plot(; legend=false, aspect_ratio=:equal, axis=nothing, border=:none)\n b = 2.\n plot!(p, x -> x^2, -b, b)\n plot!(p, [-b,b], [0,0])\n plot!(p, [0,0], [0, b^2])\n a = 1\n scatter!(p, [a],[a^2])\n m = 2a\n plot!(p, x -> a^2 + m*(x-a), 1/2, b)\n mₚ = -1/m\n plot!(p, x -> a^2 + mₚ*(x-a))\n scatter!(p, [-3/2], [(3/2)^2])\n annotate!(p, [(1+1/4, 1+1/8, \"P\"), (-3/2-1/4, (-3/2)^2-1/4, \"Q\")])\np\n\n\n\n\nWhat do these commands do?\n\n@syms x::real, a::real\nmₚ = - 1 / diff(x^2, x)(a)\nsolve(x^2 - (a^2 + mₚ*(x-a)) ~ 0, x)\n\n2-element Vector{Sym}:\n a\n -a - 1/(2*a)\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n It finds the tangent line\n \n \n\n\n \n \n \n \n It finds the point \\(P\\)\n \n \n\n\n \n \n \n \n It finds the \\(x\\) value of the intersection points of the normal line and the parabola\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNumerically, find the value of \\(a\\) that minimizes the \\(y\\) coordinate of \\(Q\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNumerically find the value of \\(a\\) that minimizes the length of the line seqment \\(PQ\\).\n#| hold: true\n#| echo: false\nx(a) = -a - 1/(2a)\nd(a) = (a-x(a))^2 + (a^2 - x(a)^2)^2\na = find_zero(d', 1)\nnumericq(a)"
},
{
"objectID": "derivatives/first_second_derivatives.html",
"href": "derivatives/first_second_derivatives.html",
"title": "27  The first and second derivatives",
"section": "",
"text": "This section uses these add-on packages:\nThis section explores properties of a function, \\(f(x)\\), that are described by properties of its first and second derivatives, \\(f'(x)\\) and \\(f''(x)\\). As part of the conversation two tests are discussed that characterize when a critical point is a relative maximum or minimum. (We know that any relative maximum or minimum occurs at a critical point, but it is not true that any critical point will be a relative maximum or minimum.)"
},
{
"objectID": "derivatives/first_second_derivatives.html#positive-or-increasing-on-an-interval",
"href": "derivatives/first_second_derivatives.html#positive-or-increasing-on-an-interval",
"title": "27  The first and second derivatives",
"section": "27.1 Positive or increasing on an interval",
"text": "27.1 Positive or increasing on an interval\nWe start with some vocabulary:\n\nA function \\(f\\) is positive on an interval \\(I\\) if for any \\(a\\) in \\(I\\) it must be that \\(f(a) > 0\\).\n\nOf course, we define negative in a parallel manner. The intermediate value theorem says a continuous function can not change from positive to negative without crossing \\(0\\). This is not the case for functions with jumps, of course.\nNext,\n\nA function, \\(f\\), is (strictly) increasing on an interval \\(I\\) if for any \\(a < b\\) it must be that \\(f(a) < f(b)\\).\n\nThe word strictly is related to the inclusion of the \\(<\\) precluding the possibility of a function being flat over an interval that the \\(\\leq\\) inequality would allow.\nA parallel definition with \\(a < b\\) implying \\(f(a) > f(b)\\) would be used for a strictly decreasing function.\nWe can try and prove these properties for a function algebraically well see both are related to the zeros of some function. However, before proceeding to that it is usually helpful to get an idea of where the answer is using exploratory graphs.\nWe will use a helper function, plotif(f, g, a, b) that plots the function f over [a,b] coloring it red when g is positive (and blue otherwise). Such a function is defined for us in the accompanying CalculusWithJulia package, which has been previously been loaded.\nTo see where a function is positive, we simply pass the function object in for both f and g above. For example, lets look at where \\(f(x) = \\sin(x)\\) is positive:\n\nf(x) = sin(x)\nplotif(f, f, -2pi, 2pi)\n\n\n\n\nLets graph with cos in the masking spot and see what happens:\n\nplotif(sin, cos, -2pi, 2pi)\n\n\n\n\nMaybe surprisingly, we see that the increasing parts of the sine curve are now highlighted. Of course, the cosine is the derivative of the sine function, now we discuss that this is no coincidence.\nFor the sequel, we will use f' notation to find numeric derivatives, with the notation being defined in the CalculusWithJulia package using the ForwardDiff package."
},
{
"objectID": "derivatives/first_second_derivatives.html#the-relationship-of-the-derivative-and-increasing",
"href": "derivatives/first_second_derivatives.html#the-relationship-of-the-derivative-and-increasing",
"title": "27  The first and second derivatives",
"section": "27.2 The relationship of the derivative and increasing",
"text": "27.2 The relationship of the derivative and increasing\nThe derivative, \\(f'(x)\\), computes the slope of the tangent line to the graph of \\(f(x)\\) at the point \\((x,f(x))\\). If the derivative is positive, the tangent line will have an increasing slope. Clearly if we see an increasing function and mentally layer on a tangent line, it will have a positive slope. Intuitively then, increasing functions and positive derivatives are related concepts. But there are some technicalities.\nSuppose \\(f(x)\\) has a derivative on \\(I\\) . Then\n\nIf \\(f'(x)\\) is positive on an interval \\(I=(a,b)\\), then \\(f(x)\\) is strictly increasing on \\(I\\).\n\nMeanwhile,\n\nIf a function \\(f(x)\\) is increasing on \\(I\\), then \\(f'(x) \\geq 0\\).\n\nThe technicality being the equality parts. In the second statement, we have the derivative is non-negative, as we cant guarantee it is positive, even if we considered just strictly increasing functions.\nWe can see by the example of \\(f(x) = x^3\\) that strictly increasing functions can have a zero derivative, at a point.\nThe mean value theorem provides the reasoning behind the first statement: on \\(I\\), the slope of any secant line between \\(d < e\\) (both in \\(I\\)) is matched by the slope of some tangent line, which by assumption will always be positive. If the secant line slope is written as \\((f(e) - f(d))/(e - d)\\) with \\(d < e\\), then it is clear then that \\(f(e) - f(d) > 0\\), or \\(d < e\\) implies \\(f(d) < f(e)\\).\nThe second part, follows from the secant line equation. The derivative can be written as a limit of secant-line slopes, each of which is positive. The limit of positive things can only be non-negative, though there is no guarantee the limit will be positive.\nSo, to visualize where a function is increasing, we can just pass in the derivative as the masking function in our plotif function, as long as we are wary about places with \\(0\\) derivative (flat spots).\nFor example, here, with a more complicated function, the intervals where the function is increasing are highlighted by passing in the functions derivative to plotif:\n\nf(x) = sin(pi*x) * (x^3 - 4x^2 + 2)\nplotif(f, f', -2, 2)\n\n\n\n\n\n27.2.1 First derivative test\nWhen a function changes from increasing to decreasing, or decreasing to increasing, it will have a peak or a valley. More formally, such points are relative extrema.\nWhen discussing the mean value thereom, we defined relative extrema :\n\n\nThe function \\(f(x)\\) has a relative maximum at \\(c\\) if the value \\(f(c)\\) is an absolute maximum for some open interval containing \\(c\\).\nSimilarly, \\(f(x)\\) has a relative minimum at \\(c\\) if the value \\(f(c)\\) is an absolute minimum for some open interval about \\(c\\).\n\n\nWe know since Fermat that:\n\nRelative maxima and minima must occur at critical points.\n\nFermat says that critical points where the function is defined, but its derivative is either \\(0\\) or undefined are interesting points, however:\n\nA critical point need not indicate a relative maxima or minima.\n\nAgain, \\(f(x)=x^3\\) provides the example at \\(x=0\\). This is a critical point, but clearly not a relative maximum or minimum - it is just a slight pause for a strictly increasing function.\nThis leaves the question:\n\nWhen will a critical point correspond to a relative maximum or minimum?\n\nThis question can be answered by considering the first derivative.\n\nThe first derivative test: If \\(c\\) is a critical point for \\(f(x)\\) and if \\(f'(x)\\) changes sign at \\(x=c\\), then \\(f(c)\\) will be either a relative maximum or a relative minimum.\n\nIt will be a relative maximum if the derivative changes sign from \\(+\\) to \\(-\\).\nIt will be a relative minimum if the derivative changes sign from \\(-\\) to \\(+\\).\nIf \\(f'(x)\\) does not change sign at \\(c\\), then \\(f(c)\\) is not a relative maximum or minimum.\n\n\nThe classification part, should be clear: e.g., if the derivative is positive then negative, the function \\(f\\) will increase to \\((c,f(c))\\) then decrease from \\((c,f(c))\\) so \\(f\\) will have a local maximum at \\(c\\).\nOur definition of critical point assumes \\(f(c)\\) exists, as \\(c\\) is in the domain of \\(f\\). With this assumption, vertical asymptotes are avoided. However, it need not be that \\(f'(c)\\) exists. The absolute value function at \\(x=0\\) provides an example: this point is a critical point where the derivative changes sign, but \\(f'(x)\\) is not defined at exactly \\(x=0\\). Regardless, it is guaranteed that \\(f(c)\\) will be a relative minimum by the first derivative test.\n\nExample\nConsider the function \\(f(x) = e^{-\\lvert x\\rvert} \\cos(\\pi x)\\) over \\([-3,3]\\):\n\n𝐟(x) = exp(-abs(x)) * cos(pi * x)\nplotif(𝐟, 𝐟', -3, 3)\n\n\n\n\nWe can see the first derivative test in action: at the peaks and valleys the relative extrema the color changes. This is because \\(f'\\) is changing sign as as the function changes from increasing to decreasing or vice versa.\nThis function has a critical point at \\(0\\), as can be seen. It corresponds to a point where the derivative does not exist. It is still identified through find_zeros, which picks up zeros and in case of discontinuous functions, like f', zero crossings:\n\nfind_zeros(𝐟', -3, 3)\n\n7-element Vector{Float64}:\n -2.9019067380477064\n -1.9019067380477062\n -0.9019067380477064\n -0.0\n 0.9019067380477064\n 1.9019067380477062\n 2.9019067380477064\n\n\n\n\nExample\nFind all the relative maxima and minima of the function \\(f(x) = \\sin(\\pi \\cdot x) \\cdot (x^3 - 4x^2 + 2)\\) over the interval \\([-2, 2]\\).\nWe will do so numerically. For this task we first need to gather the critical points. As each of the pieces of \\(f\\) are everywhere differentiable and no quotients are involved, the function \\(f\\) will be everywhere differentiable. As such, only zeros of \\(f'(x)\\) can be critical points. We find these with\n\n𝒇(x) = sin(pi*x) * (x^3 - 4x^2 + 2)\n𝒇cps = find_zeros(𝒇', -2, 2)\n\n6-element Vector{Float64}:\n -1.6497065688663188\n -0.8472574303034358\n -0.3264827018362353\n 0.35183579595045034\n 0.8981933926622453\n 1.6165308438251855\n\n\nWe should be careful though, as find_zeros may miss zeros that are not simple or too close together. A critical point will correspond to a relative maximum if the function crosses the axis, so these can not be “pauses.” As this is exactly the case we are screening for, we double check that all the critical points are accounted for by graphing the derivative:\n\nplot(𝒇', -2, 2, legend=false)\nplot!(zero)\nscatter!(𝒇cps, 0*𝒇cps)\n\n\n\n\nWe see the six zeros as stored in cps and note that at each the function clearly crosses the \\(x\\) axis.\nFrom this last graph of the derivative we can also characterize the graph of \\(f\\): The left-most critical point coincides with a relative minimum of \\(f\\), as the derivative changes sign from negative to positive. The critical points then alternate relative maximum, relative minimum, relative maximum, relative, minimum, and finally relative maximum.\n\n\nExample\nConsider the function \\(g(x) = \\sqrt{\\lvert x^2 - 1\\rvert}\\). Find the critical points and characterize them as relative extrema or not.\nWe will apply the same approach, but need to get a handle on how large the values can be. The function is a composition of three functions. We should expect that the only critical points will occur when the interior polynomial, \\(x^2-1\\) has values of interest, which is around the interval \\((-1, 1)\\). So we look to the slightly wider interval \\([-2, 2]\\):\n\ng(x) = sqrt(abs(x^2 - 1))\ngcps = find_zeros(g', -2, 2)\n\n3-element Vector{Float64}:\n -0.9999999999999999\n 0.0\n 0.9999999999999999\n\n\nWe see the three values \\(-1\\), \\(0\\), \\(1\\) that correspond to the two zeros and the relative minimum of \\(x^2 - 1\\). We could graph things, but instead we characterize these values using a sign chart. A piecewise continuous function can only change sign when it crosses \\(0\\) or jumps over \\(0\\). The derivative will be continuous, except possibly at the three values above, so is piecewise continuous.\nA sign chart picks convenient values between crossing points to test if the function is positive or negative over those intervals. When computing by hand, these would ideally be values for which the function is easily computed. On the computer, this isnt a concern; below the midpoint is chosen:\n\npts = sort(union(-2, gcps, 2)) # this includes the endpoints (a, b) and the critical points\ntest_pts = pts[1:end-1] + diff(pts)/2 # midpoints of intervals between pts\n[test_pts sign.(g'.(test_pts))]\n\n4×2 Matrix{Float64}:\n -1.5 -1.0\n -0.5 1.0\n 0.5 -1.0\n 1.5 1.0\n\n\nSuch values are often summarized graphically on a number line using a sign chart:\n - ∞ + 0 - ∞ + g'\n<---- -1 ----- 0 ----- 1 ---->\n(The values where the function is \\(0\\) or could jump over \\(0\\) are shown on the number line, and the sign between these points is indicated. So the first minus sign shows \\(g'(x)\\) is negative on \\((-\\infty, -1)\\), the second minus sign shows \\(g'(x)\\) is negative on \\((0,1)\\).)\nReading this we have:\n\nthe derivative changes sign from negative to postive at \\(x=-1\\), so \\(g(x)\\) will have a relative minimum.\nthe derivative changes sign from positive to negative at \\(x=0\\), so \\(g(x)\\) will have a relative maximum.\nthe derivative changes sign from negative to postive at \\(x=1\\), so \\(g(x)\\) will have a relative minimum.\n\nIn the CalculusWithJulia package there is sign_chart function that will do such work for us, though with a different display:\n\nsign_chart(g', -2, 2)\n\n3-element Vector{NamedTuple{(:DNE_0_∞, :sign_change), Tuple{Float64, String}}}:\n (DNE_0_∞ = -0.9999999999999999, sign_change = \"- → +\")\n (DNE_0_∞ = 0.0, sign_change = \"+ → -\")\n (DNE_0_∞ = 0.9999999999999999, sign_change = \"- → +\")\n\n\n(This function numerically identifies \\(x\\)-values for the specified function which are zeros, infinities, or points where the function jumps \\(0\\). It then shows the resulting sign pattern of the function from left to right.)\nWe did this all without graphs. But, lets look at the graph of the derivative:\n\nplot(g', -2, 2)\n\n\n\n\nWe see asymptotes at \\(x=-1\\) and \\(x=1\\)! These arent zeroes of \\(f'(x)\\), but rather where \\(f'(x)\\) does not exist. The conclusion is correct - each of \\(-1\\), \\(0\\) and \\(1\\) are critical points with the identified characterization - but not for the reason that they are all zeros.\n\nplot(g, -2, 2)\n\n\n\n\nFinally, why does find_zeros find these values that are not zeros of \\(g'(x)\\)? As discussed briefly above, it uses the bisection algorithm on bracketing intervals to find zeros which are guaranteed by the intermediate value theorem, but when applied to discontinuous functions, as f' is, will also identify values where the function jumps over \\(0\\).\n\n\nExample\nConsider the function \\(f(x) = \\sin(x) - x\\). Characterize the critical points.\nWe will work symbolically for this example.\n\n@syms x\nfx = sin(x) - x\nfp = diff(fx, x)\nsolve(fp)\n\n2-element Vector{Sym}:\n 0\n 2⋅π\n\n\nWe get values of \\(0\\) and \\(2\\pi\\). Lets look at the derivative at these points:\nAt \\(x=0\\) we have to the left and right signs found by\n\nfp(-pi/2), fp(pi/2)\n\n(-1.00000000000000, -1.00000000000000)\n\n\nBoth are negative. The derivative does not change sign at \\(0\\), so the critical point is neither a relative minimum or maximum.\nWhat about at \\(2\\pi\\)? We do something similar:\n\nfp(2pi - pi/2), fp(2pi + pi/2)\n\n(-1.00000000000000, -1.00000000000000)\n\n\nAgain, both negative. The function \\(f(x)\\) is just decreasing near \\(2\\pi\\), so again the critical point is neither a relative minimum or maximum.\nA graph verifies this:\n\nplot(fx, -3pi, 3pi)\n\n\n\n\nWe see that at \\(0\\) and \\(2\\pi\\) there are “pauses” as the function decreases. We should also see that this pattern repeats. The critical points found by solve are only those within a certain domain. Any value that satisfies \\(\\cos(x) - 1 = 0\\) will be a critical point, and there are infinitely many of these of the form \\(n \\cdot 2\\pi\\) for \\(n\\) an integer.\nAs a comment, the solveset function, which is replacing solve, returns the entire collection of zeros:\n\nsolveset(fp)\n\n \n\\[\n\\left\\{2 n \\pi\\; \\middle|\\; n \\in \\mathbb{Z}\\right\\}\n\\]\n\n\n\n\nOf course, sign_chart also does this, only numerically. We just need to pick an interval wide enough to contains \\([0,2\\pi]\\)\n\nsign_chart((x -> sin(x)-x)', -3pi, 3pi)\n\n3-element Vector{NamedTuple{(:DNE_0_∞, :sign_change), Tuple{Float64, String}}}:\n (DNE_0_∞ = -6.283185297666141, sign_change = \"- → -\")\n (DNE_0_∞ = 0.0, sign_change = \"- → -\")\n (DNE_0_∞ = 6.283185307351308, sign_change = \"- → -\")\n\n\n\n\nExample\nSuppose you know \\(f'(x) = (x-1)\\cdot(x-2)\\cdot (x-3) = x^3 - 6x^2 + 11x - 6\\) and \\(g'(x) = (x-1)\\cdot(x-2)^2\\cdot(x-3)^3 = x^6 -14x^5 +80x^4-238x^3+387x^2-324x+108\\).\nHow would the graphs of \\(f(x)\\) and \\(g(x)\\) differ, as they share identical critical points?\nThe graph of \\(f(x)\\) - a function we do not have a formula for - can have its critical points characterized by the first derivative test. As the derivative changes sign at each, all critical points correspond to relative maxima. The sign pattern is negative/positive/negative/positive so we have from left to right a relative minimum, a relative maximum, and then a relative minimum. This is consistent with a \\(4\\)th degree polynomial with \\(3\\) relative extrema.\nFor the graph of \\(g(x)\\) we can apply the same analysis. Thinking for a moment, we see as the factor \\((x-2)^2\\) comes as a power of \\(2\\), the derivative of \\(g(x)\\) will not change sign at \\(x=2\\), so there is no relative extreme value there. However, at \\(x=3\\) the factor has an odd power, so the derivative will change sign at \\(x=3\\). So, as \\(g'(x)\\) is positive for large negative values, there will be a relative maximum at \\(x=1\\) and, as \\(g'(x)\\) is positive for large positive values, a relative minimum at \\(x=3\\).\nThe latter is consistent with a \\(7\\)th degree polynomial with positive leading coefficient. It is intuitive that since \\(g'(x)\\) is a \\(6\\)th degree polynomial, \\(g(x)\\) will be a \\(7\\)th degree one, as the power rule applied to a polynomial results in a polynomial of lesser degree by one.\nHere is a simple schematic that illustrates the above considerations.\nf' - 0 + 0 - 0 + f'-sign\n ↘ ↗ ↘ ↗ f-direction\n f-shape\n\ng' + 0 - 0 - 0 + g'-sign\n ↗ ↘ ↘ ↗ g-direction\n ∩ ~ g-shape\n<------ 1 ----- 2 ----- 3 ------>"
},
{
"objectID": "derivatives/first_second_derivatives.html#concavity",
"href": "derivatives/first_second_derivatives.html#concavity",
"title": "27  The first and second derivatives",
"section": "27.3 Concavity",
"text": "27.3 Concavity\nConsider the function \\(f(x) = x^2\\). Over this function we draw some secant lines for a few pairs of \\(x\\) values:\n\n\n\n\n\nThe graph attempts to illustrate that for this function the secant line between any two points \\(a < b\\) will lie above the graph over \\([a,b]\\).\nThis is a special property not shared by all functions. Let \\(I\\) be an open interval.\n\nConcave up: A function \\(f(x)\\) is concave up on \\(I\\) if for any \\(a < b\\) in \\(I\\), the secant line between \\(a\\) and \\(b\\) lies above the graph of \\(f(x)\\) over \\([a,b]\\).\n\nA similar definition exists for concave down where the secant lines lie below the graph. Notationally, concave up says for any \\(x\\) in \\([a,b]\\):\n\\[\nf(a) + \\frac{f(b) - f(a)}{b-a} \\cdot (x-a) \\geq f(x) \\quad\\text{ (concave up) }\n\\]\nReplacing \\(\\geq\\) with \\(\\leq\\) defines concave down, and with either \\(>\\) or \\(<\\) will add the prefix “strictly.” These definitions are useful for a general definition of convex functions.\nWe wont work with these definitions in this section, rather we will characterize concavity for functions which have either a first or second derivative:\n\n\nIf \\(f'(x)\\) exists and is increasing on \\((a,b)\\), then \\(f(x)\\) is concave up on \\((a,b)\\).\nIf \\(f'(x)\\) is decreasing on \\((a,b)\\), then \\(f(x)\\) is concave down.\n\n\nA proof of this makes use of the same trick used to establish the mean value theorem from Rolles theorem. Assume \\(f'\\) is increasing and let \\(g(x) = f(x) - (f(a) + M \\cdot (x-a))\\), where \\(M\\) is the slope of the secant line between \\(a\\) and \\(b\\). By construction \\(g(a) = g(b) = 0\\). If \\(f'(x)\\) is increasing, then so is \\(g'(x) = f'(x) + M\\). By its definition above, showing \\(f\\) is concave up is the same as showing \\(g(x) \\leq 0\\). Suppose to the contrary that there is a value where \\(g(x) > 0\\) in \\([a,b]\\). We show this cant be. Assuming \\(g'(x)\\) always exists, after some work, Rolles theorem will ensure there is a value where \\(g'(c) = 0\\) and \\((c,g(c))\\) is a relative maximum, and as we know there is at least one positive value, it must be \\(g(c) > 0\\). The first derivative test then ensures that \\(g'(x)\\) will increase to the left of \\(c\\) and decrease to the right of \\(c\\), since \\(c\\) is at a critical point and not an endpoint. But this cant happen as \\(g'(x)\\) is assumed to be increasing on the interval.\nThe relationship between increasing functions and their derivatives if $f(x) > 0 $ on \\(I\\), then \\(f\\) is increasing on \\(I\\) gives this second characterization of concavity when the second derivative exists:\n\n\nIf \\(f''(x)\\) exists and is positive on \\(I\\), then \\(f(x)\\) is concave up on \\(I\\).\nIf \\(f''(x)\\) exists and is negative on \\(I\\), then \\(f(x)\\) is concave down on \\(I\\).\n\n\nThis follows, as we can think of \\(f''(x)\\) as just the first derivative of the function \\(f'(x)\\), so the assumption will force \\(f'(x)\\) to exist and be increasing, and hence \\(f(x)\\) to be concave up.\n\nExample\nLets look at the function \\(x^2 \\cdot e^{-x}\\) for positive \\(x\\). A quick graph shows the function is concave up, then down, then up in the region plotted:\n\nh(x) = x^2 * exp(-x)\nplotif(h, h'', 0, 8)\n\n\n\n\nFrom the graph, we would expect that the second derivative - which is continuous - would have two zeros on \\([0,8]\\):\n\nips = find_zeros(h'', 0, 8)\n\n2-element Vector{Float64}:\n 0.5857864376269049\n 3.414213562373095\n\n\nAs well, between the zeros we should have the sign pattern +, -, and +, as we verify:\n\nsign_chart(h'', 0, 8)\n\n2-element Vector{NamedTuple{(:DNE_0_∞, :sign_change), Tuple{Float64, String}}}:\n (DNE_0_∞ = 0.5857864376269049, sign_change = \"+ → -\")\n (DNE_0_∞ = 3.414213562373095, sign_change = \"- → +\")\n\n\n\n\n27.3.1 Second derivative test\nConcave up functions are “opening” up, and often clearly \\(U\\)-shaped, though that is not necessary. At a relative minimum, where there is a \\(U\\)-shape, the graph will be concave up; conversely at a relative maximum, where the graph has a downward \\(\\cap\\)-shape, the function will be concave down. This observation becomes:\n\nThe second derivative test: If \\(c\\) is a critical point of \\(f(x)\\) with \\(f''(c)\\) existing in a neighborhood of \\(c\\), then\n\nThe value \\(f(c)\\) will be a relative maximum if \\(f''(c) > 0\\),\nThe value \\(f(c)\\) will be a relative minimum if \\(f''(c) < 0\\), and\nif \\(f''(c) = 0\\) the test is inconclusive.\n\n\nIf \\(f''(c)\\) is positive in an interval about \\(c\\), then \\(f''(c) > 0\\) implies the function is concave up at \\(x=c\\). In turn, concave up implies the derivative is increasing so must go from negative to positive at the critical point.\nThe second derivative test is inconclusive when \\(f''(c)=0\\). No such general statement exists, as there isnt enough information. For example, the function \\(f(x) = x^3\\) has \\(0\\) as a critical point, \\(f''(0)=0\\) and the value does not correspond to a relative maximum or minimum. On the other hand \\(f(x)=x^4\\) has \\(0\\) as a critical point, \\(f''(0)=0\\) is a relative minimum.\n\nExample\nUse the second derivative test to characterize the critical points of \\(j(x) = x^5 - x^4 + x^3\\).\n\nj(x) = x^5 - 2x^4 + x^3\njcps = find_zeros(j', -3, 3)\n\n3-element Vector{Float64}:\n 0.0\n 0.6000000000000001\n 1.0\n\n\nWe can check the sign of the second derivative for each critical point:\n\n[jcps j''.(jcps)]\n\n3×2 Matrix{Float64}:\n 0.0 0.0\n 0.6 -0.72\n 1.0 2.0\n\n\nThat \\(j''(0.6) < 0\\) implies that at \\(0.6\\), \\(j(x)\\) will have a relative maximum. As \\(''(1) > 0\\), the second derivative test says at \\(x=1\\) there will be a relative minimum. That \\(j''(0) = 0\\) says that only that there may be a relative maximum or minimum at \\(x=0\\), as the second derivative test does not speak to this situation. (This last check, requiring a function evaluation to be 0, is susceptible to floating point errors, so isnt very robust as a general tool.)\nThis should be consistent with this graph, where \\(-0.25\\), and \\(1.25\\) are chosen to capture the zero at \\(0\\) and the two relative extrema:\n\nplotif(j, j'', -0.25, 1.25)\n\n\n\n\nFor the graph we see that \\(0\\) is not a relative maximum or minimum. We could have seen this numerically by checking the first derivative test, and noting there is no sign change:\n\nsign_chart(j', -3, 3)\n\n3-element Vector{NamedTuple{(:DNE_0_∞, :sign_change), Tuple{Float64, String}}}:\n (DNE_0_∞ = 0.0, sign_change = \"+ → +\")\n (DNE_0_∞ = 0.6000000000000001, sign_change = \"+ → -\")\n (DNE_0_∞ = 1.0, sign_change = \"- → +\")\n\n\n\n\nExample\nOne way to visualize the second derivative test is to locally overlay on a critical point a parabola. For example, consider \\(f(x) = \\sin(x) + \\sin(2x) + \\sin(3x)\\) over \\([0,2\\pi]\\). It has \\(6\\) critical points over \\([0,2\\pi]\\). In this graphic, we locally layer on \\(6\\) parabolas:\n\nf(x) = sin(x) + sin(2x) + sin(3x)\np = plot(f, 0, 2pi, legend=false, color=:blue, linewidth=3)\ncps = find_zeros(f', (0, 2pi))\nΔ = 0.5\nfor c in cps\n parabola(x) = f(c) + (f''(c)/2) * (x-c)^2\n plot!(parabola, c - Δ, c + Δ, color=:red, linewidth=5, alpha=0.6)\nend\np\n\n\n\n\nThe graphic shows that for this function near the relative extrema the parabolas approximate the function well, so that the relative extrema are characterized by the relative extrema of the parabolas.\nAt each critical point \\(c\\), the parabolas have the form\n\\[\nf(c) + \\frac{f''(c)}{2}(x-c)^2.\n\\]\nThe \\(2\\) is a mystery to be answered in the section on Taylor series, the focus here is on the sign of \\(f''(c)\\):\n\nif \\(f''(c) > 0\\) then the approximating parabola opens upward and the critical point is a point of relative minimum for \\(f\\),\nif \\(f''(c) < 0\\) then the approximating parabola opens downward and the critical point is a point of relative maximum for \\(f\\), and\nwere \\(f''(c) = 0\\) then the approximating parabola is just a line the tangent line at a critical point and is non-informative about extrema.\n\nThat is, the parabola picture is just the second derivative test in this light.\n\n\n\n27.3.2 Inflection points\nAn inflection point is a value where the second derivative of \\(f\\) changes sign. At an inflection point the derivative will change from increasing to decreasing (or vice versa) and the function will change from concave up to down (or vice versa).\nWe can use the find_zeros function to identify potential inflection points by passing in the second derivative function. For example, consider the bell-shaped function\n\\[\nk(x) = e^{-x^2/2}.\n\\]\nA graph suggests relative a maximum at \\(x=0\\), a horizontal asymptote of \\(y=0\\), and two inflection points:\n\nk(x) = exp(-x^2/2)\nplotif(k, k'', -3, 3)\n\n\n\n\nThe inflection points can be found directly, if desired, or numerically with:\n\nfind_zeros(k'', -3, 3)\n\n2-element Vector{Float64}:\n -1.0\n 1.0\n\n\n(The find_zeros function may return points which are not inflection points. It primarily returns points where \\(k''(x)\\) changes sign, but may also find points where \\(k''(x)\\) is \\(0\\) yet does not change sign at \\(x\\).)\n\nExample\nA car travels from a stop for 1 mile in 2 minutes. A graph of its position as a function of time might look like any of these graphs:\n\n\n\n\n\nAll three graphs have the same average velocity which is just the \\(1/2\\) miles per minute (\\(30\\) miles an hour). But the instantaneous velocity - which is given by the derivative of the position function) varies.\nThe graph f1 has constant velocity, so the position is a straight line with slope \\(v_0\\). The graph f2 is similar, though for first and last 30 seconds, the car does not move, so must move faster during the time it moves. A more realistic graph would be f3. The position increases continuously, as do the others, but the velocity changes more gradually. The initial velocity is less than \\(v_0\\), but eventually gets to be more than \\(v_0\\), then velocity starts to increase less. At no point is the velocity not increasing, for f3, the way it is for f2 after a minute and a half.\nThe rate of change of the velocity is the acceleration. For f1 this is zero, for f2 it is zero as well - when it is defined. However, for f3 we see the increase in velocity is positive in the first minute, but negative in the second minute. This fact relates to the concavity of the graph. As acceleration is the derivative of velocity, it is the second derivative of position - the graph we see. Where the acceleration is positive, the position graph will be concave up, where the acceleration is negative the graph will be concave down. The point \\(t=1\\) is an inflection point, and would be felt by most riders."
},
{
"objectID": "derivatives/first_second_derivatives.html#questions",
"href": "derivatives/first_second_derivatives.html#questions",
"title": "27  The first and second derivatives",
"section": "27.4 Questions",
"text": "27.4 Questions\n\nQuestion\nConsider this graph:\n\nplot(airyai, -5, 0) # airyai in `SpecialFunctions` loaded with `CalculusWithJulia`\n\n\n\n\nOn what intervals (roughly) is the function positive?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-3.2,-1)\\)\n \n \n\n\n \n \n \n \n \\((-5, -4.2)\\)\n \n \n\n\n \n \n \n \n \\((-5, -4.2)\\) and \\((-2.5, 0)\\)\n \n \n\n\n \n \n \n \n \\((-4.2, -2.5)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this graph:\n\n\n\n\n\nOn what intervals (roughly) is the function negative?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-25.0, 0.0)\\)\n \n \n\n\n \n \n \n \n \\((-4.0, -3.0)\\)\n \n \n\n\n \n \n \n \n \\((-5.0, -4.0)\\)\n \n \n\n\n \n \n \n \n \\((-5.0, -4.0)\\) and \\((-4, -3)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this graph\n\n\n\n\n\nOn what interval(s) is this function increasing?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-3.8, -3.0)\\)\n \n \n\n\n \n \n \n \n \\((-4.7, -3.0)\\)\n \n \n\n\n \n \n \n \n \\((-0.17, 0.17)\\)\n \n \n\n\n \n \n \n \n \\((-5.0, -3.8)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider this graph\n\n\n\n\n\nOn what interval(s) is this function concave up?\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-3.0, 3.0)\\)\n \n \n\n\n \n \n \n \n \\((-0.6, 0.6)\\)\n \n \n\n\n \n \n \n \n \\((-3.0, -0.6)\\) and \\((0.6, 3.0)\\)\n \n \n\n\n \n \n \n \n \\((0.1, 1.0)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf it is known that:\n\nA function \\(f(x)\\) has critical points at \\(x=-1, 0, 1\\)\nat \\(-2\\) an \\(-1/2\\) the values are: \\(f'(-2) = 1\\) and \\(f'(-1/2) = -1\\).\n\nWhat can be concluded?\n\n\n\n \n \n \n \n \n \n \n \n \n Nothing\n \n \n\n\n \n \n \n \n That the critical point at \\(-1\\) is a relative maximum\n \n \n\n\n \n \n \n \n That the critical point at \\(-1\\) is a relative minimum\n \n \n\n\n \n \n \n \n That the critical point at \\(0\\) is a relative maximum\n \n \n\n\n \n \n \n \n That the critical point at \\(0\\) is a relative minimum\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nMystery function \\(f(x)\\) has \\(f'(2) = 0\\) and \\(f''(0) = 2\\). What is the most you can say about \\(x=2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x)\\) is continuous at \\(2\\)\n \n \n\n\n \n \n \n \n \\(f(x)\\) is continuous and differentiable at \\(2\\)\n \n \n\n\n \n \n \n \n \\(f(x)\\) is continuous and differentiable at \\(2\\) and has a critical point\n \n \n\n\n \n \n \n \n \\(f(x)\\) is continuous and differentiable at \\(2\\) and has a critical point that is a relative minimum by the second derivative test\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the smallest critical point of \\(f(x) = x^3 e^{-x}\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow many critical points does \\(f(x) = x^5 - x + 1\\) have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow many inflection points does \\(f(x) = x^5 - x + 1\\) have?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAt \\(c\\), \\(f'(c) = 0\\) and \\(f''(c) = 1 + c^2\\). Is \\((c,f(c))\\) a relative maximum? (\\(f\\) is a “nice” function.)\n\n\n\n \n \n \n \n \n \n \n \n \n No, it is a relative minimum\n \n \n\n\n \n \n \n \n No, the second derivative test is possibly inconclusive\n \n \n\n\n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAt \\(c\\), \\(f'(c) = 0\\) and \\(f''(c) = c^2\\). Is \\((c,f(c))\\) a relative minimum? (\\(f\\) is a “nice” function.)\n\n\n\n \n \n \n \n \n \n \n \n \n No, it is a relative maximum\n \n \n\n\n \n \n \n \n No, the second derivative test is possibly inconclusive if \\(c=0\\), but otherwise yes\n \n \n\n\n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\n\n\nThe graph shows \\(f'(x)\\). Is it possible that \\(f(x) = e^{-x} \\sin(\\pi x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Plot \\(f(x)\\) and compare features like critical points, increasing decreasing to that indicated by \\(f'\\) through the graph.)\n\n\nQuestion\n\n\n\n\n\nThe graph shows \\(f'(x)\\). Is it possible that \\(f(x) = x^4 - 3x^3 - 2x + 4\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\n\n\nThe graph shows \\(f''(x)\\). Is it possible that \\(f(x) = (1+x)^{-2}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\n\n\nThis plot shows the graph of \\(f'(x)\\). What is true about the critical points and their characterization?\n\n\n\n \n \n \n \n \n \n \n \n \n The critical points are at \\(x=1\\) (a relative minimum), \\(x=2\\) (not a relative extrema), and \\(x=3\\) (a relative minimum).\n \n \n\n\n \n \n \n \n The critical points are at \\(x=1\\) (a relative minimum), \\(x=2\\) (a relative minimum), and \\(x=3\\) (a relative minimum).\n \n \n\n\n \n \n \n \n The critical points are at \\(x=1\\) (a relative maximum), \\(x=2\\) (not a relative extrema), and \\(x=3\\) (not a relative extrema).\n \n \n\n\n \n \n \n \n The critical points are at \\(x=1\\) (a relative minimum), \\(x=2\\) (not a relative extrema), and \\(x=3\\) (not a relative extrema).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nYou know \\(f''(x) = (x-1)^3\\). What do you know about \\(f(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The function is decreasing over \\((-\\infty, 1)\\) and increasing over \\((1, \\infty)\\)\n \n \n\n\n \n \n \n \n The function is negative over \\((-\\infty, 1)\\) and positive over \\((1, \\infty)\\)\n \n \n\n\n \n \n \n \n The function is concave down over \\((-\\infty, 1)\\) and concave up over \\((1, \\infty)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhile driving we accelerate to get through a light before it turns red. However, at time \\(t_0\\) a car cuts in front of us and we are forced to break. If \\(s(t)\\) represents position, what is \\(t_0\\):\n\n\n\n \n \n \n \n \n \n \n \n \n A zero of the function\n \n \n\n\n \n \n \n \n A critical point for the function\n \n \n\n\n \n \n \n \n An inflection point for the function\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nThe investopedia website describes:\n“An inflection point is an event that results in a significant change in the progress of a company, industry, sector, economy, or geopolitical situation and can be considered a turning point after which a dramatic change, with either positive or negative results, is expected to result.”\nThis accurately summarizes how the term is used outside of math books. Does it also describe how the term is used inside math books?\n\nchoices = [\"Yes. Same words, same meaning\",\n \"\"\"No, but it is close. An inflection point is when the *acceleration* changes from positive to negative, so if \"results\" are about how a company's rate of change is changing, then it is in the ballpark.\"\"\"]\nradioq(choices, 2)\n\n\n \n \n \n \n \n \n \n \n \n No, but it is close. An inflection point is when the acceleration changes from positive to negative, so if \"results\" are about how a company's rate of change is changing, then it is in the ballpark.\n \n \n\n\n \n \n \n \n Yes. Same words, same meaning\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe function \\(f(x) = x^3 + x^4\\) has a critical point at \\(0\\) and a second derivative of \\(0\\) at \\(x=0\\). Without resorting to the first derivative test, and only considering that near \\(x=0\\) the function \\(f(x)\\) is essentially \\(x^3\\), as \\(f(x) = x^3(1+x)\\), what can you say about whether the critical point is a relative extrema?\n\n\n\n \n \n \n \n \n \n \n \n \n As \\(x^3\\) has no extrema at \\(x=0\\), neither will \\(f\\)\n \n \n\n\n \n \n \n \n As \\(x^4\\) is of higher degree than \\(x^3\\), \\(f\\) will be \\(U\\)-shaped, as \\(x^4\\) is."
},
{
"objectID": "derivatives/curve_sketching.html",
"href": "derivatives/curve_sketching.html",
"title": "28  Curve Sketching",
"section": "",
"text": "This section uses the following add-on packages:\nThe figure illustrates a means to sketch a sine curve - identify as many of the following values as you can:\nWith these, a sketch fills in between the points/lines associated with these values.\nThough this approach is most useful for hand-sketches, the underlying concepts are important for properly framing graphs made with the computer.\nWe can easily make a graph of a function over a specified interval. What is not always so easy is to pick an interval that shows off the features of interest. In the section on rational functions there was a discussion about how to draw graphs for rational functions so that horizontal and vertical asymptotes can be seen. These are properties of the “large.” In this section, we build on this, but concentrate now on more local properties of a function."
},
{
"objectID": "derivatives/curve_sketching.html#questions",
"href": "derivatives/curve_sketching.html#questions",
"title": "28  Curve Sketching",
"section": "28.1 Questions",
"text": "28.1 Questions\n\nQuestion\nConsider this graph\n\n\n\n\n\nWhat kind of asymptotes does it appear to have?\n\n\n\n \n \n \n \n \n \n \n \n \n Just a horizontal asymptote, \\(y=0\\)\n \n \n\n\n \n \n \n \n Just vertical asymptotes at \\(x=-1\\) and \\(x=1\\)\n \n \n\n\n \n \n \n \n Vertical asymptotes at \\(x=-1\\) and \\(x=1\\) and a horizontal asymptote \\(y=1\\)\n \n \n\n\n \n \n \n \n Vertical asymptotes at \\(x=-1\\) and \\(x=1\\) and a slant asymptote\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the function \\(p(x) = x + 2x^3 + 3x^3 + 4x^4 + 5x^5 +6x^6\\). Which interval shows more than a \\(U\\)-shaped graph that dominates for large \\(x\\) due to the leading term being \\(6x^6\\)?\n(Find an interval that contains the zeros, critical points, and inflection points.)\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-5,5)\\), the default bounds of a calculator\n \n \n\n\n \n \n \n \n \\((-3.5, 3.5)\\), the bounds given by Cauchy for the real roots of \\(p\\)\n \n \n\n\n \n \n \n \n \\((-1, 1)\\), as many special polynomials have their roots in this interval\n \n \n\n\n \n \n \n \n \\((-1.1, .25)\\), as this constains all the roots, the critical points, and inflection points and just a bit more\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = x^3/(9-x^2)\\).\nWhat points are not in the domain of \\(f\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe \\(x\\)-intercepts are:\n\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe \\(y\\)-intercept is:\n\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThere are vertical asymptotes at \\(x=\\dots\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe slant asymptote has slope?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function has critical points at\n\nradioq(qchoices, 2, keep_order=true)\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function has relative extrema at\n\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function has inflection points at\n\n\n\n \n \n \n \n \n \n \n \n \n The values of find_zeros(f, -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The values of find_zeros(f', -10, 10): [-5.19615, 0, 5.19615]\n \n \n\n\n \n \n \n \n The values of find_zeros(f'', -10, 10): [-3, 0, 3]\n \n \n\n\n \n \n \n \n The zeros of the numerator: [0]\n \n \n\n\n \n \n \n \n The zeros of the denominator: [-3, 3]\n \n \n\n\n \n \n \n \n The value of f(0): 0\n \n \n\n\n \n \n \n \n None of these choices\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA function \\(f\\) has\n\nzeros of \\(\\{-0.7548\\dots, 2.0\\}\\),\ncritical points at \\(\\{-0.17539\\dots, 1.0, 1.42539\\dots\\}\\),\ninflection points at \\(\\{0.2712\\dots,1.2287\\}\\).\n\nIs this a possible graph of \\(f\\)?\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nTwo models for population growth are exponential growth: \\(P(t) = P_0 a^t\\) and logistic growth: \\(P(t) = K P_0 a^t / (K + P_0(a^t - 1))\\). The exponential growth model has growth rate proportional to the current population. The logistic model has growth rate depending on the current population and the available resources (which can limit growth).\nLetting \\(K=10\\), \\(P_0=5\\), and \\(a= e^{1/4}\\). A plot over \\([0,5]\\) shows somewhat similar behaviour:\n\nK, P0, a = 50, 5, exp(1/4)\nexponential_growth(t) = P0 * a^t\nlogistic_growth(t) = K * P0 * a^t / (K + P0*(a^t-1))\n\nplot(exponential_growth, 0, 5)\nplot!(logistic_growth)\n\n\n\n\nDoes a plot over \\([0,50]\\) show qualitatively similar behaviour?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nExponential growth has \\(P''(t) = P_0 a^t \\log(a)^2 > 0\\), so has no inflection point. By plotting over a sufficiently wide interval, can you answer: does the logistic growth model have an inflection point?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf yes, find it numerically:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe available resources are quantified by \\(K\\). As \\(K \\rightarrow \\infty\\) what is the limit of the logistic growth model:\n\n\n\n \n \n \n \n \n \n \n \n \n The limit is \\(P_0\\)\n \n \n\n\n \n \n \n \n The exponential growth model\n \n \n\n\n \n \n \n \n The limit does not exist\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe plotting algorithm for plotting functions starts with a small initial set of points over the specified interval (\\(21\\)) and then refines those sub-intervals where the second derivative is determined to be large.\nWhy are sub-intervals where the second derivative is large different than those where the second derivative is small?\n\n\n\n \n \n \n \n \n \n \n \n \n The function will be much larger (in absolute value) when the second derivative is large, so there needs to be more points to capture the shape\n \n \n\n\n \n \n \n \n The function will increase (or decrease) rapidly when the second derivative is large, so there needs to be more points to capture the shape\n \n \n\n\n \n \n \n \n The function will have more curvature when the second derivative is large, so there needs to be more points to capture the shape\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs there a nice algorithm to identify what domain a function should be plotted over to produce an informative graph? Wilkinson has some suggestions. (Wilkinson is well known to the R community as the specifier of the grammar of graphics.) It is mentioned that “finding an informative domain for a given function depends on at least three features: periodicity, asymptotics, and monotonicity.”\nWhy would periodicity matter?\n\n\n\n \n \n \n \n \n \n \n \n \n An informative graph only needs to show one or two periods, as others can be inferred.\n \n \n\n\n \n \n \n \n An informative graph need only show a part of the period, as the rest can be inferred.\n \n \n\n\n \n \n \n \n An informative graph needs to show several periods, as that will allow proper computation for the \\(y\\) axis range.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy should asymptotics matter?\n\n\n\n \n \n \n \n \n \n \n \n \n A horizontal asymptote must be plotted from \\(-\\infty\\) to \\(\\infty\\)\n \n \n\n\n \n \n \n \n A slant asymptote must be plotted over a very wide domain so that it can be identified.\n \n \n\n\n \n \n \n \n A vertical asymptote can distory the \\(y\\) range, so it is important to avoid too-large values\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nMonotonicity means increasing or decreasing. This is important for what reason?\n\n\n\n \n \n \n \n \n \n \n \n \n For monotonic regions, the function will have a vertical asymptote, so the region should not be plotted\n \n \n\n\n \n \n \n \n For monotonic regions, a large slope or very concave function might require more care to plot\n \n \n\n\n \n \n \n \n For monotonic regions, a function is basically a straight line"
},
{
"objectID": "derivatives/linearization.html",
"href": "derivatives/linearization.html",
"title": "29  Linearization",
"section": "",
"text": "This section uses these add-on packages:\nThe derivative of \\(f(x)\\) has the interpretation as the slope of the tangent line. The tangent line is the line that best approximates the function at the point.\nUsing the point-slope form of a line, we see that the tangent line to the graph of \\(f(x)\\) at \\((c,f(c))\\) is given by:\n\\[\ny = f(c) + f'(c) \\cdot (x - c).\n\\]\nThis is written as an equation, though we prefer to work with functions within Julia. Here we write such a function as an operator - it takes a function f and returns a function representing the tangent line.\n(Recall, the -> indicates that an anonymous function is being generated.)\nThis function along with the f' notation for automatic derivatives is defined in the CalculusWithJulia package.\nWe make some graphs with tangent lines:\nThe graph shows that near the point, the line and function are close, but this need not be the case away from the point. We can express this informally as\n\\[\nf(x) \\approx f(c) + f'(c) \\cdot (x-c)\n\\]\nwith the understanding this applies for \\(x\\) “close” to \\(c\\).\nUsually for the applications herein, instead of \\(x\\) and \\(c\\) the two points are \\(x+\\Delta_x\\) and \\(x\\). This gives:\nThis section gives some implications of this fact and quantifies what “close” can mean."
},
{
"objectID": "derivatives/linearization.html#numeric-approximations",
"href": "derivatives/linearization.html#numeric-approximations",
"title": "29  Linearization",
"section": "29.1 Numeric approximations",
"text": "29.1 Numeric approximations\n\n\n\n\n\nThe plot shows the tangent line with slope \\(dy/dx\\) and the actual change in \\(y\\), \\(\\Delta y\\), for some specified \\(\\Delta x\\). The small gap above the sine curve is the error were the value of the sine approximated using the drawn tangent line. We can see that approximating the value of \\(\\Delta y = \\sin(c+\\Delta x) - \\sin(c)\\) with the often easier to compute \\((dy/dx) \\cdot \\Delta x = f'(c)\\Delta x\\) - for small enough values of \\(\\Delta x\\) - is not going to be too far off provided \\(\\Delta x\\) is not too large.\nThis approximation is known as linearization. It can be used both in theoretical computations and in pratical applications. To see how effective it is, we look at some examples.\n\nExample\nIf \\(f(x) = \\sin(x)\\), \\(c=0\\) and \\(\\Delta x= 0.1\\) then the values for the actual change in the function values and the value of \\(\\Delta y\\) are:\n\nf(x) = sin(x)\nc, deltax = 0, 0.1\nf(c + deltax) - f(c), f'(c) * deltax\n\n(0.09983341664682815, 0.1)\n\n\nThe values are pretty close. But what is \\(0.1\\) radians? Lets use degrees. Suppose we have \\(\\Delta x = 10^\\circ\\):\n\ndeltax⁰ = 10*pi/180\nactual = f(c + deltax⁰) - f(c)\napprox = f'(c) * deltax⁰\nactual, approx\n\n(0.17364817766693033, 0.17453292519943295)\n\n\nThey agree until the third decimal value. The percentage error is just \\(1/2\\) a percent:\n\n(approx - actual) / actual * 100\n\n0.5095057975210231\n\n\n\n\n29.1.1 Relative error or relative change\nThe relative error is defined by\n\\[\n\\big| \\frac{\\text{actual} - \\text{approximate}}{\\text{actual}} \\big|.\n\\]\nHowever, typically with linearization, we talk about the relative change, not relative error, as the denominator is easier to compute. This is\n\\[\n\\frac{f(x + \\Delta_x) - f(x)}{f(x)} = \\frac{\\Delta_y}{f(x)} \\approx\n\\frac{f'(x) \\cdot \\Delta_x}{f(x)}\n\\]\nThe percentage change multiplies by \\(100\\).\n\nExample\nWhat is the relative change in surface area of a sphere if the radius changes from \\(r\\) to \\(r + dr\\)?\nWe have \\(S = 4\\pi r^2\\) so the approximate relative change, \\(dy/S\\) is given, using the derivative \\(dS/dr = 8\\pi r\\), by\n\\[\n\\frac{8\\pi\\cdot r\\cdot dr}{4\\pi r^2} = 2r\\cdot dr.\n\\]\n\n\nExample\nWe are traveling \\(60\\) miles. At \\(60\\) miles an hour, we will take \\(60\\) minutes (or one hour). How long will it take at \\(70\\) miles an hour? (Assume you cant divide, but, instead, can only multiply!)\nWell the answer is \\(60/70\\) hours or \\(60/70 \\cdot 60\\) minutes. But we cant divide, so we turn this into a multiplication problem via some algebra:\n\\[\n\\frac{60}{70} = \\frac{60}{60 + 10} = \\frac{1}{1 + 10/60} = \\frac{1}{1 + 1/6}.\n\\]\nOkay, so far no calculator was needed. We wrote \\(70 = 60 + 10\\), as we know that \\(60/60\\) is just \\(1\\). This almost gets us there. If we really dont want to divide, we can get an answer by using the tangent line approximation for \\(1/(1+x)\\) around \\(x=0\\). This is \\(1/(1+x) \\approx 1 - x\\). (You can check by finding that \\(f'(0) = -1\\).) Thus, our answer is approximately \\(5/6\\) of an hour or 50 minutes.\nHow much in error are we?\n\nabs(50 - 60/70*60) / (60/70*60) * 100\n\n2.7777777777777684\n\n\nThats about \\(3\\) percent. Not bad considering we could have done all the above in our head while driving without taking our eyes off the road to use the calculator on our phone for a division.\n\n\nExample\nA \\(10\\)cm by \\(10\\)cm by \\(10\\)cm cube will contain \\(1\\) liter (\\(1000\\)cm\\(^3\\)). In manufacturing such a cube, the side lengths are actually \\(10.1\\) cm. What will be the volume in liters? Compute this with a linear approximation to \\((10.1)^3\\).\nHere \\(f(x) = x^3\\) and we are asked to approximate \\(f(10.1)\\). Letting \\(c=10\\), we have:\n\\[\nf(c + \\Delta) \\approx f(c) + f'(c) \\cdot \\Delta = 1000 + f'(c) \\cdot (0.1)\n\\]\nComputing the derivative can be done easily, we get for our answer:\n\nfp(x) = 3*x^2\nc₀, Delta = 10, 0.1\napprox₀ = 1000 + fp(c₀) * Delta\n\n1030.0\n\n\nThis is a relative error as a percent of:\n\nactual₀ = 10.1^3\n(actual₀ - approx₀)/actual₀ * 100\n\n0.029214763452615394\n\n\nThe manufacturer may be interested instead in comparing the volume of the actual object to the \\(1\\) liter target. They might use the approximate value for this comparison, which would yield:\n\n(1000 - approx₀)/approx₀ * 100\n\n-2.912621359223301\n\n\nThis is off by about \\(3\\) percent. Not so bad for some applications, devastating for others.\n\n\nExample: Eratosthenes and the circumference of the earth\nEratosthenes is said to have been the first person to estimate the radius (or by relation the circumference) of the earth. The basic idea is based on the difference of shadows cast by the sun. Suppose Eratosthenes sized the circumference as \\(252,000\\) stadia. Taking \\(1\\)`stadia as160meters and the actual radius of the earth as6378.137kilometers, we can convert to see that Eratosthenes estimated the radius as6417.\nIf Eratosthenes were to have estimated the volume of a spherical earth, what would be his approximate percentage change between his estimate and the actual?\nUsing \\(V = 4/3 \\pi r^3\\) we get \\(V' = 4\\pi r^2\\):\n\nrₑ = 6417\nrₐ = 6378.137\nΔᵣ = rₑ - rₐ\nVₛ(r) = 4/3 * pi * r^3\nΔᵥ = Vₛ'(rₑ) * Δᵣ\nΔᵥ / Vₛ(rₑ) * 100\n\n1.8168770453483072\n\n\n\n\nExample: a simple pendulum\nA simple pendulum is comprised of a massless “bob” on a rigid “rod” of length \\(l\\). The rod swings back and forth making an angle \\(\\theta\\) with the perpendicular. At rest \\(\\theta=0\\), here we have \\(\\theta\\) swinging with \\(\\lvert\\theta\\rvert \\leq \\theta_0\\) for some \\(\\theta_0\\).\nAccording to Wikipedia - and many introductory physics book - while swinging, the angle \\(\\theta\\) varies with time following this equation:\n\\[\n\\theta''(t) + \\frac{g}{l} \\sin(\\theta(t)) = 0.\n\\]\nThat is, the second derivative of \\(\\theta\\) is proportional to the sine of \\(\\theta\\) where the proportionality constant involves \\(g\\) from gravity and the length of the “rod.”\nThis would be much easier if the second derivative were proportional to the angle \\(\\theta\\) and not its sine.\nHuygens used the approximation of \\(\\sin(x) \\approx x\\), noted above, to say that when the angle is not too big, we have the pendulums swing obeying \\(\\theta''(t) = -g/l \\cdot t\\). Without getting too involved in why, we can verify by taking two derivatives that \\(\\theta_0\\sin(\\sqrt{g/l}\\cdot t)\\) will be a solution to this modified equation.\nWith this solution, the motion is periodic with constant amplitude (assuming frictionless behaviour), as the sine function is. More surprisingly, the period is found from \\(T = 2\\pi/(\\sqrt{g/l}) = 2\\pi \\sqrt{l/g}\\). It depends on \\(l\\) - longer “rods” take more time to swing back and forth - but does not depend on the how wide the pendulum is swinging between (provided \\(\\theta_0\\) is not so big the approximation of \\(\\sin(x) \\approx x\\) fails). This latter fact may be surprising, though not to Galileo who discovered it."
},
{
"objectID": "derivatives/linearization.html#differentials",
"href": "derivatives/linearization.html#differentials",
"title": "29  Linearization",
"section": "29.2 Differentials",
"text": "29.2 Differentials\nThe Leibniz notation for a derivative is \\(dy/dx\\) indicating the change in \\(y\\) as \\(x\\) changes. It proves convenient to decouple this using differentials \\(dx\\) and \\(dy\\). What do these notations mean? They measure change along the tangent line in same way \\(\\Delta_x\\) and \\(\\Delta_y\\) measure change for the function. The differential \\(dy\\) depends on both \\(x\\) and \\(dx\\), it being defined by \\(dy=f'(x)dx\\). As tangent lines locally represent a function, \\(dy\\) and \\(dx\\) are often associated with an infinitesimal difference.\nTaking \\(dx = \\Delta_x\\), as in the previous graphic, we can compare \\(dy\\) the change along the tangent line given by \\(dy/dx \\cdot dx\\) and \\(\\Delta_y\\) the change along the function given by \\(f(x + \\Delta_x) - f(x)\\). The linear approximation, \\(f(x + \\Delta_x) - f(x)\\approx f'(x)dx\\), says that\n\\[\n\\Delta_y \\approx dy; \\quad \\text{ when } \\Delta_x = dx\n\\]"
},
{
"objectID": "derivatives/linearization.html#the-error-in-approximation",
"href": "derivatives/linearization.html#the-error-in-approximation",
"title": "29  Linearization",
"section": "29.3 The error in approximation",
"text": "29.3 The error in approximation\nHow good is the approximation? Graphically we can see it is pretty good for the graphs we choose, but are there graphs out there for which the approximation is not so good? Of course. However, we can say this (the Lagrange form of a more general Taylor remainder theorem):\n\nLet \\(f(x)\\) be twice differentiable on \\(I=(a,b)\\), \\(f\\) is continuous on \\([a,b]\\), and \\(a < c < b\\). Then for any \\(x\\) in \\(I\\), there exists some value \\(\\xi\\) between \\(c\\) and \\(x\\) such that \\(f(x) = f(c) + f'(c)(x-c) + (f''(\\xi)/2)\\cdot(x-c)^2\\).\n\nThat is, the error is basically a constant depending on the concavity of \\(f\\) times a quadratic function centered at \\(c\\).\nFor \\(\\sin(x)\\) at \\(c=0\\) we get \\(\\lvert\\sin(x) - x\\rvert = \\lvert-\\sin(\\xi)\\cdot x^2/2\\rvert\\). Since \\(\\lvert\\sin(\\xi)\\rvert \\leq 1\\), we must have this bound: \\(\\lvert\\sin(x) - x\\rvert \\leq x^2/2\\).\nCan we verify? Lets do so graphically:\n\nh(x) = abs(sin(x) - x)\ng(x) = x^2/2\nplot(h, -2, 2, label=\"h\")\nplot!(g, -2, 2, label=\"f\")\n\n\n\n\nThe graph shows a tight bound near \\(0\\) and then a bound over this viewing window.\nSimilarly, for \\(f(x) = \\log(1 + x)\\) we have the following at \\(c=0\\):\n\\[\nf'(x) = 1/(1+x), \\quad f''(x) = -1/(1+x)^2.\n\\]\nSo, as \\(f(c)=0\\) and \\(f'(c) = 1\\), we have\n\\[\n\\lvert f(x) - x\\rvert \\leq \\lvert f''(\\xi)\\rvert \\cdot \\frac{x^2}{2}\n\\]\nWe see that \\(\\lvert f''(x)\\rvert\\) is decreasing for \\(x > -1\\). So if \\(-1 < x < c\\) we have\n\\[\n\\lvert f(x) - x\\rvert \\leq \\lvert f''(x)\\rvert \\cdot \\frac{x^2}{2} = \\frac{x^2}{2(1+x)^2}.\n\\]\nAnd for \\(c=0 < x\\), we have\n\\[\n\\lvert f(x) - x\\rvert \\leq \\lvert f''(0)\\rvert \\cdot \\frac{x^2}{2} = x^2/2.\n\\]\nPlotting we verify the bound on \\(|\\log(1+x)-x|\\):\n\nh(x) = abs(log(1+x) - x)\ng(x) = x < 0 ? x^2/(2*(1+x)^2) : x^2/2\nplot(h, -0.5, 2, label=\"h\")\nplot!(g, -0.5, 2, label=\"g\")\n\n\n\n\nAgain, we see the very close bound near \\(0\\), which widens at the edges of the viewing window.\n\n29.3.1 Why is the remainder term as it is?\nTo see formally why the remainder is as it is, we recall the mean value theorem in the extended form of Cauchy. Suppose \\(c=0\\), \\(x > 0\\), and let \\(h(x) = f(x) - (f(0) + f'(0) x)\\) and \\(g(x) = x^2\\). Then we have that there exists a \\(e\\) with \\(0 < e < x\\) such that\n\\[\n\\text{error} = h(x) - h(0) = (g(x) - g(0)) \\frac{h'(e)}{g'(e)} = x^2 \\cdot \\frac{1}{2} \\cdot \\frac{f'(e) - f'(0)}{e} =\nx^2 \\cdot \\frac{1}{2} \\cdot f''(\\xi).\n\\]\nThe value of \\(\\xi\\), from the mean value theorem applied to \\(f'(x)\\), satisfies \\(0 < \\xi < e < x\\), so is in \\([0,x].\\)\n\n\n29.3.2 The big (and small) “oh”\nSymPy can find the tangent line expression as a special case of its series function (which implements Taylor series). The series function needs an expression to approximate; a variable specified, as there may be parameters in the expression; a value \\(c\\) for where the expansion is taken, with default \\(0\\); and a number of terms, for this example \\(2\\) for a constant and linear term. (There is also an optional dir argument for one-sided expansions.)\nHere we see the answer provided for \\(e^{\\sin(x)}\\):\n\n@syms x\nseries(exp(sin(x)), x, 0, 2)\n\n \n\\[\n1 + x + O\\left(x^{2}\\right)\n\\]\n\n\n\nThe expression \\(1 + x\\) comes from the fact that exp(sin(0)) is \\(1\\), and the derivative exp(sin(0)) * cos(0) is also \\(1\\). But what is the \\(\\mathcal{O}(x^2)\\)?\nWe know the answer is precisely \\(f''(\\xi)/2 \\cdot x^2\\) for some \\(\\xi\\), but were we only concerned about the scale as \\(x\\) goes to zero that when \\(f''\\) is continuous that the error when divided by \\(x^2\\) goes to some finite value (\\(f''(0)/2\\)). More generally, if the error divided by \\(x^2\\) is bounded as \\(x\\) goes to \\(0\\), then we say the error is “big oh” of \\(x^2\\).\nThe big “oh” notation, \\(f(x) = \\mathcal{O}(g(x))\\), says that the ratio \\(f(x)/g(x)\\) is bounded as \\(x\\) goes to \\(0\\) (or some other value \\(c\\), depending on the context). A little “oh” (e.g., \\(f(x) = \\mathcal{o}(g(x))\\)) would mean that the limit \\(f(x)/g(x)\\) would be \\(0\\), as \\(x\\rightarrow 0\\), a much stronger assertion.\nBig “oh” and little “oh” give us a sense of how good an approximation is without being bogged down in the details of the exact value. As such they are useful guides in focusing on what is primary and what is secondary. Applying this to our case, we have this rough form of the tangent line approximation valid for functions having a continuous second derivative at \\(c\\):\n\\[\nf(x) = f(c) + f'(c)(x-c) + \\mathcal{O}((x-c)^2).\n\\]\n\nExample: the algebra of tangent line approximations\nSuppose \\(f(x)\\) and \\(g(x)\\) are represented by their tangent lines about \\(c\\), respectively:\n\\[\n\\begin{align*}\nf(x) &= f(c) + f'(c)(x-c) + \\mathcal{O}((x-c)^2), \\\\\ng(x) &= g(c) + g'(c)(x-c) + \\mathcal{O}((x-c)^2).\n\\end{align*}\n\\]\nConsider the sum, after rearranging we have:\n\\[\n\\begin{align*}\nf(x) + g(x) &= \\left(f(c) + f'(c)(x-c) + \\mathcal{O}((x-c)^2)\\right) + \\left(g(c) + g'(c)(x-c) + \\mathcal{O}((x-c)^2)\\right)\\\\\n&= \\left(f(c) + g(c)\\right) + \\left(f'(c)+g'(c)\\right)(x-c) + \\mathcal{O}((x-c)^2).\n\\end{align*}\n\\]\nThe two big “Oh” terms become just one as the sum of a constant times \\((x-c)^2\\) plus a constant time \\((x-c)^2\\) is just some other constant times \\((x-c)^2\\). What we can read off from this is the term multiplying \\((x-c)\\) is just the derivative of \\(f(x) + g(x)\\) (from the sum rule), so this too is a tangent line approximation.\nIs it a coincidence that a basic algebraic operation with tangent lines approximations produces a tangent line approximation? Lets try multiplication:\n\\[\n\\begin{align*}\nf(x) \\cdot g(x) &= [f(c) + f'(c)(x-c) + \\mathcal{O}((x-c)^2)] \\cdot [g(c) + g'(c)(x-c) + \\mathcal{O}((x-c)^2)]\\\\\n&=[(f(c) + f'(c)(x-c)] \\cdot [g(c) + g'(c)(x-c)] + (f(c) + f'(c)(x-c) \\cdot \\mathcal{O}((x-c)^2)) + g(c) + g'(c)(x-c) \\cdot \\mathcal{O}((x-c)^2)) + [\\mathcal{O}((x-c)^2))]^2\\\\\n&= [(f(c) + f'(c)(x-c)] \\cdot [g(c) + g'(c)(x-c)] + \\mathcal{O}((x-c)^2)\\\\\n&= f(c) \\cdot g(c) + [f'(c)\\cdot g(c) + f(c)\\cdot g'(c)] \\cdot (x-c) + [f'(c)\\cdot g'(c) \\cdot (x-c)^2 + \\mathcal{O}((x-c)^2)] \\\\\n&= f(c) \\cdot g(c) + [f'(c)\\cdot g(c) + f(c)\\cdot g'(c)] \\cdot (x-c) + \\mathcal{O}((x-c)^2)\n\\end{align*}\n\\]\nThe big “oh” notation just sweeps up many things including any products of it and the term \\(f'(c)\\cdot g'(c) \\cdot (x-c)^2\\). Again, we see from the product rule that this is just a tangent line approximation for \\(f(x) \\cdot g(x)\\).\nThe basic mathematical operations involving tangent lines can be computed just using the tangent lines when the desired accuracy is at the tangent line level. This is even true for composition, though there the outer and inner functions may have different “\\(c\\)”s.\nKnowing this can simplify the task of finding tangent line approximations of compound expressions.\nFor example, suppose we know that at \\(c=0\\) we have these formula where \\(a \\approx b\\) is a shorthand for the more formal \\(a=b + \\mathcal{O}(x^2)\\):\n\\[\n\\sin(x) \\approx x, \\quad e^x \\approx 1 + x, \\quad \\text{and}\\quad 1/(1+x) \\approx 1 - x.\n\\]\nThen we can immediately see these tangent line approximations about \\(x=0\\):\n\\[\ne^x \\cdot \\sin(x) \\approx (1+x) \\cdot x = x + x^2 \\approx x,\n\\]\nand\n\\[\n\\frac{\\sin(x)}{e^x} \\approx \\frac{x}{1 + x} \\approx x \\cdot(1-x) = x-x^2 \\approx x.\n\\]\nSince \\(\\sin(0) = 0\\), we can use these to find the tangent line approximation of\n\\[\ne^{\\sin(x)} \\approx e^x \\approx 1 + x.\n\\]\nNote that \\(\\sin(\\exp(x))\\) is approximately \\(\\sin(1+x)\\) but not approximately \\(1+x\\), as the expansion for \\(\\sin\\) about \\(1\\) is not simply \\(x\\).\n\n\n\n29.3.3 The TaylorSeries package\nThe TaylorSeries packages will do these calculations in a manner similar to how SymPy transforms a function and a symbolic variable into a symbolic expression.\nFor example, we have\n\nt = Taylor1(Float64, 1)\n\n 1.0 t + 𝒪(t²)\n\n\nThe number type and the order is specified to the constructor. Linearization is order \\(1\\), other orders will be discussed later. This variable can now be composed with mathematical functions and the linearization of the function will be returned:\n\nsin(t), exp(t), 1/(1+t)\n\n( 1.0 t + 𝒪(t²), 1.0 + 1.0 t + 𝒪(t²), 1.0 - 1.0 t + 𝒪(t²))\n\n\n\nsin(t)/exp(t), exp(sin(t))\n\n( 1.0 t + 𝒪(t²), 1.0 + 1.0 t + 𝒪(t²))\n\n\n\nExample: Automatic differentiation\nAutomatic differentiation (forward mode) essentially uses this technique. A “dual” is introduced which has terms \\(a +b\\epsilon\\) where \\(\\epsilon^2 = 0\\). The \\(\\epsilon\\) is like \\(x\\) in a linear expansion, so the a coefficient encodes the value and the b coefficient reflects the derivative at the value. Numbers are treated like a variable, so their “b coefficient” is a 1. Here then is how 0 is encoded:\n\nDual(0, 1)\n\n0 + 1ɛ\n\n\nThen what is \\(\\(x)\\)? It should reflect both \\((\\sin(0), \\cos(0))\\) the latter being the derivative of \\(\\sin\\). We can see this is almost what is computed behind the scenes through:\n\nx = Dual(0, 1)\n@code_lowered sin(x)\n\n\nCodeInfo(\n1 ─ x = DualNumbers.value(z)\n│ xp = DualNumbers.epsilon(z)\n│ %3 = DualNumbers.sin(x)\n│ %4 = xp\n│ %5 = DualNumbers.cos(x)\n│ %6 = %4 * %5\n│ %7 = DualNumbers.Dual(%3, %6)\n└── return %7\n)\n\n\n\nThis output of @code_lowered can be confusing, but this simple case neednt be. Working from the end we see an assignment to a variable named %7 of Dual(%3, %6). The value of %3 is sin(x) where x is the value 0 above. The value of %6 is cos(x) times the value 1 above (the xp), which reflects the chain rule being used. (The derivative of sin(u) is cos(u)*du.) So this dual number encodes both the function value at 0 and the derivative of the function at 0.)\nSimilarly, we can see what happens to log(x) at 1 (encoded by Dual(1,1)):\n\nx = Dual(1, 1)\n@code_lowered log(x)\n\n\nCodeInfo(\n1 ─ x = DualNumbers.value(z)\n│ xp = DualNumbers.epsilon(z)\n│ %3 = DualNumbers.log(x)\n│ %4 = xp\n│ %5 = 1 / x\n│ %6 = %4 * %5\n│ %7 = DualNumbers.Dual(%3, %6)\n└── return %7\n)\n\n\n\nWe can see the derivative again reflects the chain rule, it being given by 1/x * xp where xp acts like dx (from assignments %5 and %4). Comparing the two outputs, we see only the assignment to %4 differs, it reflecting the derivative of the function."
},
{
"objectID": "derivatives/linearization.html#questions",
"href": "derivatives/linearization.html#questions",
"title": "29  Linearization",
"section": "29.4 Questions",
"text": "29.4 Questions\n\nQuestion\nWhat is the right linear approximation for \\(\\sqrt{1 + x}\\) near \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 + (1/2) \\cdot x\\)\n \n \n\n\n \n \n \n \n \\(1 - (1/2) \\cdot x\\)\n \n \n\n\n \n \n \n \n \\(1 + x^{1/2}\\)\n \n \n\n\n \n \n \n \n \\(1 + 1/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the right linear approximation for \\((1 + x)^k\\) near \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 + x^k\\)\n \n \n\n\n \n \n \n \n \\(1 + k\\)\n \n \n\n\n \n \n \n \n \\(1 - k \\cdot x\\)\n \n \n\n\n \n \n \n \n \\(1 + k \\cdot x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the right linear approximation for \\(\\cos(\\sin(x))\\) near \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 + x\\)\n \n \n\n\n \n \n \n \n \\(1 - x^2/2\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the right linear approximation for \\(\\tan(x)\\) near \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 - x\\)\n \n \n\n\n \n \n \n \n \\(x\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(1 + x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is the right linear approximation of \\(\\sqrt{25 + x}\\) near \\(x=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 - (1/2) \\cdot x\\)\n \n \n\n\n \n \n \n \n \\(5 \\cdot (1 + (1/2) \\cdot (x/25))\\)\n \n \n\n\n \n \n \n \n \\(1 + x\\)\n \n \n\n\n \n \n \n \n \\(25\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\sqrt{x}\\). Find the actual error in approximating \\(f(26)\\) by the value of the tangent line at \\((25, f(25))\\) at \\(x=26\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAn estimate of some quantity was \\(12.34\\) the actual value was \\(12\\). What was the percentage error?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the percentage error in estimating \\(\\sin(5^\\circ)\\) by \\(5 \\pi/180\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe side length of a square is measured roughly to be \\(2.0\\) cm. The actual length \\(2.2\\) cm. What is the difference in area (in absolute values) as estimated by a tangent line approximation.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Birthday problem computes the probability that in a group of \\(n\\) people, under some assumptions, that no two share a birthday. Without trying to spoil the problem, we focus on the calculus specific part of the problem below:\n\\[\n\\begin{align*}\np\n&= \\frac{365 \\cdot 364 \\cdot \\cdots (365-n+1)}{365^n} \\\\\n&= \\frac{365(1 - 0/365) \\cdot 365(1 - 1/365) \\cdot 365(1-2/365) \\cdot \\cdots \\cdot 365(1-(n-1)/365)}{365^n}\\\\\n&= (1 - \\frac{0}{365})\\cdot(1 -\\frac{1}{365})\\cdot \\cdots \\cdot (1-\\frac{n-1}{365}).\n\\end{align*}\n\\]\nTaking logarithms, we have \\(\\log(p)\\) is\n\\[\n\\log(1 - \\frac{0}{365}) + \\log(1 -\\frac{1}{365})+ \\cdots + \\log(1-\\frac{n-1}{365}).\n\\]\nNow, use the tangent line approximation for \\(\\log(1 - x)\\) and the sum formula for \\(0 + 1 + 2 + \\dots + (n-1)\\) to simplify the value of \\(\\log(p)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-n(n-1)/2/365\\)\n \n \n\n\n \n \n \n \n \\(-n(n-1)/2\\cdot 365\\)\n \n \n\n\n \n \n \n \n \\(-n^2/(2\\cdot 365)\\)\n \n \n\n\n \n \n \n \n \\(-n^2 / 2 \\cdot 365\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(n = 10\\), what is the approximation for \\(p\\) (not \\(\\log(p)\\))?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(n=100\\), what is the approximation for \\(p\\) (not \\(\\log(p)\\)?"
},
{
"objectID": "derivatives/newtons_method.html",
"href": "derivatives/newtons_method.html",
"title": "30  Newtons method",
"section": "",
"text": "This section uses these add-on packages:\nThe Babylonian method is an algorithm to find an approximate value for \\(\\sqrt{k}\\). It was described by the first-century Greek mathematician Hero of Alexandria.\nThe method starts with some initial guess, called \\(x_0\\). It then applies a formula to produce an improved guess. This is repeated until the improved guess is accurate enough or it is clear the algorithm fails to work.\nFor the Babylonian method, the next guess, \\(x_{i+1}\\), is derived from the current guess, \\(x_i\\). In mathematical notation, this is the updating step:\n\\[\nx_{i+1} = \\frac{1}{2}(x_i + \\frac{k}{x_i})\n\\]\nWe use this algorithm to approximate the square root of \\(2\\), a value known to the Babylonians.\nStart with \\(x\\), then form \\(x/2 + 1/x\\), from this again form \\(x/2 + 1/x\\), repeat.\nWe represent this step using a function\nLets look starting with \\(x = 2\\) as a rational number:\nOur estimate improved from something which squared to \\(4\\) down to something which squares to \\(2.25.\\) A big improvement, but there is still more to come. Had we done one more step:\nWe now see accuracy until the third decimal point.\nThis is now accurate to the sixth decimal point. That is about as far as we, or the Bablyonians, would want to go by hand. Using rational numbers quickly grows out of hand. The next step shows the explosion.\n(In the above, we used reduce to repeat a function call \\(4\\) times, as an alternative to the composition operation. In this section we show a few styles to do this repetition before introducing a packaged function.)\nHowever, with the advent of floating point numbers, the method stays quite manageable:\nWe can see that the algorithm - to the precision offered by floating point numbers - has resulted in an answer 1.414213562373095. This answer is an approximation to the actual answer. Approximation is necessary, as \\(\\sqrt{2}\\) is an irrational number and so can never be exactly represented in floating point. That being said, we can see that the value of \\(f(x)\\) is accurate to the last decimal place, so our approximation is very close and is achieved in a few steps."
},
{
"objectID": "derivatives/newtons_method.html#newtons-generalization",
"href": "derivatives/newtons_method.html#newtons-generalization",
"title": "30  Newtons method",
"section": "30.1 Newtons generalization",
"text": "30.1 Newtons generalization\nLet \\(f(x) = x^3 - 2x -5\\). The value of \\(2\\) is almost a zero, but not quite, as \\(f(2) = -1\\). We can check that there are no rational roots. Though there is a method to solve the cubic it may be difficult to compute and will not be as generally applicable as some algorithm like the Babylonian method to produce an approximate answer.\nIs there some generalization to the Babylonian method?\nWe know that the tangent line is a good approximation to the function at the point. Looking at this graph gives a hint as to an algorithm:\n\n\n\n\n\nThe tangent line and the function nearly agree near \\(2\\). So much so, that the intersection point of the tangent line with the \\(x\\) axis nearly hides the actual zero of \\(f(x)\\) that is near \\(2.1\\).\nThat is, it seems that the intersection of the tangent line and the \\(x\\) axis should be an improved approximation for the zero of the function.\nLet \\(x_0\\) be \\(2\\), and \\(x_1\\) be the intersection point of the tangent line at \\((x_0, f(x_0))\\) with the \\(x\\) axis. Then by the definition of the tangent line:\n\\[\nf'(x_0) = \\frac{\\Delta y }{\\Delta x} = \\frac{f(x_0)}{x_0 - x_1}.\n\\]\nThis can be solved for \\(x_1\\) to give \\(x_1 = x_0 - f(x_0)/f'(x_0)\\). In general, if we had \\(x_i\\) and used the intersection point of the tangent line to produce \\(x_{i+1}\\) we would have Newtons method:\n\\[\nx_{i+1} = x_i - \\frac{f(x_i)}{f'(x_i)}.\n\\]\nUsing automatic derivatives, as brought in with the CalculusWithJulia package, we can implement this algorithm.\nThe algorithm above starts at \\(2\\) and then becomes:\n\nf(x) = x^3 - 2x - 5\nx0 = 2.0\nx1 = x0 - f(x0) / f'(x0)\n\n2.1\n\n\nWe can see we are closer to a zero:\n\nf(x0), f(x1)\n\n(-1.0, 0.06100000000000083)\n\n\nTrying again, we have\n\nx2 = x1 - f(x1)/ f'(x1)\nx2, f(x2), f(x1)\n\n(2.094568121104185, 0.00018572317327247845, 0.06100000000000083)\n\n\nAnd again:\n\nx3 = x2 - f(x2)/ f'(x2)\nx3, f(x3), f(x2)\n\n(2.094551481698199, 1.7397612239733462e-9, 0.00018572317327247845)\n\n\n\nx4 = x3 - f(x3)/ f'(x3)\nx4, f(x4), f(x3)\n\n(2.0945514815423265, -8.881784197001252e-16, 1.7397612239733462e-9)\n\n\nWe see now that \\(f(x_4)\\) is within machine tolerance of \\(0\\), so we call \\(x_4\\) an approximate zero of \\(f(x)\\).\n\nNewtons method: Let \\(x_0\\) be an initial guess for a zero of \\(f(x)\\). Iteratively define \\(x_{i+1}\\) in terms of the just generated \\(x_i\\) by:\n\\[\nx_{i+1} = x_i - f(x_i) / f'(x_i).\n\\]\nThen for reasonable functions and reasonable initial guesses, the sequence of points converges to a zero of \\(f\\).\n\nOn the computer, we know that actual convergence will likely never occur, but accuracy to a certain tolerance can often be achieved.\nIn the example above, we kept track of the previous values. This is unnecessary if only the answer is sought. In that case, the update step could use the same variable. Here we use reduce:\n\nxₙ = reduce((x, step) -> x - f(x)/f'(x), 1:4, init=2)\nxₙ, f(xₙ)\n\n(2.0945514815423265, -8.881784197001252e-16)\n\n\nIn practice, the algorithm is implemented not by repeating the update step a fixed number of times, rather by repeating the step until either we converge or it is clear we wont converge. For good guesses and most functions, convergence happens quickly.\n\n\n\n\n\n\nNote\n\n\n\nNewton looked at this same example in 1699 (B.T. Polyak, Newtons method and its use in optimization, European Journal of Operational Research. 02/2007; 181(3):1086-1096.) though his technique was slightly different as he did not use the derivative, per se, but rather an approximation based on the fact that his function was a polynomial (though identical to the derivative). Raphson (1690) proposed the general form, hence the usual name of the Newton-Raphson method.\n\n\n\nExamples\n\nExample: visualizing convergence\nThis graphic demonstrates the method and the rapid convergence:\n\n\n \n Illustration of Newton's Method converging to a zero of a function.\n \n \n\n\n\n\nThis interactive graphic (built using JSXGraph) allows the adjustment of the point x0, initially at \\(0.85\\). Five iterations of Newtons method are illustrated. Different positions of x0 clearly converge, others will not.\n\n\nJXG = require(\"jsxgraph\");\n\n// newton's method\n\nb = JXG.JSXGraph.initBoard('jsxgraph', {\n boundingbox: [-3,5,3,-5], axis:true\n});\n\n\nf = function(x) {return x*x*x*x*x - x - 1};\nfp = function(x) { return 4*x*x*x*x - 1};\nx0 = 0.85;\n\nnm = function(x) { return x - f(x)/fp(x);};\n\nl = b.create('point', [-1.5,0], {name:'', size:0});\nr = b.create('point', [1.5,0], {name:'', size:0});\nxaxis = b.create('line', [l,r])\n\n\nP0 = b.create('glider', [x0,0,xaxis], {name:'x0'});\nP0a = b.create('point', [function() {return P0.X();},\n function() {return f(P0.X());}], {name:''});\n\nP1 = b.create('point', [function() {return nm(P0.X());},\n 0], {name:''});\nP1a = b.create('point', [function() {return P1.X();},\n function() {return f(P1.X());}], {name:''});\n\nP2 = b.create('point', [function() {return nm(P1.X());},\n 0], {name:''});\nP2a = b.create('point', [function() {return P2.X();},\n function() {return f(P2.X());}], {name:''});\n\nP3 = b.create('point', [function() {return nm(P2.X());},\n 0], {name:''});\nP3a = b.create('point', [function() {return P3.X();},\n function() {return f(P3.X());}], {name:''});\n\nP4 = b.create('point', [function() {return nm(P3.X());},\n 0], {name:''});\nP4a = b.create('point', [function() {return P4.X();},\n function() {return f(P4.X());}], {name:''});\nP5 = b.create('point', [function() {return nm(P4.X());},\n 0], {name:'x5', strokeColor:'black'});\n\n\n\n\n\nP0a.setAttribute({fixed:true});\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nP1.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP1a.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP2.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP2a.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP3.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP3a.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP4.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP4a.setAttribute({fixed:true});\n\n\n\n\n\n\n\nP5.setAttribute({fixed:true});\n\n\n\n\n\n\n\nsc = '#000000';\nb.create('segment', [P0,P0a], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nb.create('segment', [P0a, P1], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P1,P1a], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P1a, P2], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P2,P2a], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P2a, P3], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P3,P3a], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P3a, P4], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P4,P4a], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('segment', [P4a, P5], {strokeColor:sc, strokeWidth:1});\n\n\n\n\n\n\n\nb.create('functiongraph', [f, -1.5, 1.5])\n\n\n\n\n\n\n\n\n\n\nExample: numeric not algebraic\nFor the function \\(f(x) = \\cos(x) - x\\), we see that SymPy can not solve symbolically for a zero:\n\n@syms x::real\nsolve(cos(x) - x, x)\n\nLoadError: PyError ($(Expr(:escape, :(ccall(#= /Users/verzani/.julia/packages/PyCall/7a7w0/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'NotImplementedError'>\nNotImplementedError('multiple generators [x, cos(x)]\\nNo algorithms are implemented to solve equation -x + cos(x)')\n File \"/Users/verzani/.julia/conda/3/lib/python3.7/site-packages/sympy/solvers/solvers.py\", line 1106, in solve\n solution = _solve(f[0], *symbols, **flags)\n File \"/Users/verzani/.julia/conda/3/lib/python3.7/site-packages/sympy/solvers/solvers.py\", line 1720, in _solve\n raise NotImplementedError('\\n'.join([msg, not_impl_msg % f]))\n\n\nWe can find a numeric solution, even though there is no closed-form answer. Here we try Newtons method:\n\nf(x) = cos(x) - x\nx = .5\nx = x - f(x)/f'(x) # 0.7552224171056364\nx = x - f(x)/f'(x) # 0.7391416661498792\nx = x - f(x)/f'(x) # 0.7390851339208068\nx = x - f(x)/f'(x) # 0.7390851332151607\nx = x - f(x)/f'(x)\nx, f(x)\n\n(0.7390851332151607, 0.0)\n\n\nTo machine tolerance the answer is a zero, even though the exact answer is irrational and all finite floating point values can be represented as rational numbers.\n\n\nExample\nUse Newtons method to find the largest real solution to \\(e^x = x^6\\).\nA plot shows us roughly where the value lies:\n\nf(x) = exp(x)\ng(x) = x^6\nplot(f, 0, 25, label=\"f\")\nplot!(g, label=\"g\")\n\n\n\n\nClearly by \\(20\\) the two paths diverge. We know exponentials eventually grow faster than powers, and this is seen in the graph.\nTo use Newtons method to find the intersection point. Stop when the increment \\(f(x)/f'(x)\\) is smaller than 1e-4. We need to turn the solution to an equation into a value where a function is \\(0\\). Just moving the terms to one side of the equals sign gives \\(e^x - x^6 = 0\\), or the \\(x\\) we seek is a solution to \\(h(x)=0\\) with \\(h(x) = e^x - x^6\\).\n\nh(x) = exp(x) - x^6\nx = 20\nfor step in 1:10\n delta = h(x)/h'(x)\n x = x - delta\n @show step, x, delta\nend\n\n(step, x, delta) = (1, 19.096144519894025, 0.9038554801059746)\n(step, x, delta) = (2, 18.279618508448504, 0.8165260114455216)\n(step, x, delta) = (3, 17.615584160589165, 0.6640343478593392)\n(step, x, delta) = (4, 17.186229687634423, 0.42935447295474183)\n(step, x, delta) = (5, 17.02052242498673, 0.1657072626476933)\n(step, x, delta) = (6, 16.999206949504874, 0.021315475481855334)\n(step, x, delta) = (7, 16.998887423017564, 0.00031952648730976816)\n(step, x, delta) = (8, 16.998887352296055, 7.072150775034742e-8)\n(step, x, delta) = (9, 16.99888735229605, 2.3862112429152903e-15)\n(step, x, delta) = (10, 16.99888735229605, -1.1931056214576512e-15)\n\n\nSo it takes \\(8\\) steps to get an increment that small and about 10 steps to get to full convergence.\n\n\nExample division as multiplication\nNewton-Raphson Division is a means to divide by multiplying.\nWhy would you want to do that? Well, even for computers division is harder (read slower) than multiplying. The trick is that \\(p/q\\) is simply \\(p \\cdot (1/q)\\), so finding a means to compute a reciprocal by multiplying will reduce division to multiplication.\nWell suppose we have \\(q\\), we could try to use Newtons method to find \\(1/q\\), as it is a solution to \\(f(x) = x - 1/q\\). The Newton update step simplifies to:\n\\[\nx - f(x) / f'(x) \\quad\\text{or}\\quad x - (x - 1/q)/ 1 = 1/q\n\\]\nThat doesnt really help, as Newtons method is just \\(x_{i+1} = 1/q\\). That is, it just jumps to the answer, the one we want to compute by some other means!\nTrying again, we simplify the update step for a related function: \\(f(x) = 1/x - q\\) with \\(f'(x) = -1/x^2\\) and then one step of the process is:\n\\[\nx_{i+1} = x_i - (1/x_i - q)/(-1/x_i^2) = -qx^2_i + 2x_i.\n\\]\nNow for \\(q\\) in the interval \\([1/2, 1]\\) we want to get a good initial guess. Here is a claim. We can use \\(x_0=48/17 - 32/17 \\cdot q\\). Lets check graphically that this is a reasonable initial approximation to \\(1/q\\):\n\nplot(q -> 1/q, 1/2, 1, label=\"1/q\")\nplot!(q -> 1/17 * (48 - 32q), label=\"linear approximation\")\n\n\n\n\nIt can be shown that we have for any \\(q\\) in \\([1/2, 1]\\) with initial guess \\(x_0 = 48/17 - 32/17\\cdot q\\) that Newtons method will converge to \\(16\\) digits in no more than this many steps:\n\\[\n\\log_2(\\frac{53 + 1}{\\log_2(17)}).\n\\]\n\na = log2((53 + 1)/log2(17))\nceil(Integer, a)\n\n4\n\n\nThat is \\(4\\) steps suffices.\nFor \\(q = 0.80\\), to find \\(1/q\\) using the above we have\n\nq = 0.80\nx = (48/17) - (32/17)*q\nx = -q*x*x + 2*x\nx = -q*x*x + 2*x\nx = -q*x*x + 2*x\nx = -q*x*x + 2*x\n\n1.25\n\n\nThis method has basically \\(18\\) multiplication and addition operations for one division, so it naively would seem slower, but timing this shows the method is competitive with a regular division."
},
{
"objectID": "derivatives/newtons_method.html#wrapping-in-a-function",
"href": "derivatives/newtons_method.html#wrapping-in-a-function",
"title": "30  Newtons method",
"section": "30.2 Wrapping in a function",
"text": "30.2 Wrapping in a function\nIn the previous examples, we saw fast convergence, guaranteed converge in \\(4\\) steps, and an example where \\(8\\) steps were needed to get the requested level of approximation. Newtons method usually converges quickly, but may converge slowly, and may not converge at all. Automating the task to avoid repeatedly running the update step is a task best done by the computer.\nThe while loop is a good way to repeat commands until some condition is met. With this, we present a simple function implementing Newtons method, we iterate until the update step gets really small (the atol) or the convergence takes more than \\(50\\) steps. (There are other, better choices that could be used to determine when the algorithm should stop, these are just easy to understand.)\n\nfunction nm(f, fp, x0)\n atol = 1e-14\n ctr = 0\n delta = Inf\n while (abs(delta) > atol) && (ctr < 50)\n delta = f(x0)/fp(x0)\n x0 = x0 - delta\n ctr = ctr + 1\n end\n\n ctr < 50 ? x0 : NaN\nend\n\nnm (generic function with 1 method)\n\n\n\nExamples\n\nFind a zero of \\(\\sin(x)\\) starting at \\(x_0=3\\):\n\n\nnm(sin, cos, 3)\n\n3.141592653589793\n\n\nThis is an approximation for \\(\\pi\\), that historically found use, as the convergence is fast.\n\nFind a solution to \\(x^5 = 5^x\\) near \\(2\\):\n\nWriting a function to handle this, we have:\n\nk(x) = x^5 - 5^x\n\nk (generic function with 1 method)\n\n\nWe could find the derivative by hand, but use the automatic one instead:\n\nalpha = nm(k, k', 2)\nalpha, f(alpha)\n\n(1.764921914525776, 5.8411162339467655)\n\n\n\n\n30.2.1 Functions in the Roots package\nTyping in the nm function might be okay once, but would be tedious if it was needed each time. Besides, it isnt as robust to different inputs as possible. The Roots package provides a Newton method for find_zero.\nTo use a different method with find_zero, the calling pattern is find_zero(f, x, M) where f represent the function(s), x the initial point(s), and M the method. Here we have:\n\nfind_zero((sin, cos), 3, Roots.Newton())\n\n3.141592653589793\n\n\nOr, if a derivative is not specified, one can be computed using automatic differentiation:\n\nf(x) = sin(x)\nfind_zero((f, f'), 2, Roots.Newton())\n\n3.141592653589793\n\n\nThe argument verbose=true will force a print out of a message summarizing the convergence and showing each step.\n\nf(x) = exp(x) - x^4\nfind_zero((f,f'), 8, Roots.Newton(); verbose=true)\n\nResults of univariate zero finding:\n\n* Converged to: 8.6131694564414\n* Algorithm: Roots.Newton()\n* iterations: 8\n* function evaluations ≈ 16\n* stopped as x_n ≈ x_{n-1} using atol=xatol, rtol=xrtol\n\nTrace:\nx₁ = 8, fx₁ = -1115.0420129582717\nx₂ = 9.1951685161021075, fx₂ = 2700.5339924159998\nx₃ = 8.7944708020788642, fx₃ = 615.76735783715776\nx₄ = 8.6356413896038262, fx₄ = 67.416233795479457\nx₅ = 8.6135576545997825, fx₅ = 1.1446528211590703\nx₆ = 8.6131695743314562, fx₆ = 0.00034750863687804667\nx₇ = 8.6131694564414101, fx₇ = 3.3651303965598345e-11\nx₈ = 8.6131694564413994, fx₈ = 1.8189894035458565e-12\nx₉ = 8.6131694564413994, fx₉ = 1.8189894035458565e-12\n\n\n\n8.6131694564414\n\n\n\nExample: intersection of two graphs\nFind the intersection point between \\(f(x) = \\cos(x)\\) and \\(g(x) = 5x\\) near \\(0\\).\nWe have Newtons method to solve for zeros of \\(f(x)\\), i.e. when \\(f(x) = 0\\). Here we want to solve for \\(x\\) with \\(f(x) = g(x)\\). To do so, we make a new function \\(h(x) = f(x) - g(x)\\), that is \\(0\\) when \\(f(x)\\) equals \\(g(x)\\):\n\nf(x) = cos(x)\ng(x) = 5x\nh(x) = f(x) - g(x)\nx0 = find_zero((h,h'), 0, Roots.Newton())\nx0, h(x0), f(x0), g(x0)\n\n(0.19616428118784215, 0.0, 0.9808214059392107, 0.9808214059392107)\n\n\n\nWe redo the above using a parameter for the \\(5\\), as there are some options on how it would be done. We let f(x,p) = cos(x) - p*x. Then we can use Roots.Newton by also defining a derivative:\n\nf(x,p) = cos(x) - p*x\nfp(x,p) = -sin(x) - p\nxn = find_zero((f,fp), pi/4, Roots.Newton(); p=5)\nxn, f(xn, 5)\n\n(0.19616428118784215, 0.0)\n\n\nTo use automatic differentiation is not straightforward, as we must hold the p fixed. For this, we introduce a closure that fixes p and differentiates in the x variable (called u below):\n\nf(x,p) = cos(x) - p*x\nfp(x,p) = (u -> f(u,p))'(x)\nxn = find_zero((f,fp), pi/4, Roots.Newton(); p=5)\n\n0.19616428118784215\n\n\n\n\nExample: Finding \\(c\\) in Rolles Theorem\nThe function \\(r(x) = \\sqrt{1 - \\cos(x^2)^2}\\) has a zero at \\(0\\) and one at \\(a\\) near \\(1.77\\).\n\nr(x) = sqrt(1 - cos(x^2)^2)\nplot(r, 0, 1.77)\n\n\n\n\nAs \\(f(x)\\) is differentiable between \\(0\\) and \\(a\\), Rolles theorem says there will be value where the derivative is \\(0\\). Find that value.\nThis value will be a zero of the derivative. A graph shows it should be near \\(1.2\\), so we use that as a starting value to get the answer:\n\nfind_zero((r',r''), 1.2, Roots.Newton())\n\n1.2533141373155003"
},
{
"objectID": "derivatives/newtons_method.html#convergence-rates",
"href": "derivatives/newtons_method.html#convergence-rates",
"title": "30  Newtons method",
"section": "30.3 Convergence rates",
"text": "30.3 Convergence rates\nNewtons method is famously known to have “quadratic convergence.” What does this mean? Let the error in the \\(i\\)th step be called \\(e_i = x_i - \\alpha\\). Then Newtons method satisfies a bound of the type:\n\\[\n\\lvert e_{i+1} \\rvert \\leq M_i \\cdot e_i^2.\n\\]\nIf \\(M\\) were just a constant and we suppose \\(e_0 = 10^{-1}\\) then \\(e_1\\) would be less than \\(M 10^{-2}\\) and \\(e_2\\) less than \\(M^2 10^{-4}\\), \\(e_3\\) less than \\(M^3 10^{-8}\\) and \\(e_4\\) less than \\(M^4 10^{-16}\\) which for \\(M=1\\) is basically the machine precision when values are near \\(1\\). That is for some problems, with a good initial guess it will take around \\(4\\) or so steps to converge.\nTo identify \\(M\\), let \\(\\alpha\\) be the zero of \\(f\\) to be approximated. Assume\n\nThe function \\(f\\) has at continuous second derivative in a neighborhood of \\(\\alpha\\).\nThe value \\(f'(\\alpha)\\) is non-zero in the neighborhood of \\(\\alpha\\).\n\nThen this linearization holds at each \\(x_i\\) in the above neighborhood:\n\\[\nf(x) = f(x_i) + f'(x_i) \\cdot (x - x_i) + \\frac{1}{2} f''(\\xi) \\cdot (x-x_i)^2.\n\\]\nThe value \\(\\xi\\) is from the mean value theorem and is between \\(x\\) and \\(x_i\\).\nDividing by \\(f'(x_i)\\) and setting \\(x=\\alpha\\) (as \\(f(\\alpha)=0\\)) leaves\n\\[\n0 = \\frac{f(x_i)}{f'(x_i)} + (\\alpha-x_i) + \\frac{1}{2}\\cdot \\frac{f''(\\xi)}{f'(x_i)} \\cdot (\\alpha-x_i)^2.\n\\]\nFor this value, we have\n\\[\n\\begin{align*}\nx_{i+1} - \\alpha\n&= \\left(x_i - \\frac{f(x_i)}{f'(x_i)}\\right) - \\alpha\\\\\n&= \\left(x_i - \\alpha \\right) - \\frac{f(x_i)}{f'(x_i)}\\\\\n&= (x_i - \\alpha) + \\left(\n(\\alpha - x_i) + \\frac{1}{2}\\frac{f''(\\xi) \\cdot(\\alpha - x_i)^2}{f'(x_i)}\n\\right)\\\\\n&= \\frac{1}{2}\\frac{f''(\\xi)}{f'(x_i)} \\cdot(x_i - \\alpha)^2.\n\\end{align*}\n\\]\nThat is\n\\[\ne_{i+1} = \\frac{1}{2}\\frac{f''(\\xi)}{f'(x_i)} e_i^2.\n\\]\nThis convergence to \\(\\alpha\\) will be quadratic if:\n\nThe initial guess \\(x_0\\) is not too far from \\(\\alpha\\), so \\(e_0\\) is managed.\nThe derivative at \\(\\alpha\\) is not too close to \\(0\\), hence, by continuity \\(f'(x_i)\\) is not too close to \\(0\\). (As it appears in the denominator). That is, the function cant be too flat, which should make sense, as then the tangent line is nearly parallel to the \\(x\\) axis and would intersect far away.\nThe function \\(f\\) has a continuous second derivative at \\(\\alpha\\).\nThe second derivative is not too big (in absolute value) near \\(\\alpha\\). A large second derivative means the function is very concave, which means it is “turning” a lot. In this case, the function turns away from the tangent line quickly, so the tangent lines zero is not necessarily a good approximation to the actual zero, \\(\\alpha\\).\n\n\n\n\n\n\n\nNote\n\n\n\nThe basic tradeoff: methods like Newtons are faster than the bisection method in terms of function calls, but are not guaranteed to converge, as the bisection method is.\n\n\nWhat can go wrong when one of these isnt the case is illustrated next:\n\n30.3.1 Poor initial step\n\n\n \n Illustration of Newton's Method converging to a zero of a function, but slowly as the initial guess, is very poor, and not close to the zero. The algorithm does converge in this illustration, but not quickly and not to the nearest root from the initial guess.\n \n \n\n\n\n\n\n \n Illustration of Newton's method failing to coverge as for some \\(x_i\\), \\(f'(x_i)\\) is too close to \\(0\\). In this instance after a few steps, the algorithm just cycles around the local minimum near \\(0.66\\). The values of \\(x_i\\) repeat in the pattern: \\(1.0002, 0.7503, -0.0833, 1.0002, \\dots\\). This is also an illustration of a poor initial guess. If there is a local minimum or maximum between the guess and the zero, such cycles can occur.\n \n \n\n\n\n\n\n30.3.2 The second derivative is too big\n\n\n \n Illustration of Newton's Method not converging. Here the second derivative is too big near the zero - it blows up near \\(0\\) - and the convergence does not occur. Rather the iterates increase in their distance from the zero.\n \n \n\n\n\n\n\n30.3.3 The tangent line at some xᵢ is flat\n\n\n \n The function \\(f(x) = x^{20} - 1\\) has two bad behaviours for Newton's method: for \\(x < 1\\) the derivative is nearly \\(0\\) and for \\(x>1\\) the second derivative is very big. In this illustration, we have an initial guess of \\(x_0=8/9\\). As the tangent line is fairly flat, the next approximation is far away, \\(x_1 = 1.313\\dots\\). As this guess is is much bigger than \\(1\\), the ratio \\(f(x)/f'(x) \\approx x^{20}/(20x^{19}) = x/20\\), so \\(x_i - x_{i-1} \\approx (19/20)x_i\\) yielding slow, linear convergence until \\(f''(x_i)\\) is moderate. For this function, starting at \\(x_0=8/9\\) takes 11 steps, at \\(x_0=7/8\\) takes 13 steps, at \\(x_0=3/4\\) takes \\(55\\) steps, and at \\(x_0=1/2\\) it takes \\(204\\) steps.\n \n \n\n\n\n\nExample\nSuppose \\(\\alpha\\) is a simple zero for \\(f(x)\\). (The value \\(\\alpha\\) is a zero of multiplicity \\(k\\) if \\(f(x) = (x-\\alpha)^kg(x)\\) where \\(g(\\alpha)\\) is not zero. A simple zero has multiplicity \\(1\\). If \\(f'(\\alpha) \\neq 0\\) and the second derivative exists, then a zero \\(\\alpha\\) will be simple.) Around \\(\\alpha\\), quadratic convergence should apply. However, consider the function \\(g(x) = f(x)^k\\) for some integer \\(k \\geq 2\\). Then \\(\\alpha\\) is still a zero, but the derivative of \\(g\\) at \\(\\alpha\\) is zero, so the tangent line is basically flat. This will slow the convergence up. We can see that the update step \\(g'(x)/g(x)\\) becomes \\((1/k) f'(x)/f(x)\\), so an extra factor is introduced.\nThe calculation that produces the quadratic convergence now becomes:\n\\[\nx_{i+1} - \\alpha = (x_i - \\alpha) - \\frac{1}{k}(x_i-\\alpha + \\frac{f''(\\xi)}{2f'(x_i)}(x_i-\\alpha)^2) =\n\\frac{k-1}{k} (x_i-\\alpha) + \\frac{f''(\\xi)}{2kf'(x_i)}(x_i-\\alpha)^2.\n\\]\nAs \\(k > 1\\), the \\((x_i - \\alpha)\\) term dominates, and we see the convergence is linear with \\(\\lvert e_{i+1}\\rvert \\approx (k-1)/k \\lvert e_i\\rvert\\)."
},
{
"objectID": "derivatives/newtons_method.html#questions",
"href": "derivatives/newtons_method.html#questions",
"title": "30  Newtons method",
"section": "30.4 Questions",
"text": "30.4 Questions\n\nQuestion\nLook at this graph with \\(x_0\\) marked with a point:\n\n\n\n\n\nIf one step of Newtons method was used, what would be the value of \\(x_1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-2.224\\)\n \n \n\n\n \n \n \n \n \\(-2.80\\)\n \n \n\n\n \n \n \n \n \\(-0.020\\)\n \n \n\n\n \n \n \n \n \\(0.355\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLook at this graph of some increasing, concave up \\(f(x)\\) with initial point \\(x_0\\) marked. Let \\(\\alpha\\) be the zero.\n\n\n\n\n\nWhat can be said about \\(x_1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It must be \\(x_0 < x_1 < \\alpha\\)\n \n \n\n\n \n \n \n \n It must be \\(x_1 > \\alpha\\)\n \n \n\n\n \n \n \n \n It must be \\(x_1 < x_0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nLook at this graph of some increasing, concave up \\(f(x)\\) with initial point \\(x_0\\) marked. Let \\(\\alpha\\) be the zero.\n\n\n\n\n\nWhat can be said about \\(x_1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It must be \\(x_1 > x_0\\)\n \n \n\n\n \n \n \n \n It must be \\(x_1 < \\alpha\\)\n \n \n\n\n \n \n \n \n It must be \\(\\alpha < x_1 < x_0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nSuppose \\(f(x)\\) is increasing and concave up. From the tangent line representation: \\(f(x) = f(c) + f'(c)\\cdot(x-c) + f''(\\xi)/2 \\cdot(x-c)^2\\), explain why it must be that the graph of \\(f(x)\\) lies on or above the tangent line.\n\n\n\n \n \n \n \n \n \n \n \n \n This isn't true. The function \\(f(x) = x^3\\) at \\(x=0\\) provides a counterexample\n \n \n\n\n \n \n \n \n As \\(f''(\\xi)/2 \\cdot(x-c)^2\\) is non-negative, we must have \\(f(x) - (f(c) + f'(c)\\cdot(x-c)) \\geq 0\\).\n \n \n\n\n \n \n \n \n As \\(f''(\\xi) < 0\\) it must be that \\(f(x) - (f(c) + f'(c)\\cdot(x-c)) \\geq 0\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThis question can be used to give a proof for the previous two questions, which can be answered by considering the graphs alone. Combined, they say that if a function is increasing and concave up and \\(\\alpha\\) is a zero, then if \\(x_0 < \\alpha\\) it will be \\(x_1 > \\alpha\\), and for any \\(x_i > \\alpha\\), \\(\\alpha <= x_{i+1} <= x_\\alpha\\), so the sequence in Newtons method is decreasing and bounded below; conditions for which it is guaranteed mathematically there will be convergence.\n\n\nQuestion\nLet \\(f(x) = x^2 - 3^x\\). This has derivative \\(2x - 3^x \\cdot \\log(3)\\). Starting with \\(x_0=0\\), what does Newtons method converge on?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\exp(x) - x^4\\). There are 3 zeros for this function. Which one does Newtons method converge to when \\(x_0=2\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\exp(x) - x^4\\). As mentioned, there are 3 zeros for this function. Which one does Newtons method converge to when \\(x_0=8\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) = \\sin(x) - \\cos(4\\cdot x)\\).\nStarting at \\(\\pi/8\\), solve for the root returned by Newtons method\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing Newtons method find a root to \\(f(x) = \\cos(x) - x^3\\) starting at \\(x_0 = 1/2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse Newtons method to find a root of \\(f(x) = x^5 + x -1\\). Make a quick graph to find a reasonable starting point.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the following graph, graphically consider the algorithm for a few different starting points.\n\n\n\n\n\nIf \\(x_0\\) is \\(1\\) what occurs?\n\n\n\n \n \n \n \n \n \n \n \n \n The algorithm converges very quickly. A good initial point was chosen.\n \n \n\n\n \n \n \n \n The algorithm converges, but slowly. The initial point is close enough to the answer to ensure decreasing errors.\n \n \n\n\n \n \n \n \n The algrithm fails to converge, as it cycles about\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(x_0 = 1.0\\) the following values are true for \\(f\\):\n\n\n(e₀ = -0.16730397826141874, f₀ = 6.0, f̄₀ = 31.81133480963258, ē₁ = 0.0742015850567364)\n\n\nWhere the values f̄₀ and ē₁ are worst-case estimates when \\(\\xi\\) is between \\(x_0\\) and the zero.\nDoes the magnitude of the error increase or decrease in the first step?\n\n\n\n \n \n \n \n \n \n \n \n \n Appears to increase\n \n \n\n\n \n \n \n \n It decreases\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(x_0\\) is set near \\(0.40\\) what happens?\n\n\n\n \n \n \n \n \n \n \n \n \n The algorithm converges very quickly. A good initial point was chosen.\n \n \n\n\n \n \n \n \n The algorithm converges, but slowly. The initial point is close enough to the answer to ensure decreasing errors.\n \n \n\n\n \n \n \n \n The algrithm fails to converge, as it cycles about\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(x_0 = 0.4\\) the following values are true for \\(f\\):\n\n\n(e₀ = -0.7673039782614187, f₀ = 1.1280000000000001, f̄₀ = 31.81133480963258, ē₁ = 8.301903808997139)\n\n\nWhere the values f̄₀ and ē₁ are worst-case estimates when \\(\\xi\\) is between \\(x_0\\) and the zero.\nDoes the magnitude of the error increase or decrease in the first step?\n\n\n\n \n \n \n \n \n \n \n \n \n Appears to increase\n \n \n\n\n \n \n \n \n It decreases\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(x_0\\) is set near \\(0.75\\) what happens?\n\n\n\n \n \n \n \n \n \n \n \n \n The algorithm converges very quickly. A good initial point was chosen.\n \n \n\n\n \n \n \n \n The algorithm converges, but slowly. The initial point is close enough to the answer to ensure decreasing errors.\n \n \n\n\n \n \n \n \n The algrithm fails to converge, as it cycles about\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill Newtons method converge for the function \\(f(x) = x^5 - x + 1\\) starting at \\(x=1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No. The initial guess is not close enough\n \n \n\n\n \n \n \n \n No. The second derivative is too big\n \n \n\n\n \n \n \n \n No. The first derivative gets too close to \\(0\\) for one of the \\(x_i\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill Newtons method converge for the function \\(f(x) = 4x^5 - x + 1\\) starting at \\(x=1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No. The initial guess is not close enough\n \n \n\n\n \n \n \n \n No. The second derivative is too big, or does not exist\n \n \n\n\n \n \n \n \n No. The first derivative gets too close to \\(0\\) for one of the \\(x_i\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill Newtons method converge for the function \\(f(x) = x^{10} - 2x^3 - x + 1\\) starting from \\(0.25\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No. The initial guess is not close enough\n \n \n\n\n \n \n \n \n No. The second derivative is too big, or does not exist\n \n \n\n\n \n \n \n \n No. The first derivative gets too close to \\(0\\) for one of the \\(x_i\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill Newtons method converge for \\(f(x) = 20x/(100 x^2 + 1)\\) starting at \\(0.1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No. The initial guess is not close enough\n \n \n\n\n \n \n \n \n No. The second derivative is too big, or does not exist\n \n \n\n\n \n \n \n \n No. The first derivative gets too close to \\(0\\) for one of the \\(x_i\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWill Newtons method converge to a zero for \\(f(x) = \\sqrt{(1 - x^2)^2}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No. The initial guess is not close enough\n \n \n\n\n \n \n \n \n No. The second derivative is too big, or does not exist\n \n \n\n\n \n \n \n \n No. The first derivative gets too close to \\(0\\) for one of the \\(x_i\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse Newtons method to find a root of \\(f(x) = 4x^4 - 5x^3 + 4x^2 -20x -6\\) starting at \\(x_0 = 0\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse Newtons method to find a zero of \\(f(x) = \\sin(x) - x/2\\) that is bigger than \\(0\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Newton baffler (defined below) is so named, as Newtons method will fail to find the root for most starting points.\n\nfunction newton_baffler(x)\n if ( x - 0.0 ) < -0.25\n 0.75 * ( x - 0 ) - 0.3125\n elseif ( x - 0 ) < 0.25\n 2.0 * ( x - 0 )\n else\n 0.75 * ( x - 0 ) + 0.3125\n end\nend\n\nnewton_baffler (generic function with 1 method)\n\n\nWill Newtons method find the zero at \\(0.0\\) starting at \\(1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsidering this plot:\n\nplot(newton_baffler, -1.1, 1.1)\n\n\n\n\nStarting with \\(x_0=1\\), you can see why Newtons method will fail. Why?\n\n\n\n \n \n \n \n \n \n \n \n \n The first derivative is \\(0\\) at \\(1\\)\n \n \n\n\n \n \n \n \n It doesn't fail, it converges to \\(0\\)\n \n \n\n\n \n \n \n \n The tangent lines for \\(|x| > 0.25\\) intersect at \\(x\\) values with \\(|x| > 0.25\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThis function does not have a small first derivative; or a large second derivative; and the bump up can be made as close to the origin as desired, so the starting point can be very close to the zero. However, even though the conditions of the error term are satisfied, the error term does not apply, as \\(f\\) is not continuously differentiable.\n\n\nQuestion\nLet \\(f(x) = \\sin(x) - x/4\\). Starting at \\(x_0 = 2\\pi\\) Newtons method will converge to a value, but it will take many steps. Using the argument verbose=true for find_zero, how many steps does it take:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the zero that is found?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs this the closest zero to the starting point, \\(x_0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nQuadratic convergence of Newtons method only applies to simple roots. For example, we can see (using the verbose=true argument to the Roots packages newton method, that it only takes \\(4\\) steps to find a zero to \\(f(x) = \\cos(x) - x\\) starting at \\(x_0 = 1\\). But it takes many more steps to find the same zero for \\(f(x) = (\\cos(x) - x)^2\\).\nHow many?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: Implicit equations\nThe equation \\(x^2 + x\\cdot y + y^2 = 1\\) is a rotated ellipse.\n\n\n\n\n\nCan we find which point on its graph has the largest \\(y\\) value?\nThis would be straightforward if we could write \\(y(x) = \\dots\\), for then we would simply find the critical points and investiate. But we cant so easily solve for \\(y\\) interms of \\(x\\). However, we can use Newtons method to do so:\n\nfunction findy(x)\n fn = y -> (x^2 + x*y + y^2) - 1\n fp = y -> (x + 2y)\n find_zero((fn, fp), sqrt(1 - x^2), Roots.Newton())\nend\n\nfindy (generic function with 1 method)\n\n\nFor a fixed x, this solves for \\(y\\) in the equation: \\(F(y) = x^2 + x \\cdot y + y^2 - 1 = 0\\). It should be that \\((x,y)\\) is a solution:\n\nx = .75\ny = findy(x)\nx^2 + x*y + y^2 ## is this 1?\n\n1.0000000000000002\n\n\nSo we have a means to find \\(y(x)\\), but it is implicit.\nUsing find_zero, find the value \\(x\\) which maximizes y by finding a zero of y'. Use this to find the point \\((x,y)\\) with largest \\(y\\) value.\n\n\n\n \n \n \n \n \n \n \n \n \n \\((-0.57735, 1.15470)\\)\n \n \n\n\n \n \n \n \n \\((0.57735, 0.57735)\\)\n \n \n\n\n \n \n \n \n \\((0, -0.57735)\\)\n \n \n\n\n \n \n \n \n \\((0,0)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Using automatic derivatives works for values identified with find_zero as long as the initial point has its type the same as that of x.)\n\n\nQuestion\nIn the last problem we used an approximate derivative (forward difference) in place of the derivative. This can introduce an error due to the approximation. Would Newtons method still converge if the derivative in the algorithm were replaced with an approximate derivative? In general, this can often be done but the convergence can be slower and the sensitivity to a poor initial guess even greater.\nThree common approximations are given by the difference quotient for a fixed \\(h\\): \\(f'(x_i) \\approx (f(x_i+h)-f(x_i))/h\\); the secant line approximation: \\(f'(x_i) \\approx (f(x_i) - f(x_{i-1})) / (x_i - x_{i-1})\\); and the Steffensen approximation \\(f'(x_i) \\approx (f(x_i + f(x_i)) - f(x_i)) / f(x_i)\\) (using \\(h=f(x_i)\\)).\nLets revisit the \\(4\\)-step convergence of Newtons method to the root of \\(f(x) = 1/x - q\\) when \\(q=0.8\\). Will these methods be as fast?\nLets define the above approximations for a given f:\n\nq₀ = 0.8\nfq(x) = 1/x - q₀\nsecant_approx(x0,x1) = (fq(x1) - f(x0)) / (x1 - x0)\ndiffq_approx(x0, h) = secant_approx(x0, x0+h)\nsteff_approx(x0) = diffq_approx(x0, fq(x0))\n\nsteff_approx (generic function with 1 method)\n\n\nThen using the difference quotient would look like:\n\nΔ = 1e-6\nx1 = 42/17 - 32/17 * q₀\nx1 = x1 - fq(x1) / diffq_approx(x1, Δ) # |x1 - xstar| = 0.06511395862036995\nx1 = x1 - fq(x1) / diffq_approx(x1, Δ) # |x1 - xstar| = 0.003391809999860218; etc\n\n0.9647072573629651\n\n\nThe Steffensen method would look like:\n\nx1 = 42/17 - 32/17 * q₀\nx1 = x1 - fq(x1) / steff_approx(x1) # |x1 - xstar| = 0.011117056291670258\nx1 = x1 - fq(x1) / steff_approx(x1) # |x1 - xstar| = 3.502579696146313e-5; etc.\n\n1.0994223446163458\n\n\nAnd the secant method like:\n\nΔ = 1e-6\nx1 = 42/17 - 32/17 * q₀\nx0 = x1 - Δ # we need two initial values\nx0, x1 = x1, x1 - fq(x1) / secant_approx(x0, x1) # |x1 - xstar| = 8.222358365284066e-6\nx0, x1 = x1, x1 - fq(x1) / secant_approx(x0, x1) # |x1 - xstar| = 1.8766323799379592e-6; etc.\n\n(0.9647065698627694, 0.9647070425296034)\n\n\nRepeat each of the above algorithms until abs(x1 - 1.25) is 0 (which will happen for this problem, though not in general). Record the steps.\n\nDoes the difference quotient need more than \\(4\\) steps?\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nDoes the secant method need more than \\(4\\) steps?\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nDoes the Steffensen method need more than 4 steps?\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAll methods work quickly with this well-behaved problem. In general the convergence rates are slightly different for each, with the Steffensen method matching Newtons method and the difference quotient method being slower in general. All can be more sensitive to the initial guess."
},
{
"objectID": "derivatives/more_zeros.html",
"href": "derivatives/more_zeros.html",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "",
"text": "This section uses these add-on packages:\nNewtons method is not the only algorithm of its kind for identifying zeros of a function. In this section we discuss some alternatives."
},
{
"objectID": "derivatives/more_zeros.html#the-find_zerof-x0-function",
"href": "derivatives/more_zeros.html#the-find_zerof-x0-function",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "31.1 The find_zero(f, x0) function",
"text": "31.1 The find_zero(f, x0) function\nThe function find_zero from the Roots packages provides several different algorithms for finding a zero of a function, including some a derivative-free algorithms for finding zeros when started with an initial guess. The default method is similar to Newtons method in that only a good initial guess is needed. However, the algorithm, while possibly slower in terms of function evaluations and steps, is engineered to be a bit more robust to the choice of initial estimate than Newtons method. (If it finds a bracket, it will use a bisection algorithm which is guaranteed to converge, but can be slower to do so.) Here we see how to call the function:\n\nf(x) = cos(x) - x\nx₀ = 1\nfind_zero(f, x₀)\n\n0.7390851332151607\n\n\nCompare to this related call which uses the bisection method:\n\nfind_zero(f, (0, 1)) ## [0,1] must be a bracketing interval\n\n0.7390851332151607\n\n\nFor this example both give the same answer, but the bisection method is a bit less convenient as a bracketing interval must be pre-specified."
},
{
"objectID": "derivatives/more_zeros.html#the-secant-method",
"href": "derivatives/more_zeros.html#the-secant-method",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "31.2 The secant method",
"text": "31.2 The secant method\nThe default find_zero method above uses a secant-like method unless a bracketing method is found. The secant method is historic, dating back over \\(3000\\) years. Here we discuss the secant method in a more general framework.\nOne way to view Newtons method is through the inverse of \\(f\\) (assuming it exists): if \\(f(\\alpha) = 0\\) then \\(\\alpha = f^{-1}(0)\\).\nIf \\(f\\) has a simple zero at \\(\\alpha\\) and is locally invertible (that is some \\(f^{-1}\\) exists) then the update step for Newtons method can be identified with:\n\nfitting a polynomial to the local inverse function of \\(f\\) going through through the point \\((f(x_0),x_0)\\),\nand matching the slope of \\(f\\) at the same point.\n\nThat is, we can write \\(g(y) = h_0 + h_1 (y-f(x_0))\\). Then \\(g(f(x_0)) = x_0 = h_0\\), so \\(h_0 = x_0\\). From \\(g'(f(x_0)) = 1/f'(x_0)\\), we get \\(h_1 = 1/f'(x_0)\\). That is, \\(g(y) = x_0 + (y-f(x_0))/f'(x_0)\\). At \\(y=0,\\) we get the update step \\(x_1 = g(0) = x_0 - f(x_0)/f'(x_0)\\).\nA similar viewpoint can be used to create derivative-free methods.\nFor example, the secant method can be seen as the result of fitting a degree-\\(1\\) polynomial approximation for \\(f^{-1}\\) through two points \\((f(x_0),x_0)\\) and \\((f(x_1), x_1)\\).\nAgain, expressing this approximation as \\(g(y) = h_0 + h_1(y-f(x_1))\\) leads to \\(g(f(x_1)) = x_1 = h_0\\). Substituting \\(f(x_0)\\) gives \\(g(f(x_0)) = x_0 = x_1 + h_1(f(x_0)-f(x_1))\\). Solving for \\(h_1\\) leads to \\(h_1=(x_1-x_0)/(f(x_1)-f(x_0))\\). Then \\(x_2 = g(0) = x_1 + (x_1-x_0)/(f(x_1)-f(x_0)) \\cdot f(x_1)\\). This is the first step of the secant method:\n\\[\nx_{n+1} = x_n - f(x_n) \\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}.\n\\]\nThat is, where the next step of Newtons method comes from the intersection of the tangent line at \\(x_n\\) with the \\(x\\)-axis, the next step of the secant method comes from the intersection of the secant line defined by \\(x_n\\) and \\(x_{n-1}\\) with the \\(x\\) axis. That is, the secant method simply replaces \\(f'(x_n)\\) with the slope of the secant line between \\(x_n\\) and \\(x_{n-1}\\).\nWe code the update step as λ2:\n\nλ2(f0,f1,x0,x1) = x1 - f1 * (x1-x0) / (f1-f0)\n\nλ2 (generic function with 1 method)\n\n\nThen we can run a few steps to identify the zero of sine starting at \\(3\\) and \\(4\\)\n\nx0,x1 = 4,3\nf0,f1 = sin.((x0,x1))\n@show x1,f1\n\nx0,x1 = x1, λ2(f0,f1,x0,x1)\nf0,f1 = f1, sin(x1)\n@show x1,f1\n\nx0,x1 = x1, λ2(f0,f1,x0,x1)\nf0,f1 = f1, sin(x1)\n@show x1,f1\n\nx0,x1 = x1, λ2(f0,f1,x0,x1)\nf0,f1 = f1, sin(x1)\n@show x1,f1\n\nx0,x1 = x1, λ2(f0,f1,x0,x1)\nf0,f1 = f1, sin(x1)\nx1,f1\n\n(x1, f1) = (3, 0.1411200080598672)\n(x1, f1) = (3.157162792479947, -0.015569509788328599)\n(x1, f1) = (3.14154625558915, 4.639800062679684e-5)\n(x1, f1) = (3.1415926554589646, -1.8691713617942337e-9)\n\n\n(3.141592653589793, 1.2246467991473532e-16)\n\n\nLike Newtons method, the secant method coverges quickly for this problem (though its rate is less than the quadratic rate of Newtons method).\nThis method is included in Roots as Secant() (or Order1()):\n\nfind_zero(sin, (4,3), Secant())\n\n3.141592653589793\n\n\nThough the derivative is related to the slope of the secant line, that is in the limit. The convergence of the secant method is not as fast as Newtons method, though at each step of the secant method, only one new function evaluation is needed, so it can be more efficient for functions that are expensive to compute or differentiate.\nLet \\(\\epsilon_{n+1} = x_{n+1}-\\alpha\\), where \\(\\alpha\\) is assumed to be the simple zero of \\(f(x)\\) that the secant method converges to. A calculation shows that\n\\[\n\\begin{align*}\n\\epsilon_{n+1} &\\approx \\frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})} \\frac{(1/2)f''(\\alpha)(e_n-e_{n-1})}{x_n-x_{n-1}} \\epsilon_n \\epsilon_{n-1}\\\\\n& \\approx \\frac{f''(\\alpha)}{2f'(\\alpha)} \\epsilon_n \\epsilon_{n-1}\\\\\n&= C \\epsilon_n \\epsilon_{n-1}.\n\\end{align*}\n\\]\nThe constant C is similar to that for Newtons method, and reveals potential troubles for the secant method similar to those of Newtons method: a poor initial guess (the initial error is too big), the second derivative is too large, the first derivative too flat near the answer.\nAssuming the error term has the form \\(\\epsilon_{n+1} = A|\\epsilon_n|^\\phi\\) and substituting into the above leads to the equation\n\\[\n\\frac{A^{1-1/\\phi}}{C} = |\\epsilon_n|^{1 - \\phi +1/\\phi}.\n\\]\nThe left side being a constant suggests \\(\\phi\\) solves: \\(1 - \\phi + 1/\\phi = 0\\) or \\(\\phi^2 -\\phi - 1 = 0\\). The solution is the golden ratio, \\((1 + \\sqrt{5})/2 \\approx 1.618\\dots\\).\n\n31.2.1 Steffensens method\nSteffensens method is a secant-like method that converges with \\(|\\epsilon_{n+1}| \\approx C |\\epsilon_n|^2\\). The secant is taken between the points \\((x_n,f(x_n))\\) and \\((x_n + f(x_n), f(x_n + f(x_n))\\). Like Newtons method this requires \\(2\\) function evaluations per step. Steffensens is implemented through Roots.Steffensen(). Steffensens method is more sensitive to the initial guess than other methods, so in practice must be used with care, though it is a starting point for many higher-order derivative-free methods."
},
{
"objectID": "derivatives/more_zeros.html#inverse-quadratic-interpolation",
"href": "derivatives/more_zeros.html#inverse-quadratic-interpolation",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "31.3 Inverse quadratic interpolation",
"text": "31.3 Inverse quadratic interpolation\nInverse quadratic interpolation fits a quadratic polynomial through three points, not just two like the Secant method. The third being \\((f(x_2), x_2)\\).\nFor example, here is the inverse quadratic function, \\(g(y)\\), going through three points marked with red dots. The blue dot is found from \\((g(0), 0)\\).\n\n\n\n\n\nHere we use SymPy to identify the degree-\\(2\\) polynomial as a function of \\(y\\), then evaluate it at \\(y=0\\) to find the next step:\n\n@syms y hs[0:2] xs[0:2] fs[0:2]\nH(y) = sum(hᵢ*(y - fs[end])^i for (hᵢ,i) ∈ zip(hs, 0:2))\n\neqs = [H(fᵢ) ~ xᵢ for (xᵢ, fᵢ) ∈ zip(xs, fs)]\nϕ = solve(eqs, hs)\nhy = subs(H(y), ϕ)\n\n \n\\[\nxs₂ + \\frac{\\left(- fs₂ + y\\right)^{2} \\left(- fs₀ xs₁ + fs₀ xs₂ + fs₁ xs₀ - fs₁ xs₂ - fs₂ xs₀ + fs₂ xs₁\\right)}{fs₀^{2} fs₁ - fs₀^{2} fs₂ - fs₀ fs₁^{2} + fs₀ fs₂^{2} + fs₁^{2} fs₂ - fs₁ fs₂^{2}} + \\frac{\\left(- fs₂ + y\\right) \\left(fs₀^{2} xs₁ - fs₀^{2} xs₂ - 2 fs₀ fs₂ xs₁ + 2 fs₀ fs₂ xs₂ - fs₁^{2} xs₀ + fs₁^{2} xs₂ + 2 fs₁ fs₂ xs₀ - 2 fs₁ fs₂ xs₂ - fs₂^{2} xs₀ + fs₂^{2} xs₁\\right)}{fs₀^{2} fs₁ - fs₀^{2} fs₂ - fs₀ fs₁^{2} + fs₀ fs₂^{2} + fs₁^{2} fs₂ - fs₁ fs₂^{2}}\n\\]\n\n\n\nThe value of hy at \\(y=0\\) yields the next guess based on the past three, and is given by:\n\nq⁻¹ = hy(y => 0)\n\n \n\\[\n\\frac{fs₂^{2} \\left(- fs₀ xs₁ + fs₀ xs₂ + fs₁ xs₀ - fs₁ xs₂ - fs₂ xs₀ + fs₂ xs₁\\right)}{fs₀^{2} fs₁ - fs₀^{2} fs₂ - fs₀ fs₁^{2} + fs₀ fs₂^{2} + fs₁^{2} fs₂ - fs₁ fs₂^{2}} - \\frac{fs₂ \\left(fs₀^{2} xs₁ - fs₀^{2} xs₂ - 2 fs₀ fs₂ xs₁ + 2 fs₀ fs₂ xs₂ - fs₁^{2} xs₀ + fs₁^{2} xs₂ + 2 fs₁ fs₂ xs₀ - 2 fs₁ fs₂ xs₂ - fs₂^{2} xs₀ + fs₂^{2} xs₁\\right)}{fs₀^{2} fs₁ - fs₀^{2} fs₂ - fs₀ fs₁^{2} + fs₀ fs₂^{2} + fs₁^{2} fs₂ - fs₁ fs₂^{2}} + xs₂\n\\]\n\n\n\nThough the above can be simplified quite a bit when computed by hand, here we simply make this a function with lambdify which we will use below.\n\nλ3 = lambdify(q⁻¹) # fs, then xs\n\n#118 (generic function with 1 method)\n\n\n(SymPys lambdify function, by default, picks the order of its argument lexicographically, in this case they will be the f values then the x values.)\nAn inverse quadratic step is utilized by Brents method, as possible, to yield a rapidly convergent bracketing algorithm implemented as a default zero finder in many software languages. Julias Roots package implements the method in Roots.Brent(). An inverse cubic interpolation is utilized by Alefeld, Potra, and Shi which gives an asymptotically even more rapidly convergent algorithm then Brents (implemented in Roots.AlefeldPotraShi() and also Roots.A42()). This is used as a finishing step in many cases by the default hybrid Order0() method of find_zero.\nIn a bracketing algorithm, the next step should reduce the size of the bracket, so the next iterate should be inside the current bracket. However, quadratic convergence does not guarantee this to happen. As such, sometimes a subsitute method must be chosen.\nChandrapatlas method, is a bracketing method utilizing an inverse quadratic step as the centerpiece. The key insight is the test to choose between this inverse quadratic step and a bisection step. This is done in the following based on values of \\(\\xi\\) and \\(\\Phi\\) defined within:\n\nfunction chandrapatla(f, u, v, λ; verbose=false)\n a,b = promote(float(u), float(v))\n fa,fb = f(a),f(b)\n @assert fa * fb < 0\n\n if abs(fa) < abs(fb)\n a,b,fa,fb = b,a,fb,fa\n end\n\n c, fc = a, fa\n\n maxsteps = 100\n for ns in 1:maxsteps\n\n Δ = abs(b-a)\n m, fm = (abs(fa) < abs(fb)) ? (a, fa) : (b, fb)\n ϵ = eps(m)\n if Δ ≤ 2ϵ\n return m\n end\n @show m,fm\n iszero(fm) && return m\n\n ξ = (a-b)/(c-b)\n Φ = (fa-fb)/(fc-fb)\n\n if Φ^2 < ξ < 1 - (1-Φ)^2\n xt = λ(fa,fc,fb, a,c,b) # inverse quadratic\n else\n xt = a + (b-a)/2\n end\n\n ft = f(xt)\n\n isnan(ft) && break\n\n if sign(fa) == sign(ft)\n c,fc = a,fa\n a,fa = xt,ft\n else\n c,b,a = b,a,xt\n fc,fb,fa = fb,fa,ft\n end\n\n verbose && @show ns, a, fa\n\n end\n error(\"no convergence: [a,b] = $(sort([a,b]))\")\nend\n\nchandrapatla (generic function with 1 method)\n\n\nLike bisection, this method ensures that \\(a\\) and \\(b\\) is a bracket, but it moves \\(a\\) to the newest estimate, so does not maintain that \\(a < b\\) throughout.\nWe can see it in action on the sine function. Here we pass in \\(\\lambda\\), but in a real implementation (as in Roots.Chandrapatla()) we would have programmed the algorithm to compute the inverse quadratic value.\n\nchandrapatla(sin, 3, 4, λ3, verbose=true)\n\n(m, fm) = (3.0, 0.1411200080598672)\n(ns, a, fa) = (1, 3.5, -0.35078322768961984)\n(m, fm) = (3.0, 0.1411200080598672)\n(ns, a, fa) = \n\n\n(2, 3.1315894157911264, 0.010003070970892524)\n(m, fm) = (3.1315894157911264, 0.010003070970892524)\n(ns, a, fa) = (3, 3.141678836157296, -8.618256739611538e-5)\n(m, fm) = (3.141678836157296, -8.618256739611538e-5)\n(ns, a, fa) = (4, 3.141592600257386, 5.3332407057633926e-8)\n(m, fm) = (3.141592600257386, 5.3332407057633926e-8)\n(ns, a, fa) = (5, 3.1415926535898007, -7.42705188753633e-15)\n(m, fm) = (3.1415926535898007, -7.42705188753633e-15)\n(ns, a, fa) = (6, 3.141592653589793, 1.2246467991473532e-16)\n(m, fm) = (3.141592653589793, 1.2246467991473532e-16)\n(ns, a, fa) = (7, 3.1415926535897936, -3.216245299353273e-16)\n\n\n3.141592653589793\n\n\nThe condition Φ^2 < ξ < 1 - (1-Φ)^2 can be visualized. Assume a,b=0,1, fa,fb=-1/2,1, Then c < a < b, and fc has the same sign as fa, but what values of fc will satisfy the inequality?\n\nξ(c,fc) = (a-b)/(c-b)\nΦ(c,fc) = (fa-fb)/(fc-fb)\nΦl(c,fc) = Φ(c,fc)^2\nΦr(c,fc) = 1 - (1-Φ(c,fc))^2\na,b = 0, 1\nfa,fb = -1/2, 1\nregion = Lt(Φl, ξ) & Lt(ξ,Φr)\nplot(region, xlims=(-2,a), ylims=(-3,0))\n\n\n\n\nWhen (c,fc) is in the shaded area, the inverse quadratic step is chosen. We can see that fc < fa is needed.\nFor these values, this area is within the area where a implicit quadratic step will result in a value between a and b:\n\nl(c,fc) = λ3(fa,fb,fc,a,b,c)\nregion₃ = ImplicitEquations.Lt(l,b) & ImplicitEquations.Gt(l,a)\nplot(region₃, xlims=(-2,0), ylims=(-3,0))\n\n\n\n\nThere are values in the parameter space where this does not occur."
},
{
"objectID": "derivatives/more_zeros.html#tolerances",
"href": "derivatives/more_zeros.html#tolerances",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "31.4 Tolerances",
"text": "31.4 Tolerances\nThe chandrapatla algorithm typically waits until abs(b-a) <= 2eps(m) (where \\(m\\) is either \\(b\\) or \\(a\\) depending on the size of \\(f(a)\\) and \\(f(b)\\)) is satisfied. Informally this means the algorithm stops when the two bracketing values are no more than a small amount apart. What is a “small amount?”\nTo understand, we start with the fact that floating point numbers are an approximation to real numbers.\nFloating point numbers effectively represent a number in scientific notation in terms of\n\na sign (plus or minus) ,\na mantissa (a number in \\([1,2)\\), in binary ), and\nan exponent (to represent a power of \\(2\\)).\n\nThe mantissa is of the form 1.xxxxx...xxx where there are \\(m\\) different xs each possibly a 0 or 1. The ith x indicates if the term 1/2^i should be included in the value. The mantissa is the sum of 1 plus the indicated values of 1/2^i for i in 1 to m. So the last x represents if 1/2^m should be included in the sum. As such, the mantissa represents a discrete set of values, separated by 1/2^m, as that is the smallest difference possible.\nFor example if m=2 then the possible value for the mantissa are 11 => 1 + 1/2 + 1/4 = 7/4, 10 => 1 + 1/2 = 6/4, 01 => 1 + 1/4 = 5/4. and 00 => 1 = 4/4, values separated by 1/4 = 1/2^m.\nFor \\(64\\)-bit floating point numbers m=52, so the values in the mantissa differ by 1/2^52 = 2.220446049250313e-16. This is the value of eps().\nHowever, this “gap” between numbers is for values when the exponent is 0. That is the numbers in [1,2). For values in [2,4) the gap is twice, between [1/2,1) the gap is half. That is the gap depends on the size of the number. The gap between x and its next largest floating point number is given by eps(x) and that always satisfies eps(x) <= eps() * abs(x).\nOne way to think about this is the difference between x and the next largest floating point values is basically x*(1+eps()) - x or x*eps().\nFor the specific example, abs(b-a) <= 2eps(m) means that the gap between a and b is essentially 2 floating point values from the \\(x\\) value with the smallest \\(f(x)\\) value.\nFor bracketing methods that is about as good as you can get. However, once floating values are understood, the absolute best you can get for a bracketing interval would be\n\nalong the way, a value f(c) is found which is exactly 0.0\nthe endpoints of the bracketing interval are adjacent floating point values, meaning the interval can not be bisected and f changes sign between the two values.\n\nThere can be problems when the stopping criteria is abs(b-a) <= 2eps(m)) and the answer is 0.0 that require engineering around. For example, the algorithm above for the function f(x) = -40*x*exp(-x) does not converge when started with [-9,1], even though 0.0 is an obvious zero.\n\nfu(x) = -40*x*exp(-x)\nchandrapatla(fu, -9, 1, λ3)\n\n(m, fm) = (1.0, -14.715177646857693)\n(m, fm) = (1.0, -14.715177646857693)\n(m, fm) = (1.0, -14.715177646857693)\n(m, fm) = (-0.25, 12.840254166877415)\n(m, fm) = (0.375, -10.309339181864583)\n(m, fm) = (0.0625, -2.3485326570336893)\n(m, fm) = (-0.010153633209987412, 0.41029018616686774)\n(m, fm) = (0.00020479607707152292, -0.008190165597310655)\n(m, fm) = (-2.0313182169488797e-7, 8.125274518297165e-6)\n(m, fm) = (6.287189464795201e-13, -2.5148757859164994e-11)\n(m, fm) = (-5.293955920339377e-23, 2.117582368135751e-21)\n(m, fm) = (1.0097419586828951e-28, -4.0389678347315804e-27)\n(m, fm) = (-5.877471754111438e-39, 2.350988701644575e-37)\n(m, fm) = (1.1210387714598537e-44, -4.484155085839415e-43)\n(m, fm) = (-6.525304467998525e-55, 2.61012178719941e-53)\n(m, fm) = (1.2446030555722283e-60, -4.978412222288913e-59)\n(m, fm) = (-7.24454326306137e-71, 2.897817305224548e-69)\n(m, fm) = (1.3817869688151111e-76, -5.527147875260445e-75)\n(m, fm) = (-8.043058733543795e-87, 3.217223493417518e-85)\n(m, fm) = (1.5340917079055395e-92, -6.136366831622158e-91)\n(m, fm) = (-8.929588994392773e-103, 3.5718355977571093e-101)\n(m, fm) = (1.7031839360032603e-108, -6.812735744013041e-107)\n(m, fm) = (-9.913835302014255e-119, 3.965534120805702e-117)\n(m, fm) = (1.8909140209225187e-124, -7.563656083690075e-123)\n\n\nLoadError: no convergence: [a,b] = [-9.913835302014255e-119, 1.8909140209225187e-124]\n\n\nHere the issue is abs(b-a) is tiny (of the order 1e-119) but eps(m) is even smaller.\nFor non-bracketing methods, like Newtons method or the secant method, different criteria are useful. There may not be a bracketing interval for f (for example f(x) = (x-1)^2) so the second criteria above might need to be restated in terms of the last two iterates, \\(x_n\\) and \\(x_{n-1}\\). Calling this difference \\(\\Delta = |x_n - x_{n-1}|\\), we might stop if \\(\\Delta\\) is small enough. As there are scenarios where this can happen, but the function is not at a zero, a check on the size of \\(f\\) is needed.\nHowever, there may be no floating point value where \\(f\\) is exactly 0.0 so checking the size of f(x_n) requires some agreement.\nFirst if f(x_n) is 0.0 then it makes sense to call x_n an exact zero of \\(f\\), even though this may hold even if x_n, a floating point value, is not mathematically an exact zero of \\(f\\). (Consider f(x) = x^2 - 2x + 1. Mathematically, this is identical to g(x) = (x-1)^2, but f(1 + eps()) is zero, while g(1+eps()) is 4.930380657631324e-32.\nHowever, there may never be a value with f(x_n) exactly 0.0. (The value of sin(pi) is not zero, for example, as pi is an approximation to \\(\\pi\\), as well the sin of values adjacent to float(pi) do not produce 0.0 exactly.)\nSuppose x_n is the closest floating number to \\(\\alpha\\), the zero. Then the relative rounding error, \\((\\) x_n \\(- \\alpha)/\\alpha\\), will be a value \\((1 + \\delta)\\) with \\(\\delta\\) less than eps().\nHow far then can f(x_n) be from \\(0 = f(\\alpha)\\)?\n\\[\nf(x_n) = f(x_n - \\alpha + \\alpha) = f(\\alpha + \\alpha \\cdot \\delta) = f(\\alpha \\cdot (1 + \\delta)),\n\\]\nAssuming \\(f\\) has a derivative, the linear approximation gives:\n\\[\nf(x_n) \\approx f(\\alpha) + f'(\\alpha) \\cdot (\\alpha\\delta) = f'(\\alpha) \\cdot \\alpha \\delta\n\\]\nSo we should consider f(x_n) an approximate zero when it is on the scale of \\(f'(\\alpha) \\cdot \\alpha \\delta\\).\nThat \\(\\alpha\\) factor means we consider a relative tolerance for f. Also important when x_n is close to 0, is the need for an absolute tolerance, one not dependent on the size of x. So a good condition to check if f(x_n) is small is\nabs(f(x_n)) <= abs(x_n) * rtol + atol, or abs(f(x_n)) <= max(abs(x_n) * rtol, atol)\nwhere the relative tolerance, rtol, would absorb an estimate for \\(f'(\\alpha)\\).\nNow, in Newtons method the update step is \\(f(x_n)/f'(x_n)\\). Naturally when \\(f(x_n)\\) is close to \\(0\\), the update step is small and \\(\\Delta\\) will be close to \\(0\\). However, should \\(f'(x_n)\\) be large, then \\(\\Delta\\) can also be small and the algorithm will possibly stop, as \\(x_{n+1} \\approx x_n\\) but not necessarily \\(x_{n+1} \\approx \\alpha\\). So termination on \\(\\Delta\\) alone can be off. Checking if \\(f(x_{n+1})\\) is an approximate zero is also useful to include in a stopping criteria.\nOne thing to keep in mind is that the right-hand side of the rule abs(f(x_n)) <= abs(x_n) * rtol + atol, as a function of x_n, goes to Inf as x_n increases. So if f has 0 as an asymptote (like e^(-x)) for large enough x_n, the rule will be true and x_n could be counted as an approximate zero, despite it not being one.\nSo a modified criteria for convergence might look like:\n\nstop if \\(\\Delta\\) is small and f is an approximate zero with some tolerances\nstop if f is an approximate zero with some tolerances, but be mindful that this rule can identify mathematically erroneous answers.\n\nIt is not uncommon to assign rtol to have a value like sqrt(eps()) to account for accumulated floating point errors and the factor of \\(f'(\\alpha)\\), though in the Roots package it is set smaller by default."
},
{
"objectID": "derivatives/more_zeros.html#questions",
"href": "derivatives/more_zeros.html#questions",
"title": "31  Derivative-free alternatives to Newtons method",
"section": "31.5 Questions",
"text": "31.5 Questions\n\nQuestion\nLet f(x) = tanh(x) (the hyperbolic tangent) and fp(x) = sech(x)^2, its derivative.\nDoes Newtons method (using Roots.Newton()) converge starting at 1.0?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes Newtons method (using Roots.Newton()) converge starting at 1.3?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes the secant method (using Roots.Secant()) converge starting at 1.3? (a second starting value will automatically be chosen, if not directly passed in.)\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the function f(x) = x^5 - x - 1 both Newtons method and the secant method will converge to the one root when started from 1.0. Using verbose=true as an argument to find_zero, (e.g., find_zero(f, x0, Roots.Secant(), verbose=true)) how many more steps does the secant method need to converge?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDo the two methods converge to the exact same value?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet f(x) = exp(x) - x^4 and x0=8.0. How many steps (iterations) does it take for the secant method to converge using the default tolerances?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet f(x) = exp(x) - x^4 and a starting bracket be x0 = [8.9]. Then calling find_zero(f,x0, verbose=true) will show that 49 steps are needed for exact bisection to converge. What about with the Roots.Brent() algorithm, which uses inverse quadratic steps when it can?\nIt takes how many steps?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe Roots.A42() method uses inverse cubic interpolation, as possible, how many steps does this method take to converge?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe large difference is due to how the tolerances are set within Roots. The `Brent method gets pretty close in a few steps, but takes a much longer time to get close enough for the default tolerances\n\n\nQuestion\nConsider this crazy function defined by:\nf(x) = cos(100*x)-4*erf(30*x-10)\n(The erf function is the (error function](https://en.wikipedia.org/wiki/Error_function) and is in the SpecialFunctions package loaded with CalculusWithJulia.)\nMake a plot over the interval \\([-3,3]\\) to see why it is called “crazy”.\nDoes find_zero find a zero to this function starting from \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf so, what is the value?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf not, what is the reason?\n\n\n\n \n \n \n \n \n \n \n \n \n The zero is a simple zero\n \n \n\n\n \n \n \n \n The zero is not a simple zero\n \n \n\n\n \n \n \n \n The function oscillates too much to rely on the tangent line approximation far from the zero\n \n \n\n\n \n \n \n \n We can find an answer\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes find_zero find a zero to this function starting from \\(1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf so, what is the value?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf not, what is the reason?\n\n\n\n \n \n \n \n \n \n \n \n \n The zero is a simple zero\n \n \n\n\n \n \n \n \n The zero is not a simple zero\n \n \n\n\n \n \n \n \n The function oscillates too much to rely on the tangent line approximations far from the zero\n \n \n\n\n \n \n \n \n We can find an answer"
},
{
"objectID": "derivatives/lhospitals_rule.html",
"href": "derivatives/lhospitals_rule.html",
"title": "32  LHospitals Rule",
"section": "",
"text": "This section uses these add-on packages:\nLets return to limits of the form \\(\\lim_{x \\rightarrow c}f(x)/g(x)\\) which have an indeterminate form of \\(0/0\\) if both are evaluated at \\(c\\). The typical example being the limit considered by Euler:\n\\[\n\\lim_{x\\rightarrow 0} \\frac{\\sin(x)}{x}.\n\\]\nWe know this is \\(1\\) using a bound from geometry, but might also guess this is one, as we know from linearization near \\(0\\) that we have \\(\\sin(x) \\approx x\\) or, more specifically:\n\\[\n\\sin(x) = x - \\sin(\\xi)x^2/2, \\quad 0 < \\xi < x.\n\\]\nThis would yield:\n\\[\n\\lim_{x \\rightarrow 0} \\frac{\\sin(x)}{x} = \\lim_{x\\rightarrow 0} \\frac{x -\\sin(\\xi) x^2/2}{x} = \\lim_{x\\rightarrow 0} 1 + \\sin(\\xi) \\cdot x/2 = 1.\n\\]\nThis is because we know \\(\\sin(\\xi) x/2\\) has a limit of \\(0\\), when \\(|\\xi| \\leq |x|\\).\nThat doesnt look any easier, as we worried about the error term, but if just mentally replaced \\(\\sin(x)\\) with \\(x\\) - which it basically is near \\(0\\) - then we can see that the limit should be the same as \\(x/x\\) which we know is \\(1\\) without thinking.\nBasically, we found that in terms of limits, if both \\(f(x)\\) and \\(g(x)\\) are \\(0\\) at \\(c\\), that we might be able to just take this limit: \\((f(c) + f'(c) \\cdot(x-c)) / (g(c) + g'(c) \\cdot (x-c))\\) which is just \\(f'(c)/g'(c)\\).\nWouldnt that be nice? We could find difficult limits just by differentiating the top and the bottom at \\(c\\) (and not use the messy quotient rule).\nWell, in fact that is more or less true, a fact that dates back to LHospital - who wrote the first textbook on differential calculus - though this result is likely due to one of the Bernoulli brothers.\nThat is if the right limit of \\(f(x)/g(x)\\) is indeterminate of the form \\(0/0\\), but the right limit of \\(f'(x)/g'(x)\\) is known, possibly by simple continuity, then the right limit of \\(f(x)/g(x)\\) exists and is equal to that of \\(f'(x)/g'(x)\\).\nThe rule equally applies to left limits and limits at \\(c\\). Later it will see there are other generalizations.\nTo apply this rule to Eulers example, \\(\\sin(x)/x\\), we just need to consider that:\n\\[\nL = 1 = \\lim_{x \\rightarrow 0}\\frac{\\cos(x)}{1},\n\\]\nSo, as well, \\(\\lim_{x \\rightarrow 0} \\sin(x)/x = 1\\).\nThis is due to \\(\\cos(x)\\) being continuous at \\(0\\), so this limit is just \\(\\cos(0)/1\\). (More importantly, the tangent line expansion of \\(\\sin(x)\\) at \\(0\\) is \\(\\sin(0) + \\cos(0)x\\), so that \\(\\cos(0)\\) is why this answer is as it is, but we dont need to think in terms of \\(\\cos(0)\\), but rather the tangent-line expansion, which is \\(\\sin(x) \\approx x\\), as \\(\\cos(0)\\) appears as the coefficient."
},
{
"objectID": "derivatives/lhospitals_rule.html#idea-behind-lhospitals-rule",
"href": "derivatives/lhospitals_rule.html#idea-behind-lhospitals-rule",
"title": "32  LHospitals Rule",
"section": "32.1 Idea behind LHospitals rule",
"text": "32.1 Idea behind LHospitals rule\nA first proof of LHospitals rule takes advantage of Cauchys generalization of the mean value theorem to two functions. Suppose \\(f(x)\\) and \\(g(x)\\) are continuous on \\([c,b]\\) and differentiable on \\((c,b)\\). On \\((c,x)\\), \\(c < x < b\\) there exists a \\(\\xi\\) with \\(f'(\\xi) \\cdot (f(x) - f(c)) = g'(\\xi) \\cdot (g(x) - g(c))\\). In our formulation, both \\(f(c)\\) and \\(g(c)\\) are zero, so we have, provided we know that \\(g(x)\\) is non zero, that \\(f(x)/g(x) = f'(\\xi)/g'(\\xi)\\) for some \\(\\xi\\), \\(c < \\xi < c + x\\). That the right-hand side has a limit as \\(x \\rightarrow c+\\) is true by the assumption that the limit of the ratio of the derivatives exists. (The \\(\\xi\\) part can be removed by considering it as a composition of a function going to \\(c\\).) Thus the right limit of the ratio \\(f/g\\) is known.\n\n\n\n \n :Geometric\ninterpretation of \\(L=\\lim_{x \\rightarrow 0} x^2 / (\\sqrt{1 + x} - 1 - x^2)\\). At \\(0\\) this limit is indeterminate of the form \\(0/0\\). The value for a fixed \\(x\\) can be seen as the slope of a secant line of a parametric plot of the two functions, plotted as \\((g, f)\\). In this figure, the limiting \"tangent\" line has \\(0\\) slope, corresponding to the limit \\(L\\). In general, L'Hospital's rule is nothing more than a statement about slopes of tangent lines.\n$"
},
{
"objectID": "derivatives/lhospitals_rule.html#generalizations",
"href": "derivatives/lhospitals_rule.html#generalizations",
"title": "32  LHospitals Rule",
"section": "32.2 Generalizations",
"text": "32.2 Generalizations\nLHospitals rule generalizes to other indeterminate forms, in particular the indeterminate form \\(\\infty/\\infty\\) can be proved at the same time as \\(0/0\\) with a more careful proof.\nThe value \\(c\\) in the limit can also be infinite. Consider this case with \\(c=\\infty\\):\n\\[\n\\begin{align*}\n\\lim_{x \\rightarrow \\infty} \\frac{f(x)}{g(x)} &=\n\\lim_{x \\rightarrow 0} \\frac{f(1/x)}{g(1/x)}\n\\end{align*}\n\\]\nLHospitals limit applies as \\(x \\rightarrow 0\\), so we differentiate to get:\n\\[\n\\begin{align*}\n\\lim_{x \\rightarrow 0} \\frac{[f(1/x)]'}{[g(1/x)]'}\n&= \\lim_{x \\rightarrow 0} \\frac{f'(1/x)\\cdot(-1/x^2)}{g'(1/x)\\cdot(-1/x^2)}\\\\\n&= \\lim_{x \\rightarrow 0} \\frac{f'(1/x)}{g'(1/x)}\\\\\n&= \\lim_{x \\rightarrow \\infty} \\frac{f'(x)}{g'(x)},\n\\end{align*}\n\\]\nassuming the latter limit exists, LHospitals rule assures the equality\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{f(x)}{g(x)} =\n\\lim_{x \\rightarrow \\infty} \\frac{f'(x)}{g'(x)},\n\\]\n\nExamples\nFor example, consider\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{x}{e^x}.\n\\]\nWe see it is of the form \\(\\infty/\\infty\\). Taking advantage of the fact that LHospitals rule applies to limits at \\(\\infty\\), we have that this limit will exist and be equal to this one, should it exist:\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{1}{e^x}.\n\\]\nThis limit is, of course, \\(0\\), as it is of the form \\(1/\\infty\\). It is not hard to build up from here to show that for any integer value of \\(n>0\\) that:\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{x^n}{e^x} = 0.\n\\]\nThis is an expression of the fact that exponential functions grow faster than polynomial functions.\nSimilarly, powers grow faster than logarithms, as this limit shows, which is indeterminate of the form \\(\\infty/\\infty\\):\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{\\log(x)}{x} =\n\\lim_{x \\rightarrow \\infty} \\frac{1/x}{1} = 0,\n\\]\nthe first equality by LHospitals rule, as the second limit exists."
},
{
"objectID": "derivatives/lhospitals_rule.html#other-indeterminate-forms",
"href": "derivatives/lhospitals_rule.html#other-indeterminate-forms",
"title": "32  LHospitals Rule",
"section": "32.3 Other indeterminate forms",
"text": "32.3 Other indeterminate forms\nIndeterminate forms of the type \\(0 \\cdot \\infty\\), \\(0^0\\), \\(\\infty^\\infty\\), \\(\\infty - \\infty\\) can be re-expressed to be in the form \\(0/0\\) or \\(\\infty/\\infty\\) and then LHospitals theorem can be applied.\n\nExample: rewriting \\(0 \\cdot \\infty\\)\nWhat is the limit \\(x \\log(x)\\) as \\(x \\rightarrow 0+\\)? The form is \\(0\\cdot \\infty\\), rewriting, we see this is just:\n\\[\n\\lim_{x \\rightarrow 0+}\\frac{\\log(x)}{1/x}.\n\\]\nLHospitals rule clearly applies to one-sided limits, as well as two (our proof sketch used one-sided limits), so this limit will equal the following, should it exist:\n\\[\n\\lim_{x \\rightarrow 0+}\\frac{1/x}{-1/x^2} = \\lim_{x \\rightarrow 0+} -x = 0.\n\\]\n\n\nExample: rewriting \\(0^0\\)\nWhat is the limit \\(x^x\\) as \\(x \\rightarrow 0+\\)? The expression is of the form \\(0^0\\), which is indeterminate. (Even though floating point math defines the value as \\(1\\).) We can rewrite this by taking a log:\n\\[\nx^x = \\exp(\\log(x^x)) = \\exp(x \\log(x)) = \\exp(\\log(x)/(1/x)).\n\\]\nBe just saw that \\(\\lim_{x \\rightarrow 0+}\\log(x)/(1/x) = 0\\). So by the rules for limits of compositions and the fact that \\(e^x\\) is continuous, we see \\(\\lim_{x \\rightarrow 0+} x^x = e^0 = 1\\).\n\n\nExample: rewriting \\(\\infty - \\infty\\)\nA limit \\(\\lim_{x \\rightarrow c} f(x) - g(x)\\) of indeterminate form \\(\\infty - \\infty\\) can be reexpressed to be of the from \\(0/0\\) through the transformation:\n\\[\n\\begin{align*}\nf(x) - g(x) &= f(x)g(x) \\cdot (\\frac{1}{g(x)} - \\frac{1}{f(x)}) \\\\\n&= \\frac{\\frac{1}{g(x)} - \\frac{1}{f(x)}}{\\frac{1}{f(x)g(x)}}.\n\\end{align*}\n\\]\nApplying this to\n\\[\nL = \\lim_{x \\rightarrow 1} \\big(\\frac{x}{x-1} - \\frac{1}{\\log(x)}\\big)\n\\]\nWe get that \\(L\\) is equal to the following limit:\n\\[\n\\lim_{x \\rightarrow 1} \\frac{\\log(x) - \\frac{x-1}{x}}{\\frac{x-1}{x} \\log(x)}\n=\n\\lim_{x \\rightarrow 1} \\frac{x\\log(x)-(x-1)}{(x-1)\\log(x)}\n\\]\nIn SymPy we have:\n\n𝒇 = x*log(x) - (x-1)\n𝒈 = (x-1)*log(x)\n𝒇(1), 𝒈(1)\n\n(0, 0)\n\n\nLHospitals rule applies to the form \\(0/0\\), so we try:\n\n𝒇 = diff(𝒇, x)\n𝒈 = diff(𝒈, x)\n𝒇(1), 𝒈(1)\n\n(0, 0)\n\n\nAgain, we get the indeterminate form \\(0/0\\), so we try again with second derivatives:\n\n𝒇 = diff(𝒇, x, x)\n𝒈 = diff(𝒈, x, x)\n𝒇(1), 𝒈(1)\n\n(1, 2)\n\n\nFrom this we see the limit is \\(1/2\\), as could have been done directly:\n\nlimit(𝒇/𝒈, x=>1)\n\n \n\\[\n\\frac{1}{2}\n\\]"
},
{
"objectID": "derivatives/lhospitals_rule.html#the-assumptions-are-necessary",
"href": "derivatives/lhospitals_rule.html#the-assumptions-are-necessary",
"title": "32  LHospitals Rule",
"section": "32.4 The assumptions are necessary",
"text": "32.4 The assumptions are necessary\n\nExample: the limit existing is necessary\nThe following limit is easily seen by comparing terms of largest growth:\n\\[\n1 = \\lim_{x \\rightarrow \\infty} \\frac{x - \\sin(x)}{x}\n\\]\nHowever, the limit of the ratio of the derivatives does not exist:\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{1 - \\cos(x)}{1},\n\\]\nas the function just oscillates. This shows that LHospitals rule does not apply when the limit of the the ratio of the derivatives does not exist.\n\n\nExample: the assumptions matter\nThis example comes from the thesis of Gruntz to highlight possible issues when computer systems do simplifications.\nConsider:\n\\[\n\\lim_{x \\rightarrow \\infty} \\frac{1/2\\sin(2x) +x}{\\exp(\\sin(x))\\cdot(\\cos(x)\\sin(x)+x)}.\n\\]\nIf we apply LHospitals rule using simplification we have:\n\nu(x) = 1//2*sin(2x) + x\nv(x) = exp(sin(x))*(cos(x)*sin(x) + x)\nup, vp = diff(u(x),x), diff(v(x),x)\nlimit(simplify(up/vp), x => oo)\n\n \n\\[\n0\n\\]\n\n\n\nHowever, this answer is incorrect. The reason being subtle. The simplification cancels a term of \\(\\cos(x)\\) that appears in the numerator and denominator. Before cancellation, we have vp will have infinitely many zeros as \\(x\\) approaches \\(\\infty\\) so LHospitals wont apply (the limit wont exist, as every \\(2\\pi\\) the ratio is undefined so the function is never eventually close to some \\(L\\)).\nThis ratio has no limit, as it oscillates, as confirmed by SymPy:\n\nlimit(u(x)/v(x), x=> oo)\n\n \n\\[\n\\left\\langle e^{-1}, e\\right\\rangle\n\\]"
},
{
"objectID": "derivatives/lhospitals_rule.html#questions",
"href": "derivatives/lhospitals_rule.html#questions",
"title": "32  LHospitals Rule",
"section": "32.5 Questions",
"text": "32.5 Questions\n\nQuestion\nThis function \\(f(x) = \\sin(5x)/x\\) is indeterminate at \\(x=0\\). What type?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0/0\\)\n \n \n\n\n \n \n \n \n \\(\\infty/\\infty\\)\n \n \n\n\n \n \n \n \n \\(0^0\\)\n \n \n\n\n \n \n \n \n \\(\\infty - \\infty\\)\n \n \n\n\n \n \n \n \n \\(0 \\cdot \\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis function \\(f(x) = \\sin(x)^{\\sin(x)}\\) is indeterminate at \\(x=0\\). What type?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0/0\\)\n \n \n\n\n \n \n \n \n \\(\\infty/\\infty\\)\n \n \n\n\n \n \n \n \n \\(0^0\\)\n \n \n\n\n \n \n \n \n \\(\\infty - \\infty\\)\n \n \n\n\n \n \n \n \n \\(0 \\cdot \\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis function \\(f(x) = (x-2)/(x^2 - 4)\\) is indeterminate at \\(x=2\\). What type?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0/0\\)\n \n \n\n\n \n \n \n \n \\(\\infty/\\infty\\)\n \n \n\n\n \n \n \n \n \\(0^0\\)\n \n \n\n\n \n \n \n \n \\(\\infty - \\infty\\)\n \n \n\n\n \n \n \n \n \\(0 \\cdot \\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis function \\(f(x) = (g(x+h) - g(x-h)) / (2h)\\) (\\(g\\) is continuous) is indeterminate at \\(h=0\\). What type?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0/0\\)\n \n \n\n\n \n \n \n \n \\(\\infty/\\infty\\)\n \n \n\n\n \n \n \n \n \\(0^0\\)\n \n \n\n\n \n \n \n \n \\(\\infty - \\infty\\)\n \n \n\n\n \n \n \n \n \\(0 \\cdot \\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThis function \\(f(x) = x \\log(x)\\) is indeterminate at \\(x=0\\). What type?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0/0\\)\n \n \n\n\n \n \n \n \n \\(\\infty/\\infty\\)\n \n \n\n\n \n \n \n \n \\(0^0\\)\n \n \n\n\n \n \n \n \n \\(\\infty - \\infty\\)\n \n \n\n\n \n \n \n \n \\(0 \\cdot \\infty\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes LHospitals rule apply to this limit:\n\\[\n\\lim_{x \\rightarrow \\pi} \\frac{\\sin(\\pi x)}{\\pi x}.\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n No. It is not indeterminate\n \n \n\n\n \n \n \n \n Yes. It is of the form \\(0/0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule to find the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{4x - \\sin(x)}{x}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule to find the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{\\sqrt{1+x} - 1}{x}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule one or more times to find the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{x - \\sin(x)}{x^3}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule one or more times to find the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{1 - x^2/2 - \\cos(x)}{x^3}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule one or more times to find the limit\n\\[\nL = \\lim_{x \\rightarrow \\infty} \\frac{\\log(\\log(x))}{\\log(x)}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBy using a common denominator to rewrite this expression, use LHospitals rule to find the limit\n\\[\nL = \\lim_{x \\rightarrow 0} \\frac{1}{x} - \\frac{1}{\\sin(x)}.\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse LHospitals rule to find the limit\n\\[\nL = \\lim_{x \\rightarrow \\infty} \\log(x)/x\n\\]\nWhat is \\(L\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing LHospitals rule, does\n\\[\n\\lim_{x \\rightarrow 0+} x^{\\log(x)}\n\\]\nexist?\nConsider \\(x^{\\log(x)} = e^{\\log(x)\\log(x)}\\).\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing LHospitals rule, find the limit of\n\\[\n\\lim_{x \\rightarrow 1} (2-x)^{\\tan(\\pi/2 \\cdot x)}.\n\\]\n(Hint, express as \\(\\exp^{\\tan(\\pi/2 \\cdot x) \\cdot \\log(2-x)}\\) and take the limit of the resulting exponent.)\n\n\n\n \n \n \n \n \n \n \n \n \n It does not exist\n \n \n\n\n \n \n \n \n \\(e^{2/\\pi}\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\({2\\pi}\\)"
},
{
"objectID": "derivatives/implicit_differentiation.html",
"href": "derivatives/implicit_differentiation.html",
"title": "33  Implicit Differentiation",
"section": "",
"text": "This section uses these add-on packages:"
},
{
"objectID": "derivatives/implicit_differentiation.html#graphs-of-equations",
"href": "derivatives/implicit_differentiation.html#graphs-of-equations",
"title": "33  Implicit Differentiation",
"section": "33.1 Graphs of equations",
"text": "33.1 Graphs of equations\nAn equation in \\(y\\) and \\(x\\) is an algebraic expression involving an equality with two (or more) variables. An example might be \\(x^2 + y^2 = 1\\).\nThe solutions to an equation in the variables \\(x\\) and \\(y\\) are all points \\((x,y)\\) which satisfy the equation.\nThe graph of an equation is just the set of solutions to the equation represented in the Cartesian plane.\nWith this definition, the graph of a function \\(f(x)\\) is just the graph of the equation \\(y = f(x)\\). In general, graphing an equation is more complicated than graphing a function. For a function, we know for a given value of \\(x\\) what the corresponding value of \\(f(x)\\) is through evaluation of the function. For equations, we may have \\(0\\), \\(1\\) or more \\(y\\) values for a given \\(x\\) and even more problematic is we may have no rule to find these values.\nThere are a few options for plotting equations in Julia. We will use ImplicitPlots in this section, but note both ImplicitEquations and IntervalConstraintProgramming offer alternatives that are a bit more flexible.\nTo plot an implicit equation using ImplicitPlots requires expressing the relationship in terms of a function, and then plotting the equation f(x,y) = 0. In practice this simply requires all the terms be moved to one side of an equals sign.\nTo plot the circle of radius \\(2\\), or the equations \\(x^2 + y^2 = 2^2\\) we would move all terms to one side \\(x^2 + y^2 - 2^2 = 0\\) and then express the left hand side through a function:\n\nf(x,y) = x^2 + y^2 - 2^2\n\nf (generic function with 1 method)\n\n\nThis function is then is passed to the implicit_plot function, which works with Plots to render the graphic:\n\nimplicit_plot(f)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe f is a function of two variables, used here to express one side of an equation. Julia makes this easy to do - just make sure two variables are in the signature of f when it is defined. Using functions like this, we can express our equation in the form \\(f(x,y) = c\\) or, more generally, as \\(f(x,y) = g(x,y)\\). The latter of which can be expressed as \\(h(x,y) = f(x,y) - g(x,y) = 0\\). That is, only the form \\(f(x,y)=0\\) is needed to represent an equation.\n\n\n\n\n\n\n\n\nNote\n\n\n\nThere are two different styles in Julia to add simple plot recipes. ImplicitPlots adds a new plotting function (implicit_plot); alternatively many packages add a new recipe for the generic plot method using new types. (For example, SymPy has a plot recipe for symbolic types.\n\n\nOf course, more complicated equations are possible and the steps are similar - only the function definition is more involved. For example, the Devils curve has the form\n\\[\ny^4 - x^4 + ay^2 + bx^2 = 0\n\\]\nHere we draw the curve for a particular choice of \\(a\\) and \\(b\\). For illustration purposes, a narrower viewing window is specified below using xlims and ylims:\n\na,b = -1,2\nf(x,y) = y^4 - x^4 + a*y^2 + b*x^2\nimplicit_plot(f; xlims=(-3,3), ylims=(-3,3), legend=false)"
},
{
"objectID": "derivatives/implicit_differentiation.html#tangent-lines-implicit-differentiation",
"href": "derivatives/implicit_differentiation.html#tangent-lines-implicit-differentiation",
"title": "33  Implicit Differentiation",
"section": "33.2 Tangent lines, implicit differentiation",
"text": "33.2 Tangent lines, implicit differentiation\nThe graph \\(x^2 + y^2 = 1\\) has well-defined tangent lines at all points except \\((-1,0)\\) and \\((0, 1)\\) and even at these two points, we could call the vertical lines \\(x=-1\\) and \\(x=1\\) tangent lines. However, to recover the slope of these tangent lines would need us to express \\(y\\) as a function of \\(x\\) and then differentiate that function. Of course, in this example, we would need two functions: \\(f(x) = \\sqrt{1-x^2}\\) and \\(g(x) = - \\sqrt{1-x^2}\\) to do this completely.\nIn general though, we may not be able to solve for \\(y\\) in terms of \\(x\\). What then?\nThe idea is to assume that \\(y\\) is representable by some function of \\(x\\). This makes sense, moving on the curve from \\((x,y)\\) to some nearby point, means changing \\(x\\) will cause some change in \\(y\\). This assumption is only made locally - basically meaning a complicated graph is reduced to just a small, well-behaved, section of its graph.\nWith this assumption, asking what \\(dy/dx\\) is has an obvious meaning - what is the slope of the tangent line to the graph at \\((x,y)\\). (The assumption eliminates the question of what a tangent line would mean when a graph self intersects.)\nThe method of implicit differentiation allows this question to be investigated. It begins by differentiating both sides of the equation assuming \\(y\\) is a function of \\(x\\) to derive a new equation involving \\(dy/dx\\).\nFor example, starting with \\(x^2 + y^2 = 1\\), differentiating both sides in \\(x\\) gives:\n\\[\n2x + 2y\\cdot \\frac{dy}{dx} = 0.\n\\]\nThe chain rule was used to find \\((d/dx)(y^2) = [y(x)^2]' = 2y \\cdot dy/dx\\). From this we can solve for \\(dy/dx\\) (the resulting equations are linear in \\(dy/dx\\), so can always be solved explicitly):\n\\[\n\\frac{dy}{dx} = -\\frac{x}{y}.\n\\]\nThis says the slope of the tangent line depends on the point \\((x,y)\\) through the formula \\(-x/y\\).\nAs a check, we compare to what we would have found had we solved for \\(y= \\sqrt{1 - x^2}\\) (for \\((x,y)\\) with \\(y \\geq 0\\)). We would have found: \\(dy/dx = 1/2 \\cdot 1/\\sqrt{1 - x^2} \\cdot (-2x)\\). Which can be simplified to \\(-x/y\\). This should show that the method above - assuming \\(y\\) is a function of \\(x\\) and differentiating - is not only more general, but can even be easier.\nThe name - implicit differentiation - comes from the assumption that \\(y\\) is implicitly defined in terms of \\(x\\). According to the Implicit Function Theorem the above method will work provided the curve has sufficient smoothness near the point \\((x,y)\\).\n\nExamples\nConsider the serpentine equation\n\\[\nx^2y + a\\cdot b \\cdot y - a^2 \\cdot x = 0, \\quad a\\cdot b > 0.\n\\]\nFor \\(a = 2, b=1\\) we have the graph:\n\na, b = 2, 1\nf(x,y) = x^2*y + a * b * y - a^2 * x\nimplicit_plot(f)\n\n\n\n\nWe can see that at each point in the viewing window the tangent line exists due to the smoothness of the curve. Moreover, at a point \\((x,y)\\) the tangent will have slope \\(dy/dx\\) satisfying:\n\\[\n2xy + x^2 \\frac{dy}{dx} + a\\cdot b \\frac{dy}{dx} - a^2 = 0.\n\\]\nSolving, yields:\n\\[\n\\frac{dy}{dx} = \\frac{a^2 - 2xy}{ab + x^2}.\n\\]\nIn particular, the point \\((0,0)\\) is always on this graph, and the tangent line will have positive slope \\(a^2/(ab) = a/b\\).\n\nThe eight curve has representation\n\\[\nx^4 = a^2(x^2-y^2), \\quad a \\neq 0.\n\\]\nA graph for \\(a=3\\) shows why it has the name it does:\n\na = 3\nf(x,y) = x^4 - a^2*(x^2 - y^2)\nimplicit_plot(f)\n\n\n\n\nThe tangent line at \\((x,y)\\) will have slope, \\(dy/dx\\) satisfying:\n\\[\n4x^3 = a^2 \\cdot (2x - 2y \\frac{dy}{dx}).\n\\]\nSolving gives:\n\\[\n\\frac{dy}{dx} = -\\frac{4x^3 - a^2 \\cdot 2x}{a^2 \\cdot 2y}.\n\\]\nThe point \\((3,0)\\) can be seen to be a solution to the equation and should have a vertical tangent line. This also is reflected in the formula, as the denominator is \\(a^2\\cdot 2 y\\), which is \\(0\\) at this point, whereas the numerator is not.\n\n\nExample\nThe quotient rule can be hard to remember, unlike the product rule. No reason to despair, the product rule plus implicit differentiation can be used to recover the quotient rule. Suppose \\(y=f(x)/g(x)\\), then we could also write \\(y g(x) = f(x)\\). Differentiating implicitly gives:\n\\[\n\\frac{dy}{dx} g(x) + y g'(x) = f'(x).\n\\]\nSolving for \\(dy/dx\\) gives:\n\\[\n\\frac{dy}{dx} = \\frac{f'(x) - y g'(x)}{g(x)}.\n\\]\nNot quite what we expect, perhaps, but substituting in \\(f(x)/g(x)\\) for \\(y\\) gives us the usual formula:\n\\[\n\\frac{dy}{dx} = \\frac{f'(x) - \\frac{f(x)}{g(x)} g'(x)}{g(x)} = \\frac{f'(x) g(x) - f(x) g'(x)}{g(x)^2}.\n\\]\n\n\n\n\n\n\nNote\n\n\n\nIn this example we mix notations using \\(g'(x)\\) to represent a derivative of \\(g\\) with respect to \\(x\\) and \\(dy/dx\\) to represent the derivative of \\(y\\) with respect to \\(x\\). This is done to emphasize the value that we are solving for. It is just a convention though, we could just as well have used the “prime” notation for each.\n\n\n\n\nExample: Graphing a tangent line\nLets see how to add a graph of a tangent line to the graph of an equation. Tangent lines are tangent at a point, so we need a point to discuss.\nReturning to the equation for a circle, \\(x^2 + y^2 = 1\\), lets look at \\((\\sqrt{2}/2, - \\sqrt{2}/2)\\). The derivative is \\(-y/x\\), so the slope at this point is \\(1\\). The line itself has equation \\(y = b + m \\cdot (x-a)\\). The following represents this in Julia:\n\nF(x,y) = x^2 + y^2 - 1\n\na,b = sqrt(2)/2, -sqrt(2)/2\n\nm = -a/b\ntl(x) = b + m * (x-a)\n\nimplicit_plot(F, xlims=(-2, 2), ylims=(-2, 2), aspect_ratio=:equal)\nplot!(tl)\n\n\n\n\nWe added both the implicit plot of \\(F\\) and the tangent line to the graph at the given point.\n\n\nExample\nWhen we assume \\(y\\) is a function of \\(x\\), it may not be feasible to actually find the function algebraically. However, in many cases one can be found numerically. Suppose \\(G(x,y) = c\\) describes the equation. Then for a fixed \\(x\\), \\(y(x)\\) solves \\(G(x,y(x))) - c = 0\\), so \\(y(x)\\) is a zero of a known function. As long as we can piece together which \\(y\\) goes with which, we can find the function.\nFor example, the folium of Descartes has the equation\n\\[\nx^3 + y^3 = 3axy.\n\\]\nSetting \\(a=1\\) we have the graph:\n\n𝒂 = 1\nG(x,y) = x^3 + y^3 - 3*𝒂*x*y\nimplicit_plot(G)\n\n\n\n\nWe can solve for the lower curve, \\(y\\), as a function of \\(x\\), as follows:\n\ny1(x) = minimum(find_zeros(y->G(x,y), -10, 10)) # find_zeros from `Roots`\n\ny1 (generic function with 1 method)\n\n\nThis gives the lower part of the curve, which we can plot with:\n\nplot(y1, -5, 5)\n\n\n\n\nThough, in this case, the cubic equation would admit a closed-form solution, the approach illustrated applies more generally."
},
{
"objectID": "derivatives/implicit_differentiation.html#using-sympy-for-computation",
"href": "derivatives/implicit_differentiation.html#using-sympy-for-computation",
"title": "33  Implicit Differentiation",
"section": "33.3 Using SymPy for computation",
"text": "33.3 Using SymPy for computation\nSymPy can be used to perform implicit differentiation. The three steps are similar: we assume \\(y\\) is a function of \\(x\\), locally; differentiate both sides; solve the result for \\(dy/dx\\).\nLets do so for the Trident of Newton, which is represented in Cartesian form as follows:\n\\[\nxy = cx^3 + dx^2 + ex + h.\n\\]\nTo approach this task in SymPy, we begin by defining our symbolic expression. For now, we keep the parameters as symbolic values:\n\n@syms a b c d x y\nex = x*y - (a*c^3 + b*x^2 + c*x + d)\n\n \n\\[\n- a c^{3} - b x^{2} - c x - d + x y\n\\]\n\n\n\nTo express that y is a locally a function of x, we use a “symbolic function” object:\n\n@syms u()\n\n(u,)\n\n\nThe object u is the symbolic function, and u(x) a symbolic expression involving a symbolic function. This is what we will use to refer to y.\nAssume \\(y\\) is a function of \\(x\\), called u(x), this substitution is just a renaming:\n\nex1 = ex(y => u(x))\n\n \n\\[\n- a c^{3} - b x^{2} - c x - d + x u{\\left(x \\right)}\n\\]\n\n\n\nAt this point, we differentiate in x:\n\nex2 = diff(ex1, x)\n\n \n\\[\n- 2 b x - c + x \\frac{d}{d x} u{\\left(x \\right)} + u{\\left(x \\right)}\n\\]\n\n\n\nThe next step is solve for \\(dy/dx\\) - the lone answer to the linear equation - which is done as follows:\n\ndydx = diff(u(x), x)\nex3 = solve(ex2, dydx)[1] # pull out lone answer with [1] indexing\n\n \n\\[\n\\frac{2 b x + c - u{\\left(x \\right)}}{x}\n\\]\n\n\n\nAs this represents an answer in terms of u(x), we replace that term with the original variable:\n\ndydx₁ = ex3(u(x) => y)\n\n \n\\[\n\\frac{2 b x + c - y}{x}\n\\]\n\n\n\nIf x and y are the variable names, this function will combine the steps above:\n\nfunction dy_dx(eqn, x, y)\n @syms u()\n eqn1 = eqn(y => u(x))\n eqn2 = solve(diff(eqn1, x), diff(u(x), x))[1]\n eqn2(u(x) => y)\nend\n\ndy_dx (generic function with 1 method)\n\n\nLet \\(a = b = c = d = 1\\), then \\((1,4)\\) is a point on the curve. We can draw a tangent line to this point with these commands:\n\nH = ex(a=>1, b=>1, c=>1, d=>1)\nx0, y0 = 1, 4\n𝒎 = dydx₁(x=>1, y=>4, a=>1, b=>1, c=>1, d=>1)\nimplicit_plot(lambdify(H); xlims=(-5,5), ylims=(-5,5), legend=false)\nplot!(y0 + 𝒎 * (x-x0))\n\n\n\n\nBasically this includes all the same steps as if done “by hand.” Some effort could have been saved in plotting, had values for the parameters been substituted initially, but not doing so shows their dependence in the derivative.\n\n\n\n\n\n\nWarning\n\n\n\nThe use of lambdify(H) is needed to turn the symbolic expression, H, into a function.\n\n\n\n\n\n\n\n\nNote\n\n\n\nWhile SymPy itself has the plot_implicit function for plotting implicit equations, this works only with PyPlot, not Plots, so we use the ImplicitPlots package in these examples."
},
{
"objectID": "derivatives/implicit_differentiation.html#higher-order-derivatives",
"href": "derivatives/implicit_differentiation.html#higher-order-derivatives",
"title": "33  Implicit Differentiation",
"section": "33.4 Higher order derivatives",
"text": "33.4 Higher order derivatives\nImplicit differentiation can be used to find \\(d^2y/dx^2\\) or other higher-order derivatives. At each stage, the same technique is applied. The only “trick” is that some simplifications can be made.\nFor example, consider \\(x^3 - y^3=3\\). To find \\(d^2y/dx^2\\), we first find \\(dy/dx\\):\n\\[\n3x^2 - (3y^2 \\frac{dy}{dx}) = 0.\n\\]\nWe could solve for \\(dy/dx\\) at this point - it always appears as a linear factor - to get:\n\\[\n\\frac{dy}{dx} = \\frac{3x^2}{3y^2} = \\frac{x^2}{y^2}.\n\\]\nHowever, we differentiate the first equation, as we generally try to avoid the quotient rule\n\\[\n6x - (6y \\frac{dy}{dx} \\cdot \\frac{dy}{dx} + 3y^2 \\frac{d^2y}{dx^2}) = 0.\n\\]\nAgain, if must be that \\(d^2y/dx^2\\) appears as a linear factor, so we can solve for it:\n\\[\n\\frac{d^2y}{dx^2} = \\frac{6x - 6y (\\frac{dy}{dx})^2}{3y^2}.\n\\]\nOne last substitution for \\(dy/dx\\) gives:\n\\[\n\\frac{d^2y}{dx^2} = \\frac{-6x + 6y (\\frac{x^2}{y^2})^2}{3y^2} = -2\\frac{x}{y^2} + 2\\frac{x^4}{y^5} = 2\\frac{x}{y^2}(1 - \\frac{x^3}{y^3}) = 2\\frac{x}{y^5}(y^3 - x^3) = 2 \\frac{x}{y^5}(-3).\n\\]\nIt isnt so pretty, but thats all it takes.\nTo visualize, we plot implicitly and notice that:\n\nas we change quadrants from the third to the fourth to the first the concavity changes from down to up to down, as the sign of the second derivative changes from negative to positive to negative;\nand that at these inflection points, the “tangent” line is vertical when \\(y=0\\) and flat when \\(x=0\\).\n\n\nK(x,y) = x^3 - y^3 - 3\nimplicit_plot(K, xlims=(-3, 3), ylims=(-3, 3))\n\n\n\n\nThe same problem can be done symbolically. The steps are similar, though the last step (replacing \\(x^3 - y^3\\) with \\(3\\)) isnt done without explicitly asking.\n\n@syms x y u()\n\neqn = K(x,y) - 3\neqn1 = eqn(y => u(x))\ndydx = solve(diff(eqn1,x), diff(u(x), x))[1] # 1 solution\nd2ydx2 = solve(diff(eqn1, x, 2), diff(u(x),x, 2))[1] # 1 solution\neqn2 = d2ydx2(diff(u(x), x) => dydx, u(x) => y)\nsimplify(eqn2)\n\n \n\\[\n\\frac{2 x \\left(- x^{3} + y^{3}\\right)}{y^{5}}\n\\]"
},
{
"objectID": "derivatives/implicit_differentiation.html#inverse-functions",
"href": "derivatives/implicit_differentiation.html#inverse-functions",
"title": "33  Implicit Differentiation",
"section": "33.5 Inverse functions",
"text": "33.5 Inverse functions\nAs mentioned, an inverse function for \\(f(x)\\) is a function \\(g(x)\\) satisfying: \\(y = f(x)\\) if and only if \\(g(y) = x\\) for all \\(x\\) in the domain of \\(f\\) and \\(y\\) in the range of \\(f\\).\nIn short, both \\(f \\circ g\\) and \\(g \\circ f\\) are identify functions on their respective domains. As inverses are unique, their notation, \\(f^{-1}(x)\\), reflects the name of the related function.\nThe chain rule can be used to give the derivative of an inverse function when applied to \\(f(f^{-1}(x)) = x\\). Solving gives, \\([f^{-1}(x)]' = 1 / f'(f^{-1}(x))\\).\nThis is great - if we can remember the rules. If not, sometimes implicit differentiation can also help.\nConsider the inverse function for the tangent, which exists when the domain of the tangent function is restricted to \\((-\\pi/2, \\pi/2)\\). The function solves \\(y = \\tan^{-1}(x)\\) or \\(\\tan(y) = x\\). Differentiating this yields:\n\\[\n\\sec(y)^2 \\frac{dy}{dx} = 1.\n\\]\nOr \\(dy/dx = 1/\\sec^2(y)\\).\nBut \\(\\sec(y)^2 = 1 + \\tan(y)^2 = 1 + x^2\\), as can be seen by right-triangle trigonometry. This yields the formula \\(dy/dx = [\\tan^{-1}(x)]' = 1 / (1 + x^2)\\).\n\nExample\nFor a more complicated example, suppose we have a moving trajectory \\((x(t), y(t))\\). The angle it makes with the origin satisfies\n\\[\n\\tan(\\theta(t)) = \\frac{y(t)}{x(t)}.\n\\]\nSuppose \\(\\theta(t)\\) can be defined in terms of the inverse to some function (\\(\\tan^{-1}(x)\\)). We can differentiate implicitly to find \\(\\theta'(t)\\) in terms of derivatives of \\(y\\) and \\(x\\):\n\\[\n\\sec^2(\\theta(t)) \\cdot \\theta'(t) = \\frac{y'(t) x(t) - y(t) x'(t)}{x(t))^2}.\n\\]\nBut \\(\\sec^2(\\theta(t)) = (r(t)/x(t))^2 = (x(t)^2 + y(t)^2) / x(t)^2\\), so moving to the other side the secant term gives an explicit, albeit complicated, expression for the derivative of \\(\\theta\\) in terms of the functions \\(x\\) and \\(y\\):\n\\[\n\\theta'(t) = \\frac{x^2}{x^2(t) + y^2(t)} \\cdot \\frac{y'(t) x(t) - y(t) x'(t)}{x(t))^2} = \\frac{y'(t) x(t) - y(t) x'(t)}{x^2(t) + y^2(t)}.\n\\]\nThis could have been made easier, had we leveraged the result of the previous example.\n\n\nExample: from physics\nMany problems are best done with implicit derivatives. A video showing such a problem along with how to do it analytically is here.\nThis video starts with a simple question:\n\nIf you have a rope and heavy ring, where will the ring position itself due to gravity?\n\nWell, suppose you hold the rope in two places, which we can take to be \\((0,0)\\) and \\((a,b)\\). Then let \\((x,y)\\) be all the possible positions of the ring that hold the rope taught. Then we have this picture:\n\n\n\n\n\nSince the length of the rope does not change, we must have for any admissible \\((x,y)\\) that:\n\\[\nL = \\sqrt{x^2 + y^2} + \\sqrt{(a-x)^2 + (b-y)^2},\n\\]\nwhere these terms come from the two hypotenuses in the figure, as computed through Pythagoreans theorem.\n\nIf we assume that the ring will minimize the value of y subject to this constraint, can we solve for y?\n\nWe create a function to represent the equation:\n\nF₀(x, y, a, b) = sqrt(x^2 + y^2) + sqrt((a-x)^2 + (b-y)^2)\n\nF₀ (generic function with 1 method)\n\n\nTo illustrate, we need specific values of \\(a\\), \\(b\\), and \\(L\\):\n\n𝐚, 𝐛, 𝐋 = 3, 3, 10 # L > sqrt{a^2 + b^2}\nF₀(x, y) = F₀(x, y, 𝐚, 𝐛)\n\nF₀ (generic function with 2 methods)\n\n\nOur values \\((x,y)\\) must satisfy \\(f(x,y) = L\\). Lets graph:\n\nimplicit_plot((x,y) -> F₀(x,y) - 𝐋, xlims=(-5, 7), ylims=(-5, 7))\n\n\n\n\nThe graph is an ellipse, though slightly tilted.\nOkay, now to find the lowest point. This will be when the derivative is \\(0\\). We solve by assuming \\(y\\) is a function of \\(x\\) called u. We have already defined symbolic variables a, b, x, and y, here we define L:\n\n@syms L\n\n(L,)\n\n\nThen\n\neqn = F₀(x,y,a,b) - L\n\n \n\\[\n- L + \\sqrt{x^{2} + y^{2}} + \\sqrt{\\left(0.05 - x\\right)^{2} + \\left(0.25 - y\\right)^{2}}\n\\]\n\n\n\n\neqn_1 = diff(eqn(y => u(x)), x)\neqn_2 = solve(eqn_1, diff(u(x), x))[1]\ndydx₂ = eqn_2(u(x) => y)\n\n \n\\[\n\\frac{- 20.0 x \\sqrt{x^{2} + y^{2}} - 20.0 x \\sqrt{\\left(x - 0.05\\right)^{2} + \\left(y - 0.25\\right)^{2}} + \\sqrt{x^{2} + y^{2}}}{20.0 y \\left(x^{2} + y^{2}\\right)^{0.5} + 20.0 y \\left(\\left(x - 0.05\\right)^{2} + \\left(y - 0.25\\right)^{2}\\right)^{0.5} - 5.0 \\left(x^{2} + y^{2}\\right)^{0.5}}\n\\]\n\n\n\nWe are looking for when the tangent line has \\(0\\) slope, or when dydx is \\(0\\):\n\ncps = solve(dydx₂, x)\n\n2-element Vector{Sym}:\n 0.2⋅y\n 0.2*y/(8.0*y - 1.0)\n\n\nThere are two answers, as we could guess from the graph, but we want the one for the smallest value of \\(y\\), which is the second.\nThe values of dydx depend on any pair \\((x,y)\\), but our solution must also satisfy the equation. That is for our value of \\(x\\), we need to find the corresponding \\(y\\). This should be possible by substituting:\n\neqn1 = eqn(x => cps[2])\n\n \n\\[\n- L + \\sqrt{y^{2} + \\frac{0.000625 y^{2}}{\\left(y - 0.125\\right)^{2}}} + \\sqrt{\\left(0.25 - y\\right)^{2} + 0.04 \\left(- \\frac{y}{8.0 y - 1.0} + 0.25\\right)^{2}}\n\\]\n\n\n\nWe would try to solve eqn1 for y with solve(eqn1, y), but SymPy cant complete this problem. Instead, we will approach this numerically using find_zero from the Roots package. We make the above a function of y alone\n\neqn2 = eqn1(a=>3, b=>3, L=>10)\nystar = find_zero(eqn2, -3)\n\n-3.4872035192968935\n\n\nOkay, now we need to put this value back into our expression for the x value and also substitute in for the parameters:\n\nxstar = N(cps[2](y => ystar, a =>3, b => 3, L => 3))\n\n0.024134877095571772\n\n\nOur minimum is at (xstar, ystar), as this graphic shows:\n\ntl(x) = ystar + 0 * (x- xstar)\nimplicit_plot((x,y) -> F₀(x,y,3,3) - 10, xlims=(-4, 7), ylims=(-10, 10))\nplot!(tl)\n\n\n\n\nIf you watch the video linked to above, you will see that the surprising fact here is the resting point is such that the angles formed by the rope are the same. Basically this makes the tension in both parts of the rope equal, so there is a static position (if not static, the ring would move and not end in the final position). We can verify this fact numerically by showing the arctangents of the two triangles are the same up to a sign:\n\na0, b0 = 0,0 # the foci of the ellipse are (0,0) and (3,3)\na1, b1 = 3, 3\natan((b0 - ystar)/(a0 - xstar)) + atan((b1 - ystar)/(a1 - xstar)) # ≈ 0\n\n-0.42316792063329345\n\n\nNow, were we lucky and just happened to take \\(a=3\\), \\(b = 3\\) in such a way to make this work? Well, no. But convince yourself by doing the above for different values of \\(b\\).\n\nIn the above, we started with \\(F(x,y) = L\\) and solved symbolically for \\(y=f(x)\\) so that \\(F(x,f(x)) = L\\). Then we took a derivative of \\(f(x)\\) and set this equal to \\(0\\) to solve for the minimum \\(y\\) values.\nHere we try the same problem numerically, using a zero-finding approach to identify \\(f(x))\\).\nStarting with \\(F(x,y) = \\sqrt{x^2 + y^2} + \\sqrt{(x-1)^2 + (b-2)^2}\\) and \\(L=3\\), we have:\n\nF₁(x,y) = F₀(x,y, 1, 2) - 3 # a,b,L = 1,2,3\nimplicit_plot(F₁)\n\n\n\n\nTrying to find the lowest \\(y\\) value we have from the graph it is near \\(x=0.1\\). We can do better.\nFirst, we could try so solve for the \\(f\\) using find_zero. Here is one way:\n\nf₀(x) = find_zero(y -> F₁(x, y), 0)\n\nf₀ (generic function with 1 method)\n\n\nWe use \\(0\\) as an initial guess, as the \\(y\\) value is near \\(0\\). More on this later. We could then just sample many \\(x\\) values between \\(-0.5\\) and \\(1.5\\) and find the one corresponding to the smallest \\(t\\) value:\n\nfindmin([f₀(x) for x ∈ range(-0.5, 1.5, length=100)])\n\n(-0.4142135621686101, 33)\n\n\nThis shows the smallest value is around \\(-0.414\\) and occurs in the \\(33\\)rd position of the sampled \\(x\\) values. Pretty good, but we can do better. We just need to differentiate \\(f\\), solve for \\(f'(x) = 0\\) and then put that value back into \\(f\\) to find the smallest \\(y\\).\nHowever there is one subtle point. Using automatic differentiation, as implemented in ForwardDiff, with find_zero requires the x0 initial value to have a certain type. In this case, the same type as the “x” passed into \\(f(x)\\). So rather than use an initial value of \\(0\\), we must use an initial value zero(x)! (Otherwise, there will be an error “no method matching Float64(::ForwardDiff.Dual{...”.)\nWith this slight modification, we have:\n\nf₁(x) = find_zero(y -> F₁(x, y), zero(x))\nplot(f₁', -0.5, 1.5)\n\n\n\n\nThe zero of f' is a bit to the right of \\(0\\), say \\(0.2\\); we use find_zero again to find it:\n\nxstar₁ = find_zero(f₁', 0.2)\nxstar₁, f₁(xstar₁)\n\n(0.146446609406726, -0.4142135623730952)\n\n\nIt is important to note that the above uses of find_zero required good initial guesses, which we were fortunate enough to identify."
},
{
"objectID": "derivatives/implicit_differentiation.html#questions",
"href": "derivatives/implicit_differentiation.html#questions",
"title": "33  Implicit Differentiation",
"section": "33.6 Questions",
"text": "33.6 Questions\n\nQuestion\nIs \\((1,1)\\) on the graph of\n\\[\nx^2 - 2xy + y^2 = 1?\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the equation\n\\[\nx^2y + 2y - 4 x = 0,\n\\]\nif \\(x=4\\), what is a value for \\(y\\) such that \\((x,y)\\) is a point on the graph of the equation?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the equation\n\\[\n(y-5)\\cdot \\cos(4\\cdot \\sqrt{(x-4)^2 + y^2)} = x\\cdot\\sin(2\\sqrt{x^2 + y^2})\n\\]\nis the point \\((5,0)\\) a solution?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\((x/3)^2 + (y/2)^2 = 1\\). Find the slope of the tangent line at the point \\((3/\\sqrt{2}, 2/\\sqrt{2})\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nThe lame curves satisfy:\n\\[\n\\left(\\frac{x}{a}\\right)^n + \\left(\\frac{y}{b}\\right)^n = 1.\n\\]\nAn ellipse is when \\(n=1\\). Take \\(n=3\\), \\(a=1\\), and \\(b=2\\).\nFind a positive value of \\(y\\) when \\(x=1/2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat expression gives \\(dy/dx\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(b \\cdot (1 - (x/a)^n)^{1/n}\\)\n \n \n\n\n \n \n \n \n \\(-(y/x) \\cdot (x/a)^n \\cdot (y/b)^{-n}\\)\n \n \n\n\n \n \n \n \n \\(-(x/a)^n / (y/b)^n\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(y - x^2 = -\\log(x)\\). At the point \\((1/2, 0.9431...)\\), the graph has a tangent line. Find this line, then find its intersection point with the \\(y\\) axes.\nThis intersection is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe witch of Agnesi is the curve given by the equation\n\\[\ny(x^2 + a^2) = a^3.\n\\]\nIf \\(a=1\\), numerically find a a value of \\(y\\) when \\(x=2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat expression yields \\(dy/dx\\) for this curve:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2xy / (x^2 + a^2)\\)\n \n \n\n\n \n \n \n \n \\(-2xy/(x^2 + a^2)\\)\n \n \n\n\n \n \n \n \n \\(a^3/(x^2 + a^2)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\nImage number 35 from LHospitals calculus book (the first). Given a description of the curve, identify the point E which maximizes the height.\n\n\nThe figure above shows a problem appearing in LHospitals first calculus book. Given a function defined implicitly by \\(x^3 + y^3 = axy\\) (with \\(AP=x\\), \\(AM=y\\) and \\(AB=a\\)) find the point \\(E\\) that maximizes the height. In the AMS feature column this problem is illustrated and solved in the historical manner, with the comment that the concept of implicit differentiation wouldnt have occurred to LHospital.\nUsing Implicit differentiation, find when \\(dy/dx = 0\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(y^2 = 3x/a\\)\n \n \n\n\n \n \n \n \n \\(y=3x^2/a\\)\n \n \n\n\n \n \n \n \n \\(y^2=a/(3x)\\)\n \n \n\n\n \n \n \n \n \\(y=a/(3x^2)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSubstituting the correct value of \\(y\\), above, into the defining equation gives what value for \\(x\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x=(1/3) a 2^{1/3}\\)\n \n \n\n\n \n \n \n \n \\(x=(1/2) a^3 3^{1/3}\\)\n \n \n\n\n \n \n \n \n \\(x=(1/3) a^2 2^{1/2}\\)\n \n \n\n\n \n \n \n \n \\(x=(1/2) a 2^{1/2}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the equation of an ellipse:\n\\[\n\\left(\\frac{x}{a}\\right)^2 + \\left(\\frac{y}{b}\\right)^2 = 1,\n\\]\ncompute \\(d^2y/dx^2\\). Is this the answer?\n\\[\n\\frac{d^2y}{dx^2} = -\\frac{b^2}{a^2\\cdot y} - \\frac{b^4\\cdot x^2}{a^4\\cdot y^3} = -\\frac{1}{y}\\frac{b^2}{a^2}(1 + \\frac{b^2 x^2}{a^2 y^2}).\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(y>0\\) is the sign positive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n Can be both\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(x>0\\) is the sign positive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n Can be both\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(x>0\\), the graph of the equation is…\n\n\n\n \n \n \n \n \n \n \n \n \n concave up\n \n \n\n\n \n \n \n \n concave down\n \n \n\n\n \n \n \n \n both concave up and down"
},
{
"objectID": "derivatives/implicit_differentiation.html#appendix",
"href": "derivatives/implicit_differentiation.html#appendix",
"title": "33  Implicit Differentiation",
"section": "33.7 Appendix",
"text": "33.7 Appendix\nThere are other packages in the Julia ecosystem that can plot implicit equations.\n\n33.7.1 The ImplicitEquations package\nThe ImplicitEquations packages can plot equations and inequalities. The use is somewhat similar to the examples above, but the object plotted is a predicate, not a function. These predicates are created with functions like Eq or Lt.\nFor example, the ImplicitPlots manual shows this function \\(f(x,y) = (x^4 + y^4 - 1) \\cdot (x^2 + y^2 - 2) + x^5 \\cdot y\\) to plot. Using ImplicitEquations, this equation would be plotted with:\n\nusing ImplicitEquations\nf(x,y) = (x^4 + y^4 - 1) * (x^2 + y^2 - 2) + x^5 * y\nr = Eq(f, 0) # the equation f(x,y) = 0\nplot(r)\n\n\n\n\nUnlike ImplicitPlots, inequalities may be displayed:\n\nf(x,y) = (x^4 + y^4 - 1) * (x^2 + y^2 - 2) + x^5 * y\nr = Lt(f, 0) # the inequality f(x,y) < 0\nplot(r; M=10, N=10) # less blocky\n\n\n\n\nThe rendered plots look “blocky” due to the algorithm used to plot the equations. As there is no rule defining \\((x,y)\\) pairs to plot, a search by regions is done. A region is initially labeled undetermined. If it can be shown that for any value in the region the equation is true (equations can also be inequalities), the region is colored black. If it can be shown it will never be true, the region is dropped. If a black-and-white answer is not clear, the region is subdivided and each subregion is similarly tested. This continues until the remaining undecided regions are smaller than some threshold. Such regions comprise a boundary, and here are also colored black. Only regions are plotted - not \\((x,y)\\) pairs - so the results are blocky. Pass larger values of \\(N=M\\) (with defaults of \\(8\\)) to plot to lower the threshold at the cost of longer computation times, as seen in the last example.\n\n\n33.7.2 The IntervalConstraintProgramming package\nThe IntervalConstraintProgramming package also can be used to graph implicit equations. For certain problem descriptions it is significantly faster and makes better graphs. The usage is slightly more involved. We show the commands, but dont run them here, as there are minor conflicts with the CalculusWithJuliapackage.\nWe specify a problem using the @constraint macro. Using a macro allows expressions to involve free symbols, so the problem is specified in an equation-like manner:\nS = @constraint x^2 + y^2 <= 2\nThe right hand side must be a number.\nThe area to plot over must be specified as an IntervalBox, basically a pair of intervals. The interval \\([a,b]\\) is expressed through a..b:\nJ = -3..3\nX = IntervalArithmetic.IntervalBox(J, J)\nThe pave command does the heavy lifting:\nregion = IntervalConstraintProgramming.pave(S, X)\nA plot can be made of either the boundary, the interior, or both.\nplot(region.inner) # plot interior; use r.boundary for boundary"
},
{
"objectID": "derivatives/related_rates.html",
"href": "derivatives/related_rates.html",
"title": "34  Related rates",
"section": "",
"text": "This section uses these add-on packaages:\nRelated rates problems involve two (or more) unknown quantities that are related through an equation. As the two variables depend on each other, also so do their rates - change with respect to some variable which is often time, though exactly how remains to be discovered. Hence the name “related rates.”"
},
{
"objectID": "derivatives/related_rates.html#questions",
"href": "derivatives/related_rates.html#questions",
"title": "34  Related rates",
"section": "34.1 Questions",
"text": "34.1 Questions\n\nQuestion\nSupply and demand. Suppose demand for product \\(XYZ\\) is \\(d(x)\\) and supply is \\(s(x)\\). The excess demand is \\(d(x) - s(x)\\). Suppose this is positive. How does this influence price? Guess the “law” of economics that applies:\n\n\n\n \n \n \n \n \n \n \n \n \n The rate of change of price will be \\(0\\)\n \n \n\n\n \n \n \n \n The rate of change of price will increase\n \n \n\n\n \n \n \n \n The rate of change of price will be positive and will depend on the rate of change of excess demand.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Theoretically, when demand exceeds supply, prices increase.)\n\n\nQuestion\nWhich makes more sense from an economic viewpoint?\n\n\n\n \n \n \n \n \n \n \n \n \n If the rate of change of unemployment is negative, the rate of change of wages will be negative.\n \n \n\n\n \n \n \n \n If the rate of change of unemployment is negative, the rate of change of wages will be positive.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Colloquially, “the rate of change of unemployment is negative” means the unemployment rate is going down, so there are fewer workers available to fill new jobs.)\n\n\nQuestion\nIn chemistry there is a fundamental relationship between pressure (\\(P\\)), temperature (\\(T)\\) and volume (\\(V\\)) given by \\(PV=cT\\) where \\(c\\) is a constant. Which of the following would be true with respect to time?\n\n\n\n \n \n \n \n \n \n \n \n \n The rate of change of pressure is always increasing by \\(c\\)\n \n \n\n\n \n \n \n \n If volume is constant, the rate of change of pressure is proportional to the temperature\n \n \n\n\n \n \n \n \n If volume is constant, the rate of change of pressure is proportional to the rate of change of temperature\n \n \n\n\n \n \n \n \n If pressure is held constant, the rate of change of pressure is proportional to the rate of change of temperature\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA pebble is thrown into a lake causing ripples to form expanding circles. Suppose one of the circles expands at a rate of \\(1\\) foot per second and the radius of the circle is \\(10\\) feet, what is the rate of change of the area enclosed by the circle?\n\n\n\n \n \n \n \n \n\n \n  feet$^2$/second \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA pizza maker tosses some dough in the air. The dough is formed in a circle with radius \\(10\\). As it rotates, its area increases at a rate of \\(1\\) inch\\(^2\\) per second. What is the rate of change of the radius?\n\n\n\n \n \n \n \n \n\n \n  inches/second \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAn FBI agent with a powerful spyglass is located in a boat anchored 400 meters offshore. A gangster under surveillance is driving along the shore. Assume the shoreline is straight and that the gangster is 1 km from the point on the shore nearest to the boat. If the spyglasses must rotate at a rate of \\(\\pi/4\\) radians per minute to track the gangster, how fast is the gangster moving? (In kilometers per minute.) Source.\n\n\n\n \n \n \n \n \n\n \n  kilometers/minute \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA flood lamp is installed on the ground 200 feet from a vertical wall. A six foot tall man is walking towards the wall at the rate of 4 feet per second. How fast is the tip of his shadow moving down the wall when he is 50 feet from the wall? Source. (As the question is written the answer should be positive.)\n\n\n\n \n \n \n \n \n\n \n  feet/second \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the hyperbola \\(y = 1/x\\) and think of it as a slide. A particle slides along the hyperbola so that its x-coordinate is increasing at a rate of \\(f(x)\\) units/sec. If its \\(y\\)-coordinate is decreasing at a constant rate of \\(1\\) unit/sec, what is \\(f(x)\\)? Source.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x) = 1/x\\)\n \n \n\n\n \n \n \n \n \\(f(x) = x^0\\)\n \n \n\n\n \n \n \n \n \\(f(x) = x\\)\n \n \n\n\n \n \n \n \n \\(f(x) = x^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA balloon is in the shape of a sphere, fortunately, as this gives a known formula, \\(V=4/3 \\pi r^3\\), for the volume. If the balloon is being filled with a rate of change of volume per unit time is \\(2\\) and the radius is \\(3\\), what is rate of change of radius per unit time?\n\n\n\n \n \n \n \n \n\n \n  units per unit time \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the curve \\(f(x) = x^2 - \\log(x)\\). For a given \\(x\\), the tangent line intersects the \\(y\\) axis. Where?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(y = 1 - x^2 - \\log(x)\\)\n \n \n\n\n \n \n \n \n \\(y = x(2x - 1/x)\\)\n \n \n\n\n \n \n \n \n \\(y = 1 - \\log(x)\\)\n \n \n\n\n \n \n \n \n \\(y = 1 - x^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(dx/dt = -1\\), what is \\(dy/dt\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(dy/dt = 2x + 1/x\\)\n \n \n\n\n \n \n \n \n \\(dy/dt = 1 - x^2 - \\log(x)\\)\n \n \n\n\n \n \n \n \n \\(dy/dt = 1\\)\n \n \n\n\n \n \n \n \n \\(dy/dt = -2x - 1/x\\)"
},
{
"objectID": "derivatives/taylor_series_polynomials.html",
"href": "derivatives/taylor_series_polynomials.html",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "",
"text": "This section uses these add-on packages:\nThe tangent line was seen to be the “best” linear approximation to a function at a point \\(c\\). Approximating a function by a linear function gives an easier to use approximation at the expense of accuracy. It suggests a tradeoff between ease and accuracy. Is there a way to gain more accuracy at the expense of ease?\nQuadratic functions are still fairly easy to work with. Is it possible to find the best “quadratic” approximation to a function at a point \\(c\\).\nMore generally, for a given \\(n\\), what would be the best polynomial of degree \\(n\\) to approximate \\(f(x)\\) at \\(c\\)?\nWe will see in this section how the Taylor polynomial answers these questions, and is the appropriate generalization of the tangent line approximation."
},
{
"objectID": "derivatives/taylor_series_polynomials.html#the-secant-line-and-the-tangent-line",
"href": "derivatives/taylor_series_polynomials.html#the-secant-line-and-the-tangent-line",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "35.1 The secant line and the tangent line",
"text": "35.1 The secant line and the tangent line\nWe approach this general problem much more indirectly than is needed. We introducing notations that are attributed to Newton and proceed from there. By leveraging SymPy we avoid tedious computations and hopefully gain some insight.\nSuppose \\(f(x)\\) is a function which is defined in a neighborhood of \\(c\\) and has as many continuous derivatives as we care to take at \\(c\\).\nWe have two related formulas:\n\nThe secant line connecting \\((c, f(c))\\) and \\((c+h, f(c+h))\\) for a value of \\(h>0\\) is given in point-slope form by\n\n\\[\nsl(x) = f(c) + \\frac{(f(c+h) - f(c))}{h} \\cdot (x-c).\n\\]\nThe slope is the familiar approximation to the derivative: \\((f(c+h)-f(c))/h\\).\n\nThe tangent line to the graph of \\(f(x)\\) at \\(x=c\\) is described by the function\n\n\\[\ntl(x) = f(c) + f'(c) \\cdot(x - c).\n\\]\nThe key is the term multiplying \\((x-c)\\) for the secant line is an approximation to the related term for the tangent line. That is, the secant line approximates the tangent line, which is the linear function that best approximates the function at the point \\((c, f(c))\\). This is quantified by the mean value theorem which states under our assumptions on \\(f(x)\\) that there exists some \\(\\xi\\) between \\(x\\) and \\(c\\) for which:\n\\[\nf(x) - tl(x) = \\frac{f''(\\xi)}{2} \\cdot (x-c)^2.\n\\]\nThe term “best” is deserved, as any other straight line will differ at least in an \\((x-c)\\) term, which in general is larger than an \\((x-c)^2\\) term for \\(x\\) “near” \\(c\\).\n(This is a consequence of Cauchys mean value theorem with \\(F(c) = f(c) - f'(c)\\cdot(c-x)\\) and \\(G(c) = (c-x)^2\\)\n\\[\n\\begin{align*}\n\\frac{F'(\\xi)}{G'(\\xi)} &=\n\\frac{f'(\\xi) - f''(\\xi)(\\xi-x) - f(\\xi)\\cdot 1}{2(\\xi-x)} \\\\\n&= -f''(\\xi)/2\\\\\n&= \\frac{F(c) - F(x)}{G(c) - G(x)}\\\\\n&= \\frac{f(c) - f'(c)(c-x) - (f(x) - f'(x)(x-x))}{(c-x)^2 - (x-x)^2} \\\\\n&= \\frac{f(c) + f'(c)(x-c) - f(x)}{(x-c)^2}\n\\end{align*}\n\\]\nThat is, \\(f(x) = f(c) + f'(c)(x-c) + f''(\\xi)/2\\cdot(x-c)^2\\), or \\(f(x)-tl(x)\\) is as described.)\nThe secant line also has an interpretation that will generalize - it is the smallest order polynomial that goes through, or interpolates, the points \\((c,f(c))\\) and \\((c+h, f(c+h))\\). This is obvious from the construction - as this is how the slope is derived - but from the formula itself requires showing \\(tl(c) = f(c)\\) and \\(tl(c+h) = f(c+h)\\). The former is straightforward, as \\((c-c) = 0\\), so clearly \\(tl(c) = f(c)\\). The latter requires a bit of algebra.\nWe have:\n\nThe best linear approximation at a point \\(c\\) is related to the linear polynomial interpolating the points \\(c\\) and \\(c+h\\) as \\(h\\) goes to \\(0\\).\n\nThis is the relationship we seek to generalize through our round about approach below:\n\nThe best approximation at a point \\(c\\) by a polynomial of degree \\(n\\) or less is related to the polynomial interpolating through the points \\(c, c+h, \\dots, c+nh\\) as \\(h\\) goes to \\(0\\).\n\nAs in the linear case, there is flexibility in the exact points chosen for the interpolation.\n\nNow, we take a small detour to define some notation. Instead of writing our two points as \\(c\\) and \\(c+h,\\) we use \\(x_0\\) and \\(x_1\\). For any set of points \\(x_0, x_1, \\dots, x_n\\), define the divided differences of \\(f\\) inductively, as follows:\n\\[\n\\begin{align}\nf[x_0] &= f(x_0) \\\\\nf[x_0, x_1] &= \\frac{f[x_1] - f[x_0]}{x_1 - x_0}\\\\\n\\cdots &\\\\\nf[x_0, x_1, x_2, \\dots, x_n] &= \\frac{f[x_1, \\dots, x_n] - f[x_0, x_1, x_2, \\dots, x_{n-1}]}{x_n - x_0}.\n\\end{align}\n\\]\nWe see the first two values look familiar, and to generate more we just take certain ratios akin to those formed when finding a secant line.\nWith this notation the secant line can be re-expressed as:\n\\[\nsl(x) = f[c] + f[c, c+h] \\cdot (x-c).\n\\]\nIf we think of \\(f[c, c+h]\\) as an approximate first derivative, we have an even stronger parallel between a secant line \\(x=c\\) and the tangent line at \\(x=c\\): \\(tl(x) = f(c) + f'(c)\\cdot (x-c)\\).\nWe use SymPy to investigate. First we create a recursive function to compute the divided differences:\n\ndivided_differences(f, x) = f(x)\n\nfunction divided_differences(f, x, xs...)\n xs = sort(vcat(x, xs...))\n (divided_differences(f, xs[2:end]...) - divided_differences(f, xs[1:end-1]...)) / (xs[end] - xs[1])\nend\n\ndivided_differences (generic function with 2 methods)\n\n\nIn the following, by adding a getindex method, we enable the [] notation of Newton to work with symbolic functions, like u() defined below, which is used in place of \\(f\\):\n\nBase.getindex(u::SymFunction, xs...) = divided_differences(u, xs...)\n\n@syms x::real c::real h::positive u()\nex = u[c, c+h]\n\n \n\\[\n\\frac{- u{\\left(c \\right)} + u{\\left(c + h \\right)}}{h}\n\\]\n\n\n\nWe can take a limit and see the familiar (yet differently represented) value of \\(u'(c)\\):\n\nlimit(ex, h => 0)\n\n \n\\[\n\\left. \\frac{d}{d \\xi_{1}} u{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=c }}\n\\]\n\n\n\nThe choice of points is flexible. Here we use \\(c-h\\) and \\(c+h\\):\n\nlimit(u[c-h, c+h], h=>0)\n\n \n\\[\n\\left. \\frac{d}{d \\xi_{1}} u{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=c }}\n\\]\n\n\n\nNow, lets look at:\n\nex₂ = u[c, c+h, c+2h]\nsimplify(ex₂)\n\n \n\\[\n\\frac{u{\\left(c \\right)} - 2 u{\\left(c + h \\right)} + u{\\left(c + 2 h \\right)}}{2 h^{2}}\n\\]\n\n\n\nNot so bad after simplification. The limit shows this to be an approximation to the second derivative divided by \\(2\\):\n\nlimit(ex₂, h => 0)\n\n \n\\[\n\\frac{\\left. \\frac{d^{2}}{d \\xi_{1}^{2}} u{\\left(\\xi_{1} \\right)} \\right|_{\\substack{ \\xi_{1}=c }}}{2}\n\\]\n\n\n\n(The expression is, up to a divisor of \\(2\\), the second order forward difference equation, a well-known approximation to \\(f''\\).)\nThis relationship between higher-order divided differences and higher-order derivatives generalizes. This is expressed in this theorem:\n\nSuppose \\(m=x_0 < x_1 < x_2 < \\dots < x_n=M\\) are distinct points. If \\(f\\) has \\(n\\) continuous derivatives then there exists a value \\(\\xi\\), where \\(m < \\xi < M\\), satisfying:\n\n\\[\nf[x_0, x_1, \\dots, x_n] = \\frac{1}{n!} \\cdot f^{(n)}(\\xi).\n\\]\nThis immediately applies to the above, where we parameterized by \\(h\\): \\(x_0=c, x_1=c+h, x_2 = c+2h\\). For then, as \\(h\\) goes to \\(0\\), it must be that \\(m, M \\rightarrow c\\), and so the limit of the divided differences must converge to \\((1/2!) \\cdot f^{(2)}(c)\\), as \\(f^{(2)}(\\xi)\\) converges to \\(f^{(2)}(c)\\).\nA proof based on Rolles theorem appears in the appendix."
},
{
"objectID": "derivatives/taylor_series_polynomials.html#quadratic-approximations-interpolating-polynomials",
"href": "derivatives/taylor_series_polynomials.html#quadratic-approximations-interpolating-polynomials",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "35.2 Quadratic approximations; interpolating polynomials",
"text": "35.2 Quadratic approximations; interpolating polynomials\nWhy the fuss? The answer comes from a result of Newton on interpolating polynomials. Consider a function \\(f\\) and \\(n+1\\) points \\(x_0\\), \\(x_1, \\dots, x_n\\). Then an interpolating polynomial is a polynomial of least degree that goes through each point \\((x_i, f(x_i))\\). The Newton form of such a polynomial can be written as:\n\\[\n\\begin{align*}\nf[x_0] &+ f[x_0,x_1] \\cdot (x-x_0) + f[x_0, x_1, x_2] \\cdot (x-x_0) \\cdot (x-x_1) + \\\\\n& \\cdots + f[x_0, x_1, \\dots, x_n] \\cdot (x-x_0)\\cdot \\cdots \\cdot (x-x_{n-1}).\n\\end{align*}\n\\]\nThe case \\(n=0\\) gives the value \\(f[x_0] = f(c)\\), which can be interpreted as the slope-\\(0\\) line that goes through the point \\((c,f(c))\\).\nWe are familiar with the case \\(n=1\\), with \\(x_0=c\\) and \\(x_1=c+h\\), this becomes our secant-line formula:\n\\[\nf[c] + f[c, c+h](x-c).\n\\]\nAs mentioned, we can verify directly that it interpolates the points \\((c,f(c))\\) and \\((c+h, f(c+h))\\). He we let SymPy do the algebra:\n\np₁ = u[c] + u[c, c+h] * (x-c)\np₁(x => c) - u(c), p₁(x => c+h) - u(c+h)\n\n(0, 0)\n\n\nNow for something new. Take the \\(n=2\\) case with \\(x_0 = c\\), \\(x_1 = c + h\\), and \\(x_2 = c+2h\\). Then the interpolating polynomial is:\n\\[\nf[c] + f[c, c+h](x-c) + f[c, c+h, c+2h](x-c)(x-(c+h)).\n\\]\nWe add the next term to our previous polynomial and simplify\n\np₂ = p₁ + u[c, c+h, c+2h] * (x-c) * (x-(c+h))\nsimplify(p₂)\n\n \n\\[\n\\frac{h^{2} u{\\left(c \\right)} + h \\left(c - x\\right) \\left(u{\\left(c \\right)} - u{\\left(c + h \\right)}\\right) + \\frac{\\left(c - x\\right) \\left(c + h - x\\right) \\left(u{\\left(c \\right)} - 2 u{\\left(c + h \\right)} + u{\\left(c + 2 h \\right)}\\right)}{2}}{h^{2}}\n\\]\n\n\n\nWe can check that this interpolates the three points. Notice that at \\(x_0=c\\) and \\(x_1=c+h\\), the last term, \\(f[x_0, x_1, x_2]\\cdot(x-x_0)(x-x_1)\\), vanishes, so we already have the polynomial interpolating there. Only the value \\(x_2=c+2h\\) remains to be checked:\n\np₂(x => c+2h) - u(c+2h)\n\n \n\\[\nh \\left(- \\frac{- u{\\left(c \\right)} + u{\\left(c + h \\right)}}{h} + \\frac{- u{\\left(c + h \\right)} + u{\\left(c + 2 h \\right)}}{h}\\right) - u{\\left(c \\right)} + 2 u{\\left(c + h \\right)} - u{\\left(c + 2 h \\right)}\n\\]\n\n\n\nHmm, doesnt seem correct - that was supposed to be \\(0\\). The issue isnt the math, it is that SymPy needs to be encouraged to simplify:\n\nsimplify(p₂(x => c+2h) - u(c+2h))\n\n \n\\[\n0\n\\]\n\n\n\nBy contrast, at the point \\(x=c+3h\\) we have no guarantee of interpolation, and indeed dont, as this expression is non always zero:\n\nsimplify(p₂(x => c+3h) - u(c+3h))\n\n \n\\[\nu{\\left(c \\right)} - 3 u{\\left(c + h \\right)} + 3 u{\\left(c + 2 h \\right)} - u{\\left(c + 3 h \\right)}\n\\]\n\n\n\nInterpolating polynomials are of interest in their own right, but for now we want to use them as motivation for the best polynomial approximation of a certain degree for a function. Motivated by how the secant line leads to the tangent line, we note that coefficients of the quadratic interpolating polynomial above have limits as \\(h\\) goes to \\(0\\), leaving this polynomial:\n\\[\nf(c) + f'(c) \\cdot (x-c) + \\frac{1}{2!} \\cdot f''(c) (x-c)^2.\n\\]\nThis is clearly related to the tangent line approximation of \\(f(x)\\) at \\(x=c\\), but carrying an extra quadratic term.\nHere we visualize the approximations with the function \\(f(x) = \\cos(x)\\) at \\(c=0\\).\n\nf(x) = cos(x)\na, b = -pi/2, pi/2\nc = 0\nh = 1/4\n\nfp = -sin(c) # by hand, or use diff(f), ...\nfpp = -cos(c)\n\n\np = plot(f, a, b, linewidth=5, legend=false, color=:blue)\nplot!(p, x->f(c) + fp*(x-c), a, b; color=:green, alpha=0.25, linewidth=5) # tangent line is flat\nplot!(p, x->f(c) + fp*(x-c) + (1/2)*fpp*(x-c)^2, a, b; color=:green, alpha=0.25, linewidth=5) # a parabola\np\n\n\n\n\nThis graph illustrates that the extra quadratic term can track the curvature of the function, whereas the tangent line itself cant. So, we have a polynomial which is a “better” approximation, is it the best approximation?\nThe Cauchy mean value theorem, as in the case of the tangent line, will guarantee the existence of \\(\\xi\\) between \\(c\\) and \\(x\\), for which\n\\[\nf(x) - \\left(f(c) + f'(c) \\cdot(x-c) + \\frac{1}{2}\\cdot f''(c) \\cdot (x-c)^2 \\right) =\n\\frac{1}{3!}f'''(\\xi) \\cdot (x-c)^3.\n\\]\nIn this sense, the above quadratic polynomial, called the Taylor Polynomial of degree 2, is the best quadratic approximation to \\(f\\), as the difference goes to \\(0\\) at a rate of \\((x-c)^3\\).\nThe graphs of the secant line and approximating parabola for \\(h=1/4\\) are similar:\n\nf(x) = cos(x)\na, b = -pi/2, pi/2\nc = 0\nh = 1/4\n\nx0, x1, x2 = c-h, c, c+h\n\nf0 = divided_differences(f, x0)\nfd = divided_differences(f, x0, x1)\nfdd = divided_differences(f, x0, x1, x2)\n\np = plot(f, a, b, color=:blue, linewidth=5, legend=false)\nplot!(p, x -> f0 + fd*(x-x0), a, b, color=:green, alpha=0.25, linewidth=5);\nplot!(p, x -> f0 + fd*(x-x0) + fdd * (x-x0)*(x-x1), a,b, color=:green, alpha=0.25, linewidth=5);\np\n\n\n\n\nThough similar, the graphs are not identical, as the interpolating polynomials arent the best approximations. For example, in the tangent-line graph the parabola only intersects the cosine graph at \\(x=0\\), whereas for the secant-line graph - by definition - the parabola intersects the graph at least \\(2\\) times and the interpolating polynomial \\(3\\) times (at \\(x_0\\), \\(x_1\\), and \\(x_2\\)).\n\nExample\nConsider the function \\(f(t) = \\log(1 + t)\\). We have mentioned that for \\(t\\) small, the value \\(t\\) is a good approximation. A better one becomes:\n\\[\nf(0) + f'(0) \\cdot t + \\frac{1}{2} \\cdot f''(0) \\cdot t^2 = 0 + 1t - \\frac{t^2}{2}\n\\]\nA graph shows the difference:\n\nf(t) = log(1 + t)\na, b = -1/2, 1\nplot(f, a, b, legend=false, linewidth=5)\nplot!(t -> t, a, b)\nplot!(t -> t - t^2/2, a, b)\n\n\n\n\nThough we can see that the tangent line is a good approximation, the quadratic polynomial tracks the logarithm better farther from \\(c=0\\).\n\n\nExample\nA wire is bent in the form of a half circle with radius \\(R\\) centered at \\((0,R)\\), so the bottom of the wire is at the origin. A bead is released on the wire at angle \\(\\theta\\). As time evolves, the bead will slide back and forth. How? (Ignoring friction.)\nLet \\(U\\) be the potential energy, \\(U=mgh = mgR \\cdot (1 - \\cos(\\theta))\\). The velocity of the object will depend on \\(\\theta\\) - it will be \\(0\\) at the high point, and largest in magnitude at the bottom - and is given by \\(v(\\theta) = R \\cdot d\\theta/ dt\\). (The bead moves along the wire so its distance traveled is \\(R\\cdot \\Delta \\theta\\), this, then, is just the time derivative of distance.)\nBy ignoring friction, the total energy is conserved giving:\n\\[\nK = \\frac{1}{2}m v^2 + mgR \\cdot (1 - \\cos(\\theta) =\n\\frac{1}{2} m R^2 (\\frac{d\\theta}{dt})^2 + mgR \\cdot (1 - \\cos(\\theta)).\n\\]\nThe value of \\(1-\\cos(\\theta)\\) inhibits further work which would be possible were there an easier formula there. In fact, we could try the excellent approximation \\(1 - \\theta^2/2\\) from the quadratic approximation. Then we have:\n\\[\nK \\approx \\frac{1}{2} m R^2 (\\frac{d\\theta}{dt})^2 + mgR \\cdot (1 - \\theta^2/2).\n\\]\nAssuming equality and differentiating in \\(t\\) gives by the chain rule:\n\\[\n0 = \\frac{1}{2} m R^2 2\\frac{d\\theta}{dt} \\cdot \\frac{d^2\\theta}{dt^2} - mgR \\theta\\cdot \\frac{d\\theta}{dt}.\n\\]\nThis can be solved to give this relationship:\n\\[\n\\frac{d^2\\theta}{dt^2} = - \\frac{g}{R}\\theta.\n\\]\nThe solution to this “equation” can be written (in some parameterization) as \\(\\theta(t)=A\\cos \\left(\\omega t+\\phi \\right)\\). This motion is the well-studied simple harmonic oscillator, a model for a simple pendulum.\n\n\nExample: optimization\nConsider the following approach to finding the minimum or maximum of a function:\n\nAt \\(x_k\\) fit a quadratic polynomial to \\(f(x)\\) matching the derivative and second derivative of \\(f\\).\nLet \\(x_{k+1}\\) be at the vertex of this fitted quadratic polynomial\nIterate to convergence\n\nThe polynomial in question will be the Taylor polynomial of degree \\(2\\):\n\\[\nT_2(x_k) = f(x_k) + f'(x_k)(x-x_k) + \\frac{f''(x_k)}{2}(x - x_k)^2\n\\]\nThe vertex of this quadratic polynomial will be when its derivative is \\(0\\) which can be solved for \\(x_{k+1}\\) giving:\n\\[\nx_{k+1} = x_k - \\frac{f'(x_k)}{f''(x_k)}.\n\\]\nThis assumes \\(f''(x_k)\\) is non-zero.\nOn inspection, it is seen that this is Newtons method applied to \\(f'(x)\\). This method, when convergent, finds a zero of \\(f'(x)\\). We know that should the algorithm converge, it will have found a critical point, not necessarily a value for a local extrema."
},
{
"objectID": "derivatives/taylor_series_polynomials.html#the-taylor-polynomial-of-degree-n",
"href": "derivatives/taylor_series_polynomials.html#the-taylor-polynomial-of-degree-n",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "35.3 The Taylor polynomial of degree \\(n\\)",
"text": "35.3 The Taylor polynomial of degree \\(n\\)\nStarting with the Newton form of the interpolating polynomial of smallest degree:\n\\[\n\\begin{align*}\nf[x_0] &+ f[x_0,x_1] \\cdot (x - x_0) + f[x_0, x_1, x_2] \\cdot (x - x_0)\\cdot(x-x_1) + \\\\\n& \\cdots + f[x_0, x_1, \\dots, x_n] \\cdot (x-x_0) \\cdot \\cdots \\cdot (x-x_{n-1}).\n\\end{align*}\n\\]\nand taking \\(x_i = c + i\\cdot h\\), for a given \\(n\\), we have in the limit as \\(h > 0\\) goes to zero that coefficients of this polynomial converge to the coefficients of the Taylor Polynomial of degree n:\n\\[\nf(c) + f'(c)\\cdot(x-c) + \\frac{f''(c)}{2!}(x-c)^2 + \\cdots + \\frac{f^{(n)}(c)}{n!} (x-c)^n.\n\\]\nThis polynomial will be the best approximation of degree \\(n\\) or less to the function \\(f\\), near \\(c\\). The error will be given - again by an application of the Cauchy mean value theorem:\n\\[\n\\frac{1}{(n+1)!} \\cdot f^{(n+1)}(\\xi) \\cdot (x-c)^n\n\\]\nfor some \\(\\xi\\) between \\(c\\) and \\(x\\).\nThe Taylor polynomial for \\(f\\) about \\(c\\) of degree \\(n\\) can be computed by taking \\(n\\) derivatives. For such a task, the computer is very helpful. In SymPy the series function will compute the Taylor polynomial for a given \\(n\\). For example, here is the series expansion to 10 terms of the function \\(\\log(1+x)\\) about \\(c=0\\):\n\nc, n = 0, 10\nl = series(log(1 + x), x, c, n+1)\n\n \n\\[\nx - \\frac{x^{2}}{2} + \\frac{x^{3}}{3} - \\frac{x^{4}}{4} + \\frac{x^{5}}{5} - \\frac{x^{6}}{6} + \\frac{x^{7}}{7} - \\frac{x^{8}}{8} + \\frac{x^{9}}{9} - \\frac{x^{10}}{10} + O\\left(x^{11}\\right)\n\\]\n\n\n\nA pattern can be observed.\nUsing series, we can see Taylor polynomials for several familiar functions:\n\nseries(1/(1-x), x, 0, 10) # sum x^i for i in 0:n\n\n \n\\[\n1 + x + x^{2} + x^{3} + x^{4} + x^{5} + x^{6} + x^{7} + x^{8} + x^{9} + O\\left(x^{10}\\right)\n\\]\n\n\n\n\nseries(exp(x), x, 0, 10) # sum x^i/i! for i in 0:n\n\n \n\\[\n1 + x + \\frac{x^{2}}{2} + \\frac{x^{3}}{6} + \\frac{x^{4}}{24} + \\frac{x^{5}}{120} + \\frac{x^{6}}{720} + \\frac{x^{7}}{5040} + \\frac{x^{8}}{40320} + \\frac{x^{9}}{362880} + O\\left(x^{10}\\right)\n\\]\n\n\n\n\nseries(sin(x), x, 0, 10) # sum (-1)^i * x^(2i+1) / (2i+1)! for i in 0:n\n\n \n\\[\nx - \\frac{x^{3}}{6} + \\frac{x^{5}}{120} - \\frac{x^{7}}{5040} + \\frac{x^{9}}{362880} + O\\left(x^{10}\\right)\n\\]\n\n\n\n\nseries(cos(x), x, 0, 10) # sum (-1)^i * x^(2i) / (2i)! for i in 0:n\n\n \n\\[\n1 - \\frac{x^{2}}{2} + \\frac{x^{4}}{24} - \\frac{x^{6}}{720} + \\frac{x^{8}}{40320} + O\\left(x^{10}\\right)\n\\]\n\n\n\nEach of these last three have a pattern that can be expressed quite succinctly if the denominator is recognized as \\(n!\\).\nThe output of series includes a big “Oh” term, which identifies the scale of the error term, but also gets in the way of using the output. SymPy provides the removeO method to strip this. (It is called as object.removeO(), as it is a method of an object in SymPy.)\n\n\n\n\n\n\nNote\n\n\n\nA Taylor polynomial of degree \\(n\\) consists of \\(n+1\\) terms and an error term. The “Taylor series” is an infinite collection of terms, the first \\(n+1\\) matching the Taylor polynomial of degree \\(n\\). The fact that series are infinite means care must be taken when even talking about their existence, unlike a Tyalor polynomial, which is just a polynomial and exists as long as a sufficient number of derivatives are available.\n\n\nWe define a function to compute Taylor polynomials from a function. The following returns a function, not a symbolic object, using D, from CalculusWithJulia, which is based on ForwardDiff.derivative, to find higher-order derivatives:\n\nfunction taylor_poly(f, c=0, n=2)\n x -> f(c) + sum(D(f, i)(c) * (x-c)^i / factorial(i) for i in 1:n)\nend\n\ntaylor_poly (generic function with 3 methods)\n\n\nWith a function, we can compare values. For example, here we see the difference between the Taylor polynomial and the answer for a small value of \\(x\\):\n\na = .1\nf(x) = log(1+x)\nTn = taylor_poly(f, 0, 5)\nTn(a) - f(a)\n\n1.5352900840925887e-7\n\n\n\n35.3.1 Plotting\nLets now visualize a function and the two approximations - the Taylor polynomial and the interpolating polynomial. We use this function to generate the interpolating polynomial as a function:\n\nfunction newton_form(f, xs)\n x -> begin\n tot = divided_differences(f, xs[1])\n for i in 2:length(xs)\n tot += divided_differences(f, xs[1:i]...) * prod([x-xs[j] for j in 1:(i-1)])\n end\n tot\n end\nend\n\nnewton_form (generic function with 1 method)\n\n\nTo see a plot, we have\n\n𝒇(x) = sin(x)\n𝒄, 𝒉, 𝒏 = 0, 1/4, 4\nint_poly = newton_form(𝒇, [𝒄 + i*𝒉 for i in 0:𝒏])\ntp = taylor_poly(𝒇, 𝒄, 𝒏)\n𝒂, 𝒃 = -pi, pi\nplot(𝒇, 𝒂, 𝒃; linewidth=5, label=\"f\")\nplot!(int_poly; color=:green, label=\"interpolating\")\nplot!(tp; color=:red, label=\"Taylor\")\n\n\n\n\nTo get a better sense, we plot the residual differences here:\n\nd1(x) = 𝒇(x) - int_poly(x)\nd2(x) = 𝒇(x) - tp(x)\nplot(d1, 𝒂, 𝒃; color=:blue, label=\"interpolating\")\nplot!(d2; color=:green, label=\"Taylor\")\n\n\n\n\nThe graph should be \\(0\\) at each of the the points in xs, which we can verify in the graph above. Plotting over a wider region shows a common phenomenon that these polynomials approximate the function near the values, but quickly deviate away:\nIn this graph we make a plot of the Taylor polynomial for different sizes of \\(n\\) for the function \\(f(x) = 1 - \\cos(x)\\):\n\nf(x) = 1 - cos(x)\na, b = -pi, pi\nplot(f, a, b, linewidth=5, label=\"f\")\nplot!(taylor_poly(f, 0, 2), label=\"T₂\")\nplot!(taylor_poly(f, 0, 4), label=\"T₄\")\nplot!(taylor_poly(f, 0, 6), label=\"T₆\")\n\n\n\n\nThough all are good approximations near \\(c=0\\), as more terms are included, the Taylor polynomial becomes a better approximation over a wider range of values.\n\nExample: period of an orbiting satellite\nKeplers third law of planetary motion states:\n\nThe square of the orbital period of a planet is directly proportional to the cube of the semi-major axis of its orbit.\n\nIn formulas, \\(P^2 = a^3 \\cdot (4\\pi^2) / (G\\cdot(M + m))\\), where \\(M\\) and \\(m\\) are the respective masses. Suppose a satellite is in low earth orbit with a constant height, \\(a\\). Use a Taylor polynomial to approximate the period using Keplers third law to relate the quantities.\nSuppose \\(R\\) is the radius of the earth and \\(h\\) the height above the earth assuming \\(h\\) is much smaller than \\(R\\). The mass \\(m\\) of a satellite is negligible to that of the earth, so \\(M+m=M\\) for this purpose. We have:\n\\[\nP = \\frac{2\\pi}{\\sqrt{G\\cdot M}} \\cdot (h+R)^{3/2} = \\frac{2\\pi}{\\sqrt{G\\cdot M}} \\cdot R^{3/2} \\cdot (1 + h/R)^{3/2} = P_0 \\cdot (1 + h/R)^{3/2},\n\\]\nwhere \\(P_0\\) collects terms that involve the constants.\nWe can expand \\((1+x)^{3/2}\\) to fifth order, to get:\n\\[\n(1+x)^{3/2} \\approx 1 + \\frac{3x}{2} + \\frac{3x^2}{8} - \\frac{1x^3}{16} + \\frac{3x^4}{128} -\\frac{3x^5}{256}\n\\]\nOur approximation becomes:\n\\[\nP \\approx P_0 \\cdot (1 + \\frac{3(h/R)}{2} + \\frac{3(h/R)^2}{8} - \\frac{(h/R)^3}{16} + \\frac{3(h/R)^4}{128} - \\frac{3(h/R)^5}{256}).\n\\]\nTypically, if \\(h\\) is much smaller than \\(R\\) the first term is enough giving a formula like \\(P \\approx P_0 \\cdot(1 + \\frac{3h}{2R})\\).\nA satellite phone utilizes low orbit satellites to relay phone communications. The Iridium system uses satellites with an elevation \\(h=780km\\). The radius of the earth is \\(3,959 miles\\), the mass of the earth is \\(5.972 × 10^{24} kg\\), and the gravitational constant, \\(G\\) is \\(6.67408 \\cdot 10^{-11}\\) \\(m^3/(kg \\cdot s^2)\\).\nCompare the approximate value with \\(1\\) term to the exact value.\n\nG = 6.67408e-11\nH = 780 * 1000\nR = 3959 * 1609.34 # 1609 meters per mile\nM = 5.972e24\nP0, HR = (2pi)/sqrt(G*M) * R^(3/2), H/R\n\nPreal = P0 * (1 + HR)^(3/2)\nP1 = P0 * (1 + 3*HR/2)\nPreal, P1\n\n(6018.78431252517, 5990.893153415102)\n\n\nWith terms out to the fifth power, we get a better approximation:\n\nP5 = P0 * (1 + 3*HR/2 + 3*HR^2/8 - HR^3/16 + 3*HR^4/128 - 3*HR^5/256)\n\n6018.784204505923\n\n\nThe units of the period above are in seconds. That answer here is about \\(100\\) minutes:\n\nPreal/60\n\n100.31307187541951\n\n\nWhen \\(H\\) is much smaller than \\(R\\) the approximation with \\(5\\)th order is really good, and serviceable with just \\(1\\) term. Next we check if this is the same when \\(H\\) is larger than \\(R\\).\n\nThe height of a GPS satellite is about \\(12,550\\) miles. Compute the period of a circular orbit and compare with the estimates.\n\nHₛ = 12250 * 1609.34 # 1609 meters per mile\nHRₛ = Hₛ/R\n\nPrealₛ = P0 * (1 + HRₛ)^(3/2)\nP1ₛ = P0 * (1 + 3*HRₛ/2)\nP5ₛ = P0 * (1 + 3*HRₛ/2 + 3*HRₛ^2/8 - HRₛ^3/16 + 3*HRₛ^4/128 - 3*HRₛ^5/256)\n\nPrealₛ, P1ₛ, P5ₛ\n\n(41930.52564789311, 28553.22950490504, 31404.73066617854)\n\n\nWe see the Taylor polynomial underestimates badly in this case. A reminder that these approximations are locally good, but may not be good on all scales. Here \\(h \\approx 3R\\). We can see from this graph of \\((1+x)^{3/2}\\) and its \\(5\\)th degree Taylor polynomial \\(T_5\\) that it is a bad approximation when \\(x > 2\\).\n\n\n\n\n\n\nFinally, we show how to use the Unitful package. This package allows us to define different units, carry these units through computations, and convert between similar units with uconvert. In this example, we define several units, then show how they can then be used as constants.\n\nm, mi, kg, s, hr = u\"m\", u\"mi\", u\"kg\", u\"s\", u\"hr\"\n\nG = 6.67408e-11 * m^3 / kg / s^2\nH = uconvert(m, 12250 * mi) # unit convert miles to meter\nR = uconvert(m, 3959 * mi)\nM = 5.972e24 * kg\n\nP0, HR = (2pi)/sqrt(G*M) * R^(3/2), H/R\nPreal = P0 * (1 + HR)^(3/2) # in seconds\nPreal, uconvert(hr, Preal) # ≈ 11.65 hours\n\n(41930.68197490307 s, 11.647411659695297 hr)\n\n\nWe see Preal has the right units - the units of mass and distance cancel leaving a measure of time - but it is hard to sense how long this is. Converting to hours, helps us see the satellite orbits about twice per day.\n\n\nExample: computing \\(\\log(x)\\)\nWhere exactly does the value assigned to \\(\\log(5)\\) come from? The value needs to be computed. At some level, many questions resolve down to the basic operations of addition, subtraction, multiplication, and division. Preferably not the latter, as division is slow. Polynomials then should be fast to compute, and so computing logarithms using a polynomial becomes desirable.\nBut how? One can see details of a possible way here.\nFirst, there is usually a reduction stage. In this phase, the problem is transformed in a manner to one involving only a fixed interval of values. For this, function values of \\(k\\) and \\(m\\) are found so that \\(x = 2^k \\cdot (1+m)\\) and \\(\\sqrt{2}/2 < 1+m < \\sqrt{2}\\). If these are found, then \\(\\log(x)\\) can be computed with \\(k \\cdot \\log(2) + \\log(1+m)\\). The first value - a multiplication - can easily be computed using pre-computed value of \\(\\log(2)\\), the second then reduces the problem to an interval.\nNow, for this problem a further trick is utilized, writing \\(s= f/(2+f)\\) so that \\(\\log(1+m)=\\log(1+s)-\\log(1-s)\\) for some small range of \\(s\\) values. These combined make it possible to compute \\(\\log(x)\\) for any real \\(x\\).\nTo compute \\(\\log(1\\pm s)\\), we can find a Taylor polynomial. Lets go out to degree \\(19\\) and use SymPy to do the work:\n\n@syms s\naₗ = series(log(1 + s), s, 0, 19)\nbₗ = series(log(1 - s), s, 0, 19)\na_b = (aₗ - bₗ).removeO() # remove\"Oh\" not remove\"zero\"\n\n \n\\[\n\\frac{2 s^{17}}{17} + \\frac{2 s^{15}}{15} + \\frac{2 s^{13}}{13} + \\frac{2 s^{11}}{11} + \\frac{2 s^{9}}{9} + \\frac{2 s^{7}}{7} + \\frac{2 s^{5}}{5} + \\frac{2 s^{3}}{3} + 2 s\n\\]\n\n\n\nThis is re-expressed as \\(2s + s \\cdot p\\) with \\(p\\) given by:\n\ncancel(a_b - 2s/s)\n\n \n\\[\n\\frac{2 s^{17}}{17} + \\frac{2 s^{15}}{15} + \\frac{2 s^{13}}{13} + \\frac{2 s^{11}}{11} + \\frac{2 s^{9}}{9} + \\frac{2 s^{7}}{7} + \\frac{2 s^{5}}{5} + \\frac{2 s^{3}}{3} + 2 s - 2\n\\]\n\n\n\nNow, \\(2s = m - s\\cdot m\\), so the above can be reworked to be \\(\\log(1+m) = m - s\\cdot(m-p)\\).\n(For larger values of \\(m\\), a similar, but different approximation, can be used to minimize floating point errors.)\nHow big can the error be between this approximations and \\(\\log(1+m)\\)? We plot to see how big \\(s\\) can be:\n\n@syms v\nplot(v/(2+v), sqrt(2)/2 - 1, sqrt(2)-1)\n\n\n\n\nThis shows, \\(s\\) is as big as\n\nMax = (v/(2+v))(v => sqrt(2) - 1)\n\n \n\\[\n0.17157287525381\n\\]\n\n\n\nThe error term is like \\(2/19 \\cdot \\xi^{19}\\) which is largest at this value of \\(M\\). Large is relative - it is really small:\n\n(2/19)*Max^19\n\n \n\\[\n2.99778410043418 \\cdot 10^{-16}\n\\]\n\n\n\nBasically that is machine precision. Which means, that as far as can be told on the computer, the value produced by \\(2s + s \\cdot p\\) is about as accurate as can be done.\nTo try this out to compute \\(\\log(5)\\). We have \\(5 = 2^2(1+0.25)\\), so \\(k=2\\) and \\(m=0.25\\).\n\nk, m = 2, 0.25\n𝒔 = m / (2+m)\npₗ = 2 * sum(𝒔^(2i)/(2i+1) for i in 1:8) # where the polynomial approximates the logarithm...\n\nlog(1 + m), m - 𝒔*(m-pₗ), log(1 + m) - ( m - 𝒔*(m-pₗ))\n\n(0.22314355131420976, 0.22314355131420976, 0.0)\n\n\nThe two values differ by less than \\(10^{-16}\\), as advertised. Re-assembling then, we compare the computed values:\n\nΔ = k * log(2) + (m - 𝒔*(m-pₗ)) - log(5)\n\n0.0\n\n\nThe actual code is different, as the Taylor polynomial isnt used. The Taylor polynomial is a great approximation near a point, but there might be better polynomial approximations for all values in an interval. In this case there is, and that polynomial is used in the production setting. This makes things a bit more efficient, but the basic idea remains - for a prescribed accuracy, a polynomial approximation can be found over a given interval, which can be cleverly utilized to solve for all applicable values.\n\n\nExample: higher order derivatives of the inverse function\nFor notational purposes, let \\(g(x)\\) be the inverse function for \\(f(x)\\). Assume both functions have a Taylor polynomial expansion:\n\\[\n\\begin{align*}\nf(x_0 + \\Delta_x) &= f(x_0) + a_1 \\Delta_x + a_2 (\\Delta_x)^2 + \\cdots a_n + (\\Delta_x)^n + \\dots\\\\\ng(y_0 + \\Delta_y) &= g(y_0) + b_1 \\Delta_y + b_2 (\\Delta_y)^2 + \\cdots b_n + (\\Delta_y)^n + \\dots\n\\end{align*}\n\\]\nThen using \\(x = g(f(x))\\), we have expanding the terms and using \\(\\approx\\) to drop the \\(\\dots\\):\n\\[\n\\begin{align*}\nx_0 + \\Delta_x &= g(f(x_0 + \\Delta_x)) \\\\\n&\\approx g(f(x_0) + \\sum_{j=1}^n a_j (\\Delta_x)^j) \\\\\n&\\approx g(f(x_0)) + \\sum_{i=1}^n b_i \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j \\right)^i \\\\\n&\\approx x_0 + \\sum_{i=1}^{n-1} b_i \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j\\right)^i + b_n \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j\\right)^n\n\\end{align*}\n\\]\nThat is:\n\\[\nb_n \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j \\right)^n =\n(x_0 + \\Delta_x) - \\left( x_0 + \\sum_{i=1}^{n-1} b_i \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j \\right)^i \\right)\n\\]\nSolving for \\(b_n = g^{(n)}(y_0) / n!\\) gives the formal expression:\n\\[\ng^{(n)}(y_0) = n! \\cdot \\lim_{\\Delta_x \\rightarrow 0}\n\\frac{\\Delta_x - \\sum_{i=1}^{n-1} b_i \\left(\\sum_{j=1}^n a_j (\\Delta_x)^j \\right)^i}{\n\\left(\\sum_{j=1}^n a_j \\left(\\Delta_x^j\\right)^i\\right)^n}\n\\]\n(This is following Liptaj).\nWe will use SymPy to take this limit for the first 4 derivatives. Here is some code that expands \\(x + \\Delta_x = g(f(x_0 + \\Delta_x))\\) and then uses SymPy to solve:\n\n@syms x₀ Δₓ f[1:4] g[1:4]\n\nas(i) = f[i]/factorial(i)\nbs(i) = g[i]/factorial(i)\n\ngᵏs = Any[]\neqns = Any[]\nfor n ∈ 1:4\n Δy = sum(as(j) * Δₓ^j for j ∈ 1:n)\n left = x₀ + Δₓ\n right = x₀ + sum(bs(i)*Δy^i for i ∈ 1:n)\n\n eqn = left ~ right\n push!(eqns, eqn)\n\n gⁿ = g[n]\n ϕ = solve(eqn, gⁿ)[1]\n\n # replace gᵢs in terms of computed fᵢs\n for j ∈ 1:n-1\n ϕ = subs(ϕ, g[j] => gᵏs[j])\n end\n\n L = limit(ϕ, Δₓ => 0)\n push!(gᵏs, L)\n\nend\ngᵏs\n\n4-element Vector{Any}:\n 1/f₁\n -f₂/f₁^3\n (-12*f₁^2*f₃ + 36*f₁*f₂^2)/(12*f₁^6)\n (-3456*f₁^9*f₄ + 34560*f₁^8*f₂*f₃ - 51840*f₁^7*f₂^3)/(3456*f₁^14)\n\n\nWe can see the expected g' = 1/f' (where the point of evalution is \\(g(y) = 1/f'(f^{-1}(y))\\) is not written). In addition, we get 3 more formulas, hinting that the answers grow rapidly in terms of their complexity.\nIn the above, for each n, the code above sets up the two sides, left and right, of an equation involving the higher-order derivatives of \\(g\\). For example, when n=2 we have:\n\neqns[2]\n\n \n\\[\nx₀ + Δₓ = g₁ \\left(f₁ Δₓ + \\frac{f₂ Δₓ^{2}}{2}\\right) + \\frac{g₂ \\left(f₁ Δₓ + \\frac{f₂ Δₓ^{2}}{2}\\right)^{2}}{2} + x₀\n\\]\n\n\n\nThe solve function is used to identify \\(g^{(n)}\\) represented in terms of lower-order derivatives of \\(g\\). These values have been computed and stored and are then substituted into ϕ. Afterwards a limit is taken and the answer recorded."
},
{
"objectID": "derivatives/taylor_series_polynomials.html#questions",
"href": "derivatives/taylor_series_polynomials.html#questions",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "35.4 Questions",
"text": "35.4 Questions\n\nQuestion\nCompute the Taylor polynomial of degree \\(10\\) for \\(\\sin(x)\\) about \\(c=0\\) using SymPy. Based on the form, which formula seems appropriate:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sum_{k=1}^{10} (-1)^{n+1} x^n/n\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{10} x^k\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{10} x^n/n!\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the Taylor polynomial of degree \\(10\\) for \\(e^x\\) about \\(c=0\\) using SymPy. Based on the form, which formula seems appropriate:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=1}^{10} (-1)^{n+1} x^n/n\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{10} x^k\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{10} x^n/n!\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the Taylor polynomial of degree \\(10\\) for \\(1/(1-x)\\) about \\(c=0\\) using SymPy. Based on the form, which formula seems appropriate:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sum_{k=0}^{10} x^n/n!\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=1}^{10} (-1)^{n+1} x^n/n\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}\\)\n \n \n\n\n \n \n \n \n \\(\\sum_{k=0}^{10} x^k\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(T_5(x)\\) be the Taylor polynomial of degree \\(5\\) for the function \\(\\sqrt{1+x}\\) about \\(x=0\\). What is the coefficient of the \\(x^5\\) term?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(7/256\\)\n \n \n\n\n \n \n \n \n \\(-5/128\\)\n \n \n\n\n \n \n \n \n \\(2/15\\)\n \n \n\n\n \n \n \n \n \\(1/5!\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe \\(5\\)th order Taylor polynomial for \\(\\sin(x)\\) about \\(c=0\\) is: \\(x - x^3/3! + x^5/5!\\). Use this to find the first \\(3\\) terms of the Taylor polynomial of \\(\\sin(x^2)\\) about \\(c=0\\).\nThey are:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x^2\\)\n \n \n\n\n \n \n \n \n \\(x^2 \\cdot (x - x^3/3! + x^5/5!)\\)\n \n \n\n\n \n \n \n \n \\(x^2 - x^6/3! + x^{10}/5!\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA more direct derivation of the form of the Taylor polynomial (here taken about \\(c=0\\)) is to assume a polynomial form that matches \\(f\\):\n\\[\nf(x) = a + bx + cx^2 + dx^3 + ex^4 + \\cdots\n\\]\nIf this is true, then formally evaluating at \\(x=0\\) gives \\(f(0) = a\\), so \\(a\\) is determined. Similarly, formally differentiating and evaluating at \\(0\\) gives \\(f'(0) = b\\). What is the result of formally differentiating \\(4\\) times and evaluating at \\(0\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f''''(0) = 0\\)\n \n \n\n\n \n \n \n \n \\(f''''(0) = e\\)\n \n \n\n\n \n \n \n \n \\(f''''(0) = 4 \\cdot 3 \\cdot 2 e = 4! e\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow big an error is there in approximating \\(e^x\\) by its \\(5\\)th degree Taylor polynomial about \\(c=0\\), \\(1 + x + x^2/2! + x^3/3! + x^4/4! + x^5/5!\\), over \\([-1,1]\\)?\nThe error is known to be \\(( f^{(6)}(\\xi)/6!) \\cdot x^6\\) for some \\(\\xi\\) in \\([-1,1]\\).\n\nThe \\(6\\)th derivative of \\(e^x\\) is still \\(e^x\\):\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nWhich is true about the function \\(e^x\\):\n\n\n\n\n \n \n \n \n \n \n \n \n \n It is decreasing\n \n \n\n\n \n \n \n \n It both increases and decreases\n \n \n\n\n \n \n \n \n It is increasing\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe maximum value of \\(e^x\\) over \\([-1,1]\\) occurs at\n\n\n\n\n \n \n \n \n \n \n \n \n \n An end point\n \n \n\n\n \n \n \n \n A critical point\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nWhich theorem tells you that for a continuous function over closed interval, a maximum value will exist?\n\n\n\n\n \n \n \n \n \n \n \n \n \n The mean value theorem\n \n \n\n\n \n \n \n \n The extreme value theorem\n \n \n\n\n \n \n \n \n The intermediate value theorem\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nWhat is the largest possible value of the error:\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1/6!\\cdot e^1 \\cdot 1^6\\)\n \n \n\n\n \n \n \n \n \\(1^6 \\cdot 1 \\cdot 1^6\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe error in using \\(T_k(x)\\) to approximate \\(e^x\\) over the interval \\([-1/2, 1/2]\\) is \\((1/(k+1)!) e^\\xi x^{k+1}\\), for some \\(\\xi\\) in the interval. This is less than \\(1/((k+1)!) e^{1/2} (1/2)^{k+1}\\).\n\nWhy?\n\n\n\n\n \n \n \n \n \n \n \n \n \n The function has a critical point at \\(x=1/2\\)\n \n \n\n\n \n \n \n \n The function is monotonic in \\(k\\), so achieves its maximum at \\(k+1\\)\n \n \n\n\n \n \n \n \n The function \\(e^x\\) is increasing, so takes on its largest value at the endpoint and the function \\(|x^n| \\leq |x|^n \\leq (1/2)^n\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAssuming the above is right, find the smallest value \\(k\\) guaranteeing a error no more than \\(10^{-16}\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe function \\(f(x) = (1 - x + x^2) \\cdot e^x\\) has a Taylor polynomial about \\(0\\) such that all coefficients are rational numbers. Is it true that the numerators are all either \\(1\\) or prime? (From the 2014 Putnam exam.)\n\nHere is one way to get all the values bigger than 1:\n\nex = (1 - x + x^2)*exp(x)\nTn = series(ex, x, 0, 100).removeO()\nps = sympy.Poly(Tn, x).coeffs()\nqs = numer.(ps)\nqs[qs .> 1] |> Tuple # format better for output\n\n(97, 89, 83, 79, 73, 71, 67, 61, 59, 53, 47, 43, 41, 37, 31, 29, 23, 19, 17, 13, 11, 7, 5, 2, 3, 2)\n\n\nVerify by hand that each of the remaining values is a prime number to answer the question (Or you can use sympy.isprime.(qs)).\nAre they all prime or \\(1\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No"
},
{
"objectID": "derivatives/taylor_series_polynomials.html#appendix",
"href": "derivatives/taylor_series_polynomials.html#appendix",
"title": "35  Taylor Polynomials and other Approximating Polynomials",
"section": "35.5 Appendix",
"text": "35.5 Appendix\nWe mentioned two facts that could use a proof: the Newton form of the interpolating polynomial and the mean value theorem for divided differences. Our explanation tries to emphasize a parallel with the secant lines relationship with the tangent line. The standard way to discuss Taylor polynomials is different (also more direct) and so these two proofs are not in most calculus texts.\nA proof of the Newton form can be done knowing that the interpolating polynomial is unique and can be expressed either as\n\\[\ng(x)=a_0 + a_1 (x-x_0) + \\cdots + a_n (x-x_0)\\cdot\\cdots\\cdot(x-x_{n-1})\n\\]\nor in this reversed form\n\\[\nh(x)=b_0 + b_1 (x-x_n) + b_2(x-x_n)(x-x_{n-1}) + \\cdots + b_n (x-x_n)(x-x_{n-1})\\cdot\\cdots\\cdot(x-x_1).\n\\]\nThese two polynomials are of degree \\(n\\) or less and have \\(u(x) = h(x)-g(x)=0\\), by uniqueness. So the coefficients of \\(u(x)\\) are \\(0\\). We have that the coefficient of \\(x^n\\) must be \\(a_n-b_n\\) so \\(a_n=b_n\\). Our goal is to express \\(a_n\\) in terms of \\(a_{n-1}\\) and \\(b_{n-1}\\). Focusing on the \\(x^{n-1}\\) term, we have:\n\\[\n\\begin{align*}\nb_n(x-x_n)(x-x_{n-1})\\cdot\\cdots\\cdot(x-x_1)\n&- a_n\\cdot(x-x_0)\\cdot\\cdots\\cdot(x-x_{n-1}) \\\\\n&=\na_n [(x-x_1)\\cdot\\cdots\\cdot(x-x_{n-1})] [(x- x_n)-(x-x_0)] \\\\\n&= -a_n \\cdot(x_n - x_0) x^{n-1} + p_{n-2},\n\\end{align*}\n\\]\nwhere \\(p_{n-2}\\) is a polynomial of at most degree \\(n-2\\). (The expansion of \\((x-x_1)\\cdot\\cdots\\cdot(x-x_{n-1}))\\) leaves \\(x^{n-1}\\) plus some lower degree polynomial.) Similarly, we have \\(a_{n-1}(x-x_0)\\cdot\\cdots\\cdot(x-x_{n-2}) = a_{n-1}x^{n-1} + q_{n-2}\\) and \\(b_{n-1}(x-x_n)\\cdot\\cdots\\cdot(x-x_2) = b_{n-1}x^{n-1}+r_{n-2}\\). Combining, we get that the \\(x^{n-1}\\) term of \\(u(x)\\) is\n\\[\n(b_{n-1}-a_{n-1}) - a_n(x_n-x_0) = 0.\n\\]\nOn rearranging, this yields \\(a_n = (b_{n-1}-a_{n-1}) / (x_n - x_0)\\). By induction - that \\(a_i=f[x_0, x_1, \\dots, x_i]\\) and \\(b_i = f[x_n, x_{n-1}, \\dots, x_{n-i}]\\) (which has trivial base case) - this is \\((f[x_1, \\dots, x_n] - f[x_0,\\dots x_{n-1}])/(x_n-x_0)\\).\nNow, assuming the Newton form is correct, a proof of the mean value theorem for divided differences comes down to Rolles theorem. Starting from the Newton form of the polynomial and expanding in terms of \\(1, x, \\dots, x^n\\) we see that \\(g(x) = p_{n-1}(x) + f[x_0, x_1, \\dots,x_n]\\cdot x^n\\), where now \\(p_{n-1}(x)\\) is a polynomial of degree at most \\(n-1\\). That is, the coefficient of \\(x^n\\) is \\(f[x_0, x_1, \\dots, x_n]\\). Consider the function \\(h(x)=f(x) - g(x)\\). It has zeros \\(x_0, x_1, \\dots, x_n\\).\nBy Rolles theorem, between any two such zeros \\(x_i, x_{i+1}\\), \\(0 \\leq i < n\\) there must be a zero of the derivative of \\(h(x)\\), say \\(\\xi^1_i\\). So \\(h'(x)\\) has zeros \\(\\xi^1_0 < \\xi^1_1 < \\dots < \\xi^1_{n-1}\\).\nWe visualize this with \\(f(x) = \\sin(x)\\) and \\(x_i = i\\) for \\(i=0, 1, 2, 3\\), The \\(x_i\\) values are indicated with circles, the \\(\\xi^1_i\\) values indicated with squares:\n\n\n\n\n\nAgain by Rolles theorem, between any pair of adjacent zeros \\(\\xi^1_i, \\xi^1_{i+1}\\) there must be a zero \\(\\xi^2_i\\) of \\(h''(x)\\). So there are \\(n-1\\) zeros of \\(h''(x)\\). Continuing, we see that there will be \\(n+1-3\\) zeros of \\(h^{(3)}(x)\\), \\(n+1-4\\) zeros of \\(h^{4}(x)\\), \\(\\dots\\), \\(n+1-(n-1)\\) zeros of \\(h^{n-1}(x)\\), and finally \\(n+1-n\\) (\\(1\\)) zeros of \\(h^{(n)}(x)\\). Call this last zero \\(\\xi\\). It satisfies \\(x_0 \\leq \\xi \\leq x_n\\). Further, \\(0 = h^{(n)}(\\xi) = f^{(n)}(\\xi) - g^{(n)}(\\xi)\\). But \\(g\\) is a degree \\(n\\) polynomial, so the \\(n\\)th derivative is the coefficient of \\(x^n\\) times \\(n!\\). In this case we have \\(0 = f^{(n)}(\\xi) - f[x_0, \\dots, x_n] n!\\). Rearranging yields the result."
},
{
"objectID": "integrals/area.html",
"href": "integrals/area.html",
"title": "36  Area under a curve",
"section": "",
"text": "This section uses these add-on packages:\nThe question of area has long fascinated human culture. As children, we learn early on the formulas for the areas of some geometric figures: a square is \\(b^2\\), a rectangle \\(b\\cdot h\\) a triangle \\(1/2 \\cdot b \\cdot h\\) and for a circle, \\(\\pi r^2\\). The area of a rectangle is often the intuitive basis for illustrating multiplication. The area of a triangle has been known for ages. Even complicated expressions, such as Herons formula which relates the area of a triangle with measurements from its perimeter have been around for 2000 years. The formula for the area of a circle is also quite old. Wikipedia dates it as far back as the Rhind papyrus for 1700 BC, with the approximation of \\(256/81\\) for \\(\\pi\\).\nThe modern approach to area begins with a non-negative function \\(f(x)\\) over an interval \\([a,b]\\). The goal is to compute the area under the graph. That is, the area between \\(f(x)\\) and the \\(x\\)-axis between \\(a \\leq x \\leq b\\).\nFor some functions, this area can be computed by geometry, for example, here we see the area under \\(f(x)\\) is just \\(1\\), as it is a triangle with base \\(2\\) and height \\(1\\):\nSimilarly, we know this area is also \\(1\\), it being a square:\nThis one, is simply \\(\\pi/2\\), it being half a circle of radius \\(1\\):\nAnd this area can be broken into a sum of the area of square and the area of a rectangle, or \\(1 + 1/2\\):\nBut what of more complicated areas? Can these have their area computed?"
},
{
"objectID": "integrals/area.html#approximating-areas",
"href": "integrals/area.html#approximating-areas",
"title": "36  Area under a curve",
"section": "36.1 Approximating areas",
"text": "36.1 Approximating areas\nIn a previous section, we saw this animation:\n\n\n \n The first triangle has area \\(1/2\\), the second has area \\(1/8\\), then \\(2\\) have area \\((1/8)^2\\), \\(4\\) have area \\((1/8)^3\\), ... With some algebra, the total area then should be \\(1/2 \\cdot (1 + (1/4) + (1/4)^2 + \\cdots) = 2/3\\).\n \n \n\n\n\nThis illustrates a method of Archimedes to compute the area contained in a parabola using the method of exhaustion. Archimedes leveraged a fact he discovered relating the areas of triangle inscribed with parabolic segments to create a sum that could be computed.\nThe pursuit of computing areas persisted. The method of computing area by finding a square with an equivalent area was known as quadrature. Over the years, many figures had their area computed, for example, the area under the graph of the cycloid (…Galileo tried empirically to find this using a tracing on sheet metal and a scale).\nHowever, as areas of geometric objects were replaced by the more general question of area related to graphs of functions, a more general study was called for.\nOne such approach is illustrated in this figure due to Beeckman from 1618 (from Bressoud)\nBeeckman actually did more than find the area. He generalized the relationship of rate \\(\\times\\) time \\(=\\) distance. The line was interpreting a velocity, the “squares”, then, provided an approximate distance traveled when the velocity is taken as a constant on the small time interval. Then the distance traveled can be approximated by a smaller quantity - just add the area of the rectangles squarely within the desired area (\\(6+16+6\\)) - and a larger quantity - by including all rectangles that have a portion of their area within the desired area (\\(10 + 16 + 10\\)). Beeckman argued that the error vanishes as the rectangles get smaller.\nAdding up the smaller “squares” can be a bit more efficient if we were to add all those in a row, or column at once. We would then add the areas of a smaller number of rectangles. For this curve, the two approaches are basically identical. For other curves, identifying which squares in a row would be added is much more complicated (though useful), but for a curve generated by a function, identifying which “squares” go in a rectangle is quite easy, in fact we can see the rectangles area will be a base given by that of the squares, and height depending on the function.\n\n36.1.1 Adding rectangles\nThe idea of the Riemann sum then is to approximate the area under the curve by the area of well-chosen rectangles in such a way that as the bases of the rectangles get smaller (hence adding more rectangles) the error in approximation vanishes.\nDefine a partition of \\([a,b]\\) to be a selection of points \\(a = x_0 < x_1 < \\cdots < x_{n-1} < x_n = b\\). The norm of the partition is the largest of all the differences \\(\\lvert x_i - x_{i-1} \\rvert\\). For a partition, consider an arbitrary selection of points \\(c_i\\) satisfying \\(x_{i-1} \\leq c_i \\leq x_{i}\\), \\(1 \\leq i \\leq n\\). Then the following is a Riemann sum:\n\\[\nS_n = f(c_1) \\cdot (x_1 - x_0) + f(c_2) \\cdot (x_2 - x_1) + \\cdots + f(c_n) \\cdot (x_n - x_{n-1}).\n\\]\nClearly for a given partition and choice of \\(c_i\\), the above can be computed. Each term \\(f(c_i)\\cdot(x_i-x_{i-1})\\) can be visualized as the area of a rectangle with base spanning from \\(x_{i-1}\\) to \\(x_i\\) and height given by the function value at \\(c_i\\). The following visualizes left Riemann sums for different values of \\(n\\) in a way that makes Beekmans intuition plausible that as the number of rectangles gets larger, the approximate sum will get closer to the actual area.\n\n\n \n Illustration of left Riemann sum for increasing \\(n\\) values\n \n \n\n\n\nTo successfully compute a good approximation for the area, we would need to choose \\(c_i\\) and the partition so that a formula can be found to express the dependence on the size of the partition.\nFor Archimedes problem - finding the area under \\(f(x)=x^2\\) between \\(0\\) and \\(1\\) - if we take as a partition \\(x_i = i/n\\) and \\(c_i = x_i\\), then the above sum becomes:\n\\[\n\\begin{align*}\nS_n &= f(c_1) \\cdot (x_1 - x_0) + f(c_2) \\cdot (x_2 - x_1) + \\cdots + f(c_n) \\cdot (x_n - x_{n-1})\\\\\n&= (x_1)^2 \\cdot \\frac{1}{n} + (x_2)^2 \\cdot \\frac{1}{n} + \\cdot + (x_n)^2 \\cdot \\frac{1}{n}\\\\\n&= 1^2 \\cdot \\frac{1}{n^3} + 2^2 \\cdot \\frac{1}{n^3} + \\cdots + n^2 \\cdot \\frac{1}{n^3}\\\\\n&= \\frac{1}{n^3} \\cdot (1^2 + 2^2 + \\cdots + n^2) \\\\\n&= \\frac{1}{n^3} \\cdot \\frac{n\\cdot(n-1)\\cdot(2n+1)}{6}.\n\\end{align*}\n\\]\nThe latter uses a well-known formula for the sum of squares of the first \\(n\\) natural numbers.\nWith this expression, it is readily seen that as \\(n\\) gets large this value gets close to \\(2/6 = 1/3\\).\n\n\n\n\n\n\nNote\n\n\n\nThe above approach, like Archimedes, ends with a limit being taken. The answer comes from using a limit to add a big number of small values. As with all limit questions, worrying about whether a limit exists is fundamental. For this problem, we will see that for the general statement there is a stretching of the formal concept of a limit.\n\n\n\nThere is a more compact notation to \\(x_1 + x_2 + \\cdots + x_n\\), this using the summation notation or capital sigma. We have:\n\\[\n\\Sigma_{i = 1}^n x_i = x_1 + x_2 + \\cdots + x_n\n\\]\nThe notation includes three pieces of information:\n\nThe \\(\\Sigma\\) is an indication of a sum\nThe \\({i=1}\\) and \\(n\\) sub- and superscripts indicate the range to sum over.\nThe term \\(x_i\\) is a general term describing the \\(i\\)th entry, where it is understood that \\(i\\) is just some arbitrary indexing value.\n\nWith this notation, a Riemann sum can be written as \\(\\Sigma_{i=1}^n f(c_i)(x_i-x_{i-1})\\).\n\n\n36.1.2 Other sums\nThe choice of the \\(c_i\\) will give different answers for the approximation, though for an integrable function these differences will vanish in the limit. Some common choices are:\n\nUsing the right hand endpoint of the interval \\([x_{i-1}, x_i]\\) giving the right-Riemann sum, \\(R_n\\).\nThe choice \\(c_i = x_{i-1}\\) gives the left-Riemann sum, \\(L_n\\).\nThe choice \\(c_i = (x_i + x_{i-1})/2\\) is the midpoint rule, \\(M_n\\).\nIf the function is continuous on the closed subinterval \\([x_{i-1}, x_i]\\), then it will take on its minimum and maximum values. By the extreme value theorem, we could take \\(c_i\\) to correspond to either the maximum or the minimum. These choices give the “upper Riemann-sums” and “lower Riemann-sums”.\n\nThe choice of partition can also give different answers. A common choice is to break the interval into \\(n+1\\) equal-sized pieces. With \\(\\Delta = (b-a)/n\\), these pieces become the arithmetic sequence \\(a = a + 0 \\cdot \\Delta < a + 1 \\cdot \\Delta < a + 2 \\cdot \\Delta < \\cdots a + n \\cdots < \\Delta = b\\) with \\(x_i = a + i (b-a)/n\\). (The range(a, b, length=n+1) command will compute these.) An alternate choice made below for one problem is to use a geometric progression:\n\\[\na = a(1+\\alpha)^0 < a(1+\\alpha)^1 < a (1+\\alpha)^2 < \\cdots < a (1+\\alpha)^n = b.\n\\]\nThe general statement allows for any partition such that the largest gap goes to \\(0\\).\n\nRiemann sums werent named after Riemann because he was the first to approximate areas using rectangles. Indeed, others had been using even more efficient ways to compute areas for centuries prior to Riemanns work. Rather, Riemann put the definition of the area under the curve on a firm theoretical footing with the following theorem which gives a concrete notion of what functions are integrable:\n\nRiemann Integral: A function \\(f\\) is Riemann integrable over the interval \\([a,b]\\) and its integral will have value \\(V\\) provided for every \\(\\epsilon > 0\\) there exists a \\(\\delta > 0\\) such that for any partition \\(a =x_0 < x_1 < \\cdots < x_n=b\\) with \\(\\lvert x_i - x_{i-1} \\rvert < \\delta\\) and for any choice of points \\(x_{i-1} \\leq c_i \\leq x_{i}\\) this is satisfied:\n\\[\n\\lvert \\sum_{i=1}^n f(c_i)(x_{i} - x_{i-1}) - V \\rvert < \\epsilon.\n\\]\nWhen the integral exists, it is written \\(V = \\int_a^b f(x) dx\\).\n\n\n\n\n\n\n\nHistory note\n\n\n\nThe expression \\(V = \\int_a^b f(x) dx\\) is known as the definite integral of \\(f\\) over \\([a,b]\\). Much earlier than Riemann, Cauchy had defined the definite integral in terms of a sum of rectangular products beginning with \\(S=(x_1 - x_0) f(x_0) + (x_2 - x_1) f(x_1) + \\cdots + (x_n - x_{n-1}) f(x_{n-1})\\) (the left Riemann sum). He showed the limit was well defined for any continuous function. Riemanns formulation relaxes the choice of partition and the choice of the \\(c_i\\) so that integrability can be better understood.\n\n\n\n\n36.1.3 Some immediate consequences\nThe following formulas are consequences when \\(f(x)\\) is integrable. These mostly follow through a judicious rearranging of the approximating sums.\nThe area is \\(0\\) when there is no width to the interval to integrate over:\n\n\\[\n\\int_a^a f(x) dx = 0.\n\\]\n\nEven our definition of a partition doesnt really apply, as we assume \\(a < b\\), but clearly if \\(a=x_0=x_n=b\\) then our only”approximating” sum could be \\(f(a)(b-a) = 0\\).\nThe area under a constant function is found from the area of rectangle, a special case being \\(c=0\\) yielding \\(0\\) area:\n\n\\[\n\\int_a^b c dx = c \\cdot (b-a).\n\\]\n\nFor any partition of \\(a < b\\), we have \\(S_n = c(x_1 - x_0) + c(x_2 -x_1) + \\cdots + c(x_n - x_{n-1})\\). By factoring out the \\(c\\), we have a telescoping sum which means the sum simplifies to \\(S_n = c(x_n-x_0) = c(b-a)\\). Hence any limit must be this constant value.\nScaling the \\(y\\) axis by a constant can be done before or after computing the area:\n\n\\[\n\\int_a^b cf(x) dx = c \\int_a^b f(x) dx.\n\\]\n\nLet \\(a=x_0 < x_1 < \\cdots < x_n=b\\) be any partition. Then we have \\(S_n= cf(c_1)(x_1-x_0) + \\cdots + cf(c_n)(x_n-x_0)\\) \\(=\\) \\(c\\cdot\\left[ f(c_1)(x_1 - x_0) + \\cdots + f(c_n)(x_n - x_0)\\right]\\). The “limit” of the left side is \\(\\int_a^b c f(x) dx\\). The “limit” of the right side is \\(c \\cdot \\int_a^b f(x)\\). We call this a “sketch” as a formal proof would show that for any \\(\\epsilon\\) we could choose a \\(\\delta\\) so that any partition with norm \\(\\delta\\) will yield a sum less than \\(\\epsilon\\). Here, then our “any” partition would be one for which the \\(\\delta\\) on the left hand side applies. The computation shows that the same \\(\\delta\\) would apply for the right hand side when \\(\\epsilon\\) is the same.\nThe area is invariant under shifts left or right.\n\n\\[\n\\int_a^b f(x - c) dx = \\int_{a-c}^{b-c} f(x) dx.\n\\]\n\nAny partition \\(a =x_0 < x_1 < \\cdots < x_n=b\\) is related to a partition of \\([a-c, b-c]\\) through \\(a-c < x_0-c < x_1-c < \\cdots < x_n - c = b-c\\). Let \\(d_i=c_i-c\\) denote this partition, then we have:\n\\[\nf(c_1 -c) \\cdot (x_1 - x_0) + f(c_2 -c) \\cdot (x_2 - x_1) + \\cdots + f(c_n -c) \\cdot (x_n - x_{n-1}) =\nf(d_1) \\cdot(x_1-c - (x_0-c)) + f(d_2) \\cdot(x_2-c - (x_1-c)) + \\cdots + f(d_n) \\cdot(x_n-c - (x_{n-1}-c)).\n\\]\nThe left side will have a limit of \\(\\int_a^b f(x-c) dx\\) the right would have a “limit” of \\(\\int_{a-c}^{b-c}f(x)dx\\).\nSimilarly, reflections dont effect the area under the curve, they just require a new parameterization:\n\n\\[\n\\int_a^b f(x) dx = \\int_{-b}^{-a} f(-x) dx\n\\]\n\nThe scaling operation \\(g(x) = f(cx)\\) has the following:\n\n\\[\n\\int_a^b f(c\\cdot x) dx = \\frac{1}{c} \\int_{ca}^{cb}f(x) dx\n\\]\n\nThe scaling operation shifts \\(a\\) to \\(ca\\) and \\(b\\) to \\(cb\\) so the limits of integration make sense. However, the area stretches by \\(c\\) in the \\(x\\) direction, so must contract by \\(c\\) in the \\(y\\) direction to stay in balance. Hence the factor of \\(1/c\\).\nCombining two operations above, the operation \\(g(x) = \\frac{1}{h}f(\\frac{x-c}{h})\\) will leave the area between \\(a\\) and \\(b\\) under \\(g\\) the same as the area under \\(g\\) between \\((a-c)/h\\) and \\((b-c)/h\\).\n\nThe area between \\(a\\) and \\(b\\) can be broken up into the sum of the area between \\(a\\) and \\(c\\) and that between \\(c\\) and \\(b\\).\n\n\\[\n\\int_a^b f(x) dx = \\int_a^c f(x) dx + \\int_c^b f(x) dx.\n\\]\n\nFor this, suppose we have a partition for both the integrals on the right hand side for a given \\(\\epsilon/2\\) and \\(\\delta\\). Combining these into a partition of \\([a,b]\\) will mean \\(\\delta\\) is still the norm. The approximating sum will combine to be no more than \\(\\epsilon/2 + \\epsilon/2\\), so for a given \\(\\epsilon\\), this \\(\\delta\\) applies.\nThis is due to the area on the left and right of \\(0\\) being equivalent.\nThe “reversed” area is the same, only accounted for with a minus sign.\n\n\\[\n\\int_a^b f(x) dx = -\\int_b^a f(x) dx.\n\\]\n\nA consequence of the last few statements is:\n\nIf \\(f(x)\\) is an even function, then \\(\\int_{-a}^a f(x) dx = 2 \\int_0^a f(x) dx\\). If \\(f(x)\\) is an odd function, then \\(\\int_{-a}^a f(x) dx = 0\\).\n\nIf \\(g\\) bounds \\(f\\) then the area under \\(g\\) will bound the area under \\(f\\), in particular if \\(f(x)\\) is non negative, so will the area under \\(f\\) be non negative for any \\(a < b\\). (This assumes that \\(g\\) and \\(f\\) are integrable.)\n\nIf \\(0 \\leq f(x) \\leq g(x)\\) then \\(\\int_a^b f(x) dx \\leq \\int_a^b g(x) dx.\\)\n\nFor any partition of \\([a,b]\\) and choice of \\(c_i\\), we have the term-by-term bound \\(f(c_i)(x_i-x_{i-1}) \\leq g(c_i)(x_i-x_{i-1})\\) So any sequence of partitions that converges to the limits will have this inequality maintained for the sum.\n\n\n36.1.4 Some known integrals\nUsing the definition, we can compute a few definite integrals:\n\n\\[\n\\int_a^b c dx = c \\cdot (b-a).\n\\]\n\n\n\\[\n\\int_a^b x dx = \\frac{b^2}{2} - \\frac{a^2}{2}.\n\\]\n\nThis is just the area of a trapezoid with heights \\(a\\) and \\(b\\) and side length \\(b-a\\), or \\(1/2 \\cdot (b + a) \\cdot (b - a)\\). The right sum would be:\n\\[\n\\begin{align*}\nS &= x_1 \\cdot (x_1 - x_0) + x_2 \\cdot (x_2 - x_1) + \\cdots x_n \\cdot (x_n - x_{n-1}) \\\\\n&= (a + 1\\frac{b-a}{n}) \\cdot \\frac{b-a}{n} + (a + 2\\frac{b-a}{n}) \\cdot \\frac{b-a}{n} + \\cdots (a + n\\frac{b-a}{n}) \\cdot \\frac{b-a}{n}\\\\\n&= n \\cdot a \\cdot (\\frac{b-a}{n}) + (1 + 2 + \\cdots n) \\cdot (\\frac{b-a}{n})^2 \\\\\n&= n \\cdot a \\cdot (\\frac{b-a}{n}) + \\frac{n(n+1)}{2} \\cdot (\\frac{b-a}{n})^2 \\\\\n& \\rightarrow a \\cdot(b-a) + \\frac{(b-a)^2}{2} \\\\\n&= \\frac{b^2}{2} - \\frac{a^2}{2}.\n\\end{align*}\n\\]\n\n\\[\n\\int_a^b x^2 dx = \\frac{b^3}{3} - \\frac{a^3}{3}.\n\\]\n\nThis is similar to the Archimedes case with \\(a=0\\) and \\(b=1\\) shown above.\n\n\\[\n\\int_a^b x^k dx = \\frac{b^{k+1}}{k+1} - \\frac{a^{k+1}}{k+1},\\quad k \\neq -1\n\\]\n.\n\nCauchy showed this using a geometric series for the partition, not the arithmetic series \\(x_i = a + i (b-a)/n\\). The series defined by \\(1 + \\alpha = (b/a)^{1/n}\\), then \\(x_i = a \\cdot (1 + \\alpha)^i\\). Here the bases \\(x_{i+1} - x_i\\) simplify to \\(x_i \\cdot \\alpha\\) and \\(f(x_i) = (a\\cdot(1+\\alpha)^i)^k = a^k (1+\\alpha)^{ik}\\), or \\(f(x_i)(x_{i+1}-x_i) = a^{k+1}\\alpha[(1+\\alpha)^{k+1}]^i\\), so, using \\(u=(1+\\alpha)^{k+1}=(b/a)^{(k+1)/n}\\), \\(f(x_i) \\cdot(x_{i+1} - x_i) = a^{k+1}\\alpha u^i\\). This gives\n\\[\n\\begin{align*}\nS &= a^{k+1}\\alpha u^0 + a^{k+1}\\alpha u^1 + \\cdots + a^{k+1}\\alpha u^{n-1}\n&= a^{k+1} \\cdot \\alpha \\cdot (u^0 + u^1 + \\cdot u^{n-1}) \\\\\n&= a^{k+1} \\cdot \\alpha \\cdot \\frac{u^n - 1}{u - 1}\\\\\n&= (b^{k+1} - a^{k+1}) \\cdot \\frac{\\alpha}{(1+\\alpha)^{k+1} - 1} \\\\\n&\\rightarrow \\frac{b^{k+1} - a^{k+1}}{k+1}.\n\\end{align*}\n\\]\n\n\\[\n\\int_a^b x^{-1} dx = \\log(b) - \\log(a), \\quad (0 < a < b).\n\\]\n\nAgain, Cauchy showed this using a geometric series. The expression \\(f(x_i) \\cdot(x_{i+1} - x_i)\\) becomes just \\(\\alpha\\). So the approximating sum becomes:\n\\[\nS = f(x_0)(x_1 - x_0) + f(x_1)(x_2 - x_1) + \\cdots + f(x_{n-1}) (x_n - x_{n-1}) = \\alpha + \\alpha + \\cdots \\alpha = n\\alpha.\n\\]\nBut, letting \\(x = 1/n\\), the limit above is just the limit of\n\\[\n\\lim_{x \\rightarrow 0+} \\frac{(b/a)^x - 1}{x} = \\log(b/a) = \\log(b) - \\log(a).\n\\]\n(Using LHopitals rule to compute the limit.)\nCertainly other integrals could be computed with various tricks, but we wont pursue this. There is another way to evaluate integrals using the forthcoming Fundamental Theorem of Calculus.\n\n\n36.1.5 Some other consequences\n\nThe definition is defined in terms of any partition with its norm bounded by \\(\\delta\\). If you know a function \\(f\\) is Riemann integrable, then it is enough to consider just a regular partition \\(x_i = a + i \\cdot (b-a)/n\\) when forming the sums, as was done above. It is just that showing a limit for just this particular type of partition would not be sufficient to prove Riemann integrability.\nThe choice of \\(c_i\\) is arbitrary to allow for maximum flexibility. The Darboux integrals use the maximum and minimum over the subinterval. It is sufficient to prove integrability to show that the limit exists with just these choices.\nMost importantly,\n\n\nA continuous function on \\([a,b]\\) is Riemann integrable on \\([a,b]\\).\n\nThe main idea behind this is that the difference between the maximum and minimum values over a partition gets small. That is if \\([x_{i-1}, x_i]\\) is like \\(1/n\\) is length, then the difference between the maximum of \\(f\\) over this interval, \\(M\\), and the minimum, \\(m\\) over this interval will go to zero as \\(n\\) gets big. That \\(m\\) and \\(M\\) exists is due to the extreme value theorem, that this difference goes to \\(0\\) is a consequence of continuity. What is needed is that this value goes to \\(0\\) at the same rate no matter what interval is being discussed is a consequence of a notion of uniform continuity, a concept discussed in advanced calculus, but which holds for continuous functions on closed intervals. Armed with this, the Riemann sum for a general partition can be bounded by this difference times \\(b-a\\), which will go to zero. So the upper and lower Riemann sums will converge to the same value.\n\nA “jump”, or discontinuity of the first kind, is a value \\(c\\) in \\([a,b]\\) where \\(\\lim_{x \\rightarrow c+} f(x)\\) and \\(\\lim_{x \\rightarrow c-}f(x)\\) both exist, but are not equal. It is true that a function that is not continuous on \\(I=[a,b]\\), but only has discontinuities of the first kind on \\(I\\) will be Riemann integrable on \\(I\\).\n\nFor example, the function \\(f(x) = 1\\) for \\(x\\) in \\([0,1]\\) and \\(0\\) otherwise will be integrable, as it is continuous at all but two points, \\(0\\) and \\(1\\), where it jumps.\n\nSome functions can have infinitely many points of discontinuity and still be integrable. The example of \\(f(x) = 1/q\\) when \\(x=p/q\\) is rational, and \\(0\\) otherwise is often used as an example."
},
{
"objectID": "integrals/area.html#numeric-integration",
"href": "integrals/area.html#numeric-integration",
"title": "36  Area under a curve",
"section": "36.2 Numeric integration",
"text": "36.2 Numeric integration\nThe Riemann sum approach gives a method to approximate the value of a definite integral. We just compute an approximating sum for a large value of \\(n\\), so large that the limiting value and the approximating sum are close.\nTo see the mechanics, lets again return to Archimedes problem and compute \\(\\int_0^1 x^2 dx\\).\nLet us fix some values:\n\na, b = 0, 1\nf(x) = x^2\n\nf (generic function with 1 method)\n\n\nThen for a given \\(n\\) we have some steps to do: create the partition, find the \\(c_i\\), multiply the pieces and add up. Here is one way to do all this:\n\nn = 5\nxs = a:(b-a)/n:b # also range(a, b, length=n)\ndeltas = diff(xs) # forms x2-x1, x3-x2, ..., xn-xn-1\ncs = xs[1:end-1] # finds left-hand end points. xs[2:end] would be right-hand ones.\n\n0.0:0.2:0.8\n\n\nNow to multiply the values. We want to sum the product f(cs[i]) * deltas[i], here is one way to do so:\n\nsum(f(cs[i]) * deltas[i] for i in 1:length(deltas))\n\n0.24000000000000002\n\n\nOur answer is not so close to the value of \\(1/3\\), but what did we expect - we only used \\(n=5\\) intervals. Trying again with \\(50,000\\) gives us:\n\nn = 50_000\nxs = a:(b-a)/n:b\ndeltas = diff(xs)\ncs = xs[1:end-1]\nsum(f(cs[i]) * deltas[i] for i in 1:length(deltas))\n\n0.3333233333999998\n\n\nThis value is about \\(10^{-5}\\) off from the actual answer of \\(1/3\\).\nWe should expect that larger values of \\(n\\) will produce better approximate values, as long as numeric issues dont get involved.\nBefore continuing, we define a function to compute the Riemann sum for us with an extra argument to specifying one of four methods for computing \\(c_i\\):\nfunction riemann(f::Function, a::Real, b::Real, n::Int; method=\"right\")\n if method == \"right\"\n meth = f -> (lr -> begin l,r = lr; f(r) * (r-l) end)\n elseif method == \"left\"\n meth = f -> (lr -> begin l,r = lr; f(l) * (r-l) end)\n elseif method == \"trapezoid\"\n meth = f -> (lr -> begin l,r = lr; (1/2) * (f(l) + f(r)) * (r-l) end)\n elseif method == \"simpsons\"\n meth = f -> (lr -> begin l,r=lr; (1/6) * (f(l) + 4*(f((l+r)/2)) + f(r)) * (r-l) end)\n end\n\n xs = range(a, b, n+1)\n pairs = zip(xs[begin:end-1], xs[begin+1:end]) # (x₀,x₁), …, (xₙ₋₁,xₙ)\n sum(meth(f), pairs)\n\nend\n(This function is defined in CalculusWithJulia and need not be copied over if that package is loaded.)\nWith this, we can easily find an approximate answer. We wrote the function to use the familiar template action(function, arguments...), so we pass in a function and arguments to describe the problem (a, b, and n and, optionally, the method):\n\n𝒇(x) = exp(x)\nriemann(𝒇, 0, 5, 10) # S_10\n\n187.324835773627\n\n\nOr with more intervals in the partition\n\nriemann(𝒇, 0, 5, 50_000)\n\n147.42052988337647\n\n\n(The answer is \\(e^5 - e^0 = 147.4131591025766\\dots\\), which shows that even \\(50,000\\) partitions is not enough to guarantee many digits of accuracy.)"
},
{
"objectID": "integrals/area.html#negative-area",
"href": "integrals/area.html#negative-area",
"title": "36  Area under a curve",
"section": "36.3 “Negative” area",
"text": "36.3 “Negative” area\nSo far, we have had the assumption that \\(f(x) \\geq 0\\), as that allows us to define the concept of area. We can define the signed area between \\(f(x)\\) and the \\(x\\) axis through the definite integral:\n\\[\nA = \\int_a^b f(x) dx.\n\\]\nThe right hand side is defined whenever the Riemann limit exists and in that case we call \\(f(x)\\) Riemann integrable. (The definition does not suppose \\(f\\) is non-negative.)\nSuppose \\(f(a) = f(b) = 0\\) for \\(a < b\\) and for all \\(a < x < b\\) we have \\(f(x) < 0\\). Then we can see easily from the geometry (or from the Riemann sum approximation) that\n\\[\n\\int_a^b f(x) dx = - \\int_a^b \\lvert f(x) \\rvert dx.\n\\]\nIf we think of the area below the \\(x\\) axis as “signed” area carrying a minus sign, then the total area can be seen again as a sum, only this time some of the summands may be negative.\n\nExample\nConsider a function \\(g(x)\\) defined through its piecewise linear graph:\n\n\n\n\n\n\nCompute \\(\\int_{-3}^{-1} g(x) dx\\). The area comprised of a square of area \\(1\\) and a triangle with area \\(1/2\\), so should be \\(3/2\\).\nCompute \\(\\int_{-3}^{0} g(x) dx\\). In addition to the above, there is a triangle with area \\(1/2\\), but since the function is negative, this area is added in as \\(-1/2\\). In total then we have \\(1 + 1/2 - 1/2 = 1\\) for the answer.\nCompute \\(\\int_{-3}^{1} g(x) dx\\):\n\nWe could add the signed area over \\([0,1]\\) to the above, but instead see a square of area \\(1\\), a triangle with area \\(1/2\\) and a triangle with signed area \\(-1\\). The total is then \\(1/2\\).\n\nCompute \\(\\int_{-3}^{3} g(x) dx\\):\n\nWe could add the area, but lets use a symmetry trick. This is clearly twice our second answer, or \\(2\\). (This is because \\(g(x)\\) is an even function, as we can tell from the graph.)\n\n\nExample\nSuppose \\(f(x)\\) is an odd function, then \\(f(x) = - f(-x)\\) for any \\(x\\). So the signed area between \\([-a,0]\\) is related to the signed area between \\([0,a]\\) but of different sign. This gives \\(\\int_{-a}^a f(x) dx = 0\\) for odd functions.\nAn immediate consequence would be \\(\\int_{-\\pi}^\\pi \\sin(x) = 0\\), as would \\(\\int_{-a}^a x^k dx\\) for any odd integer \\(k > 0\\).\n\n\nExample\nNumerically estimate the definite integral \\(\\int_0^e x\\log(x) dx\\). (We redefine the function to be \\(0\\) at \\(0\\), so it is continuous.)\nWe have to be a bit careful with the Riemann sum, as the left Riemann sum will have an issue at \\(0=x_0\\) (0*log(0) returns NaN which will poison any subsequent arithmetic operations, so the value returned will be NaN and not an approximate answer). We could define our function with a check:\n\n𝒉(x) = x > 0 ? x * log(x) : 0.0\n\n𝒉 (generic function with 1 method)\n\n\nThis is actually inefficient, as the check for the size of x will slow things down a bit. Since we will call this function 50,000 times, we would like to avoid this, if we can. In this case just using the right sum will work:\n\nh(x) = x * log(x)\nriemann(h, 0, 2, 50_000, method=\"right\")\n\n0.38632208884775826\n\n\n(The default is \"right\", so no method specified would also work.)\n\n\nExample\nLet \\(j(x) = \\sqrt{1 - x^2}\\). The area under the curve between \\(-1\\) and \\(1\\) is \\(\\pi/2\\). Using a Riemann sum with 4 equal subintervals and the midpoint, estimate \\(\\pi\\). How close are you?\nThe partition is \\(-1 < -1/2 < 0 < 1/2 < 1\\). The midpoints are \\(-3/4, -1/4, 1/4, 3/4\\). We thus have that \\(\\pi/2\\) is approximately:\n\nxs = range(-1, 1, length=5)\ndeltas = diff(xs)\ncs = [-3/4, -1/4, 1/4, 3/4]\nj(x) = sqrt(1 - x^2)\na = sum(j(c)*delta for (c,delta) in zip(cs, deltas))\na, pi/2 # π ≈ 2a\n\n(1.629683664318002, 1.5707963267948966)\n\n\n(For variety, we used an alternate way to sum over two vectors.)\nSo \\(\\pi\\) is about 2a.\n\n\nExample\nWe have the well-known triangle inequality which says for an individual sum: \\(\\lvert a + b \\rvert \\leq \\lvert a \\rvert +\\lvert b \\rvert\\). Applying this recursively to a partition with \\(a < b\\) gives:\n\\[\n\\begin{align*}\n\\lvert f(c_1)(x_1-x_0) + f(c_2)(x_2-x_1) + \\cdots + f(c_n) (x_n-x_1) \\rvert\n& \\leq\n\\lvert f(c_1)(x_1-x_0) \\rvert + \\lvert f(c_2)(x_2-x_1)\\rvert + \\cdots +\\lvert f(c_n) (x_n-x_1) \\rvert \\\\\n&= \\lvert f(c_1)\\rvert (x_1-x_0) + \\lvert f(c_2)\\rvert (x_2-x_1)+ \\cdots +\\lvert f(c_n) \\rvert(x_n-x_1).\n\\end{align*}\n\\]\nThis suggests that the following inequality holds for integrals:\n\n\\(\\lvert \\int_a^b f(x) dx \\rvert \\leq \\int_a^b \\lvert f(x) \\rvert dx\\).\n\nThis can be used to give bounds on the size of an integral. For example, suppose you know that \\(f(x)\\) is continuous on \\([a,b]\\) and takes its maximum value of \\(M\\) and minimum value of \\(m\\). Letting \\(K\\) be the larger of \\(\\lvert M\\rvert\\) and \\(\\lvert m \\rvert\\), gives this bound when \\(a < b\\):\n\\[\n\\lvert\\int_a^b f(x) dx \\rvert \\leq \\int_a^b \\lvert f(x) \\rvert dx \\leq \\int_a^b K dx = K(b-a).\n\\]\nWhile such bounds are disappointing, often, when looking for specific values, they are very useful when establishing general truths, such as is done with proofs."
},
{
"objectID": "integrals/area.html#error-estimate",
"href": "integrals/area.html#error-estimate",
"title": "36  Area under a curve",
"section": "36.4 Error estimate",
"text": "36.4 Error estimate\nThe Riemann sum above is actually extremely inefficient. To see how much, we can derive an estimate for the error in approximating the value using an arithmetic progression as the partition. Lets assume that our function \\(f(x)\\) is increasing, so that the right sum gives an upper estimate and the left sum a lower estimate, so the error in the estimate will be between these two values:\n\\[\n\\begin{align*}\n\\text{error} &\\leq\n\\left[\nf(x_1) \\cdot (x_{1} - x_0) + f(x_2) \\cdot (x_{2} - x_1) + \\cdots + f(x_{n-1})(x_{n-1} - x_n) + f(x_n) \\cdot (x_n - x_{n-1})\\right]\\\\\n&-\n\\left[f(x_0) \\cdot (x_{1} - x_0) + f(x_1) \\cdot (x_{2} - x_1) + \\cdots + f(x_{n-1})(x_{n-1} - x_n)\\right] \\\\\n&= \\frac{b-a}{n} \\cdot (\\left[f(x_1) + f(x_2) + \\cdots f(x_n)\\right] - \\left[f(x_0) + \\cdots f(x_{n-1})\\right]) \\\\\n&= \\frac{b-a}{n} \\cdot (f(b) - f(a)).\n\\end{align*}\n\\]\nWe see the error goes to \\(0\\) at a rate of \\(1/n\\) with the constant depending on \\(b-a\\) and the function \\(f\\). In general, a similar bound holds when \\(f\\) is not monotonic.\nThere are other ways to approximate the integral that use fewer points in the partition. Simpsons rule is one, where instead of approximating the area with rectangles that go through some \\(c_i\\) in \\([x_{i-1}, x_i]\\) instead the function is approximated by the quadratic polynomial going through \\(x_{i-1}\\), \\((x_i + x_{i-1})/2\\), and \\(x_i\\) and the exact area under that polynomial is used in the approximation. The explicit formula is:\n\\[\nA \\approx \\frac{b-a}{3n} (f(x_0) + 4 f(x_1) + 2f(x_2) + 4f(x_3) + \\cdots + 2f(x_{n-2}) + 4f(x_{n-1}) + f(x_n)).\n\\]\nThe error in this approximation can be shown to be\n\\[\n\\text{error} \\leq \\frac{(b-a)^5}{180n^4} \\text{max}_{\\xi \\text{ in } [a,b]} \\lvert f^{(4)}(\\xi) \\rvert.\n\\]\nThat is, the error is like \\(1/n^4\\) with constants depending on the length of the interval, \\((b-a)^5\\), and the maximum value of the fourth derivative over \\([a,b]\\). This is significant, the error in \\(10\\) steps of Simpsons rule is on the scale of the error of \\(10,000\\) steps of the Riemann sum for well-behaved functions.\n\n\n\n\n\n\nNote\n\n\n\nThe Wikipedia article mentions that Kepler used a similar formula \\(100\\) years prior to Simpson, or about \\(200\\) years before Riemann published his work. Again, the value in Riemanns work is not the computation of the answer, but the framework it provides in determining if a function is Riemann integrable or not."
},
{
"objectID": "integrals/area.html#gauss-quadrature",
"href": "integrals/area.html#gauss-quadrature",
"title": "36  Area under a curve",
"section": "36.5 Gauss quadrature",
"text": "36.5 Gauss quadrature\nThe formula for Simpsons rule was the composite formula. If just a single rectangle is approximated over \\([a,b]\\) by a parabola interpolating the points \\(x_1=a\\), \\(x_2=(a+b)/2\\), and \\(x_3=b\\), the formula is:\n\\[\n\\frac{b-a}{6}(f(x_1) + 4f(x_2) + f(x_3)).\n\\]\nThis formula will actually be exact for any 3rd degree polynomial. In fact an entire family of similar approximations using \\(n\\) points can be made exact for any polynomial of degree \\(n-1\\) or lower. But with non-evenly spaced points, even better results can be found.\nThe formulas for an approximation to the integral \\(\\int_{-1}^1 f(x) dx\\) discussed so far can be written as:\n\\[\n\\begin{align*}\nS &= f(x_1) \\Delta_1 + f(x_2) \\Delta_2 + \\cdots + f(x_n) \\Delta_n\\\\\n &= w_1 f(x_1) + w_2 f(x_2) + \\cdots + w_n f(x_n).\n\\end{align*}\n\\]\nThe \\(w\\)s are “weights” and the \\(x\\)s are nodes. A Gaussian quadrature rule is a set of weights and nodes for \\(i=1, \\dots n\\) for which the sum is exact for any \\(f\\) which is a polynomial of degree \\(2n-1\\) or less. Such choices then also approximate well the integrals of functions which are not polynomials of degree \\(2n-1\\), provided \\(f\\) can be well approximated by a polynomial over \\([-1,1]\\). (Which is the case for the “nice” functions we encounter.) Some examples are given in the questions.\n\n36.5.1 The quadgk function\nIn Julia a modification of the Gauss quadrature rule is implemented in the quadgk function (from the QuadGK package) to give numeric approximations to integrals. The quadgk function also has the familiar interface action(function, arguments...). Unlike our riemann function, there is no n specified, as the number of steps is adaptively determined. (There is more partitioning occurring where the function is changing rapidly.) Instead, the algorithm outputs an estimate on the possible error along with the answer. Instead of \\(n\\), some trickier problems require a specification of an error threshold.\nTo use the function, we have:\n\nf(x) = x * log(x)\nquadgk(f, 0, 2)\n\n(0.38629436103070175, 4.856575104393709e-9)\n\n\nAs mentioned, there are two values returned: an approximate answer, and an error estimate. In this example we see that the value of \\(0.3862943610307017\\) is accurate to within \\(10^{-9}\\). (The actual answer is \\(-1 + 2\\cdot \\log(2)\\) and the error is only \\(10^{-11}\\). The reported error is an upper bound, and may be conservative, as with this problem.) Our previous answer using \\(50,000\\) right-Riemann sums was \\(0.38632208884775737\\) and is only accurate to \\(10^{-5}\\). By contrast, this method uses just \\(256\\) function evaluations in the above problem.\nThe method should be exact for polynomial functions:\n\nf(x) = x^5 - x + 1\nquadgk(f, -2, 2)\n\n(3.9999999999999973, 8.881784197001252e-16)\n\n\nThe error term is \\(0\\), answer is \\(4\\) up to the last unit of precision (1 ulp), so any error is only in floating point approximations.\nFor the numeric computation of definite integrals, the quadgk function should be used over the Riemann sums or even Simpsons rule.\nHere are some sample integrals computed with quadgk:\n\\[\n\\int_0^\\pi \\sin(x) dx\n\\]\n\nquadgk(sin, 0, pi)\n\n(2.0, 1.7905676941154525e-12)\n\n\n(Again, the actual answer is off only in the last digit, the error estimate is an upper bound.)\n\\[\n\\int_0^2 x^x dx\n\\]\n\nquadgk(x -> x^x, 0, 2)\n\n(2.8338767448900546, 1.9481752001546115e-8)\n\n\n\\[\n\\int_0^5 e^x dx\n\\]\n\nquadgk(exp, 0, 5)\n\n(147.41315910257657, 2.6594506152832764e-8)\n\n\nWhen composing the answer with other functions it may be desirable to drop the error in the answer. Two styles can be used for this. The first is to just name the two returned values:\n\nA, err = quadgk(cos, 0, pi/4)\nA\n\n0.7071067811865475\n\n\nThe second is to ask for just the first component of the returned value:\n\nA = quadgk(tan, 0, pi/4)[1] # or first(quadgk(tan, 0, pi/4))\n\n0.3465735902799726\n\n\n\nTo visualize the choice of nodes by the algorithm, we have for \\(f(x)=\\sin(x)\\) over \\([0,\\pi]\\) relatively few nodes used to get a high-precision estimate:\n\n\n\n\n\nFor a more oscillatory function, more nodes are chosen:\n\n\n\n\n\n\nExample\nIn probability theory, a univariate density is a function, \\(f(x)\\) such that \\(f(x) \\geq 0\\) and \\(\\int_a^b f(x) dx = 1\\), where \\(a\\) and \\(b\\) are the range of the distribution. The Von Mises distribution, takes the form\n\\[\nk(x) = C \\cdot \\exp(\\cos(x)), \\quad -\\pi \\leq x \\leq \\pi\n\\]\nCompute \\(C\\) (numerically).\nThe fact that \\(1 = \\int_{-\\pi}^\\pi C \\cdot \\exp(\\cos(x)) dx = C \\int_{-\\pi}^\\pi \\exp(\\cos(x)) dx\\) implies that \\(C\\) is the reciprocal of\n\nk(x) = exp(cos(x))\nA,err = quadgk(k, -pi, pi)\n\n(7.954926521012919, 3.9023298370466364e-8)\n\n\nSo\n\nC = 1/A\nk₁(x) = C * exp(cos(x))\n\nk₁ (generic function with 1 method)\n\n\nThe cumulative distribution function for \\(k(x)\\) is \\(K(x) = \\int_{-\\pi}^x k(u) du\\), \\(-\\pi \\leq x \\leq \\pi\\). We just showed that \\(K(\\pi) = 1\\) and it is trivial that \\(K(-\\pi) = 0\\). The quantiles of the distribution are the values \\(q_1\\), \\(q_2\\), and \\(q_3\\) for which \\(K(q_i) = i/4\\). Can we find these?\nFirst we define a function, that computes \\(K(x)\\):\n\nK(x) = quadgk(k₁, -pi, x)[1]\n\nK (generic function with 1 method)\n\n\n(The trailing [1] is so only the answer - and not the error - is returned.)\nThe question asks us to solve \\(K(x) = 0.25\\), \\(K(x) = 0.5\\) and \\(K(x) = 0.75\\). The Roots package can be used for such work, in particular find_zero. We will use a bracketing method, as clearly \\(K(x)\\) is increasing, as \\(k(u)\\) is positive, so we can just bracket our answer with \\(-\\pi\\) and \\(\\pi\\). (We solve \\(K(x) - p = 0\\), so \\(K(\\pi) - p > 0\\) and \\(K(-\\pi)-p < 0\\).). We could do this with [find_zero(x -> K(x) - p, (-pi, pi)) for p in [0.25, 0.5, 0.75]], but that is a bit less performant than using the solve interface for this task:\n\nZ = ZeroProblem((x,p) -> K(x) - p, (-pi, pi))\nsolve.(Z, (1/4, 1/2, 3/4))\n\n(-0.8097673745015915, 0.0, 0.809767374501629)\n\n\nThe middle one is clearly \\(0\\). This distribution is symmetric about \\(0\\), so half the area is to the right of \\(0\\) and half to the left, so clearly when \\(p=0.5\\), \\(x\\) is \\(0\\). The other two show that the area to the left of \\(-0.809767\\) is equal to the area to the right of \\(0.809767\\) and equal to \\(0.25\\)."
},
{
"objectID": "integrals/area.html#questions",
"href": "integrals/area.html#questions",
"title": "36  Area under a curve",
"section": "36.6 Questions",
"text": "36.6 Questions\n\nQuestion\nUsing geometry, compute the definite integral:\n\\[\n\\int_{-5}^5 \\sqrt{5^2 - x^2} dx.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing geometry, compute the definite integral:\n\\[\n\\int_{-2}^2 (2 - \\lvert x\\rvert) dx\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing geometry, compute the definite integral:\n\\[\n\\int_0^3 3 dx + \\int_3^9 (3 + 3(x-3)) dx\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing geometry, compute the definite integral:\n\\[\n\\int_0^5 \\lfloor x \\rfloor dx\n\\]\n(The notation \\(\\lfloor x \\rfloor\\) is the integer such that \\(\\lfloor x \\rfloor \\leq x < \\lfloor x \\rfloor + 1\\).)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing geometry, compute the definite integral between \\(-3\\) and \\(3\\) of this graph comprised of lines and circular arcs:\n\n\n\n\n\nThe value is:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the function \\(f(x) = \\sin(\\pi x)\\), estimate the integral for \\(-1\\) to \\(1\\) using a left-Riemann sum with the partition \\(-1 < -1/2 < 0 < 1/2 < 1\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWithout doing any real work, find this integral:\n\\[\n\\int_{-\\pi/4}^{\\pi/4} \\tan(x) dx.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWithout doing any real work, find this integral:\n\\[\n\\int_3^5 (1 - \\lvert x-4 \\rvert) dx\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose you know that for the integrable function \\(\\int_a^b f(u)du =1\\) and \\(\\int_a^c f(u)du = p\\). If \\(a < c < b\\) what is \\(\\int_c^b f(u)du\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(p\\)\n \n \n\n\n \n \n \n \n \\(1-p\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(p^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat is \\(\\int_0^2 x^4 dx\\)? Use the rule for integrating \\(x^n\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2^5 - 0^5\\)\n \n \n\n\n \n \n \n \n \\(2^4/4 - 0^4/4\\)\n \n \n\n\n \n \n \n \n \\(2^5/5 - 0^5/5\\)\n \n \n\n\n \n \n \n \n \\(3\\cdot 2^3 - 3 \\cdot 0^3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSolve for a value of \\(x\\) for which:\n\\[\n\\int_1^x \\frac{1}{u}du = 1.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSolve for a value of \\(n\\) for which\n\\[\n\\int_0^1 x^n dx = \\frac{1}{12}.\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x) > 0\\) and \\(a < c < b\\). Define \\(F(x) = \\int_a^x f(u) du\\). What can be said about \\(F(b)\\) and \\(F(c)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n The area between \\(c\\) and \\(b\\) must be positive, so \\(F(c) < F(b)\\).\n \n \n\n\n \n \n \n \n \\(F(x)\\) is continuous, so between \\(a\\) and \\(b\\) has an extreme value, which must be at \\(c\\). So \\(F(c) \\geq F(b)\\).\n \n \n\n\n \n \n \n \n \\(F(b) - F(c) = F(a).\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the right Riemann sum approximating \\(\\int_0^{10} e^x dx\\) with \\(n=100\\) subintervals, what would be a good estimate for the error?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(10/100\\)\n \n \n\n\n \n \n \n \n \\((10 - 0) \\cdot e^{10} / 100^4\\)\n \n \n\n\n \n \n \n \n \\((10 - 0)/100 \\cdot (e^{10} - e^{0})\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse quadgk to find the following definite integral:\n\\[\n\\int_1^4 x^x dx .\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse quadgk to find the following definite integral:\n\\[\n\\int_0^3 e^{-x^2} dx .\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse quadgk to find the following definite integral:\n\\[\n\\int_0^{9/10} \\tan(u \\frac{\\pi}{2}) du. .\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse quadgk to find the following definite integral:\n\\[\n\\int_{-1/2}^{1/2} \\frac{1}{\\sqrt{1 - x^2}} dx\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\nJXG = require(\"jsxgraph\");\n\nb = JXG.JSXGraph.initBoard('jsxgraph', {\n boundingbox: [-0.5,0.3,1.5,-1/4], axis:true\n});\n\ng = function(x) { return x*x*x*x + 10*x*x - 60* x + 100}\nf = function(x) {return 1/Math.sqrt(g(x))};\n\ntype = \"right\";\nl = 0;\nr = 1;\nrsum = function() {\n return JXG.Math.Numerics.riemannsum(f,n.Value(), type, l, r);\n};\nn = b.create('slider', [[0.1, -0.05],[0.75,-0.05], [2,1,50]],{name:'n',snapWidth:1});\n\ngraph = b.create('functiongraph', [f, l, r]);\nos = b.create('riemannsum',\n [f,\n function(){ return n.Value();},\n type, l, r\n ],\n {fillColor:'#ffff00', fillOpacity:0.3});\n\nb.create('text', [0.1,0.25, function(){\n return 'Riemann sum='+(rsum().toFixed(4));\n}]);\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nThe interactive graphic shows the area of a right-Riemann sum for different partitions. The function is\n\\[\nf(x) = \\frac{1}{\\sqrt{ x^4 + 10x^2 - 60x + 100}}\n\\]\nWhen \\(n=5\\) what is the area of the Riemann sum?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(n=50\\) what is the area of the Riemann sum?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nUsing quadgk what is the area under the curve?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGauss nodes for approximating the integral \\(\\int_{-1}^1 f(x) dx\\) for \\(n=4\\) are:\n\nns = [-0.861136, -0.339981, 0.339981, 0.861136]\n\n4-element Vector{Float64}:\n -0.861136\n -0.339981\n 0.339981\n 0.861136\n\n\nThe corresponding weights are\n\nwts = [0.347855, 0.652145, 0.652145, 0.347855]\n\n4-element Vector{Float64}:\n 0.347855\n 0.652145\n 0.652145\n 0.347855\n\n\nUse these to estimate the integral \\(\\int_{-1}^1 \\cos(\\pi/2 \\cdot x)dx\\) with \\(w_1f(x_1) + w_2 f(x_2) + w_3 f(x_3) + w_4 f(x_4)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe actual answer is \\(4/\\pi\\). How far off is the approximation based on 4 points?\n\n\n\n \n \n \n \n \n \n \n \n \n around \\(10^{-1}\\)\n \n \n\n\n \n \n \n \n around \\(10^{-2}\\)\n \n \n\n\n \n \n \n \n around \\(10^{-4}\\)\n \n \n\n\n \n \n \n \n around \\(10^{-6}\\)\n \n \n\n\n \n \n \n \n around \\(10^{-8}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing the Gauss nodes and weights from the previous question, estimate the integral of \\(f(x) = e^x\\) over \\([-1, 1]\\). The value is:"
},
{
"objectID": "integrals/ftc.html",
"href": "integrals/ftc.html",
"title": "37  Fundamental Theorem or Calculus",
"section": "",
"text": "This section uses these add-on packages:\nWe refer to the example from the section on transformations where two operators on functions were defined:\n\\[\nD(f)(k) = f(k) - f(k-1), \\quad S(f)(k) = f(1) + f(2) + \\cdots + f(k).\n\\]\nIt was remarked that these relationships hold: \\(D(S(f))(k) = f(k)\\) and \\(S(D(f))(k) = f(k) - f(0)\\). These being a consequence of the inverse relationship between addition and subtraction. These two relationships are examples of a more general pair of relationships known as the Fundamental theorem of calculus or FTC.\nWe will see that with suitable rewriting, the derivative of a function is related to a certain limit of D(f) and the definite integral of a function is related to a certain limit of S(f). The addition and subtraction rules encapsulated in the relations of \\(D(S(f))(k) = f(k)\\) and \\(S(D(f))(k) = f(k) - f(0)\\) then generalize to these calculus counterparts.\nThe FTC details the interconnectivity between the operations of integration and differentiation.\nFor example:\nThat is, what is \\(A = \\int_a^b f'(x) dx\\)? (Assume \\(f'\\) is continuous.)\nTo investigate, we begin with the right Riemann sum using \\(h = (b-a)/n\\):\n\\[\nA \\approx S_n = \\sum_{i=1}^n f'(a + ih) \\cdot h.\n\\]\nBut the mean value theorem says that for small \\(h\\) we have \\(f'(x) \\approx (f(x) - f(x-h))/h\\). Using this approximation with \\(x=a+ih\\) gives:\n\\[\nA \\approx\n\\sum_{i=1}^n \\left(f(a + ih) - f(a + (i-1)h)\\right).\n\\]\nIf we let \\(g(i) = f(a + ih)\\), then the summand above is just \\(g(i) - g(i-1) = D(g)(i)\\) and the above then is just the sum of the \\(D(g)(i)\\)s, or:\n\\[\nA \\approx S(D(g))(n) = g(n) - g(0).\n\\]\nBut \\(g(n) - g(0) = f(a + nh) - f(a + 0h) = f(b) - f(a)\\). That is, we expect that the \\(\\approx\\) in the limit becomes \\(=\\), or:\n\\[\n\\int_a^b f'(x) dx = f(b) - f(a).\n\\]\nThis is indeed the case.\nThe other question would be\nThat is, can we find the derivative of \\(\\int_0^x f(u) du\\)? (The derivative in \\(x\\), the variable \\(u\\) is a dummy variable of integration.)\nLets look first at the integral using the right-Riemann sum, again using \\(h=(b-a)/n\\):\n\\[\n\\int_a^b f(u) du \\approx f(a + 1h)h + f(a + 2h)h + \\cdots f(a +nh)h = S(g)(n),\n\\]\nwhere we define \\(g(i) = f(a + ih)h\\). In the above, \\(n\\) relates to \\(b\\), but we could have stopped accumulating at any value. The analog for \\(S(g)(k)\\) would be \\(\\int_a^x f(u) du\\) where \\(x = a + kh\\). That is we can make a function out of integration by considering the mapping \\((x, \\int_a^x f(u) du)\\). This might be written as \\(F(x) = \\int_a^x f(u)du\\). With this definition, can we take a derivative in \\(x\\)?\nAgain, we fix a large \\(n\\) and let \\(h=(b-a)/n\\). And suppose \\(x = a + Mh\\) for some \\(M\\). Then writing out the approximations to both the definite integral and the derivative we have\n\\[\n\\begin{align*}\nF'(x) = & \\frac{d}{dx} \\int_a^x f(u) du \\\\\n& \\approx \\frac{F(x) - F(x-h)}{h} \\\\\n&= \\frac{\\int_a^x f(u) du - \\int_a^{x-h} f(u) du}{h}\\\\\n& \\approx \\frac{\\left(f(a + 1h)h + f(a + 2h)h + \\cdots + f(a + (M-1)h)h + f(a + Mh)h\\right)}{h}\\\\\n&- \\quad\n\\frac{\\left(f(a + 1h)h + f(a + 2h)h + \\cdots + f(a + (M-1)h)h \\right)}{h} \\\\\n& = \\left(f(a + 1h) + \\quad f(a + 2h) + \\cdots + f(a + (M-1)h) + f(a + Mh)\\right)\\\\\n&- \\quad\n\\left(f(a + 1h) + f(a + 2h) + \\cdots + f(a + (M-1)h) \\right) \\\\\n&= f(a + Mh).\n\\end{align*}\n\\]\nIf \\(g(i) = f(a + ih)\\), then the above becomes\n\\[\n\\begin{align*}\nF'(x) & \\approx D(S(g))(M) \\\\\n&= f(a + Mh)\\\\\n&= f(x).\n\\end{align*}\n\\]\nThat is \\(F'(x) \\approx f(x)\\).\nIn the limit, then, we would expect that\n\\[\n\\frac{d}{dx} \\int_a^x f(u) du = f(x).\n\\]\nWith these heuristics, we now have:"
},
{
"objectID": "integrals/ftc.html#using-the-fundamental-theorem-of-calculus-to-evaluate-definite-integrals",
"href": "integrals/ftc.html#using-the-fundamental-theorem-of-calculus-to-evaluate-definite-integrals",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.1 Using the fundamental theorem of calculus to evaluate definite integrals",
"text": "37.1 Using the fundamental theorem of calculus to evaluate definite integrals\nThe major use of the FTC is the computation of \\(\\int_a^b f(x) dx\\). Rather then resort to Riemann sums or geometric arguments, there is an alternative - when possible, find a function \\(F\\) with \\(F'(x) = f(x)\\) and compute \\(F(b) - F(a)\\).\nSome examples:\n\nConsider the problem of Archimedes, \\(\\int_0^1 x^2 dx\\). Clearly, we have with \\(f(x) = x^2\\) that \\(F(x)=x^3/3\\) will satisfy the assumptions of the FTC, so that:\n\n\\[\n\\int_0^1 x^2 dx = F(1) - F(0) = \\frac{1^3}{3} - \\frac{0^3}{3} = \\frac{1}{3}.\n\\]\n\nMore generally, we know if \\(n\\neq-1\\) that if \\(f(x) = x^{n}\\), that\n\n\\[\nF(x) = x^{n+1}/(n+1)\n\\]\nwill satisfy \\(F'(x)=f(x)\\), so that\n\\[\n\\int_a^b x^n dx = \\frac{b^{n+1} - a^{n+1}}{n+1}, \\quad n\\neq -1.\n\\]\n(Well almost! We must be careful to know that \\(a \\cdot b > 0\\), as otherwise we will encounter a place where \\(f(x)\\) may not be integrable.)\nWe note that the above includes the case of a constant, or \\(n=0\\).\nWhat about the case \\(n=-1\\), or \\(f(x) = 1/x\\), that is not covered by the above? For this special case, it is known that \\(F(x) = \\log(x)\\) (natural log) will have \\(F'(x) = 1/x\\). This gives for \\(0 < a < b\\):\n\\[\n\\int_a^b \\frac{1}{x} dx = \\log(b) - \\log(a).\n\\]\n\nLet \\(f(x) = \\cos(x)\\). How much area is between \\(-\\pi/2\\) and \\(\\pi/2\\)? We have that \\(F(x) = \\sin(x)\\) will have \\(F'(x) = f(x)\\), so:\n\n\\[\n\\int_{-\\pi/2}^{\\pi/2} \\cos(x) dx = F(\\pi/2) - F(-\\pi/2) = 1 - (-1) = 2.\n\\]\n\n37.1.1 An alternate notation for \\(F(b) - F(a)\\)\nThe expression \\(F(b) - F(a)\\) is often written in this more compact form:\n\\[\n\\int_a^b f(x) dx = F(b) - F(a) = F(x)\\big|_{x=a}^b, \\text{ or just expr}\\big|_{x=a}^b.\n\\]\nThe vertical bar is used for the evaluation step, in this case the \\(a\\) and \\(b\\) mirror that of the definite integral. This notation lends itself to working inline, as we illustrate with this next problem where we “know” a function “\\(F\\)”, so just express it “inline”:\n\\[\n\\int_0^{\\pi/4} \\sec^2(x) dx = \\tan(x) \\big|_{x=0}^{\\pi/4} = 1 - 0 = 1.\n\\]\nA consequence of this notation is:\n\\[\nF(x) \\big|_{x=a}^b = -F(x) \\big|_{x=b}^a.\n\\]\nThis says nothing more than \\(F(b)-F(a) = -F(a) - (-F(b))\\), though more compactly."
},
{
"objectID": "integrals/ftc.html#the-indefinite-integral",
"href": "integrals/ftc.html#the-indefinite-integral",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.2 The indefinite integral",
"text": "37.2 The indefinite integral\nA function \\(F(x)\\) with \\(F'(x) = f(x)\\) is known as an antiderivative of \\(f\\). For a given \\(f\\), there are infinitely many antiderivatives: if \\(F(x)\\) is one, then so is \\(G(x) = F(x) + C\\). But - due to the mean value theorem - all antiderivatives for \\(f\\) differ at most by a constant.\nThe indefinite integral of \\(f(x)\\) is denoted by:\n\\[\n\\int f(x) dx.\n\\]\n(There are no limits of integration.) There are two possible definitions: this refers to the set of all antiderivatives, or is just one of the set of all antiderivatives for \\(f\\). The former gives rise to expressions such as\n\\[\n\\int x^2 dx = \\frac{x^3}{3} + C\n\\]\nwhere \\(C\\) is the constant of integration and isnt really a fixed constant, but any possible constant. These notes will follow the lead of SymPy and not give a \\(C\\) in the expression, but instead rely on the reader to understand that there could be many other possible expressions given, though all differ by no more than a constant. This means, that \\(\\int f(x) dx\\) refers to an antiderivative, not the collection of all antiderivatives.\n\n37.2.1 The integrate function from SymPy\nSymPy provides the integrate function to perform integration. There are two usages:\n\nintegrate(ex, var) to find an antiderivative\nintegrate(ex, (var, a, b)) to find the definite integral. This integrates the expression in the variable var from a to b.\n\nTo illustrate, we have, this call finds an antiderivative:\n\n@syms x\nintegrate(sin(x),x)\n\n \n\\[\n- \\cos{\\left(x \\right)}\n\\]\n\n\n\nWhereas this call computes the “area” under \\(f(x)\\) between a and b:\n\nintegrate(sin(x), (x, 0, pi))\n\n \n\\[\n2.0\n\\]\n\n\n\nAs does this for a different function:\n\nintegrate(acos(1-x), (x, 0, 2))\n\n \n\\[\n\\pi\n\\]\n\n\n\nAnswers may depend on conditions, as here, where the case \\(n=-1\\) breaks a pattern:\n\n@syms x::real n::real\nintegrate(x^n, x) # indefinite integral\n\n \n\\[\n\\begin{cases} \\frac{x^{n + 1}}{n + 1} & \\text{for}\\: n \\neq -1 \\\\\\log{\\left(x \\right)} & \\text{otherwise} \\end{cases}\n\\]\n\n\n\nAnswers may depend on specific assumptions:\n\n@syms u\nintegrate(abs(u),u)\n\n \n\\[\n\\int \\left|{u}\\right|\\, du\n\\]\n\n\n\nYet\n\n@syms u::real\nintegrate(abs(u),u)\n\n \n\\[\n\\begin{cases} - \\frac{u^{2}}{2} & \\text{for}\\: u \\leq 0 \\\\\\frac{u^{2}}{2} & \\text{otherwise} \\end{cases}\n\\]\n\n\n\nAnswers may not be available as elementary functions, but there may be special functions that have special cases.\n\n@syms x::real\nintegrate(x / sqrt(1-x^3), x)\n\n \n\\[\n\\frac{x^{2} \\Gamma\\left(\\frac{2}{3}\\right) {{}_{2}F_{1}\\left(\\begin{matrix} \\frac{1}{2}, \\frac{2}{3} \\\\ \\frac{5}{3} \\end{matrix}\\middle| {x^{3} e^{2 i \\pi}} \\right)}}{3 \\Gamma\\left(\\frac{5}{3}\\right)}\n\\]\n\n\n\nThe different cases explored by integrate are after the questions."
},
{
"objectID": "integrals/ftc.html#rules-of-integration",
"href": "integrals/ftc.html#rules-of-integration",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.3 Rules of integration",
"text": "37.3 Rules of integration\nThere are some “rules” of integration that allow integrals to be re-expressed. These follow from the rules of derivatives.\n\nThe integral of a constant times a function:\n\n\\[\n\\int c \\cdot f(x) dx = c \\cdot \\int f(x) dx.\n\\]\nThis follows as if \\(F(x)\\) is an antiderivative of \\(f(x)\\), then \\([cF(x)]' = c f(x)\\) by rules of derivatives.\n\nThe integral of a sum of functions:\n\n\\[\n\\int (f(x) + g(x)) dx = \\int f(x) dx + \\int g(x) dx.\n\\]\nThis follows immediately as if \\(F(x)\\) and \\(G(x)\\) are antiderivatives of \\(f(x)\\) and \\(g(x)\\), then \\([F(x) + G(x)]' = f(x) + g(x)\\), so the right hand side will have a derivative of \\(f(x) + g(x)\\).\nIn fact, this more general form where \\(c\\) and \\(d\\) are constants covers both cases:\n\\[\n\\int (cf(x) + dg(x)) dx = c \\int f(x) dx + d \\int g(x) dx.\n\\]\nThis statement is nothing more than the derivative formula \\([cf(x) + dg(x)]' = cf'(x) + dg'(x)\\). The product rule gives rise to a technique called integration by parts and the chain rule gives rise to a technique of integration by substitution, but we defer those discussions to other sections.\n\nExamples\n\nThe antiderivative of the polynomial \\(p(x) = a_n x^n + \\cdots a_1 x + a_0\\) follows from the linearity of the integral and the general power rule:\n\n\\[\n\\begin{align}\n\\int (a_n x^n + \\cdots a_1 x + a_0) dx\n&= \\int a_nx^n dx + \\cdots \\int a_1 x dx + \\int a_0 dx \\\\\n&= a_n \\int x^n dx + \\cdots + a_1 \\int x dx + a_0 \\int dx \\\\\n&= a_n\\frac{x^{n+1}}{n+1} + \\cdots + a_1 \\frac{x^2}{2} + a_0 \\frac{x}{1}.\n\\end{align}\n\\]\n\nMore generally, a Laurent polynomial allows for terms with negative powers. These too can be handled by the above. For example\n\n\\[\n\\begin{align}\n\\int (\\frac{2}{x} + 2 + 2x) dx\n&= \\int \\frac{2}{x} dx + \\int 2 dx + \\int 2x dx \\\\\n&= 2\\int \\frac{1}{x} dx + 2 \\int dx + 2 \\int xdx\\\\\n&= 2\\log(x) + 2x + 2\\frac{x^2}{2}.\n\\end{align}\n\\]\n\nConsider this integral:\n\n\\[\n\\int_0^\\pi 100 \\sin(x) dx = F(\\pi) - F(0),\n\\]\nwhere \\(F(x)\\) is an antiderivative of \\(100\\sin(x)\\). But:\n\\[\n\\int 100 \\sin(x) dx = 100 \\int \\sin(x) dx = 100 (-\\cos(x)).\n\\]\nSo the answer to the question is\n\\[\n\\int_0^\\pi 100 \\sin(x) dx = (100 (-\\cos(\\pi))) - (100(-\\cos(0))) = (100(-(-1))) - (100(-1)) = 200.\n\\]\nThis seems like a lot of work, and indeed it is more than is needed. The following would be more typical once the rules are learned:\n\\[\n\\int_0^\\pi 100 \\sin(x) dx = -100(-\\cos(x)) \\big|_0^{\\pi} = 100 \\cos(x) \\big|_{\\pi}^0 = 100(1) - 100(-1) = 200.\n\\]"
},
{
"objectID": "integrals/ftc.html#the-derivative-of-the-integral",
"href": "integrals/ftc.html#the-derivative-of-the-integral",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.4 The derivative of the integral",
"text": "37.4 The derivative of the integral\nThe relationship that \\([\\int_a^x f(u) du]' = f(x)\\) is a bit harder to appreciate, as it doesnt help answer many ready made questions. Here we give some examples of its use.\nFirst, the expression defining an antiderivative, or indefinite integral, is given in term of a definite integral:\n\\[\nF(x) = \\int_a^x f(u) du.\n\\]\nThe value of \\(a\\) does not matter, as long as the integral is defined.\n\n\n \n Illustration showing \\(F(x) = \\int_a^x f(u) du\\) is a function that accumulates area. The value of \\(A\\) is the area over \\([x_{n-1}, x_n]\\) and also the difference \\(F(x_n) - F(x_{n-1})\\).\n \n \n\n\n\nThe picture for this, for non-negative \\(f\\), is of accumulating area as \\(x\\) increases. It can be used to give insight into some formulas:\nFor any function, we know that \\(F(b) - F(c) + F(c) - F(a) = F(b) - F(a)\\). For this specific function, this translates into this property of the integral:\n\\[\n\\int_a^b f(x) dx = \\int_a^c f(x) dx + \\int_c^b f(x) dx.\n\\]\nSimilarly, \\(\\int_a^a f(x) dx = F(a) - F(a) = 0\\) follows.\nTo see that the value of \\(a\\) does not matter, consider \\(a_0 < a_1\\). Then we have with\n\\[\nF(x) = \\int_{a_0}^x f(u)du, \\quad G(x) = \\int_{a_0}^x f(u)du,\n\\]\nThat \\(F(x) = G(x) + \\int_{a_0}^{a_1} f(u) du\\). The additional part may look complicated, but the point is that as far as \\(x\\) is involved, it is a constant. Hence both \\(F\\) and \\(G\\) are antiderivatives if either one is.\n\nExample\nFrom the familiar formula rate \\(\\times\\) time \\(=\\) distance, we “know,” for example, that a car traveling 60 miles an hour for one hour will have traveled 60 miles. This allows us to translate statements about the speed (or more generally velocity) into statements about position at a given time. If the speed is not constant, we dont have such an easy conversion.\nSuppose our velocity at time \\(t\\) is \\(v(t)\\), and always positive. We want to find the position at time \\(t\\), \\(x(t)\\). Lets assume \\(x(0) = 0\\). Let \\(h\\) be some small time step, say \\(h=(t - 0)/n\\) for some large \\(n>0\\). Then we can approximate \\(v(t)\\) between \\([ih, (i+1)h)\\) by \\(v(ih)\\). This is a constant so the change in position over the time interval \\([ih, (i+1)h)\\) would simply be \\(v(ih) \\cdot h\\), and ignoring the accumulated errors, the approximate position at time \\(t\\) would be found by adding this pieces together: \\(x(t) \\approx v(0h)\\cdot h + v(1h)\\cdot h + v(2h) \\cdot h + \\cdots + v(nh)h\\). But we recognize this (as did Beeckman in 1618) as nothing more than an approximation for the Riemann sum of \\(v\\) over the interval \\([0, t]\\). That is, we expect:\n\\[\nx(t) = \\int_0^t v(u) du.\n\\]\nHopefully this makes sense: our position is the result of accumulating our change in position over small units of time. The old one-foot-in-front-of-another approach to walking out the door.\nThe above was simplified by the assumption that \\(x(0) = 0\\). What if \\(x(0) = x_0\\) for some non-zero value. Then the above is not exactly correct, as \\(\\int_0^0 v(u) du = 0\\). So instead, we might write this more concretely as:\n\\[\nx(t) = x_0 + \\int_0^t v(u) du.\n\\]\nThere is a similar relationship between velocity and acceleration, but lets think about it formally. If we know that the acceleration is the rate of change of velocity, then we have \\(a(t) = v'(t)\\). By the FTC, then\n\\[\n\\int_0^t a(u) du = \\int_0^t v'(t) = v(t) - v(0).\n\\]\nRewriting gives a similar statement as before:\n\\[\nv(t) = v_0 + \\int_0^t a(u) du.\n\\]\n\n\nExample\nIn probability theory, for a positive, continuous random variable, the probability that the random value is less than \\(a\\) is given by \\(P(X \\leq a) = F(a) = \\int_{0}^a f(x) dx\\). (Positive means the integral starts at \\(0\\), whereas in general it could be \\(-\\infty\\), a minor complication that we havent yet discussed.)\nFor example, the exponential distribution with rate \\(1\\) has \\(f(x) = e^{-x}\\). Compute \\(F(x)\\).\nThis is just \\(F(x) = \\int_0^x e^{-u} du = -e^{-u}\\big|_0^x = 1 - e^{-x}\\).\nThe “uniform” distribution on \\([a,b]\\) has\n\\[\nF(x) =\n\\begin{cases}\n0 & x < a\\\\\n\\frac{x-a}{b-a} & a \\leq x \\leq b\\\\\n1 & x > b\n\\end{cases}\n\\]\nFind \\(f(x)\\). There are some subtleties here. If we assume that \\(F(x) = \\int_0^x f(u) du\\) then we know if \\(f(x)\\) is continuous that \\(F'(x) = f(x)\\). Differentiating we get\n\\[\nf(x) = \\begin{cases}\n0 & x < a\\\\\n\\frac{1}{b-a} & a < x < b\\\\\n0 & x > b\n\\end{cases}\n\\]\nHowever, the function \\(f\\) is not continuous on \\([a,b]\\) and \\(F'(x)\\) is not differentiable on \\((a,b)\\). It is true that \\(f\\) is integrable, and where \\(F\\) is differentiable \\(F'=f\\). So \\(f\\) is determined except possibly at the points \\(x=a\\) and \\(x=b\\).\n\n\nExample\nThe error function is defined by \\(\\text{erf}(x) = 2/\\sqrt{\\pi}\\int_0^x e^{-u^2} du\\). It is implemented in Julia through erf. Suppose, we were to ask where it takes on its maximum value, what would we find?\nThe answer will either be at a critical point, at \\(0\\) or as \\(x\\) goes to \\(\\infty\\). We can differentiate to find critical points:\n\\[\n[\\text{erf}(x)] = \\frac{2}{\\pi}e^{-x^2}.\n\\]\nOh, this is never \\(0\\), so there are no critical points. The maximum occurs at \\(0\\) or as \\(x\\) goes to \\(\\infty\\). Clearly at \\(0\\), we have \\(\\text{erf}(0)=0\\), so the answer will be as \\(x\\) goes to \\(\\infty\\).\nIn retrospect, this is a silly question. As \\(f(x) > 0\\) for all \\(x\\), we must have that \\(F(x)\\) is strictly increasing, so never gets to a local maximum.\n\n\nExample\nThe Dawson function is\n\\[\nF(x) = e^{-x^2} \\int_0^x e^{t^2} dt\n\\]\nCharacterize any local maxima or minima.\nFor this we need to consider the product rule. The fundamental theorem of calculus will help with the right-hand side. We have:\n\\[\nF'(x) = (-2x)e^{-x^2} \\int_0^x e^{t^2} dt + e^{-x^2} e^{x^2} = -2x F(x) + 1\n\\]\nWe need to figure out when this is \\(0\\). For that, we use some numeric math.\n\nF(x) = exp(-x^2) * quadgk(t -> exp(t^2), 0, x)[1]\nFp(x) = -2x*F(x) + 1\ncps = find_zeros(Fp, -4, 4)\n\n2-element Vector{Float64}:\n -0.9241388730045916\n 0.9241388730045916\n\n\nWe could take a second derivative to characterize. For that we use \\(F''(x) = [-2xF(x) + 1]' = -2F(x) + -2x(-2xF(x) + 1)\\), so\n\nFpp(x) = -2F(x) + 4x^2*F(x) - 2x\nFpp.(cps)\n\n2-element Vector{Float64}:\n 1.0820884492703637\n -1.0820884492703637\n\n\nThe first value being positive says there is a relative minimum at \\(-0.924139\\), at \\(0.924139\\) there is a relative maximum.\n\n\nExample\nReturning to probability, suppose there are \\(n\\) positive random numbers \\(X_1\\), \\(X_2\\), …, \\(X_n\\). A natural question might be to ask what formulas describes the largest of these values, assuming each is identical in some way. A description that is helpful is to define \\(F(a) = P(X \\leq a)\\) for some random number \\(X\\). That is the probability that \\(X\\) is less than or equal to \\(a\\) is \\(F(a)\\). For many situations, there is a density function, \\(f\\), for which \\(F(a) = \\int_0^a f(x) dx\\).\nUnder assumptions that the \\(X\\) are identical and independent, the largest value, \\(M\\), may b characterized by \\(P(M \\leq a) = \\left[F(a)\\right]^n\\). Using \\(f\\) and \\(F\\) describe the derivative of this expression.\nThis problem is constructed to take advantage of the FTC, and we have:\n\\[\n\\begin{align*}\n\\left[P(M \\leq a)\\right]'\n&= \\left[F(a)^n\\right]'\\\\\n&= n \\cdot F(a)^{n-1} \\left[F(a)\\right]'\\\\\n&= n F(a)^{n-1}f(a)\n\\end{align*}\n\\]\n\n\nExample\nSuppose again probabilities of a random number between \\(0\\) and \\(1\\), say, are given by a positive, continuous function \\(f(x)\\)on \\((0,1)\\) by \\(F(a) = P(X \\leq a) = \\int_0^a f(x) dx\\). The median value of the random number is a value of \\(a\\) for which \\(P(X \\leq a) = 1/2\\). Such an \\(a\\) makes \\(X\\) a coin toss betting if \\(X\\) is less than \\(a\\) is like betting on heads to come up. More generally the \\(q\\)th quantile of \\(X\\) is a number \\(a\\) with \\(P(X \\leq a) = q\\). The definition is fine, but for a given \\(f\\) and \\(q\\) can we find \\(a\\)?\nAbstractly, we are solving \\(F(a) = q\\) or \\(F(a)-q = 0\\) for \\(a\\). That is, this is a zero-finding question. We have discussed different options for this problem: bisection, a range of derivative free methods, and Newtons method. As evaluating \\(F\\) involves an integral, which may involve many evaluations of \\(f\\), a method which converges quickly is preferred. For that, Newtons method is a good idea, it having quadratic convergence in this case, as \\(a\\) is a simple zero given that \\(F\\) is increasing under the assumptions above.\nNewtons method involves the update step x = x - f(x)/f'(x). For this “\\(f\\)” is \\(h(x) = \\int_0^x f(u) du - q\\). The derivative is easy, the FTC just applies: \\(h'(x) = f(x)\\); no need for automatic differentiation, which may not even apply to this setup.\nTo do a concrete example, we take the Beta(\\(\\alpha, \\beta\\)) distribution (\\(\\alpha, \\beta > 0\\)) which has density, \\(f\\), over \\([0,1]\\) given by\n\\[\nf(x) = x^{\\alpha-1}\\cdot (1-x)^{\\beta-1} \\cdot \\frac{\\Gamma(\\alpha+\\beta)}{\\Gamma(\\alpha)\\Gamma(\\beta)}\n\\]\nThe Wikipedia link above gives an approximate answer for the median of \\((\\alpha-1/3)/(\\alpha+\\beta-2/3)\\) when \\(\\alpha,\\beta > 1\\). Lets see how correct this is when \\(\\alpha=5\\) and \\(\\beta=6\\). The gamma function used below implements \\(\\Gamma\\). It is in the SpecialFunctions package, which is loaded with the CalculusWithJulia package.\n\nalpha, beta = 5,6\nf(x) = x^(alpha-1)*(1-x)^(beta-1) * gamma(alpha + beta) / (gamma(alpha) * gamma(beta))\nq = 1/2\nh(x) = first(quadgk(f, 0, x)) - q\nhp(x) = f(x)\n\nx0 = (alpha-1/3)/(alpha + beta - 2/3)\nxstar = find_zero((h, hp), x0, Roots.Newton())\n\nxstar, x0\n\n(0.4516941562236631, 0.45161290322580644)\n\n\nThe asymptotic answer agrees with the answer in the first four decimal places.\nAs an aside, we ask how many function evaluations were taken? We can track this with a trick - using a closure to record when \\(f\\) is called:\n\nfunction FnWrapper(f)\n ctr = 0\n function(x)\n ctr += 1\n f(x)\n end\nend\n\nFnWrapper (generic function with 1 method)\n\n\nThen we have the above using FnWrapper(f) in place of f:\n\nff = FnWrapper(f)\nF(x) = first(quadgk(ff, 0, x))\nh(x) = F(x) - q\nhp(x) = ff(x)\nxstar = find_zero((h, hp), x0, Roots.Newton())\nxstar, ff.ctr\n\n(0.4516941562236631, Core.Box(48))\n\n\nSo the answer is the same. Newtons method converged in 3 steps, and called h or hp 5 times.\nAssuming the number inside Core.Box is the value of ctr, we see not so many function calls, just \\(48\\).\nWere f very expensive to compute or h expensive to compute (which can happen if, say, f were highly oscillatory) then steps could be made to cut this number down, such as evaluating \\(F(x_n) = \\int_{x_0}^{x_n} f(x) dx\\), using linearity, as \\(\\int_0^{x_0} f(x) dx + \\int_{x_0}^{x_1}f(x)dx + \\int_{x_1}^{x_2}f(x)dx + \\cdots + \\int_{x_{n-1}}^{x_n}f(x)dx\\). Then all but the last term could be stored from the previous steps of Newtons method. The last term presumably being less costly as it would typically involve a small interval.\n\n\n\n\n\n\nNote\n\n\n\nThe trick using a closure relies on an internal way of accessing elements in a closure. The same trick could be implemented many different ways which arent reliant on undocumented internals, this approach was just a tad more convenient. It shouldnt be copied for work intended for distribution, as the internals may change without notice or deprecation.\n\n\n\n\nExample\nA junior engineer at Treadmillz.com is tasked with updating the display of calories burned for an older-model treadmill. The old display involved a sequence of LED “dots” that updated each minute. The last 10 minutes were displayed. Each dot corresponded to one calorie burned, so the total number of calories burned in the past 10 minutes was the number of dots displayed, or the sum of each column of dots. An example might be:\n **\n ****\n *****\n ********\n**********\nIn this example display there was 1 calorie burned in the first minute, then 2, then 5, 5, 4, 3, 2, 2, 1. The total is \\(24\\).\nIn her work the junior engineer found this old function for updating the display\nfunction cnew = update(Cnew, Cold)\n cnew = Cnew - Cold\nend\nShe discovered that the function was written awhile ago, and in MATLAB. The function receives the values Cnew and Cold which indicate the total number of calories burned up until that time frame. The value cnew is the number of calories burned in the minute. (Some other engineer has cleverly figured out how many calories have been burned during the time on the machine.)\nThe new display will have twice as many dots, so the display can be updated every 30 seconds and still display 10 minutes worth of data. What should the update function now look like?\nHer first attempt was simply to rewrite the function in Julia:\n\nfunction update₁(Cnew, Cold)\n cnew = Cnew - Cold\nend\n\nupdate₁ (generic function with 1 method)\n\n\nThis has the advantage that each “dot” still represents a calorie burned, so that a user can still count the dots to see the total burned in the past 10 minutes.\n * *\n ****** *\n ************* *\nSadly though, users didnt like it. Instead of a set of dots being, say, 5 high, they were now 3 high and 2 high. It “looked” like they were doing less work! What to do?\nThe users actually were not responding to the number of dots, which hadnt changed, but rather the area that they represented - and this shrank in half. (It is much easier to visualize area than count dots when tired.) How to adjust for that?\nWell our engineer knew - double the dots and count each as half a calorie. This makes the “area” constant. She also generalized letting n be the number of updates per minute, in anticipation of even further improvements in the display technology:\n\nfunction update(Cnew, Cold, n)\n cnew = (Cnew - Cold) * n\nend\n\nupdate (generic function with 1 method)\n\n\nThen the “area” represented by the dots stays fixed over this time frame.\nThe engineer then thought a bit more, as the form of her answer seemed familiar. She decides to parameterize it in terms of \\(t\\) and found with \\(h=1/n\\): c(t) = (C(t) - C(t-h))/h. Ahh - the derivative approximation. But then what is the “area”? It is no longer just the sum of the dots, but in terms of the functions she finds that each column represents \\(c(t)\\cdot h\\), and the sum is just \\(c(t_1)h + c(t_2)h + \\cdots c(t_n)h\\) which looks like an approximate integral.\nIf the display were to reach the modern age and replace LED “dots” with a higher-pixel display, then the function to display would be \\(c(t) = C'(t)\\) and the area displayed would be \\(\\int_{t-10}^t c(u) du\\).\nThinking a bit harder, she knows that her update function is getting \\(C(t)\\), and displaying the rate of calorie burn leads to the area displayed being interpretable as the total calories burned between \\(t\\) and \\(t-10\\) (or \\(C(t)-C(t-10)\\)) by the fundamental theorem of calculus."
},
{
"objectID": "integrals/ftc.html#questions",
"href": "integrals/ftc.html#questions",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.5 Questions",
"text": "37.5 Questions\n\nQuestion\nIf \\(F(x) = e^{x^2}\\) is an antiderivative for \\(f\\), find \\(\\int_0^2 f(x) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIf \\(\\sin(x) - x\\cos(x)\\) is an antiderivative for \\(x\\sin(x)\\), find the following integral \\(\\int_0^\\pi x\\sin(x) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind an antiderivative then evaluate \\(\\int_0^1 x(1-x) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUse the fact that \\([e^x]' = e^x\\) to evaluate \\(\\int_0^e (e^x - 1) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the value of \\(\\int_0^1 (1-x^2/2 + x^4/24) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing SymPy, what is an antiderivative for \\(x^2 \\sin(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-x^2\\cos(x)\\)\n \n \n\n\n \n \n \n \n \\(-x^2\\cos(x) + 2x\\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(-x^2\\cos(x) + 2x\\sin(x) + 2\\cos(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing SymPy, what is an antiderivative for \\(xe^{-x}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-e^{-x}\\)\n \n \n\n\n \n \n \n \n \\(-xe^{-x}\\)\n \n \n\n\n \n \n \n \n \\(-(1+x) e^{-x}\\)\n \n \n\n\n \n \n \n \n \\(-(1 + x + x^2) e^{-x}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing SymPy, integrate the function \\(\\int_0^{2\\pi} e^x \\cdot \\sin(x) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA particle has velocity \\(v(t) = 2t^2 - t\\) between \\(0\\) and \\(1\\). If \\(x(0) = 0\\), find the position \\(x(1)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA particle has acceleration given by \\(\\sin(t)\\) between \\(0\\) and \\(\\pi\\). If the initial velocity is \\(v(0) = 0\\), find \\(v(\\pi/2)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe position of a particle is given by \\(x(t) = \\int_0^t g(u) du\\), where \\(x(0)=0\\) and \\(g(u)\\) is given by this piecewise linear graph:\n\n\n\n\n\n\nThe velocity of the particle is positive over:\n\n\n\n\n \n \n \n \n \n \n \n \n \n It is always positive\n \n \n\n\n \n \n \n \n It is always negative\n \n \n\n\n \n \n \n \n Between \\(0\\) and \\(1\\)\n \n \n\n\n \n \n \n \n Between \\(1\\) and \\(5\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe position of the particle is \\(0\\) at \\(t=0\\) and:\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(t=1\\)\n \n \n\n\n \n \n \n \n \\(t=2\\)\n \n \n\n\n \n \n \n \n \\(t=3\\)\n \n \n\n\n \n \n \n \n \\(t=4\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nThe position of the particle at time \\(t=5\\) is?\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nOn the interval \\([2,3]\\):\n\n\n\n\n \n \n \n \n \n \n \n \n \n The position, \\(x(t)\\), stays constant\n \n \n\n\n \n \n \n \n The position, \\(x(t)\\), increases with a slope of \\(1\\)\n \n \n\n\n \n \n \n \n The position, \\(x(t)\\), increases quadratically from \\(-1/2\\) to \\(1\\)\n \n \n\n\n \n \n \n \n The position, \\(x(t)\\), increases quadratically from \\(0\\) to \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x) = \\int_{t-10}^t f(u) du\\) for \\(f(u)\\) a positive, continuous function. What is \\(F'(t)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(t)\\)\n \n \n\n\n \n \n \n \n \\(-f(t-10)\\)\n \n \n\n\n \n \n \n \n \\(f(t) - f(t-10)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x) \\geq 0\\) and \\(F(x) = \\int_0^x f(u) du\\). \\(F(x)\\) is continuous and so has a maximum value on the interval \\([0,1]\\) taken at some \\(c\\) in \\([0,1]\\). It is\n\n\n\n \n \n \n \n \n \n \n \n \n At a critical point\n \n \n\n\n \n \n \n \n At the endpoint \\(0\\)\n \n \n\n\n \n \n \n \n At the endpoint \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x) = \\int_0^x f(u) du\\), where \\(f(x)\\) is given by the graph below. Identify the \\(x\\) values of all relative maxima of \\(F(x)\\). Explain why you know these are the values.\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n The derivative of \\(F\\) is \\(f\\), so by the second derivative test, \\(x=7\\)\n \n \n\n\n \n \n \n \n The derivative of \\(F\\) is \\(f\\), so by the first derivative test, \\(x=3, 9\\)\n \n \n\n\n \n \n \n \n The graph of \\(f\\) has relative maxima at \\(x=2,6,8\\)\n \n \n\n\n \n \n \n \n The derivative of \\(F\\) is \\(f\\), so by the first derivative test, \\(x=1,5\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x)\\) is monotonically decreasing with \\(f(0)=1\\), \\(f(1/2) = 0\\) and \\(f(1) = -1\\). Let \\(F(x) = \\int_0^x f(u) du\\). \\(F(x)\\) is continuous and so has a maximum value on the interval \\([0,1]\\) taken at some \\(c\\) in \\([0,1]\\). It is\n\n\n\n \n \n \n \n \n \n \n \n \n At a critical point, either \\(0\\) or \\(1\\)\n \n \n\n\n \n \n \n \n At a critical point, \\(1/2\\)\n \n \n\n\n \n \n \n \n At the endpoint \\(0\\)\n \n \n\n\n \n \n \n \n At the endpoint \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBarrow presented a version of the fundamental theorem of calculus in a 1670 volume edited by Newton, Barrows student (cf. Wagner). His version can be stated as follows (cf. Jardine):\nConsider the following figure where \\(f\\) is a strictly increasing function with \\(f(0) = 0\\). and \\(x > 0\\). The function \\(A(x) = \\int_0^x f(u) du\\) is also plotted. The point \\(Q\\) is \\(f(x)\\), and the point \\(P\\) is \\(A(x)\\). The point \\(T\\) is chosen to so that the length between \\(T\\) and \\(x\\) times the length between \\(Q\\) and \\(x\\) equals the length from \\(P\\) to \\(x\\). (\\(\\lvert Tx \\rvert \\cdot \\lvert Qx \\rvert = \\lvert Px \\rvert\\).) Barrow showed that the line segment \\(PT\\) is tangent to the graph of \\(A(x)\\). This figure illustrates the labeling for some function:\n\n\n\n\n\nThe fact that \\(\\lvert Tx \\rvert \\cdot \\lvert Qx \\rvert = \\lvert Px \\rvert\\) says what in terms of \\(f(x)\\), \\(A(x)\\) and \\(A'(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\lvert Tx \\rvert \\cdot f(x) = A(x)\\)\n \n \n\n\n \n \n \n \n \\(A(x) / \\lvert Tx \\rvert = A'(x)\\)\n \n \n\n\n \n \n \n \n \\(A(x) \\cdot A'(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe fact that \\(\\lvert PT \\rvert\\) is tangent says what in terms of \\(f(x)\\), \\(A(x)\\) and \\(A'(x)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\lvert Tx \\rvert \\cdot f(x) = A(x)\\)\n \n \n\n\n \n \n \n \n \\(A(x) / \\lvert Tx \\rvert = A'(x)\\)\n \n \n\n\n \n \n \n \n \\(A(x) \\cdot A'(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSolving, we get:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(A(x) = A^2(x) / f(x)\\)\n \n \n\n\n \n \n \n \n \\(A'(x) = A(x)\\)\n \n \n\n\n \n \n \n \n \\(A(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \\(A'(x) = f(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAccording to Bressoud “Newton observes that the rate of change of an accumulated quantity is the rate at which that quantity is accumulating”. Which part of the FTC does this refer to:\n\n\n\n \n \n \n \n \n \n \n \n \n Part 1: \\([\\int_a^x f(u) du]' = f\\)\n \n \n\n\n \n \n \n \n Part 2: \\(\\int_a^b f(u) du = F(b)- F(a)\\)."
},
{
"objectID": "integrals/ftc.html#more-on-sympys-integrate",
"href": "integrals/ftc.html#more-on-sympys-integrate",
"title": "37  Fundamental Theorem or Calculus",
"section": "37.6 More on SymPys integrate",
"text": "37.6 More on SymPys integrate\nFinding the value of a definite integral through the fundamental theorem of calculus relies on the algebraic identification of an antiderivative. This is difficult to do by hand and by computer, and is complicated by the fact that not every elementaryfunction has an elementary antiderivative. SymPys documentation on integration indicates that several different means to integrate a function are used internally. As it is of interest here, it is copied with just minor edits below (from an older version of SymPy):\n\nSimple heuristics (based on pattern matching and integral table):\n\nmost frequently used functions (e.g. polynomials, products of trigonometric functions)\n\n\n\nIntegration of rational functions:\n\nA complete algorithm for integrating rational functions is implemented (the Lazard-Rioboo-Trager algorithm). The algorithm also uses the partial fraction decomposition algorithm implemented in apart as a preprocessor to make this process faster. Note that the integral of a rational function is always elementary, but in general, it may include a RootSum.\n\n\n\nFull Risch algorithm:\n\nThe Risch algorithm is a complete decision procedure for integrating elementary functions, which means that given any elementary function, it will either compute an elementary antiderivative, or else prove that none exists. Currently, part of transcendental case is implemented, meaning elementary integrals containing exponentials, logarithms, and (soon!) trigonometric functions can be computed. The algebraic case, e.g., functions containing roots, is much more difficult and is not implemented yet.\nIf the routine fails (because the integrand is not elementary, or because a case is not implemented yet), it continues on to the next algorithms below. If the routine proves that the integrals is nonelementary, it still moves on to the algorithms below, because we might be able to find a closed-form solution in terms of special functions. If risch=true, however, it will stop here.\n\n\n\nThe Meijer G-Function algorithm:\n\nThis algorithm works by first rewriting the integrand in terms of very general Meijer G-Function (meijerg in SymPy), integrating it, and then rewriting the result back, if possible. This algorithm is particularly powerful for definite integrals (which is actually part of a different method of Integral), since it can compute closed-form solutions of definite integrals even when no closed-form indefinite integral exists. But it also is capable of computing many indefinite integrals as well.\nAnother advantage of this method is that it can use some results about the Meijer G-Function to give a result in terms of a Piecewise expression, which allows to express conditionally convergent integrals.\nSetting meijerg=true will cause integrate to use only this method.\n\n\n\nThe “manual integration” algorithm:\n\nThis algorithm tries to mimic how a person would find an antiderivative by hand, for example by looking for a substitution or applying integration by parts. This algorithm does not handle as many integrands but can return results in a more familiar form.\nSometimes this algorithm can evaluate parts of an integral; in this case integrate will try to evaluate the rest of the integrand using the other methods here.\nSetting manual=true will cause integrate to use only this method.\n\n\n\nThe Heuristic Risch algorithm:\n\nThis is a heuristic version of the Risch algorithm, meaning that it is not deterministic. This is tried as a last resort because it can be very slow. It is still used because not enough of the full Risch algorithm is implemented, so that there are still some integrals that can only be computed using this method. The goal is to implement enough of the Risch and Meijer G-function methods so that this can be deleted. Setting heurisch=true will cause integrate to use only this method. Set heurisch=false to not use it."
},
{
"objectID": "integrals/substitution.html",
"href": "integrals/substitution.html",
"title": "38  Substitution",
"section": "",
"text": "This section uses these add-on packages:\nThe technique of \\(u\\)-substitution is derived from reversing the chain rule: \\([f(g(x))]' = f'(g(x)) g'(x)\\).\nSuppose that \\(g\\) is continuous and \\(u(x)\\) is differentiable with \\(u'(x)\\) being Riemann integrable. Then both these integrals are defined:\n\\[\n\\int_a^b g(u(t)) \\cdot u'(t) dt, \\quad \\text{and}\\quad \\int_{u(a)}^{u(b)} g(x) dx.\n\\]\nWe wish to show they are equal.\nLet \\(G\\) be an antiderivative of \\(g\\), which exists as \\(g\\) is assumed to be continuous. (By the Fundamental Theorem part I.) Consider the composition \\(G \\circ u\\). The chain rule gives:\n\\[\n[G \\circ u]'(t) = G'(u(t)) \\cdot u'(t) = g(u(t)) \\cdot u'(t).\n\\]\nSo,\n\\[\n\\begin{align*}\n\\int_a^b g(u(t)) \\cdot u'(t) dt &= \\int_a^b (G \\circ u)'(t) dt\\\\\n&= (G\\circ u)(b) - (G\\circ u)(a) \\quad\\text{(the FTC, part II)}\\\\\n&= G(u(b)) - G(u(a)) \\\\\n&= \\int_{u(a)}^{u(b)} g(x) dx. \\quad\\text{(the FTC part II)}\n\\end{align*}\n\\]\nThat is, this substitution formula applies:\nFurther, for indefinite integrals,\nWe have seen a special case of substitution where \\(u(x) = x-c\\) in the formula \\(\\int_{a-c}^{b-c} g(x) dx= \\int_a^b g(x-c)dx\\).\nThe main use of this is to take complicated things inside of the function \\(g\\) out of the function (the \\(u(x)\\)) by renaming them, then accounting for the change of name.\nSome examples are in order.\nConsider:\n\\[\n\\int_0^{\\pi/2} \\cos(x) e^{\\sin(x)} dx.\n\\]\nClearly the \\(\\sin(x)\\) inside the exponential is an issue. If we let \\(u(x) = \\sin(x)\\), then \\(u'(x) = \\cos(x)\\), and this becomes\n\\[\n\\int_0^2 u\\prime(x) e^{u(x)} dx =\n\\int_{u(0)}^{u(\\pi/2)} e^x dx = e^x \\big|_{\\sin(0)}^{\\sin(\\pi/2)} = e^1 - e^0.\n\\]\nThis all worked, as the problem was such that it was more or less obvious what to choose for \\(u\\) and \\(G\\)."
},
{
"objectID": "integrals/substitution.html#sympy-and-substitution",
"href": "integrals/substitution.html#sympy-and-substitution",
"title": "38  Substitution",
"section": "38.1 SymPy and substitution",
"text": "38.1 SymPy and substitution\nThe integrate function in SymPy can handle most problems which involve substitution. Here are a few examples:\n\nThis integral, \\(\\int_0^2 4x/\\sqrt{x^2 +1}dx\\), involves a substitution for \\(x^2 + 1\\):\n\n\n@syms x::real t::real\nintegrate(4x / sqrt(x^2 + 1), (x, 0, 2))\n\n \n\\[\n-4 + 4 \\sqrt{5}\n\\]\n\n\n\n\nThis integral, \\(\\int_e^{e^2} 1/(x\\log(x)) dx\\) involves a substitution of \\(u=\\log(x)\\). Here we see the answer:\n\n\nf(x) = 1/(x*log(x))\nintegrate(f(x), (x, sympy.E, sympy.E^2))\n\n \n\\[\n\\log{\\left(2 \\right)}\n\\]\n\n\n\n(We used sympy.E) - and not e - to avoid any conversion to floating point, which could yield an inexact answer.)\nThe antiderivative is interesting here; it being an iterated logarithm.\n\nintegrate(1/(x*log(x)), x)\n\n \n\\[\n\\log{\\left(\\log{\\left(x \\right)} \\right)}\n\\]\n\n\n\n\n38.1.1 Failures…\nNot every integral problem lends itself to solution by substitution. For example, we can use substitution to evaluate the integral of \\(xe^{-x^2}\\), but for \\(e^{-x^2}\\) or \\(x^2e^{-x^2}\\). The first has no familiar antiderivative, the second is done by a different technique.\nEven when substitution can be used, SymPy may not be able to algorithmically identify it. The main algorithm used can determine if expressions involving rational functions, radicals, logarithms, and exponential functions is integrable. Missing from this list are absolute values.\nFor some such problems, we can help SymPy out - by breaking the integral into pieces where we know the sign of the expression.\nFor substitution problems, we can also help out. For example, to find an antiderivative for\n\\[\n\\int(1 + \\log(x)) \\sqrt{1 + (x\\log(x))^2} dx\n\\]\nA quick attempt with SymPy turns up nothing:\n\n𝒇(x) = (1 + log(x)) * sqrt(1 + (x*log(x))^2 )\nintegrate(𝒇(x), x)\n\n \n\\[\n\\int \\sqrt{x^{2} \\log{\\left(x \\right)}^{2} + 1} \\left(\\log{\\left(x \\right)} + 1\\right)\\, dx\n\\]\n\n\n\nBut were we to try \\(u=x\\log(x)\\), wed see that this simplifies to \\(\\int \\sqrt{1 + u^2} du\\), which has some hope of having an antiderivative.\nWe can help SymPy out by substitution:\n\nu(x) = x * log(x)\n@syms w dw\nex = 𝒇(x)\nex₁ = ex(u(x) => w, diff(u(x),x) => dw)\n\n \n\\[\ndw \\sqrt{w^{2} + 1}\n\\]\n\n\n\nThis verifies the above. Can it be integrated in w? The “dw” is only for familiarity, SymPy doesnt use this, so we set it to 1 then integrate:\n\nex₂ = ex₁(dw => 1)\nex₃ = integrate(ex₂, w)\n\n \n\\[\n\\frac{w \\sqrt{w^{2} + 1}}{2} + \\frac{\\operatorname{asinh}{\\left(w \\right)}}{2}\n\\]\n\n\n\nFinally, we put back in the u(x) to get an antiderivative.\n\nex₃(w => u(x))\n\n \n\\[\n\\frac{x \\sqrt{x^{2} \\log{\\left(x \\right)}^{2} + 1} \\log{\\left(x \\right)}}{2} + \\frac{\\operatorname{asinh}{\\left(x \\log{\\left(x \\right)} \\right)}}{2}\n\\]\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nLest it be thought this is an issue with SymPy, but not other systems, this example was borrowed from an illustration for helping Mathematica."
},
{
"objectID": "integrals/substitution.html#trigonometric-substitution",
"href": "integrals/substitution.html#trigonometric-substitution",
"title": "38  Substitution",
"section": "38.2 Trigonometric substitution",
"text": "38.2 Trigonometric substitution\nWait, in the last example an antiderivative for \\(\\sqrt{1 + u^2}\\) was found. But how? We havent discussed this yet.\nThis can be found using trigonometric substitution. In this example, we know that \\(1 + \\tan(\\theta)^2\\) simplifies to \\(\\sec(\\theta)^2\\), so we might try a substitution of \\(\\tan(u)=x\\). This would simplify \\(\\sqrt{1 + x^2}\\) to \\(\\sqrt{1 + \\tan(u)^2} = \\sqrt{\\sec(u)^2}\\) which is \\(\\lvert \\sec(u) \\rvert\\). What of \\(du\\)? The chain rule gives \\(\\sec(u)^2du = dx\\). In short we get:\n\\[\n\\int \\sqrt{1 + x^2} dx = \\int \\sec(u)^2 \\lvert \\sec(u) \\rvert du = \\int \\sec(u)^3 du,\n\\]\nif we know \\(\\sec(u) \\geq 0\\).\nThis leaves still the question of integrating \\(\\sec(u)^3\\), which we arent (yet) prepared to discuss, but we see that this type of substitution can re-express an integral in a new way that may pay off.\n\nExamples\nLets see some examples where a trigonometric substitution is all that is needed.\n\nExample\nConsider \\(\\int 1/(1+x^2) dx\\). This is an antiderivative of some function, but if that isnt observed, we might notice the \\(1+x^2\\) and try to simplify that. First, an attempt at a \\(u\\)-substitution:\nLetting \\(u = 1+x^2\\) we get \\(du = 2xdx\\) which gives \\(\\int (1/u) (2x) du\\). We arent able to address the “\\(2x\\)” part successfully, so this attempt is for naught.\nNow we try a trigonometric substitution, taking advantage of the identity \\(1+\\tan(x)^2 = \\sec(x)^2\\). Letting \\(\\tan(u) = x\\) yields \\(\\sec(u)^2 du = dx\\) and we get:\n\\[\n\\int \\frac{1}{1+x^2} dx = \\int \\frac{1}{1 + \\tan(u)^2} \\sec(u)^2 du = \\int 1 du = u.\n\\]\nBut \\(\\tan(u) = x\\), so in terms of \\(x\\), an antiderivative is just \\(\\tan^{-1}(x)\\), or the arctangent. Here we verify with SymPy:\n\nintegrate(1/(1+x^2), x)\n\n \n\\[\n\\operatorname{atan}{\\left(x \\right)}\n\\]\n\n\n\nThe general form allows \\(a^2 + (bx)^2\\) in the denominator (squared so both are positive and the answer is nicer):\n\n@syms a::real, b::real, x::real\nintegrate(1 / (a^2 + (b*x)^2), x)\n\n \n\\[\n\\frac{\\operatorname{atan}{\\left(\\frac{b x}{a} \\right)}}{a b}\n\\]\n\n\n\n\n\nExample\nThe expression \\(1-x^2\\) can be attacked by the substitution \\(\\sin(u) =x\\) as then \\(1-x^2 = 1-\\cos(u)^2 = \\sin(u)^2\\). Here we see this substitution being used successfully:\n\\[\n\\begin{align*}\n\\int \\frac{1}{\\sqrt{9 - x^2}} dx &= \\int \\frac{1}{\\sqrt{9 - (3\\sin(u))^2}} \\cdot 3\\cos(u) du\\\\\n&=\\int \\frac{1}{3\\sqrt{1 - \\sin(u)^2}}\\cdot3\\cos(u) du \\\\\n&= \\int du \\\\\n&= u \\\\\n&= \\sin^{-1}(x/3).\n\\end{align*}\n\\]\nFurther substitution allows the following integral to be solved for an antiderivative:\n\n@syms a::real, b::real\nintegrate(1 / sqrt(a^2 - b^2*x^2), x)\n\n \n\\[\n\\begin{cases} - \\frac{i x \\left|{a}\\right| \\operatorname{acosh}{\\left(\\frac{\\left|{b}\\right| \\left|{x}\\right|}{\\left|{a}\\right|} \\right)}}{a \\left|{b}\\right| \\left|{x}\\right|} + \\frac{\\pi x \\left|{a}\\right|}{2 a \\left|{b}\\right| \\left|{x}\\right|} & \\text{for}\\: \\frac{b^{2} x^{2}}{a^{2}} > 1 \\\\\\frac{x \\left|{a}\\right| \\operatorname{asin}{\\left(\\frac{\\left|{b}\\right| \\left|{x}\\right|}{\\left|{a}\\right|} \\right)}}{a \\left|{b}\\right| \\left|{x}\\right|} & \\text{otherwise} \\end{cases}\n\\]\n\n\n\n\n\nExample\nThe expression \\(x^2 - 1\\) is a bit different, this lends itself to \\(\\sec(u) = x\\) for a substitution, for \\(\\sec(u)^2 - 1 = \\tan(u)^2\\). For example, we try \\(\\sec(u) = x\\) to integrate:\n\\[\n\\begin{align*}\n\\int \\frac{1}{\\sqrt{x^2 - 1}} dx &= \\int \\frac{1}{\\sqrt{\\sec(u)^2 - 1}} \\cdot \\sec(u)\\tan(u) du\\\\\n&=\\int \\frac{1}{\\tan(u)}\\sec(u)\\tan(u) du\\\\\n&= \\int \\sec(u) du.\n\\end{align*}\n\\]\nThis doesnt seem that helpful, but the antiderivative to \\(\\sec(u)\\) is \\(\\log\\lvert (\\sec(u) + \\tan(u))\\rvert\\), so we can proceed to get:\n\\[\n\\begin{align*}\n\\int \\frac{1}{\\sqrt{x^2 - 1}} dx &= \\int \\sec(u) du\\\\\n&= \\log\\lvert (\\sec(u) + \\tan(u))\\rvert\\\\\n&= \\log\\lvert x + \\sqrt{x^2-1} \\rvert.\n\\end{align*}\n\\]\nSymPy gives a different representation using the arccosine:\n\n@syms a::positive, b::positive, x::real\nintegrate(1 / sqrt(a^2*x^2 - b^2), x)\n\n \n\\[\n\\frac{\\operatorname{acosh}{\\left(\\frac{a x}{b} \\right)}}{a}\n\\]\n\n\n\n\n\nExample\nThe equation of an ellipse is \\(x^2/a^2 + y^2/b^2 = 1\\). Suppose \\(a,b>0\\). The area under the function \\(b \\sqrt{1 - x^2/a^2}\\) between \\(-a\\) and \\(a\\) will then be half the area of the ellipse. Find the area enclosed by the ellipse.\nWe need to compute:\n\\[\n2\\int_{-a}^a b \\sqrt{1 - x^2/a^2} dx =\n4 b \\int_0^a\\sqrt{1 - x^2/a^2} dx.\n\\]\nLetting \\(\\sin(u) = x/a\\) gives \\(a\\cos(u)du = dx\\) and an antiderivative is found with:\n\\[\n4 b \\int_0^a \\sqrt{1 - x^2/a^2} dx = 4b \\int_0^{\\pi/2} \\sqrt{1-u^2} a \\cos(u) du\n= 4ab \\int_0^{\\pi/2} \\cos(u)^2 du\n\\]\nThe identify \\(\\cos(u)^2 = (1 + \\cos(2u))/2\\) makes this tractable:\n\\[\n\\begin{align*}\n4ab \\int \\cos(u)^2 du\n&= 4ab\\int_0^{\\pi/2}(\\frac{1}{2} + \\frac{\\cos(2u)}{2}) du\\\\\n&= 4ab(\\frac{1}{2}u + \\frac{\\sin(2u)}{4})\\big|_0^{\\pi/2}\\\\\n&= 4ab (\\pi/4 + 0) = \\pi ab.\n\\end{align*}\n\\]\nKeeping in mind that that a circle with radius \\(a\\) is an ellipse with \\(b=a\\), we see that this gives the correct answer for a circle."
},
{
"objectID": "integrals/substitution.html#questions",
"href": "integrals/substitution.html#questions",
"title": "38  Substitution",
"section": "38.3 Questions",
"text": "38.3 Questions\n\nQuestion\nFor \\(\\int \\sin(x) \\cos(x) dx\\), let \\(u=\\sin(x)\\). What is the resulting substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int u (1 - u^2) du\\)\n \n \n\n\n \n \n \n \n \\(\\int u du\\)\n \n \n\n\n \n \n \n \n \\(\\int u \\cos(x) du\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(\\int \\tan(x)^4 \\sec(x)2 dx\\) what \\(u\\)-substitution makes this easy?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(u=\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(u=\\sec(x)^2\\)\n \n \n\n\n \n \n \n \n \\(u=\\sec(x)\\)\n \n \n\n\n \n \n \n \n \\(u=\\tan(x)^4\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(\\int x \\sqrt{x^2 - 1} dx\\) what \\(u\\) substitution makes this easy?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(u=\\sqrt{x^2 - 1}\\)\n \n \n\n\n \n \n \n \n \\(u=x^2 - 1\\)\n \n \n\n\n \n \n \n \n \\(u=x\\)\n \n \n\n\n \n \n \n \n \\(u=x^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(\\int x^2(1-x)^2 dx\\) will the substitution \\(u=1-x\\) prove effective?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat about expanding the factored polynomial to get a fourth degree polynomial, will this prove effective?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(\\int (\\log(x))^3/x dx\\) the substitution \\(u=\\log(x)\\) reduces this to what?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int u du\\)\n \n \n\n\n \n \n \n \n \\(\\int u^3/x du\\)\n \n \n\n\n \n \n \n \n \\(\\int u^3 du\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(\\int \\tan(x) dx\\) what substitution will prove effective?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(u=\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(u=\\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(u=\\cos(x)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIntegrating \\(\\int_0^1 x \\sqrt{1 - x^2} dx\\) can be done by using the \\(u\\)-substitution \\(u=1-x^2\\). This yields an integral\n\\[\n\\int_a^b \\frac{-\\sqrt{u}}{2} du.\n\\]\nWhat are \\(a\\) and \\(b\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(a=1,~ b=0\\)\n \n \n\n\n \n \n \n \n \\(a=0,~ b=0\\)\n \n \n\n\n \n \n \n \n \\(a=1,~ b=1\\)\n \n \n\n\n \n \n \n \n \\(a=0,~ b=1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int \\sqrt{1 - x^2} dx\\) lends itself to what substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(u = 1 - x^2\\)\n \n \n\n\n \n \n \n \n \\(\\tan(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sin(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int x/(1+x^2) dx\\) lends itself to what substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(u = 1 + x^2\\)\n \n \n\n\n \n \n \n \n \\(\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\tan(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sin(u) = x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int dx / \\sqrt{1 - x^2}\\) lends itself to what substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\tan(u) = x\\)\n \n \n\n\n \n \n \n \n \\(u = 1 - x^2\\)\n \n \n\n\n \n \n \n \n \\(\\sin(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int dx / \\sqrt{x^2 - 16}\\) lends itself to what substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(4\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sin(u) = x\\)\n \n \n\n\n \n \n \n \n \\(4\\sin(u) = x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int dx / (a^2 + x^2)\\) lends itself to what substitution?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\tan(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\tan(u) = x\\)\n \n \n\n\n \n \n \n \n \\(a\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \\(\\sec(u) = x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe integral \\(\\int_{1/2}^1 \\sqrt{1 - x^2}dx\\) can be approached with the substitution \\(\\sin(u) = x\\) giving:\n\\[\n\\int_a^b \\cos(u)^2 du.\n\\]\nWhat are \\(a\\) and \\(b\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(a=1/2,~ b= 1\\)\n \n \n\n\n \n \n \n \n \\(a=\\pi/3,~ b=\\pi/2\\)\n \n \n\n\n \n \n \n \n \\(a=\\pi/6,~ b=\\pi/2\\)\n \n \n\n\n \n \n \n \n \\(a=\\pi/4,~ b=\\pi/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nHow would we verify that \\(\\log\\lvert (\\sec(u) + \\tan(u))\\rvert\\) is an antiderivative for \\(\\sec(u)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n We could differentiate \\(\\log\\lvert (\\sec(u) + \\tan(u))\\rvert\\) \n \n \n\n\n \n \n \n \n We could differentiate \\(\\sec(u)\\)."
},
{
"objectID": "integrals/integration_by_parts.html",
"href": "integrals/integration_by_parts.html",
"title": "39  Integration By Parts",
"section": "",
"text": "This section uses these add-on packages:\nSo far we have seen that the derivative rules lead to integration rules. In particular:\nNow we turn our attention to the implications of the product rule: \\([uv]' = u'v + uv'\\). The resulting technique is called integration by parts.\nThe following illustrates integration by parts of the integral \\((uv)'\\) over \\([a,b]\\) original.\nThe figure is a parametric plot of \\((u,v)\\) with the points \\((u(a), v(a))\\) and \\((u(b), v(b))\\) marked. The difference \\(u(b)v(b) - u(a)v(a) = u(x)v(x) \\mid_a^b\\) is shaded. This area breaks into two pieces, \\(A\\) and \\(B\\), partitioned by the curve. If \\(u\\) is increasing and the curve is parameterized by \\(t \\rightarrow u^{-1}(t)\\), then \\(A=\\int_{u^{-1}(a)}^{u^{-1}(b)} v(u^{-1}(t))dt\\). A \\(u\\)-substitution with \\(t = u(x)\\) changes this into the integral \\(\\int_a^b v(x) u'(x) dx\\). Similarly, for increasing \\(v\\), it can be seen that \\(B=\\int_a^b u(x) v'(x) dx\\). This suggests a relationship between the integral of \\(u v'\\), the integral of \\(u' v\\) and the value \\(u(b)v(b) - u(a)v(a)\\).\nIn terms of formulas, by the fundamental theorem of calculus:\n\\[\nu(x)\\cdot v(x)\\big|_a^b = \\int_a^b [u(x) v(x)]' dx = \\int_a^b u'(x) \\cdot v(x) dx + \\int_a^b u(x) \\cdot v'(x) dx.\n\\]\nThis is re-expressed as\n\\[\n\\int_a^b u(x) \\cdot v'(x) dx = u(x) \\cdot v(x)\\big|_a^b - \\int_a^b v(x) \\cdot u'(x) dx,\n\\]\nOr, more informally, as \\(\\int udv = uv - \\int v du\\).\nThis can sometimes be confusingly written as:\n\\[\n\\int f(x) g'(x) dx = f(x)g(x) - \\int f'(x) g(x) dx.\n\\]\n(The confusion coming from the fact that the indefinite integrals are only defined up to a constant.)\nHow does this help? It allows us to differentiate parts of an integral in hopes it makes the result easier to integrate.\nAn illustration can clarify.\nConsider the integral \\(\\int_0^\\pi x\\sin(x) dx\\). If we let \\(u=x\\) and \\(dv=\\sin(x) dx\\), then \\(du = 1dx\\) and \\(v=-\\cos(x)\\). The above then says:\n\\[\n\\begin{align*}\n\\int_0^\\pi x\\sin(x) dx &= \\int_0^\\pi u dv\\\\\n&= uv\\big|_0^\\pi - \\int_0^\\pi v du\\\\\n&= x \\cdot (-\\cos(x)) \\big|_0^\\pi - \\int_0^\\pi (-\\cos(x)) dx\\\\\n&= \\pi (-\\cos(\\pi)) - 0(-\\cos(0)) + \\int_0^\\pi \\cos(x) dx\\\\\n&= \\pi + \\sin(x)\\big|_0^\\pi\\\\\n&= \\pi.\n\\end{align*}\n\\]\nThe technique means one part is differentiated and one part integrated. The art is to break the integrand up into a piece that gets easier through differentiation and a piece that doesnt get much harder through integration."
},
{
"objectID": "integrals/integration_by_parts.html#area-related-to-parameterized-curves",
"href": "integrals/integration_by_parts.html#area-related-to-parameterized-curves",
"title": "39  Integration By Parts",
"section": "39.1 Area related to parameterized curves",
"text": "39.1 Area related to parameterized curves\nThe figure introduced to motivate the integration by parts formula also suggests that areas described parametrically (by a pair of functions \\(x=u(t), y=v(t)\\) for \\(a \\le t \\le b\\)) can have their area computed.\nWhen \\(u(t)\\) is strictly increasing, and hence having an inverse function, then re-parameterizing by \\(\\phi(t) = u^{-1}(t)\\) gives a \\(x=u(u^{-1}(t))=t, y=v(u^{-1}(t))\\) and integrating this gives the area by \\(A=\\int_a^b v(t) u'(t) dt\\)\nHowever, the correct answer requires understanding a minus sign. Consider the area enclosed by \\(x(t) = \\cos(t), y(t) = \\sin(t)\\):\n\n\n\n\n\nWe added a rectangle for a Riemann sum for \\(t_i = \\pi/3\\) and \\(t_{i+1} = \\pi/3 + \\pi/8\\). The height of this rectangle if \\(y(t_i)\\), the base is of length \\(x(t_i) - x(t_{i+1})\\) given the orientation of how the circular curve is parameterized (counter clockwise here).\nTaking this Riemann sum approach, we can approximate the area under the curve parameterized by \\((u(t), v(t))\\) over the time range \\([t_i, t_{i+1}]\\) as a rectangle with height \\(y(t_i)\\) and base \\(x(t_{i}) - x(t_{i+1})\\). Then we get, as expected:\n\\[\n\\begin{align*}\nA &\\approx \\sum_i y(t_i) \\cdot (x(t_{i}) - x(t_{i+1}))\\\\\n &= - \\sum_i y(t_i) \\cdot (x(t_{i+1}) - x(t_{i}))\\\\\n &= - \\sum_i y(t_i) \\cdot \\frac{x(t_{i+1}) - x(t_i)}{t_{i+1}-t_i} \\cdot (t_{i+1}-t_i)\\\\\n &\\approx -\\int_a^b y(t) x'(t) dt.\n\\end{align*}\n\\]\nSo with a counterclockwise rotation, the actual answer for the area includes a minus sign. If the area is traced out in a clockwise manner, there is no minus sign.\nThis is a case of Greens Theorem to be taken up in Greens Theorem, Stokes Theorem, and the Divergence Theorem.\n\nExample\nApply the formula to a parameterized circle to ensure, the signed area is properly computed. If we use \\(x(t) = r\\cos(t)\\) and \\(y(t) = r\\sin(t)\\) then we have the motion is counterclockwise:\n\n@syms 𝒓 t\n𝒙 = 𝒓 * cos(t)\n𝒚 = 𝒓 * sin(t)\n-integrate(𝒚 * diff(𝒙, t), (t, 0, 2PI))\n\n \n\\[\n\\pi 𝒓^{2}\n\\]\n\n\n\nWe see the expected answer for the area of a circle.\n\n\nExample\nApply the formula to find the area under one arch of a cycloid, parameterized by \\(x(t) = t - \\sin(t), y(t) = 1 - \\cos(t)\\).\nWorking symbolically, we have one arch given by the following described in a clockwise manner, so we use \\(\\int y(t) x'(t) dt\\):\n\n@syms t\n𝒙 = t - sin(t)\n𝒚 = 1 - cos(t)\nintegrate(𝒚 * diff(𝒙, t), (t, 0, 2PI))\n\n \n\\[\n3 \\pi\n\\]\n\n\n\n(Galileo was thwarted in finding this answer exactly and resorted to constructing one from metal to estimate the value.)\n\n\nExample\nConsider the example \\(x(t) = \\cos(t) + t\\sin(t), y(t) = \\sin(t) - t\\cos(t)\\) for \\(0 \\leq t \\leq 2\\pi\\).\n\n\n\n\n\nHow much area is enclosed by this curve and the \\(x\\) axis? The area is described in a counterclockwise manner, so we have:\n\nlet\n x(t) = cos(t) + t*sin(t)\n y(t) = sin(t) - t*cos(t)\n yx(t) = -y(t) * x'(t) # yx\\prime[tab]\n quadgk(yx, 0, 2pi)\nend\n\n(44.483294893989545, 6.295185999150021e-7)\n\n\nThis particular problem could also have been done symbolically, but many curves will need to have a numeric approximation used."
},
{
"objectID": "integrals/integration_by_parts.html#questions",
"href": "integrals/integration_by_parts.html#questions",
"title": "39  Integration By Parts",
"section": "39.2 Questions",
"text": "39.2 Questions\n\nQuestion\nIn the integral of \\(\\int \\log(x) dx\\) we let \\(u=\\log(x)\\) and \\(dv=dx\\). What are \\(du\\) and \\(v\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(du=1/x dx\\quad v = x^2/2\\)\n \n \n\n\n \n \n \n \n \\(du=1/x dx \\quad v = x\\)\n \n \n\n\n \n \n \n \n \\(du=x\\log(x) dx\\quad v = 1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the integral \\(\\int \\sec(x)^3 dx\\) we let \\(u=\\sec(x)\\) and \\(dv = \\sec(x)^2 dx\\). What are \\(du\\) and \\(v\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(du=\\sec(x)\\tan(x)dx \\quad v=\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(du=\\tan(x) dx \\quad v=\\sec(x)\\tan(x)\\)\n \n \n\n\n \n \n \n \n \\(du=\\csc(x) dx \\quad v=\\sec(x)^3 / 3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the integral \\(\\int e^{-x} \\cos(x)dx\\) we let \\(u=e^{-x}\\) and \\(dv=\\cos(x) dx\\). What are \\(du\\) and \\(v\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(du=-e^{-x} dx \\quad v=\\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(du=-e^{-x} dx \\quad v=-\\sin(x)\\)\n \n \n\n\n \n \n \n \n \\(du=\\sin(x)dx \\quad v=-e^{-x}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the value of \\(\\int_1^4 x \\log(x) dx\\). You can integrate by parts.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the value of \\(\\int_0^{\\pi/2} x\\cos(2x) dx\\). You can integrate by parts.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the value of \\(\\int_1^e (\\log(x))^2 dx\\). You can integrate by parts.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIntegration by parts can be used to provide “reduction” formulas, where an antiderivative is written in terms of another antiderivative with a lower power. Which is the proper reduction formula for \\(\\int (\\log(x))^n dx\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x(\\log(x))^n - \\int (\\log(x))^{n-1} dx\\)\n \n \n\n\n \n \n \n \n \\(\\int (\\log(x))^{n+1}/(n+1) dx\\)\n \n \n\n\n \n \n \n \n \\(x(\\log(x))^n - n \\int (\\log(x))^{n-1} dx\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Wikipedia page has a rule of thumb with an acronym LIATE to indicate what is a good candidate to be “\\(u\\)”: Log function, Inverse functions, Algebraic functions (\\(x^n\\)), Trigonmetric functions, and Exponential functions.\nConsider the integral \\(\\int x \\cos(x) dx\\). Which letter should be tried first?\n\n\n\n \n \n \n \n \n \n \n \n \n L\n \n \n\n\n \n \n \n \n I\n \n \n\n\n \n \n \n \n A\n \n \n\n\n \n \n \n \n T\n \n \n\n\n \n \n \n \n E\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nConsider the integral \\(\\int x^2\\log(x) dx\\). Which letter should be tried first?\n\n\n\n \n \n \n \n \n \n \n \n \n L\n \n \n\n\n \n \n \n \n I\n \n \n\n\n \n \n \n \n A\n \n \n\n\n \n \n \n \n T\n \n \n\n\n \n \n \n \n E\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nConsider the integral \\(\\int x^2 \\sin^{-1}(x) dx\\). Which letter should be tried first?\n\n\n\n \n \n \n \n \n \n \n \n \n L\n \n \n\n\n \n \n \n \n I\n \n \n\n\n \n \n \n \n A\n \n \n\n\n \n \n \n \n T\n \n \n\n\n \n \n \n \n E\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nConsider the integral \\(\\int e^x \\sin(x) dx\\). Which letter should be tried first?\n\n\n\n \n \n \n \n \n \n \n \n \n L\n \n \n\n\n \n \n \n \n I\n \n \n\n\n \n \n \n \n A\n \n \n\n\n \n \n \n \n T\n \n \n\n\n \n \n \n \n E\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind an antiderivative for \\(\\cos^{-1}(x)\\) using the integration by parts formula.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-\\sin^{-1}(x)\\)\n \n \n\n\n \n \n \n \n \\(x\\cos^{-1}(x)-\\sqrt{1 - x^2}\\)\n \n \n\n\n \n \n \n \n \\(x^2/2 \\cos^{-1}(x) - x\\sqrt{1-x^2}/4 - \\cos^{-1}(x)/4\\)"
},
{
"objectID": "integrals/partial_fractions.html",
"href": "integrals/partial_fractions.html",
"title": "40  Partial Fractions",
"section": "",
"text": "using CalculusWithJulia\nusing SymPy\nIntegration is facilitated when an antiderivative for \\(f\\) can be found, as then definite integrals can be evaluated through the fundamental theorem of calculus.\nHowever, despite integration being an algorithmic procedure, integration is not. There are “tricks” to try, such as substitution and integration by parts. These work in some cases. However, there are classes of functions for which algorithms exist. For example, the SymPy integrate function mostly implements an algorithm that decides if an elementary function has an antiderivative. The elementary functions include exponentials, their inverses (logarithms), trigonometric functions, their inverses, and powers, including \\(n\\)th roots. Not every elementary function will have an antiderivative comprised of (finite) combinations of elementary functions. The typical example is \\(e^{x^2}\\), which has no simple antiderivative, despite its ubiquitousness.\nThere are classes of functions where an (elementary) antiderivative can always be found. Polynomials provide a case. More surprisingly, so do their ratios, rational functions."
},
{
"objectID": "integrals/partial_fractions.html#partial-fraction-decomposition",
"href": "integrals/partial_fractions.html#partial-fraction-decomposition",
"title": "40  Partial Fractions",
"section": "40.1 Partial fraction decomposition",
"text": "40.1 Partial fraction decomposition\nLet \\(f(x) = p(x)/q(x)\\), where \\(p\\) and \\(q\\) are polynomial functions with real coefficients. Further, we assume without comment that \\(p\\) and \\(q\\) have no common factors. (If they did, we can divide them out, an act which has no effect on the integrability of \\(f(x)\\).\nThe function \\(q(x)\\) will factor over the real numbers. The fundamental theorem of algebra can be applied to say that \\(q(x)=q_1(x)^{n_1} \\cdots q_k(x)^{n_k}\\) where \\(q_i(x)\\) is a linear or quadratic polynomial and \\(n_k\\) a positive integer.\n\nPartial Fraction Decomposition: There are unique polynomials \\(a_{ij}\\) with degree \\(a_{ij} <\\) degree \\(q_i\\) such that\n\\[\n\\frac{p(x)}{q(x)} = a(x) + \\sum_{i=1}^k \\sum_{j=1}^{n_i} \\frac{a_{ij}(x)}{q_i(x)^j}.\n\\]\n\nThe method is attributed to John Bernoulli, one of the prolific Bernoulli brothers who put a stamp on several areas of math. This Bernoulli was a mentor to Euler.\nThis basically says that each factor \\(q_i(x)^{n_i}\\) contributes a term like:\n\\[\n\\frac{a_{i1}(x)}{q_i(x)^1} + \\frac{a_{i2}(x)}{q_i(x)^2} + \\cdots + \\frac{a_{in_i}(x)}{q_i(x)^{n_i}},\n\\]\nwhere each \\(a_{ij}(x)\\) has degree less than the degree of \\(q_i(x)\\).\nThe value of this decomposition is that the terms \\(a_{ij}(x)/q_i(x)^j\\) each have an antiderivative, and so the sum of them will also have an antiderivative.\n\n\n\n\n\n\nNote\n\n\n\nMany calculus texts will give some examples for finding a partial fraction decomposition. We push that work off to SymPy, as for all but the easiest cases - a few are in the problems - it can be a bit tedious.\n\n\nIn SymPy, the apart function will find the partial fraction decomposition when a factorization is available. For example, here we see \\(n_i\\) terms for each power of \\(q_i\\)\n\n@syms a::real b::real c::real A::real B::real x::real\n\n(a, b, c, A, B, x)\n\n\n\napart((x-2)*(x-3) / (x*(x-1)^2*(x^2 + 2)^3))\n\n \n\\[\n- \\frac{8 x - 13}{9 \\left(x^{2} + 2\\right)^{3}} - \\frac{35 x - 34}{54 \\left(x^{2} + 2\\right)^{2}} - \\frac{45 x - 28}{108 \\left(x^{2} + 2\\right)} - \\frac{1}{3 \\left(x - 1\\right)} + \\frac{2}{27 \\left(x - 1\\right)^{2}} + \\frac{3}{4 x}\n\\]\n\n\n\n\n40.1.1 Sketch of proof\nA standard proof uses two facts of number systems: the division algorithm and a representation of the greatest common divisor in terms of sums, extended to polynomials. Our sketch shows how these are used.\nTake one of the factors of the denominators, and consider this representation of the rational function \\(P(x)/(q(x)^k Q(x))\\) where there are no common factors to any of the three polynomials.\nSince \\(q(x)\\) and \\(Q(x)\\) share no factors, Bezouts identity says there exists polynomials \\(a(x)\\) and \\(b(x)\\) with:\n\\[\na(x) Q(x) + b(x) q(x) = 1.\n\\]\nThen dividing by \\(q^k(x)Q(x)\\) gives the decomposition\n\\[\n\\frac{1}{q(x)^k Q(x)} = \\frac{a(x)}{q(x)^k} + \\frac{b(x)}{q(x)^{k-1}Q(x)}.\n\\]\nSo we get by multiplying the \\(P(x)\\):\n\\[\n\\frac{P(x)}{q(x)^k Q(x)} = \\frac{A(x)}{q(x)^k} + \\frac{B(x)}{q(x)^{k-1}Q(x)}.\n\\]\nThis may look more complicated, but what it does is peel off one term (The first) and leave something which is smaller, in this case by a factor of \\(q(x)\\). This process can be repeated pulling off a power of a factor at a time until nothing is left to do.\nWhat remains is to establish that we can take \\(A(x) = a(x)\\cdot P(x)\\) with a degree less than that of \\(q(x)\\).\nIn Proposition 3.8 of Bradley and Cook we can see how. Recall the division algorithm, for example, says there are \\(q_k\\) and \\(r_k\\) with \\(A=q\\cdot q_k + r_k\\) where the degree of \\(r_k\\) is less than that of \\(q\\), which is linear or quadratic. This is repeatedly applied below:\n\\[\n\\begin{align*}\n\\frac{A}{q^k} &= \\frac{q\\cdot q_k + r_k}{q^k}\\\\\n&= \\frac{r_k}{q^k} + \\frac{q_k}{q^{k-1}}\\\\\n&= \\frac{r_k}{q^k} + \\frac{q \\cdot q_{k-1} + r_{k-1}}{q^{k-1}}\\\\\n&= \\frac{r_k}{q^k} + \\frac{r_{k-1}}{q^{k-1}} + \\frac{q_{k-1}}{q^{k-2}}\\\\\n&= \\frac{r_k}{q^k} + \\frac{r_{k-1}}{q^{k-1}} + \\frac{q\\cdot q_{k-2} + r_{k-2}}{q^{k-2}}\\\\\n&= \\cdots\\\\\n&= \\frac{r_k}{q^k} + \\frac{r_{k-1}}{q^{k-1}} + \\cdots + q_1.\n\\end{align*}\n\\]\nSo the term \\(A(x)/q(x)^k\\) can be expressed in terms of a sum where the numerators or each term have degree less than \\(q(x)\\), as expected by the statement of the theorem."
},
{
"objectID": "integrals/partial_fractions.html#integrating-the-terms-in-a-partial-fraction-decomposition",
"href": "integrals/partial_fractions.html#integrating-the-terms-in-a-partial-fraction-decomposition",
"title": "40  Partial Fractions",
"section": "40.2 Integrating the terms in a partial fraction decomposition",
"text": "40.2 Integrating the terms in a partial fraction decomposition\nWe discuss, by example, how each type of possible term in a partial fraction decomposition has an antiderivative. Hence, rational functions will always have an antiderivative that can be computed.\n\n40.2.1 Linear factors\nFor \\(j=1\\), if \\(q_i\\) is linear, then \\(a_{ij}/q_i^j\\) must look like a constant over a linear term, or something like:\n\np = a/(x-c)\n\n \n\\[\n\\frac{a}{- c + x}\n\\]\n\n\n\nThis has a logarithmic antiderivative:\n\nintegrate(p, x)\n\n \n\\[\na \\log{\\left(- c + x \\right)}\n\\]\n\n\n\nFor \\(j > 1\\), we have powers.\n\n@syms j::positive\nintegrate(a/(x-c)^j, x)\n\n \n\\[\na \\left(\\begin{cases} \\frac{\\left(- c + x\\right)^{1 - j}}{1 - j} & \\text{for}\\: j \\neq 1 \\\\\\log{\\left(- c + x \\right)} & \\text{otherwise} \\end{cases}\\right)\n\\]\n\n\n\n\n\n40.2.2 Quadratic factors\nWhen \\(q_i\\) is quadratic, it looks like \\(ax^2 + bx + c\\). Then \\(a_{ij}\\) can be a constant or a linear polynomial. The latter can be written as \\(Ax + B\\).\nThe integral of the following general form is presented below:\n\\[\n\\frac{Ax +B }{(ax^2 + bx + c)^j},\n\\]\nWith SymPy, we consider a few cases of the following form, which results from a shift of x\n\\[\n\\frac{Ax + B}{((ax)^2 \\pm 1)^j}\n\\]\nThis can be done by finding a \\(d\\) so that \\(a(x-d)^2 + b(x-d) + c = dx^2 + e = e((\\sqrt{d/e}x^2 \\pm 1)\\).\nThe integrals of the type \\(Ax/((ax)^2 \\pm 1)\\) can completed by \\(u\\)-substitution, with \\(u=(ax)^2 \\pm 1\\).\nFor example,\n\nintegrate(A*x/((a*x)^2 + 1)^4, x)\n\n \n\\[\n- \\frac{A}{6 a^{8} x^{6} + 18 a^{6} x^{4} + 18 a^{4} x^{2} + 6 a^{2}}\n\\]\n\n\n\nThe integrals of the type \\(B/((ax)^2\\pm 1)\\) are completed by trigonometric substitution and various reduction formulas. They can get involved, but are tractable. For example:\n\nintegrate(B/((a*x)^2 + 1)^4, x)\n\n \n\\[\nB \\left(\\frac{15 a^{4} x^{5} + 40 a^{2} x^{3} + 33 x}{48 a^{6} x^{6} + 144 a^{4} x^{4} + 144 a^{2} x^{2} + 48} + \\frac{5 \\operatorname{atan}{\\left(a x \\right)}}{16 a}\\right)\n\\]\n\n\n\nand\n\nintegrate(B/((a*x)^2 - 1)^4, x)\n\n \n\\[\nB \\left(\\frac{- 15 a^{4} x^{5} + 40 a^{2} x^{3} - 33 x}{48 a^{6} x^{6} - 144 a^{4} x^{4} + 144 a^{2} x^{2} - 48} - \\frac{5 \\log{\\left(x - \\frac{1}{a} \\right)}}{32 a} + \\frac{5 \\log{\\left(x + \\frac{1}{a} \\right)}}{32 a}\\right)\n\\]\n\n\n\n\nIn Bronstein this characterization can be found - “This method, which dates back to Newton, Leibniz and Bernoulli, should not be used in practice, yet it remains the method found in most calculus texts and is often taught. Its major drawback is the factorization of the denominator of the integrand over the real or complex numbers.” We can also find the following formulas which formalize the above exploratory calculations (\\(j>1\\) and \\(b^2 - 4c < 0\\) below):\n\\[\n\\begin{align*}\n\\int \\frac{A}{(x-a)^j} &= \\frac{A}{1-j}\\frac{1}{(x-a)^{1-j}}\\\\\n\\int \\frac{A}{x-a} &= A\\log(x-a)\\\\\n\\int \\frac{Bx+C}{x^2 + bx + c} &= \\frac{B}{2} \\log(x^2 + bx + c) + \\frac{2C-bB}{\\sqrt{4c-b^2}}\\cdot \\arctan\\left(\\frac{2x+b}{\\sqrt{4c-b^2}}\\right)\\\\\n\\int \\frac{Bx+C}{(x^2 + bx + c)^j} &= \\frac{B' x + C'}{(x^2 + bx + c)^{j-1}} + \\int \\frac{C''}{(x^2 + bx + c)^{j-1}}\n\\end{align*}\n\\]\nThe first returns a rational function; the second yields a logarithm term; the third yields a logarithm and an arctangent term; while the last, which has explicit constants available, provides a reduction that can be recursively applied;\nThat is integrating \\(f(x)/g(x)\\), a rational function, will yield an output that looks like the following, where the functions are polynomials:\n\\[\n\\int f(x)/g(x) = P(x) + \\frac{C(x)}{D{x}} + \\sum v_i \\log(V_i(x)) + \\sum w_j \\arctan(W_j(x))\n\\]\n(Bronstein also sketches the modern method which is to use a Hermite reduction to express \\(\\int (f/g) dx = p/q + \\int (g/h) dx\\), where \\(h\\) is square free (the “j” are all \\(1\\)). The latter can be written over the complex numbers as logarithmic terms of the form \\(\\log(x-a)\\), the “as”found following a method due to Trager and Lazard, and Rioboo, which is mentioned in the SymPy documentation as the method used.)\n\nExamples\nFind an antiderivative for \\(1/(x\\cdot(x^2+1)^2)\\).\nWe have a partial fraction decomposition is:\n\nq = (x * (x^2 + 1)^2)\napart(1/q)\n\n \n\\[\n- \\frac{x}{x^{2} + 1} - \\frac{x}{\\left(x^{2} + 1\\right)^{2}} + \\frac{1}{x}\n\\]\n\n\n\nWe see three terms. The first and second will be done by \\(u\\)-substitution, the third by a logarithm:\n\nintegrate(1/q, x)\n\n \n\\[\n\\log{\\left(x \\right)} - \\frac{\\log{\\left(x^{2} + 1 \\right)}}{2} + \\frac{1}{2 x^{2} + 2}\n\\]\n\n\n\n\nFind an antiderivative of \\(1/(x^2 - 2x-3)\\).\nWe again just let SymPy do the work. A partial fraction decomposition is given by:\n\n𝒒 = (x^2 - 2x - 3)\napart(1/𝒒)\n\n \n\\[\n- \\frac{1}{4 \\left(x + 1\\right)} + \\frac{1}{4 \\left(x - 3\\right)}\n\\]\n\n\n\nWe see what should yield two logarithmic terms:\n\nintegrate(1/𝒒, x)\n\n \n\\[\n\\frac{\\log{\\left(x - 3 \\right)}}{4} - \\frac{\\log{\\left(x + 1 \\right)}}{4}\n\\]\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nSymPy will find \\(\\log(x)\\) as an antiderivative for \\(1/x\\), but more generally, \\(\\log(\\lvert x\\rvert)\\) is one.\n\n\n\nExample\nThe answers found can become quite involved. Corless, Moir, Maza, and Xie use this example which at first glance seems tame enough:\n\nex = (x^2 - 1) / (x^4 + 5x^2 + 7)\n\n \n\\[\n\\frac{x^{2} - 1}{x^{4} + 5 x^{2} + 7}\n\\]\n\n\n\nBut the integral is something best suited to a computer algebra system:\n\nintegrate(ex, x)\n\n \n\\[\n\\sqrt{\\frac{17}{84} + \\frac{13 \\sqrt{7}}{168}} \\log{\\left(x^{2} + x \\left(- \\frac{38 \\sqrt{6} \\sqrt{34 + 13 \\sqrt{7}}}{9} - \\frac{1301 \\sqrt{42} \\sqrt{34 + 13 \\sqrt{7}}}{1638} + \\frac{19 \\sqrt{42} \\sqrt{34 + 13 \\sqrt{7}} \\sqrt{884 \\sqrt{7} + 2339}}{546}\\right) - \\frac{2124092 \\sqrt{884 \\sqrt{7} + 2339}}{31941} - \\frac{9481 \\sqrt{7} \\sqrt{884 \\sqrt{7} + 2339}}{378} + \\frac{290246555}{63882} + \\frac{4221850 \\sqrt{7}}{2457} \\right)} - \\sqrt{\\frac{17}{84} + \\frac{13 \\sqrt{7}}{168}} \\log{\\left(x^{2} + x \\left(- \\frac{19 \\sqrt{42} \\sqrt{34 + 13 \\sqrt{7}} \\sqrt{884 \\sqrt{7} + 2339}}{546} + \\frac{1301 \\sqrt{42} \\sqrt{34 + 13 \\sqrt{7}}}{1638} + \\frac{38 \\sqrt{6} \\sqrt{34 + 13 \\sqrt{7}}}{9}\\right) - \\frac{2124092 \\sqrt{884 \\sqrt{7} + 2339}}{31941} - \\frac{9481 \\sqrt{7} \\sqrt{884 \\sqrt{7} + 2339}}{378} + \\frac{290246555}{63882} + \\frac{4221850 \\sqrt{7}}{2457} \\right)} + 2 \\sqrt{- \\frac{\\sqrt{884 \\sqrt{7} + 2339}}{84} + \\frac{17}{84} + \\frac{13 \\sqrt{7}}{56}} \\operatorname{atan}{\\left(\\frac{78 \\sqrt{42} x}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} - \\frac{988 \\sqrt{7} \\sqrt{34 + 13 \\sqrt{7}}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} - \\frac{1301 \\sqrt{34 + 13 \\sqrt{7}}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} + \\frac{57 \\sqrt{34 + 13 \\sqrt{7}} \\sqrt{884 \\sqrt{7} + 2339}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} \\right)} + 2 \\sqrt{- \\frac{\\sqrt{884 \\sqrt{7} + 2339}}{84} + \\frac{17}{84} + \\frac{13 \\sqrt{7}}{56}} \\operatorname{atan}{\\left(\\frac{78 \\sqrt{42} x}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} - \\frac{57 \\sqrt{34 + 13 \\sqrt{7}} \\sqrt{884 \\sqrt{7} + 2339}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} + \\frac{1301 \\sqrt{34 + 13 \\sqrt{7}}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} + \\frac{988 \\sqrt{7} \\sqrt{34 + 13 \\sqrt{7}}}{- 9 \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}} + 19 \\sqrt{884 \\sqrt{7} + 2339} \\sqrt{- 2 \\sqrt{884 \\sqrt{7} + 2339} + 34 + 39 \\sqrt{7}}} \\right)}\n\\]"
},
{
"objectID": "integrals/partial_fractions.html#questions",
"href": "integrals/partial_fractions.html#questions",
"title": "40  Partial Fractions",
"section": "40.3 Questions",
"text": "40.3 Questions\n\nQuestion\nThe partial fraction decomposition of \\(1/(x(x-1))\\) must be of the form \\(A/x + B/(x-1)\\).\nWhat is \\(A\\)? (Use SymPy or just put the sum over a common denominator and solve for \\(A\\) and \\(B\\).)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(B\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following gives the partial fraction decomposition for a rational expression:\n\\[\n\\frac{3x+5}{(1-2x)^2} = \\frac{A}{1-2x} + \\frac{B}{(1-2x)^2}.\n\\]\nFind \\(A\\) (being careful with the sign):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind \\(B\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following specifies the general partial fraction decomposition for a rational expression:\n\\[\n\\frac{1}{(x+1)(x-1)^2} = \\frac{A}{x+1} + \\frac{B}{x-1} + \\frac{C}{(x-1)^2}.\n\\]\nFind \\(A\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind \\(B\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind \\(C\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the following exactly:\n\\[\n\\int_0^1 \\frac{(x-2)(x-3)}{(x-4)^2\\cdot(x-5)} dx\n\\]\nIs \\(-6\\log(5) - 5\\log(3) - 1/6 + 11\\log(4)\\) the answer?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn the assumptions for the partial fraction decomposition is the fact that \\(p(x)\\) and \\(q(x)\\) share no common factors. Suppose, this isnt the case and in fact we have:\n\\[\n\\frac{p(x)}{q(x)} = \\frac{(x-c)^m s(x)}{(x-c)^n t(x)}.\n\\]\nHere \\(s\\) and \\(t\\) are polynomials such that \\(s(c)\\) and \\(t(c)\\) are non-zero.\nIf \\(m > n\\), then why can we cancel out the \\((x-c)^n\\) and not have a concern?\n\n\n\n \n \n \n \n \n \n \n \n \n SymPy allows it.\n \n \n\n\n \n \n \n \n The value \\(c\\) is a removable singularity, so the integral will be identical.\n \n \n\n\n \n \n \n \n The resulting function has an identical domain and is equivalent for all \\(x\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(m = n\\), then why can we cancel out the \\((x-c)^n\\) and not have a concern?\n\n\n\n \n \n \n \n \n \n \n \n \n SymPy allows it.\n \n \n\n\n \n \n \n \n The value \\(c\\) is a removable singularity, so the integral will be identical.\n \n \n\n\n \n \n \n \n The resulting function has an identical domain and is equivalent for all \\(x\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf \\(m < n\\), then why can we cancel out the \\((x-c)^n\\) and not have a concern?\n\n\n\n \n \n \n \n \n \n \n \n \n SymPy allows it.\n \n \n\n\n \n \n \n \n The value \\(c\\) is a removable singularity, so the integral will be identical.\n \n \n\n\n \n \n \n \n The resulting function has an identical domain and is equivalent for all \\(x\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe partial fraction decomposition, as presented, factors the denominator polynomial into linear and quadratic factors over the real numbers. Alternatively, factoring over the complex numbers is possible, resulting in terms like:\n\\[\n\\frac{a + ib}{x - (\\alpha + i \\beta)} + \\frac{a - ib}{x - (\\alpha - i \\beta)}\n\\]\nHow to see that these give rise to real answers on integration is the point of this question.\nBreaking the terms up over \\(a\\) and \\(b\\) we have:\n\\[\n\\begin{align*}\nI &= \\frac{a}{x - (\\alpha + i \\beta)} + \\frac{a}{x - (\\alpha - i \\beta)} \\\\\nII &= i\\frac{b}{x - (\\alpha + i \\beta)} - i\\frac{b}{x - (\\alpha - i \\beta)}\n\\end{align*}\n\\]\nIntegrating \\(I\\) leads to two logarithmic terms, which are combined to give:\n\\[\n\\int I dx = a\\cdot \\log((x-(\\alpha+i\\beta)) \\cdot (x - (\\alpha-i\\beta)))\n\\]\nThis involves no complex numbers, as:\n\n\n\n \n \n \n \n \n \n \n \n \n The \\(\\beta\\) are \\(0\\), as the polynomials in question are real\n \n \n\n\n \n \n \n \n The complex numbers are complex conjugates, so the term in the logarithm will simply be \\(x - 2\\alpha x + \\alpha^2 + \\beta^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe term \\(II\\) benefits from this computation (attributed to Rioboo by Corless et. al)\n\\[\n\\frac{d}{dx} i \\log(\\frac{X+iY}{X-iY}) = 2\\frac{d}{dx}\\arctan(\\frac{X}{Y})\n\\]\nApplying this with \\(X=x - \\alpha\\) and \\(Y=-\\beta\\) shows that \\(\\int II dx\\) will be\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-2b\\arctan((x - \\alpha)/(\\beta))\\)\n \n \n\n\n \n \n \n \n \\(2b\\sec^2(-(x-\\alpha)/(-\\beta))\\)"
},
{
"objectID": "integrals/improper_integrals.html",
"href": "integrals/improper_integrals.html",
"title": "41  Improper Integrals",
"section": "",
"text": "This section uses these add-on packages:\nA function \\(f(x)\\) is Riemann integrable over an interval \\([a,b]\\) if some limit involving Riemann sums exists. This limit will fail to exist if \\(f(x) = \\infty\\) in \\([a,b]\\). As well, the Riemann sum idea is undefined if either \\(a\\) or \\(b\\) (or both) are infinite, so the limit wont exist in this case.\nTo define integrals with either functions having singularities or infinite domains, the idea of an improper integral is introduced with definitions to handle the two cases above."
},
{
"objectID": "integrals/improper_integrals.html#infinite-domains",
"href": "integrals/improper_integrals.html#infinite-domains",
"title": "41  Improper Integrals",
"section": "41.1 Infinite domains",
"text": "41.1 Infinite domains\nLet \\(f(x)\\) be a reasonable function, so reasonable that for any \\(a < b\\) the function is Riemann integrable, meaning \\(\\int_a^b f(x)dx\\) exists.\nWhat needs to be the case so that we can discuss the integral over the entire real number line?\nClearly something. The function \\(f(x) = 1\\) is reasonable by the idea above. Clearly the integral over and \\([a,b]\\) is just \\(b-a\\), but the limit over an unbounded domain would be \\(\\infty\\). Even though limits of infinity can be of interest in some cases, not so here. What will ensure that the area is finite over an infinite region?\nOr is that even the right question. Now consider \\(f(x) = \\sin(\\pi x)\\). Over every interval of the type \\([-2n, 2n]\\) the area is \\(0\\), and over any interval, \\([a,b]\\) the area never gets bigger than \\(2\\). But still this function does not have a well defined area on an infinite domain.\nThe right question involves a limit. Fix a finite \\(a\\). We define the definite integral over \\([a,\\infty)\\) to be\n\\[\n\\int_a^\\infty f(x) dx = \\lim_{M \\rightarrow \\infty} \\int_a^M f(x) dx,\n\\]\nwhen the limit exists. Similarly, we define the definite integral over \\((-\\infty, a]\\) through\n\\[\n\\int_{-\\infty}^a f(x) dx = \\lim_{M \\rightarrow -\\infty} \\int_M^a f(x) dx.\n\\]\nFor the interval \\((-\\infty, \\infty)\\) we have need both these limits to exist, and then:\n\\[\n\\int_{-\\infty}^\\infty f(x) dx = \\lim_{M \\rightarrow -\\infty} \\int_M^a f(x) dx + \\lim_{M \\rightarrow \\infty} \\int_a^M f(x) dx.\n\\]\n\n\n\n\n\n\nNote\n\n\n\nWhen the integral exists, it is said to converge. If it doesnt exist, it is said to diverge.\n\n\n\nExamples\n\nThe function \\(f(x) = 1/x^2\\) is integrable over \\([1, \\infty)\\), as this limit exists:\n\n\\[\n\\lim_{M \\rightarrow \\infty} \\int_1^M \\frac{1}{x^2}dx = \\lim_{M \\rightarrow \\infty} -\\frac{1}{x}\\big|_1^M\n= \\lim_{M \\rightarrow \\infty} 1 - \\frac{1}{M} = 1.\n\\]\n\nThe function \\(f(x) = 1/x^{1/2}\\) is not integrable over \\([1, \\infty)\\), as this limit fails to exist:\n\n\\[\n\\lim_{M \\rightarrow \\infty} \\int_1^M \\frac{1}{x^{1/2}}dx = \\lim_{M \\rightarrow \\infty} \\frac{x^{1/2}}{1/2}\\big|_1^M\n= \\lim_{M \\rightarrow \\infty} 2\\sqrt{M} - 2 = \\infty.\n\\]\nThe limit is infinite, so does not exist except in an extended sense.\n\nThe function \\(x^n e^{-x}\\) for \\(n = 1, 2, \\dots\\) is integrable over \\([0,\\infty)\\).\n\nBefore showing this, we recall the fundamental theorem of calculus. The limit existing is the same as saying the limit of \\(F(M) - F(a)\\) exists for an antiderivative of \\(f(x)\\).\nFor this particular problem, it can be shown by integration by parts that for positive, integer values of \\(n\\) that an antiderivative exists of the form \\(F(x) = p(x)e^{-x}\\), where \\(p(x)\\) is a polynomial of degree \\(n\\). But weve seen that for any \\(n>0\\), \\(\\lim_{x \\rightarrow \\infty} x^n e^{-x} = 0\\), so the same is true for any polynomial. So, \\(\\lim_{M \\rightarrow \\infty} F(M) - F(1) = -F(1)\\).\n\nThe function \\(e^x\\) is integrable over \\((-\\infty, a]\\) but not\n\n\\[\n[a, \\infty)\n\\]\nfor any finite \\(a\\). This is because, \\(F(M) = e^x\\) and this has a limit as \\(x\\) goes to \\(-\\infty\\), but not \\(\\infty\\).\n\nLet \\(f(x) = x e^{-x^2}\\). This function has an integral over \\([0, \\infty)\\) and more generally \\((-\\infty, \\infty)\\). To see, we note that as it is an odd function, the area from \\(0\\) to \\(M\\) is the opposite sign of that from \\(-M\\) to \\(0\\). So \\(\\lim_{M \\rightarrow \\infty} (F(M) - F(0)) = \\lim_{M \\rightarrow -\\infty} (F(0) - (-F(\\lvert M\\lvert)))\\). We only then need to investigate the one limit. But we can see by substitution with \\(u=x^2\\), that an antiderivative is \\(F(x) = (-1/2) \\cdot e^{-x^2}\\). Clearly, \\(\\lim_{M \\rightarrow \\infty}F(M) = 0\\), so the answer is well defined, and the area from \\(0\\) to \\(\\infty\\) is just \\(e/2\\). From \\(-\\infty\\) to \\(0\\) it is \\(-e/2\\) and the total area is \\(0\\), as the two sides “cancel” out.\nLet \\(f(x) = \\sin(x)\\). Even though \\(\\lim_{M \\rightarrow \\infty} (F(M) - F(-M) ) = 0\\), this function is not integrable. The fact is we need both the limit \\(F(M)\\) and \\(F(-M)\\) to exist as \\(M\\) goes to \\(\\infty\\). In this case, even though the area cancels if \\(\\infty\\) is approached at the same rate, this isnt sufficient to guarantee the two limits exists independently.\nWill the function \\(f(x) = 1/(x\\cdot(\\log(x))^2)\\) have an integral over \\([e, \\infty)\\)?\n\nWe first find an antiderivative using the \\(u\\)-substitution \\(u(x) = \\log(x)\\):\n\\[\n\\int_e^M \\frac{e}{x \\log(x)^{2}} dx\n= \\int_{\\log(e)}^{\\log(M)} \\frac{1}{u^{2}} du\n= \\frac{-1}{u} \\big|_{1}^{\\log(M)}\n= \\frac{-1}{\\log(M)} - \\frac{-1}{1}\n= 1 - \\frac{1}{M}.\n\\]\nAs \\(M\\) goes to \\(\\infty\\), this will converge to \\(1\\).\n\nThe sinc function \\(f(x) = \\sin(\\pi x)/(\\pi x)\\) does not have a nice antiderivative. Seeing if the limit exists is a bit of a problem. However, this function is important enough that there is a built-in function, Si, that computes \\(\\int_0^x \\sin(u)/u\\cdot du\\). This function can be used through sympy.Si(...):\n\n\n@syms M\nlimit(sympy.Si(M), M => oo)\n\n \n\\[\n\\frac{\\pi}{2}\n\\]\n\n\n\n\n\n41.1.1 Numeric integration\nThe quadgk function (available through QuadGK) is able to accept Inf and -Inf as endpoints of the interval. For example, this will integrate \\(e^{-x^2/2}\\) over the real line:\n\nf(x) = exp(-x^2/2)\nquadgk(f, -Inf, Inf)\n\n(2.506628274639168, 3.608438072243189e-8)\n\n\n(If may not be obvious, but this is \\(\\sqrt{2\\pi}\\).)"
},
{
"objectID": "integrals/improper_integrals.html#singularities",
"href": "integrals/improper_integrals.html#singularities",
"title": "41  Improper Integrals",
"section": "41.2 Singularities",
"text": "41.2 Singularities\nSuppose \\(\\lim_{x \\rightarrow c}f(x) = \\infty\\) or \\(-\\infty\\). Then a Riemann sum that contains an interval including \\(c\\) will not be finite if the point chosen in the interval is \\(c\\). Though we could choose another point, this is not enough as the definition must hold for any choice of the \\(c_i\\).\nHowever, if \\(c\\) is isolated, we can get close to \\(c\\) and see how the area changes.\nSuppose \\(a < c\\), we define \\(\\int_a^c f(x) dx = \\lim_{M \\rightarrow c-} \\int_a^c f(x) dx\\). If this limit exists, the definite integral with \\(c\\) is well defined. Similarly, the integral from \\(c\\) to \\(b\\), where \\(b > c\\), can be defined by a right limit going to \\(c\\). The integral from \\(a\\) to \\(b\\) will exist if both the limits are finite.\n\nExamples\n\nConsider the example of the initial illustration, \\(f(x) = 1/\\sqrt{x}\\) at \\(0\\). Here \\(f(0)= \\infty\\), so the usual notion of a limit wont apply to \\(\\int_0^1 f(x) dx\\). However,\n\n\\[\n\\lim_{M \\rightarrow 0+} \\int_M^1 \\frac{1}{\\sqrt{x}} dx\n= \\lim_{M \\rightarrow 0+} \\frac{\\sqrt{x}}{1/2} \\big|_M^1\n= \\lim_{M \\rightarrow 0+} 2(1) - 2\\sqrt{M} = 2.\n\\]\n\n\n\n\n\n\nNote\n\n\n\nThe cases \\(f(x) = x^{-n}\\) for \\(n > 0\\) are tricky to keep straight. For \\(n > 1\\), the functions can be integrated over \\([1,\\infty)\\), but not \\((0,1]\\). For \\(0 < n < 1\\), the functions can be integrated over \\((0,1]\\) but not \\([1, \\infty)\\).\n\n\n\nNow consider \\(f(x) = 1/x\\). Is this integral \\(\\int_0^1 1/x \\cdot dx\\) defined? It will be if this limit exists:\n\n\\[\n\\lim_{M \\rightarrow 0+} \\int_M^1 \\frac{1}{x} dx\n= \\lim_{M \\rightarrow 0+} \\log(x) \\big|_M^1\n= \\lim_{M \\rightarrow 0+} \\log(1) - \\log(M) = \\infty.\n\\]\nAs the limit does not exist, the function is not integrable around \\(0\\).\n\nSymPy may give answers which do not coincide with our definitions, as it uses complex numbers as a default assumption. In this case it returns the proper answer when integrated from \\(0\\) to \\(1\\) and NaN for an integral over \\((-1,1)\\):\n\n\n@syms x\nintegrate(1/x, (x, 0, 1)), integrate(1/x, (x, -1, 1))\n\n(oo, nan)\n\n\n\nSuppose you know \\(\\int_1^\\infty x^2 f(x) dx\\) exists. Does this imply \\(\\int_0^1 f(1/x) dx\\) exists?\n\nWe need to consider the limit of \\(\\int_M^1 f(1/x) dx\\). We try the \\(u\\)-substitution \\(u(x) = 1/x\\). This gives \\(du = -(1/x^2)dx = -u^2 dx\\). So, the substitution becomes:\n\\[\n\\int_M^1 f(1/x) dx = \\int_{1/M}^{1/1} f(u) (-u^2) du = \\int_1^{1/M} u^2 f(u) du.\n\\]\nBut the limit as \\(M \\rightarrow 0\\) of \\(1/M\\) is the same going to \\(\\infty\\), so the right side will converge by the assumption. Thus we get \\(f(1/x)\\) is integrable over \\((0,1]\\).\n\n\n41.2.1 Numeric integration\nSo far our use of the quadgk function specified the region to integrate via a, b, as in quadgk(f, a, b). In fact, it can specify values in between for which the function should not be sampled. For example, were we to integrate \\(1/\\sqrt{\\lvert x\\rvert}\\) over \\([-1,1]\\), we would want to avoid \\(0\\) as a point to sample. Here is how:\n\nf(x) = 1 / sqrt(abs(x))\nquadgk(f, -1, 0, 1)\n\n(3.999999962817228, 5.736423067171012e-8)\n\n\nJust trying quadgk(f, -1, 1) leads to a DomainError, as 0 will be one of the points sampled. The general call is like quadgk(f, a, b, c, d,...) which integrates over \\((a,b)\\) and \\((b,c)\\) and \\((c,d)\\), \\(\\dots\\). The algorithm is not supposed to evaluate the function at the endpoints of the intervals."
},
{
"objectID": "integrals/improper_integrals.html#probability-applications",
"href": "integrals/improper_integrals.html#probability-applications",
"title": "41  Improper Integrals",
"section": "41.3 Probability applications",
"text": "41.3 Probability applications\nA probability density is a function \\(f(x) \\geq 0\\) which is integrable on \\((-\\infty, \\infty)\\) and for which \\(\\int_{-\\infty}^\\infty f(x) dx =1\\). The cumulative distribution function is defined by \\(F(x)=\\int_{-\\infty}^x f(u) du\\).\nProbability densities are good example of using improper integrals.\n\nShow that \\(f(x) = (1/\\pi) (1/(1 + x^2))\\) is a probability density function.\n\nWe need to show that the integral exists and is \\(1\\). For this, we use the fact that \\((1/\\pi) \\cdot \\tan^{-1}(x)\\) is an antiderivative. Then we have:\n\\[\n\\lim_{M \\rightarrow \\infty} F(M) = (1/\\pi) \\cdot \\pi/2\n\\]\nand as \\(\\tan^{-1}(x)\\) is odd, we must have \\(F(-\\infty) = \\lim_{M \\rightarrow -\\infty} f(M) = -(1/\\pi) \\cdot \\pi/2\\). All told, \\(F(\\infty) - F(-\\infty) = 1/2 - (-1/2) = 1\\).\n\nShow that \\(f(x) = 1/(b-a)\\) for \\(a \\leq x \\leq b\\) and \\(0\\) otherwise is a probability density.\n\nThe integral for \\(-\\infty\\) to \\(a\\) of \\(f(x)\\) is just an integral of the constant \\(0\\), so will be \\(0\\). (This is the only constant with finite area over an infinite domain.) Similarly, the integral from \\(b\\) to \\(\\infty\\) will be \\(0\\). This means:\n\\[\n\\int_{-\\infty}^\\infty f(x) dx = \\int_a^b \\frac{1}{b-a} dx = 1.\n\\]\n(One might also comment that \\(f\\) is Riemann integrable on any \\([0,M]\\) despite being discontinuous at \\(a\\) and \\(b\\).)\n\nShow that if \\(f(x)\\) is a probability density then so is \\(f(x-c)\\) for any \\(c\\).\n\nWe have by the \\(u\\)-substitution\n\\[\n\\int_{-\\infty}^\\infty f(x-c)dx = \\int_{u(-\\infty)}^{u(\\infty)} f(u) du = \\int_{-\\infty}^\\infty f(u) du = 1.\n\\]\nThe key is that we can use the regular \\(u\\)-substitution formula provided \\(\\lim_{M \\rightarrow \\infty} u(M) = u(\\infty)\\) is defined. (The informal notation \\(u(\\infty)\\) is defined by that limit.)\n\nIf \\(f(x)\\) is a probability density, then so is \\((1/h) f((x-c)/h)\\) for any \\(c, h > 0\\).\n\nAgain, by a \\(u\\) substitution with, now, \\(u(x) = (x-c)/h\\), we have \\(du = (1/h) \\cdot dx\\) and the result follows just as before:\n\\[\n\\int_{-\\infty}^\\infty \\frac{1}{h}f(\\frac{x-c}{h})dx = \\int_{u(-\\infty)}^{u(\\infty)} f(u) du = \\int_{-\\infty}^\\infty f(u) du = 1.\n\\]\n\nIf \\(F(x) = 1 - e^{-x}\\), for \\(x \\geq 0\\), and \\(0\\) otherwise, find \\(f(x)\\).\n\nWe want to just say \\(F'(x)= e^{-x}\\) so \\(f(x) = e^{-x}\\). But some care is needed. First, that isnt right. The derivative for \\(x<0\\) of \\(F(x)\\) is \\(0\\), so \\(f(x) = 0\\) if \\(x < 0\\). What about for \\(x>0\\)? The derivative is \\(e^{-x}\\), but is that the right answer? \\(F(x) = \\int_{-\\infty}^x f(u) du\\), so we have to at least discuss if the \\(-\\infty\\) affects things. In this case, and in general the answer is no. For any \\(x\\) we can find \\(M < x\\) so that we have \\(F(x) = \\int_{-\\infty}^M f(u) du + \\int_M^x f(u) du\\). The first part is a constant, so will have derivative \\(0\\), the second will have derivative \\(f(x)\\), if the derivative exists (and it will exist at \\(x\\) if the derivative is continuous in a neighborhood of \\(x\\)).\nFinally, at \\(x=0\\) we have an issue, as \\(F'(0)\\) does not exist. The left limit of the secant line approximation is \\(0\\), the right limit of the secant line approximation is \\(1\\). So, we can take \\(f(x) = e^{-x}\\) for \\(x > 0\\) and \\(0\\) otherwise, noting that redefining \\(f(x)\\) at a point will not effect the integral as long as the point is finite."
},
{
"objectID": "integrals/improper_integrals.html#questions",
"href": "integrals/improper_integrals.html#questions",
"title": "41  Improper Integrals",
"section": "41.4 Questions",
"text": "41.4 Questions\n\nQuestion\nIs \\(f(x) = 1/x^{100}\\) integrable around \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(f(x) = 1/x^{1/3}\\) integrable around \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(f(x) = x\\cdot\\log(x)\\) integrable on \\([1,\\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(f(x) = \\log(x)/ x\\) integrable on \\([1,\\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIs \\(f(x) = \\log(x)\\) integrable on \\([1,\\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the integral \\(\\int_0^\\infty 1/(1+x^2) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the the integral \\(\\int_1^\\infty \\log(x)/x^2 dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the integral \\(\\int_0^2 (x-1)^{2/3} dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFrom the relationship that if \\(0 \\leq f(x) \\leq g(x)\\) then \\(\\int_a^b f(x) dx \\leq \\int_a^b g(x) dx\\) it can be deduced that\n\nif \\(\\int_a^\\infty f(x) dx\\) diverges, then so does \\(\\int_a^\\infty g(x) dx\\).\nif \\(\\int_a^\\infty g(x) dx\\) converges, then so does \\(\\int_a^\\infty f(x) dx\\).\n\nLet \\(f(x) = \\lvert \\sin(x)/x^2 \\rvert\\).\nWhat can you say about \\(\\int_1^\\infty f(x) dx\\), as \\(f(x) \\leq 1/x^2\\) on \\([1, \\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is convergent\n \n \n\n\n \n \n \n \n It is divergent\n \n \n\n\n \n \n \n \n Can't say\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nLet \\(f(x) = \\lvert \\sin(x) \\rvert / x\\).\nWhat can you say about \\(\\int_1^\\infty f(x) dx\\), as \\(f(x) \\leq 1/x\\) on \\([1, \\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is convergent\n \n \n\n\n \n \n \n \n It is divergent\n \n \n\n\n \n \n \n \n Can't say\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nLet \\(f(x) = 1/\\sqrt{x^2 - 1}\\). What can you say about \\(\\int_1^\\infty f(x) dx\\), as \\(f(x) \\geq 1/x\\) on \\([1, \\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is convergent\n \n \n\n\n \n \n \n \n It is divergent\n \n \n\n\n \n \n \n \n Can't say\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nLet \\(f(x) = 1 + 4x^2\\). What can you say about \\(\\int_1^\\infty f(x) dx\\), as \\(f(x) \\leq 1/x^2\\) on \\([1, \\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is convergent\n \n \n\n\n \n \n \n \n It is divergent\n \n \n\n\n \n \n \n \n Can't say\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nLet \\(f(x) = \\lvert \\sin(x)^{10}\\rvert/e^x\\). What can you say about \\(\\int_1^\\infty f(x) dx\\), as \\(f(x) \\leq e^{-x}\\) on \\([1, \\infty)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is convergent\n \n \n\n\n \n \n \n \n It is divergent\n \n \n\n\n \n \n \n \n Can't say\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe difference between “blowing up” at \\(0\\) versus being integrable at \\(\\infty\\) can be seen to be related through the \\(u\\)-substitution \\(u=1/x\\). With this \\(u\\)-substitution, what becomes of \\(\\int_0^1 x^{-2/3} dx\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_1^\\infty u^{2/3}/u^2 \\cdot du\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^\\infty 1/u \\cdot du\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^1 u^{2/3} \\cdot du\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe antiderivative of \\(f(x) = 1/\\pi \\cdot 1/\\sqrt{x(1-x)}\\) is \\(F(x)=(2/\\pi)\\cdot \\sin^{-1}(\\sqrt{x})\\).\nFind \\(\\int_0^1 f(x) dx\\)."
},
{
"objectID": "integrals/mean_value_theorem.html",
"href": "integrals/mean_value_theorem.html",
"title": "42  Mean value theorem for integrals",
"section": "",
"text": "This section uses these add-on packages:"
},
{
"objectID": "integrals/mean_value_theorem.html#average-value-of-a-function",
"href": "integrals/mean_value_theorem.html#average-value-of-a-function",
"title": "42  Mean value theorem for integrals",
"section": "42.1 Average value of a function",
"text": "42.1 Average value of a function\nLet \\(f(x)\\) be a continuous function over the interval \\([a,b]\\) with \\(a < b\\).\nThe average value of \\(f\\) over \\([a,b]\\) is defined by:\n\\[\n\\frac{1}{b-a} \\int_a^b f(x) dx.\n\\]\nIf \\(f\\) is a constant, this is just the contant value, as would be expected. If \\(f\\) is piecewise linear, then this is the weighted average of these constants.\n\nExamples\n\nExample: average velocity\nThe average velocity between times \\(a < b\\), is simply the change in position during the time interval divided by the change in time. In notation, this would be \\((x(b) - x(a)) / (b-a)\\). If \\(v(t) = x'(t)\\) is the velocity, then by the second part of the fundamental theorem of calculus, we have, in agreement with the definition above, that:\n\\[\n\\text{average velocity} = \\frac{x(b) - x(a)}{b-a} = \\frac{1}{b-a} \\int_a^b v(t) dt.\n\\]\nThe average speed is the change in total distance over time, which is given by\n\\[\n\\text{average speed} = \\frac{1}{b-a} \\int_a^b \\lvert v(t)\\rvert dt.\n\\]\nLet \\(\\bar{v}\\) be the average velocity. Then we have \\(\\bar{v} \\cdot(b-a) = x(b) - x(a)\\), or the change in position can be written as a constant (\\(\\bar{v}\\)) times the time, as though we had a constant velocity. This is an old intuition. Bressoud comments on the special case known to scholars at Merton College around \\(1350\\) that the distance traveled by an object under uniformly increasing velocity starting at \\(v_0\\) and ending at \\(v_t\\) is equal to the distance traveled by an object with constant velocity of \\((v_0 + v_t)/2\\).\n\n\nExample\nWhat is the average value of \\(f(x)=\\sin(x)\\) over \\([0, \\pi]\\)?\n\\[\n\\text{average} = \\frac{1}{\\pi-0} \\int_0^\\pi \\sin(x) dx = \\frac{1}{\\pi} (-\\cos(x)) \\big|_0^\\pi = \\frac{2}{\\pi}\n\\]\nVisually, we have:\n\nplot(sin, 0, pi)\nplot!(x -> 2/pi)\n\n\n\n\n\n\nExample\nWhat is the average value of the function \\(f\\) which is \\(1\\) between \\([0,3]\\), \\(2\\) between \\((3,5]\\) and \\(1\\) between \\((5,6]\\)?\nThough not continuous, \\(f(x)\\) is integrable as it contains only jumps. The integral from \\([0,6]\\) can be computed with geometry: \\(3\\cdot 3 + 2 \\cdot 2 + 1 \\cdot 1 = 14\\). The average then is \\(14/(6-0) = 7/3\\).\n\n\nExample\nWhat is the average value of the function \\(e^{-x}\\) between \\(0\\) and \\(\\log(2)\\)?\n\\[\n\\begin{align*}\n\\text{average} = \\frac{1}{\\log(2) - 0} \\int_0^{\\log(2)} e^{-x} dx\\\\\n&= \\frac{1}{\\log(2)} (-e^{-x}) \\big|_0^{\\log(2)}\\\\\n&= -\\frac{1}{\\log(2)} (\\frac{1}{2} - 1)\\\\\n&= \\frac{1}{2\\log(2)}.\n\\end{align*}\n\\]\nVisualizing, we have\n\nplot(x -> exp(-x), 0, log(2))\nplot!(x -> 1/(2*log(2)))"
},
{
"objectID": "integrals/mean_value_theorem.html#the-mean-value-theorem-for-integrals",
"href": "integrals/mean_value_theorem.html#the-mean-value-theorem-for-integrals",
"title": "42  Mean value theorem for integrals",
"section": "42.2 The mean value theorem for integrals",
"text": "42.2 The mean value theorem for integrals\nIf \\(f(x)\\) is assumed integrable, the average value of \\(f(x)\\) is defined, as above. Re-expressing gives that there exists a \\(K\\) with\n\\[\nK \\cdot (b-a) = \\int_a^b f(x) dx.\n\\]\nWhen we assume that \\(f(x)\\) is continuous, we can describe \\(K\\) as a value in the range of \\(f\\):\n\nThe mean value theorem for integrals: Let \\(f(x)\\) be a continuous function on \\([a,b]\\) with \\(a < b\\). Then there exists \\(c\\) with \\(a \\leq c \\leq b\\) with\n\\(f(c) \\cdot (b-a) = \\int_a^b f(x) dx.\\)`\n\nThe proof comes from the intermediate value theorem and the extreme value theorem. Since \\(f\\) is continuous on a closed interval, there exists values \\(m\\) and \\(M\\) with \\(f(c_m) = m \\leq f(x) \\leq M=f(c_M)\\), for some \\(c_m\\) and \\(c_M\\) in the interval \\([a,b]\\). Since \\(m \\leq f(x) \\leq M\\), we must have:\n\\[\nm \\cdot (b-a) \\leq K\\cdot(b-a) \\leq M\\cdot(b-a).\n\\]\nSo in particular \\(K\\) is in \\([m, M]\\). But \\(m\\) and \\(M\\) correspond to values of \\(f(x)\\), so by the intermediate value theorem, \\(K=f(c)\\) for some \\(c\\) that must lie in between \\(c_m\\) and \\(c_M\\), which means as well that it must be in \\([a,b]\\).\n\nProof of second part of Fundamental Theorem of Calculus\nThe mean value theorem is exactly what is needed to prove formally the second part of the Fundamental Theorem of Calculus. Again, suppose \\(f(x)\\) is continuous on \\([a,b]\\) with \\(a < b\\). For any \\(a < x < b\\), we define \\(F(x) = \\int_a^x f(u) du\\). Then the derivative of \\(F\\) exists and is \\(f\\).\nLet \\(h>0\\). Then consider the forward difference \\((F(x+h) - F(x))/h\\). Rewriting gives:\n\\[\n\\frac{\\int_a^{x+h} f(u) du - \\int_a^x f(u) du}{h} =\\frac{\\int_x^{x+h} f(u) du}{h} = f(\\xi(h)).\n\\]\nThe value \\(\\xi(h)\\) is just the \\(c\\) corresponding to a given value in \\([x, x+h]\\) guaranteed by the mean value theorem. We only know that \\(x \\leq \\xi(h) \\leq x+h\\). But this is plenty - it says that \\(\\lim_{h \\rightarrow 0+} \\xi(h) = x\\). Using the fact that \\(f\\) is continuous and the known properties of limits of compositions of functions this gives \\(\\lim_{h \\rightarrow 0+} f(\\xi(h)) = f(x)\\). But this means that the (right) limit of the secant line expression exists and is equal to \\(f(x)\\), which is what we want to prove. Repeating a similar argument when \\(h < 0\\), finishes the proof.\nThe basic notion used is simply that for small \\(h\\), this expression is well approximated by the left Riemann sum taken over \\([x, x+h]\\):\n\\[\nf(\\xi(h)) \\cdot h = \\int_x^{x+h} f(u) du.\n\\]"
},
{
"objectID": "integrals/mean_value_theorem.html#questions",
"href": "integrals/mean_value_theorem.html#questions",
"title": "42  Mean value theorem for integrals",
"section": "42.3 Questions",
"text": "42.3 Questions\n\nQuestion\nBetween \\(0\\) and \\(1\\) a function is constantly \\(1\\). Between \\(1\\) and \\(2\\) the function is constantly \\(2\\). What is the average value of the function over the interval \\([0,2]\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBetween \\(0\\) and \\(2\\) a function is constantly \\(1\\). Between \\(2\\) and \\(3\\) the function is constantly \\(2\\). What is the average value of the function over the interval \\([0,3]\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhat integral will show the intuition of the Merton College scholars that the distance traveled by an object under uniformly increasing velocity starting at \\(v_0\\) and ending at \\(v_t\\) is equal to the distance traveled by an object with constant velocity of \\((v_0 + v_t)/2\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\((v(0) + v(t))/2 \\cdot \\int_0^t du = (v(0) + v(t))/2 \\cdot t\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^t (v(0) + v(u))/2 du = v(0)/2\\cdot t + x(u)/2\\ \\big|_0^t\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^t v(u) du = v^2/2 \\big|_0^t\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the average value of \\(\\cos(x)\\) over the interval \\([-\\pi/2, \\pi/2]\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the average value of \\(\\cos(x)\\) over the interval \\([0, \\pi]\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the average value of \\(f(x) = e^{-2x}\\) between \\(0\\) and \\(2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the average value of \\(f(x) = \\sin(x)^2\\) over the \\(0\\), \\(\\pi\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich is bigger? The average value of \\(f(x) = x^{10}\\) or the average value of \\(g(x) = \\lvert x \\rvert\\) over the interval \\([0,1]\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n That of \\(f(x) = x^{10}\\).\n \n \n\n\n \n \n \n \n That of \\(g(x) = \\lvert x \\rvert\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDefine a family of functions over the interval \\([0,1]\\) by \\(f(x; a,b) = x^a \\cdot (1-x)^b\\). Which has a greater average, \\(f(x; 2,3)\\) or \\(f(x; 3,4)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(x; 3,4)\\)\n \n \n\n\n \n \n \n \n \\(f(x; 2,3)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose the average value of \\(f(x)\\) over \\([a,b]\\) is \\(100\\). What is the average value of \\(100 f(x)\\) over \\([a,b]\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(f(x)\\) is continuous and positive on \\([a,b]\\).\n\nExplain why for any \\(x > a\\) it must be that:\n\n\\[\nF(x) = \\int_a^x f(x) dx > 0\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Because the mean value theorem says this is \\(f(c) (x-a)\\) for some \\(c\\) and both terms are positive by the assumptions\n \n \n\n\n \n \n \n \n Because the definite integral is only defined for positive area, so it is always positive\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nExplain why \\(F(x)\\) is increasing.\n\n\n\n\n \n \n \n \n \n \n \n \n \n By the fundamental theorem of calculus, part I, \\(F'(x) = f(x) > 0\\), hence \\(F(x)\\) is increasing\n \n \n\n\n \n \n \n \n By the intermediate value theorem, as \\(F(x) > 0\\), it must be true that \\(F(x)\\) is increasing\n \n \n\n\n \n \n \n \n By the extreme value theorem, \\(F(x)\\) must reach its maximum, hence it must increase.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(f(x) = x^2\\), which is bigger: the average of the function \\(f(x)\\) over \\([0,1]\\) or the geometric mean which is the exponential of the average of the logarithm of \\(f\\) over the same interval?\n\n\n\n \n \n \n \n \n \n \n \n \n The exponential of the average of \\(\\log(f)\\)\n \n \n\n\n \n \n \n \n The average of \\(f\\)"
},
{
"objectID": "integrals/area_between_curves.html",
"href": "integrals/area_between_curves.html",
"title": "43  Area between two curves",
"section": "",
"text": "This section uses these add-on packages:\nThe definite integral gives the “signed” area between the function \\(f(x)\\) and the \\(x\\)-axis over \\([a,b]\\). Conceptually, this is the area between two curves, \\(f(x)\\) and \\(g(x)=0\\). More generally, this integral:\n\\[\n\\int_a^b (f(x) - g(x)) dx\n\\]\ncan be interpreted as the “signed” area between \\(f(x)\\) and \\(g(x)\\) over \\([a,b]\\). If on this interval \\([a,b]\\) it is true that \\(f(x) \\geq g(x)\\), then this would just be the area, as seen in this figure. The rectangle in the figure has area: \\((f(a)-g(a)) \\cdot (b-a)\\) which could be a term in a left Riemann sum of the integral of \\(f(x) - g(x)\\):\nFor the figure, we have \\(f(x) = \\sqrt{x}\\), \\(g(x)= x^2\\) and \\([a,b] = [1/4, 3/4]\\). The shaded area is then found by:\n\\[\n\\int_{1/4}^{3/4} (x^{1/2} - x^2) dx = (\\frac{x^{3/2}}{3/2} - \\frac{x^3}{3})\\big|_{1/4}^{3/4} = \\frac{\\sqrt{3}}{4} -\\frac{7}{32}.\n\\]"
},
{
"objectID": "integrals/area_between_curves.html#questions",
"href": "integrals/area_between_curves.html#questions",
"title": "43  Area between two curves",
"section": "43.1 Questions",
"text": "43.1 Questions\n\nQuestion\nFind the area enclosed by the curves \\(y=2-x^2\\) and \\(y=x^2 - 3\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area between \\(f(x) = \\cos(x)\\), \\(g(x) = x\\) and the \\(y\\) axis.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area between the line \\(y=1/2(x+1)\\) and half circle \\(y=\\sqrt{1 - x^2}\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area in the first quadrant between the lines \\(y=x\\), \\(y=1\\), and the curve \\(y=x^2 + 4\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area between \\(y=x^2\\) and \\(y=-x^4\\) for \\(\\lvert x \\rvert \\leq 1\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet f(x) = 1/(sqrt(pi)*gamma(1/2)) * (1 + t^2)^(-1) and g(x) = 1/sqrt(2*pi) * exp(-x^2/2). These graphs intersect in two points. Find the area bounded by them.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(Where gamma(1/2) is a call to the gamma function.)\n\n\nQuestion\nFind the area in the first quadrant bounded by the graph of \\(x = (y-1)^2\\), \\(x=3-y\\) and \\(x=2\\sqrt{y}\\). (Hint: integrate in the \\(y\\) variable.)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the total area bounded by the lines \\(x=0\\), \\(x=2\\) and the curves \\(y=x^2\\) and \\(y=x\\). This would be \\(\\int_a^b \\lvert f(x) - g(x) \\rvert dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLook at the sculpture Le Tamanoir by Calder. A large scale work. How much does it weigh? Approximately?\nLets try to answer that with an educated guess. The right most figure looks to be about 1/5th the total amount. So if we estimate that piece and multiply by 5 we get a good guess. That part looks like an area of metal bounded by two quadratic polynomials. If we compute that area in square inches, then multiply by an assumed thickness of one inch, we have the cubic volume. The density of galvanized steel is 7850 kg/\\(m^3\\) which we convert into pounds/in\\(^3\\) via:\n\n7850 * 2.2 * (1/39.3)^3\n\n0.28452123585283234\n\n\nThe two parabolas, after rotating, might look like the following (with \\(x\\) in inches):\n\\[\nf(x) = x^2/70, \\quad g(x) = 35 + x^2/140\n\\]\nPut this altogether to give an estimated weight in pounds.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs the guess that the entire sculpture is more than two tons?\n\n\n\n \n \n \n \n \n \n \n \n \n Less than two tons\n \n \n\n\n \n \n \n \n More than two tons\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nWe used area to estimate weight in this example, but Galileo used weight to estimate area. It is mentioned by Martin that in order to estimate the area enclosed by one arch of a cycloid, Galileo cut the arch from from some material and compared the weight to the weight of the generating circle. He concluded the area is close to \\(3\\) times that of the circle, a conjecture proved by Roberval in 1634.\n\n\n\n\nQuestion\nFormulas from the business world say that revenue is the integral of marginal revenue or the additional money from selling 1 more unit. (This is basically the derivative of profit). Cost is the integral of marginal cost, or the cost to produce 1 more. Suppose we have\n\\[\n\\text{mr}(x) = 2 - \\frac{e^{-x/10}}{1 + e^{-x/10}}, \\quad\n\\text{mc}(x) = 1 - \\frac{1}{2} \\cdot \\frac{e^{-x/5}}{1 + e^{-x/5}}.\n\\]\nFind the profit to produce 100 units: \\(P = \\int_0^{100} (\\text{mr}(x) - \\text{mc}(x)) dx\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCan SymPy do what Archimedes did?\nConsider the following code which sets up the area of an inscribed triangle, A1, and the area of a parabolic segment, A2 for a general parabola:\n\n@syms x::real A::real B::real C::real a::real b::real\nc = (a + b) / 2\nf(x) = A*x^2 + B*x + C\nSecant(f, a, b) = f(a) + (f(b)-f(a))/(b-a) * (x - a)\nA1 = integrate(Secant(f, a, c) - Secant(f,a,b), (x,a,c)) + integrate(Secant(f,c,b)-Secant(f,a,b), (x, c, b))\nA2 = integrate(f(x) - Secant(f,a,b), (x, a, b))\nout = 4//3 * A1 - A2\n\n \n\\[\n\\frac{A a^{3}}{3} + A a^{2} b - A a b^{2} - \\frac{A b^{3}}{3} + a^{2} \\left(- \\frac{A a}{2} - \\frac{A b}{2}\\right) - \\frac{4 a^{2} \\left(\\frac{A a}{4} - \\frac{A b}{4}\\right)}{3} - \\frac{4 a \\left(- \\frac{A a^{2}}{2} + \\frac{A a b}{2}\\right)}{3} - b^{2} \\left(- \\frac{A a}{2} - \\frac{A b}{2}\\right) + \\frac{4 b^{2} \\left(- \\frac{A a}{4} + \\frac{A b}{4}\\right)}{3} + \\frac{4 b \\left(\\frac{A a b}{2} - \\frac{A b^{2}}{2}\\right)}{3} - \\frac{4 \\left(\\frac{a}{2} + \\frac{b}{2}\\right)^{2} \\left(- \\frac{A a}{4} + \\frac{A b}{4}\\right)}{3} + \\frac{4 \\left(\\frac{a}{2} + \\frac{b}{2}\\right)^{2} \\left(\\frac{A a}{4} - \\frac{A b}{4}\\right)}{3} + \\frac{4 \\left(\\frac{a}{2} + \\frac{b}{2}\\right) \\left(- \\frac{A a^{2}}{2} + \\frac{A a b}{2}\\right)}{3} - \\frac{4 \\left(\\frac{a}{2} + \\frac{b}{2}\\right) \\left(\\frac{A a b}{2} - \\frac{A b^{2}}{2}\\right)}{3}\n\\]\n\n\n\nDoes SymPy get the correct output, \\(0\\), after calling simplify?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn Martin a fascinating history of the cycloid can be read.\n\n\n\nFigure from Martin showing the companion curve to the cycloid. As the generating circle rolls, from A to C, the original point of contact, D, traces out an arch of the cycloid. The companion curve is that found by congruent line segments. In the figure, when D was at point P the line segment PQ is congruent to EF (on the original position of the generating circle).\n\n\nIn particular, it can be read that Roberval proved that the area between the cycloid and its companion curve is half the are of the generating circle. Roberval didnt know integration, so finding the area between two curves required other tricks. One is called “Cavalieris principle.” From the figure above, which of the following would you guess this principle to be:\n\n\n\n \n \n \n \n \n \n \n \n \n If two regions bounded by parallel lines are such that any parallel between them cuts each region in segments of equal length, then the regions have equal area.\n \n \n\n\n \n \n \n \n The area of the cycloid is nearly the area of a semi-ellipse with known values, so one can approximate the area of the cycloid with formula for the area of an ellipse\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSuppose the generating circle has radius \\(1\\), so the area shown is \\(\\pi/2\\). The companion curve is then \\(1-\\cos(\\theta)\\) (a fact not used by Roberval). The area under this curve is then\n\n@syms theta\nintegrate(1 - cos(theta), (theta, 0, SymPy.PI))\n\n \n\\[\n\\pi\n\\]\n\n\n\nThat means the area under one-half arch of the cycloid is\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\pi\\)\n \n \n\n\n \n \n \n \n \\((3/2)\\cdot \\pi\\)\n \n \n\n\n \n \n \n \n \\(2\\pi\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoubling the answer above gives a value that Galileo had struggled with for many years.\n\n\n\nRoberval, avoiding a trignometric integral, instead used symmetry to show that the area under the companion curve was half the area of the rectangle, which in this figure is 2\\pi."
},
{
"objectID": "integrals/center_of_mass.html",
"href": "integrals/center_of_mass.html",
"title": "44  Center of Mass",
"section": "",
"text": "This section uses these add-on packages:\nThe game of seesaw is one where children earn an early appreciation for the effects of distance and relative weight. For children with equal weights, the seesaw will balance if they sit an equal distance from the center (on opposite sides, of course). However, with unequal weights that isnt the case. If one child weighs twice as much, the other must sit twice as far.\nThe key relationship is that \\(d_1 m_1 = d_2 m_2\\). This come from physics, where the moment about a point is defined by the mass times the distance. This balance relationship says the overall moment balances out. When this is the case, then the center of mass is at the fulcrum point, so there is no impetus to move.\nThe center of mass is an old concept that often allows a possibly complicated relationship involving weights to be reduced to a single point. The seesaw is an example: if the center of mass is at the fulcrum the seesaw can balance.\nIn general, we use position of the mass, rather than use distance from some fixed fulcrum. With this, the center of mass for a finite set of point masses distributed on the real line, is defined by:\n\\[\n\\bar{\\text{cm}} = \\frac{m_1 x_1 + m_2 x_2 + \\cdots + m_n x_n}{m_1 + m_2 + \\cdots + m_n}.\n\\]\nWriting \\(w_i = m_i / (m_1 + m_2 + \\cdots + m_n)\\), we get the center of mass is just a weighted sum: \\(w_1 x_1 + \\cdots + w_n x_n\\), where the \\(w_i\\) are the relative weights.\nWith some rearrangment, we can see that the center of mass satisfies the equation:\n\\[\nw_1 \\cdot (x_1 - \\bar{\\text{cm}}) + w_2 \\cdot (x_2 - \\bar{\\text{cm}}) + \\cdots + w_n \\cdot (x_n - \\bar{\\text{cm}}) = 0.\n\\]\nThe center of mass is a balance of the weighted signed distances. This property of the center of mass being a balancing point makes it of intrinsic interest and can be - in the case of sufficient symmetry - easy to find."
},
{
"objectID": "integrals/center_of_mass.html#center-of-mass-of-figures",
"href": "integrals/center_of_mass.html#center-of-mass-of-figures",
"title": "44  Center of Mass",
"section": "44.1 Center of mass of figures",
"text": "44.1 Center of mass of figures\nConsider now a more general problem, the center of mass of a solid figure. We will restrict our attention to figures that can be represented by functions in the \\(x-y\\) plane which are two dimensional. For example, consider the region in the plane bounded by the \\(x\\) axis and the function \\(1 - \\lvert x \\rvert\\). This is triangle with vertices \\((-1,0)\\), \\((0,1)\\), and \\((1,0)\\).\nThis graph shows that the figure is symmetric:\n\nf(x) = 1 - abs(x)\na, b = -1.5, 1.5\nplot(f, a, b)\nplot!(zero, a, b)\n\n\n\n\nAs the center of mass should be a balancing value, we would guess intuitively that the center of mass in the \\(x\\) direction will be \\(x=0\\).\nBut what should the center of mass formula be?\nAs with many formulas that will end up involving a derived integral, we start with a sum approximation. If the region is described as the area under the graph of \\(f(x)\\) between \\(a\\) and \\(b\\), then we can form a Riemann sum approximation, that is a choice of \\(a = x_0 < x_1 < x_2 \\cdots < x_n = b\\) and points \\(c_1\\), \\(\\dots\\), \\(c_n\\). If all the rectangles are made up of a material of uniform density, say \\(\\rho\\), then the mass of each rectangle will be the area times \\(\\rho\\), or \\(\\rho f(c_i) \\cdot (x_i - x_{i-1})\\), for \\(i = 1, \\dots , n\\).\n\n\n\n\n\nThe figure shows the approximating rectangles and circles representing their masses for \\(n=20\\).\nGeneralizing from this figure shows the center of mass for such an approximation will be:\n\\[\n\\begin{align*}\n&\\frac{\\rho f(c_1) (x_1 - x_0) \\cdot x_1 + \\rho f(c_2) (x_2 - x_1) \\cdot x_1 + \\cdots + \\rho f(c_n) (x_n- x_{n-1}) \\cdot x_{n-1}}{\\rho f(c_1) (x_1 - x_0) + \\rho f(c_2) (x_2 - x_1) + \\cdots + \\rho f(c_n) (x_n- x_{n-1})} \\\\\n&=\\\\\n&\\quad\\frac{f(c_1) (x_1 - x_0) \\cdot x_1 + f(c_2) (x_2 - x_1) \\cdot x_1 + \\cdots + f(c_n) (x_n- x_{n-1}) \\cdot x_{n-1}}{f(c_1) (x_1 - x_0) + f(c_2) (x_2 - x_1) + \\cdots + f(c_n) (x_n- x_{n-1})}.\n\\end{align*}\n\\]\nBut the top part is an approximation to the integral \\(\\int_a^b x f(x) dx\\) and the bottom part the integral \\(\\int_a^b f(x) dx\\). The ratio of these defines the center of mass.\n\nCenter of Mass: The center of mass (in the \\(x\\) direction) of a region in the \\(x-y\\) plane described by the area under a (positive) function \\(f(x)\\) between \\(a\\) and \\(b\\) is given by\n\\(\\text{Center of mass} = \\text{cm}_x = \\frac{\\int_a^b xf(x) dx}{\\int_a^b f(x) dx}.\\)\nFor regions described by a more complicated set of equations, the center of mass is found from the same formula where \\(f(x)\\) is the total height in the \\(x\\) direction for a given \\(x\\).\n\nFor the triangular shape, we have by the fact that \\(f(x) = 1 - \\lvert x \\rvert\\) is an even function that \\(xf(x)\\) will be odd, so the integral around \\(-1,1\\) will be \\(0\\). So the center of mass formula applied to this problem agrees with our expectation.\n\nExample\nWhat about the center of mass of the triangle formed by the line \\(x=-1\\), the \\(x\\) axis and \\((1-x)/2\\)? This too is defined between \\(a=-1\\) and \\(b=-1\\), but the center of mass will be negative, as a graph shows more mass to the left of \\(0\\) than the right:\n\nf(x) = (1-x)/2\nplot(f, -1, 1)\nplot!(zero, -1, 1)\n\n\n\n\nThe formulas give:\n\\[\n\\int_{-1}^1 xf(x) dx = \\int_{-1}^1 x\\cdot (1-x)/2 = (\\frac{x^2}{4} - \\frac{x^3}{6})\\big|_{-1}^1 = -\\frac{1}{3}.\n\\]\nThe bottom integral is just the area (or total mass if the \\(\\rho\\) were not canceled) and by geometry is \\(1/2 (1)(2) = 1\\). So \\(\\text{cm}_x = -1/3\\).\n\n\nExample\nFind the center of mass formed by the intersection of the parabolas \\(y=1 - x^2\\) and \\(y=(x-1)^2 - 2\\).\nThe center of mass (in the \\(x\\) direction) can be seen to be close to \\(x=1/2\\):\n\nf1(x) = 1 - x^2\nf2(x) = (x-1)^2 -2\nplot(f1, -3, 3)\nplot!(f2, -3, 3)\n\n\n\n\nTo find it, we need to find the intersection points, then integrate. We do so numerically.\n\nh(x) = f1(x) - f2(x)\na,b = find_zeros(h, -3, 3)\ntop, err = quadgk(x -> x * h(x), a, b)\nbottom, err = quadgk(h, a, b)\ncm = top / bottom\n\n0.5000000000000001\n\n\nOur guess from the diagram proves correct.\n\n\n\n\n\n\nNote\n\n\n\nIt proves convenient to use the -> notation for an anonymous function above, as our function h is not what is being integrated all the time, but some simple modification. If this isnt palatable, a new function could be defined and passed along to quadgk.\n\n\n\n\nExample\nConsider a region bounded by a probability density function. (These functions are non-negative, and integrate to \\(1\\).) The center of mass formula simplifies to \\(\\int xf(x) dx\\), as the denominator will be \\(1\\), and the answer is called the mean, and often denoted by the Greek letter \\(\\mu\\).\nFor the probability density \\(f(x) = e^{-x}\\) for \\(x\\geq 0\\) and \\(0\\) otherwise, find the mean.\nWe need to compute \\(\\int_{-\\infty}^\\infty xf(x) dx\\), but in this case since \\(f\\) is \\(0\\) to the left of the origin, we just have:\n\\[\n\\mu = \\int_0^\\infty x e^{-x} dx = -(1+x) \\cdot e^{-x} \\big|_0^\\infty = 1\n\\]\nFor fun, we compare this to the median, which is the value \\(M\\) so that the total area is split in half. That is, the following formula is satisfied: \\(\\int_0^M f(x) dx = 1/2\\). To compute, we have:\n\\[\n\\int_0^M e^{-x} dx = -e^{-x} \\big|_0^M = 1 - e^{-M}.\n\\]\nSolving \\(1/2 = 1 - e^{-M}\\) gives \\(M=\\log(2) = 0.69...\\), The median is to the left of the mean in this example.\n\n\n\n\n\n\nNote\n\n\n\nIn this example, we used an infinite region, so the idea of “balancing” may be a bit unrealistic, nonetheless, this intuitive interpretation is still a good one to keep this in mind. The point of comparing to the median is that the balancing point is to the right of where the area splits in half. Basically, the center of mass follows in the direction of the area far to the right of the median, as this area is skewed in that direction.\n\n\n\n\nExample\nA figure is formed by transformations of the function \\(\\phi(u) = e^{2(k-1)} - e^{2(k-u)}\\), for some fixed \\(k\\), as follows:\n\nk = 3\nphi(u) = exp(2(k-1)) - exp(2(k-u))\nf(u) = max(0, phi(u))\ng(u) = min(f(u+1), f(k))\n\nplot(f, 0, k, legend=false)\nplot!(g, 0, k)\nplot!(zero, 0, k)\n\n\n\n\n(This is basically the graph of \\(\\phi(u)\\) and the graph of its shifted value \\(\\phi(u+1)\\), only truncated on the top and bottom.)\nThe center of mass of this figure is found with:\n\nh(x) = g(x) - f(x)\ntop, _ = quadgk(x -> x*h(x), 0, k)\nbottom, _ = quadgk(h, 0, k)\ntop/bottom\n\n0.9626852772498595\n\n\nThis figure has constant slices of length \\(1\\) for fixed values of \\(y\\). If we were to approximate the values with blocks of height \\(1\\), then the center of mass would be to the left of \\(1\\) - for any \\(k\\), but the top most block would have an overhang to the right of \\(1\\) - out to a value of \\(k\\). That is, this figure should balance:\n\n\n\n\n\nSee this paper and its references for some background on this example and its extensions.\n\n\n44.1.1 The \\(y\\) direction.\nWe can talk about the center of mass in the \\(y\\) direction too. The approximating picture uses horizontal rectangles - not vertical ones - and if we describe them by \\(f(y)\\), then the corresponding formulas would be\n\n\\(\\text{center of mass} = \\text{cm}_y = \\frac{\\int_a^b y f(y) dy}{\\int_a^b f(y) dy}.\\)\n\nFor example, consider, again, the triangle bounded by the line \\(x=-1\\), the \\(x\\) axis, and the line \\(y=(1-x)/2\\). In terms of describing this in \\(y\\), the function \\(f(y)=2 -2y\\) gives the total length of the horizontal slice (which comes from solving \\(y=(x-1)/2\\)for \\(x\\), the general method to find an inverse function, and subtracting \\(-1\\)) and the interval is \\(y=0\\) to \\(y=1\\). Thus our center of mass in the \\(y\\) direction will be\n\\[\n\\text{cm}_y = \\frac{\\int_0^1 y (2 - 2y) dy}{\\int_0^1 (2 - 2y) dy} = \\frac{(2y^2/2 - 2y^3/3)\\big|_0^1}{1} = \\frac{1}{3}.\n\\]\nHere the center of mass is below \\(1/2\\) as the bulk of the area is. (The bottom area is just \\(1\\), as known from the area of a triangle.)\nAs seen, the computation of the center of mass in the \\(y\\) direction has an identical formula, though may be more involved if an inverse function must be computed.\n\nExample\nMore generally, consider a right triangle with vertices \\((0,0)\\), \\((0,a)\\), and \\((b,0)\\). The center of mass of this can be computed with the help of the equation for the line that forms the hypotenuse: \\(x/b + y/a = 1\\). We find the center of mass symbolically in the \\(y\\) variable by solving for \\(x\\) in terms of \\(y\\), then integrating from \\(0\\) to \\(a\\):\n\n@syms a b x y\neqn = x/b + y/a - 1\nfy = solve(eqn, x)[1]\nintegrate(y*fy, (y, 0, a)) / integrate(fy, (y, 0, a))\n\n \n\\[\n\\frac{a}{3}\n\\]\n\n\n\nThe answer involves \\(a\\) linearly, but not \\(b\\). If we find the center of mass in \\(x\\), we could do something similar:\n\nfx = solve(eqn, y)[1]\nintegrate(x*fx, (x, 0, b)) / integrate(fx, (x, 0, b))\n\n \n\\[\n\\frac{b}{3}\n\\]\n\n\n\nBut really, we should have just noted that simply by switching the labels \\(a\\) and \\(b\\) in the diagram we could have discovered this formula.\n\n\n\n\n\n\nNote\n\n\n\nThe centroid of a region in the plane is just \\((\\text{cm}_x, \\text{cm}_y)\\). This last fact says the centroid of the right triangle is just \\((b/3, a/3)\\). The centroid can be found by other geometric means. The link shows the plumb line method. For triangles, the centroid is also the intersection point of the medians, the lines that connect a vertex with its opposite midpoint.\n\n\n\n\nExample\nCompute the \\(x\\) and \\(y\\) values of the center of mass of the half circle described by the area below the function \\(f(x) = \\sqrt{1 - x^2}\\) and above the \\(x\\)-axis.\nA plot shows the value of cm\\(_x\\) will be \\(0\\) by symmetry:\n\nf(x) = sqrt(1 - x^2)\nplot(f, -1, 1)\n\n\n\n\n(\\(f(x)\\) is even, so \\(xf(x)\\) will be odd.)\nHowever, the value for cm\\(_y\\) will - like the last problem - be around \\(1/3\\). The exact value is compute using slices in the \\(y\\) direction. Solving for \\(x\\) in \\(y=\\sqrt{1-x^2}\\), or \\(x = \\pm \\sqrt{1-y^2}\\), if \\(f(y) = 2\\sqrt{1 - y^2}\\). The value is then:\n\\[\n\\text{cm}_y = \\frac{\\int_{0}^1 y 2 \\sqrt{1 - y^2}dy}{\\int_{0}^1 2\\sqrt{1-y^2}} =\n\\frac{-(1-x^2)^{3/2}/3\\big|_0^1}{\\pi/2} = \\frac{1}{3}.\n\\]\nIn fact it is exactly \\(1/3\\). The top calculation is done by \\(u\\)-substitution, the bottom by using the area formula for a half circle, \\(\\pi r^2/2\\).\n\n\nExample\nA disc of radius \\(2\\) is centered at the origin, as a disc of radius \\(1\\) is bored out between \\(y=0\\) and \\(y=1\\). Find the resulting center of mass.\nA picture shows that this could be complicated, especially for \\(y > 0\\), as we need to describe the length of the red lines below for \\(-2 < y < 2\\):\n\n\n\n\n\nWe can see that cm\\(_x = 0\\), by symmetry, but to compute cm\\(_y\\) we need to find \\(f(y)\\), which will depend on the value of \\(y\\) between \\(-2\\) and \\(2\\). The outer circle is \\(x^2 + y^2 = 4\\), the inner circle \\(x^2 + (y-1)^2 = 1\\). When \\(y < 0\\), \\(f(y)\\) is the distance across the outer circle or, \\(2\\sqrt{4 - y^2}\\). When \\(y \\geq 0\\), \\(f(y)\\) is twice the distance from the bigger circle to the smaller, of \\(2(\\sqrt{4 - y^2} - \\sqrt{1 - (y-1)^2})\\).\nWe use this to compute:\n\nf(y) = y < 0 ? 2*sqrt(4 - y^2) : 2* (sqrt(4 - y^2)- sqrt(1 - (y-1)^2))\ntop, _ = quadgk( y -> y * f(y), -2, 2)\nbottom, _ = quadgk( f, -2, 2)\ntop/bottom\n\n-0.3333333333594305\n\n\nThe nice answer of \\(-1/3\\) makes us think there may be a different way to visualize this. Were we to rearrange the top integral, we could write it as \\(\\int_{-2}^2 y 2 \\sqrt{4 -y^2}dy - \\int_0^2 2y\\sqrt{1 - (y-1)^2}dy\\). Call this \\(A - B\\). The left term, \\(A\\), is part of the center of mass formula for the big circle (which is this value divided by \\(M=4\\pi\\)), and the right term, \\(B\\), is part of the center of mass formula for the (drilled out) smaller circle (which is this value divided by \\(m=\\pi\\). These values are weighted according to \\((AM - Bm)/(M-m)\\). In this case \\(A=0\\), \\(B=1\\) and \\(M=4m\\), so the answer is \\(-1/3\\)."
},
{
"objectID": "integrals/center_of_mass.html#questions",
"href": "integrals/center_of_mass.html#questions",
"title": "44  Center of Mass",
"section": "44.2 Questions",
"text": "44.2 Questions\n\nQuestion\nFind the center of mass in the \\(x\\) variable for the region bounded by parabola \\(x=4 - y^2\\) and the \\(y\\) axis.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the center of mass in the \\(x\\) variable of the region in the first and fourth quadrants bounded by the ellipse \\((x/2)^2 + (y/3)^2 = 1\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the center of mass of the region in the first quadrant bounded by the function \\(f(x) = x^3(1-x)^4\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(k\\) and \\(\\lambda\\) be parameters in \\((0, \\infty)\\). The Weibull density is a probability density on \\([0, \\infty)\\) (meaning it is \\(0\\) when \\(x < 0\\) satisfying:\n\\[\nf(x) = \\frac{k}{\\lambda}\\left(\\frac{x}{\\lambda}\\right)^{k-1} \\exp(-(\\frac{x}{\\lambda})^k)\n\\]\nFor \\(k=2\\) and \\(\\lambda = 2\\), compute the mean. (The center of mass, assuming the total area is \\(1\\).)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe logistic density depends on two parameters \\(m\\) and \\(s\\) and is given by:\n\\[\nf(x) = \\frac{1}{4s} \\text{sech}(\\frac{x-\\mu}{2s})^2, \\quad -\\infty < x < \\infty.\n\\]\n(Where \\(\\text{sech}\\) is the hyperbolic secant, implemented in julia through sech.)\nFor \\(m=2\\) and \\(s=4\\) compute the mean, or center of mass, of this density.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA region is formed by intersecting the area bounded by the circle \\(x^2 + y^2 = 1\\) that lies above the line \\(y=3/4\\). Find the center of mass in the \\(y\\) direction (that of the \\(x\\) direction is \\(0\\) by symmetry).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the center of mass in the \\(y\\) direction of the area bounded by the cosine curve and the \\(x\\) axis between \\(-\\pi/2\\) and \\(\\pi/2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA penny, nickel, dime and quarter are stacked so that their right most edges align and are centered so that the center of mass in the \\(y\\) direction is \\(0\\). Find the center of mass in the \\(x\\) direction.\n\n\n\n\n\nYou will need some specifications, such as these from the US Mint\n diameter(in) weight(gms)\npenny 0.750 2.500\nnickel 0.835 5.000\ndime 0.705 2.268\nquarter 0.955 5.670\n\n(Hint: Though this could be done with integration, it is easier to treat each coin as a single point (its centroid) with the given mass and then apply the formula for sums.)"
},
{
"objectID": "integrals/volumes_slice.html",
"href": "integrals/volumes_slice.html",
"title": "45  Volumes by slicing",
"section": "",
"text": "This section uses these add-on packages:\nAn ad for a summer job says work as the Michelin Man! Sounds promising, but how much will that costume weigh? A very hot summer may make walking around in a heavy costume quite uncomfortable.\nA back-of-the envelope calculation would start by\nThen the volume would be found by adding:\n\\[\nV = \\pi \\cdot r_1^2 \\cdot h_1 + \\pi \\cdot r_2^2 \\cdot h_2 + \\cdot + \\pi \\cdot r_n^2 \\cdot h_n.\n\\]\nThe weight would come by multiplying the volume by some appropriate density.\nLooking at the sum though, we see the makings of an approximate integral. If the heights were to get infinitely small, we might expect this to approach something like \\(V=\\int_a^b \\pi r(h)^2 dh\\).\nIn fact, we have in general:\nThis formula is derived by approximating the volume by “slabs” with volume \\(A_{xc}(x) \\Delta x\\) and using the Riemann integrals definition to pass to the limit. The discs of the Michelin man are an example, where the cross-sectional area is just that of a circle, or \\(\\pi r^2\\)."
},
{
"objectID": "integrals/volumes_slice.html#solids-of-revolution",
"href": "integrals/volumes_slice.html#solids-of-revolution",
"title": "45  Volumes by slicing",
"section": "45.1 Solids of revolution",
"text": "45.1 Solids of revolution\nWe begin with some examples of a special class of solids - solids of revolution. These have an axis of symmetry from which the slabs are then just circular disks.\nConsider the volume contained in this glass, it will depend on the radius at different values of \\(x\\):\n\n\n\nA wine glass oriented so that it is seen as generated by revolving a curve about the \\(x\\) axis. The radius of revolution varies as a function of \\(x\\) between about \\(0\\) and \\(6.2\\)cm.\n\n\nIf \\(r(x)\\) is the radius as a function of \\(x\\), then the cross sectional area is \\(\\pi r(x)^2\\) so the volume is given by:\n\\[\nV = \\int_a^b \\pi r(x)^2 dx.\n\\]\n\n\n\n\n\n\nNote\n\n\n\nThe formula is for a rotation around the \\(x\\)-axis, but can easily be generalized to rotating around any line (say the \\(y\\)-axis or \\(y=x\\), …) just by adjusting what \\(r(x)\\) is taken to be.\n\n\nFor a numeric example, we consider the original Red Solo Cup. The dimensions of the cup were basically: a top diameter of \\(d_1 = 3~ \\frac{3}{4}\\) inches, a bottom diameter of \\(d_0 = 2~ \\frac{1}{2}\\) inches and a height of \\(h = 4~ \\frac{3}{4}\\) inches.\nThe central axis is straight down. If we rotate the cup so this is the \\(x\\)-axis, then we can get\n\\[\nr(x) = \\frac{d_0}{2} + \\frac{d_1/2 - d_0/2}{h}x = \\frac{5}{4} + \\frac{5}{38}x\n\\]\nThe volume in cubic inches will be:\n\\[\nV = \\int_0^h \\pi r(x)^2 dx\n\\]\nThis is\n\nd0, d1, h = 2.5, 3.75, 4.75\nrad(x) = d0/2 + (d1/2 - d0/2)/h * x\nvol, _ = quadgk(x -> pi * rad(x)^2, 0, h)\n\n(36.917804295114436, 7.105427357601002e-15)\n\n\nSo \\(36.9 \\text{in}^3\\). How many ounces is that? It is useful to know that 1 gallon of water is defined as \\(231\\) cubic inches, contains \\(128\\) ounces, and weighs \\(8.34\\) pounds.\nSo our cup holds this many ounces:\n\nozs = vol / 231 * 128\n\n20.456618830193282\n\n\nFull it is about \\(20\\) ounces, though this doesnt really account for the volume taken up by the bottom of the cup, etc.\nIf you are poor with units, Julia can provide some help through the Unitful package. Here the additional UnitfulUS package must also be included, as was done above, to access fluid ounces:\n\nvol * u\"inch\"^3 |> us\"floz\"\n\n20.456618830193282 fl ozᵘˢ\n\n\nBefore Solo “squared” the cup, the Solo cup had markings that - some thought - indicated certain volume amounts.\n\n\n\nMarkings on the red Solo cup indicated various volumes\n\n\nWhat is the height for \\(5\\) ounces (for a glass of wine)? \\(12\\) ounces (for a beer unit)?\nHere the volume is fixed, but the height is not. For \\(v\\) ounces, we need to convert to cubic inches. The conversion is \\(1\\) ounce is \\(231/128 \\text{in}^3\\).\nSo we need to solve \\(v \\cdot (231/128) = \\int_0^h\\pi r(x)^2 dx\\) for \\(h\\) when \\(v=5\\) and \\(v=12\\).\nLets express volume as a function of \\(h\\):\n\nVol(h) = quadgk(x -> pi * rad(x)^2, 0, h)[1]\n\nVol (generic function with 1 method)\n\n\nThen to solve we have:\n\nv₅ = 5\nh5 = find_zero(h -> Vol(h) - v₅ * 231 / 128, 4)\n\n1.5659355800223222\n\n\nand\n\nv₁₂ = 12\nh12 = find_zero(h -> Vol(h) - v₁₂ * 231 / 128, 4)\n\n3.207188125690385\n\n\nAs a percentage of the total height, these are:\n\nh5/h, h12/h\n\n(0.32967064842575206, 0.6751975001453442)\n\n\n\n\n\n\n\n\nNote\n\n\n\nWere performance at issue, Newtons method might also have been considered here, as the derivative is easily computed by the fundamental theorem of calculus.\n\n\n\nExample\nBy rotating the line segment \\(x/r + y/h=1\\) that sits in the first quadrant around the \\(y\\) axis, we will generate a right-circular cone. The volume of which can be expressed through the above formula by noting the radius, as a function of \\(y\\), will be \\(R = r(1 - y/h)\\). This gives the well-known volume of a cone:\n\n@syms r h x y\nR = r*(1 - y/h)\nintegrate(pi*R^2, (y, 0, h))\n\n \n\\[\n\\frac{\\pi h r^{2}}{3}\n\\]\n\n\n\nThe frustum of a cone is simply viewed as a cone with its top cut off. If the original height would have been \\(h_0\\) and the actual height \\(h_1\\), then the volume remaining is just \\(\\int_{h_0}^h \\pi r(y)^2 dy = \\pi h_1 r^2/3 - \\pi h_0 r^2/3 = \\pi r^2 (h_1-h_0)/3\\).\nIt is not unusual to parameterize a cone by the angle \\(\\theta\\) it makes and the height. Since \\(r/h=\\tan\\theta\\), this gives the formula \\(V = \\pi/3\\cdot h^3\\tan(\\theta)^2\\).\n\n\nExample\nGabriels horn is a geometric figure of mathematics - but not the real world - which has infinite height, but not volume! The figure is found by rotating the curve \\(y=1/x\\) around the \\(x\\) axis from \\(1\\) to \\(\\infty\\). If the volume formula holds, what is the volume of this “horn?”\n\nradius(x) = 1/x\nquadgk(x -> pi*radius(x)^2, 1, Inf)[1]\n\n3.141592653589793\n\n\nThat is a value very reminiscent of \\(\\pi\\), which it is as \\(\\int_1^\\infty 1/x^2 dx = -1/x\\big|_1^\\infty=1\\).\n\n\n\n\n\n\nNote\n\n\n\nThe interest in this figure is that soon we will be able to show that it has infinite surface area, leading to the paradox that it seems possible to fill it with paint, but not paint the outside.\n\n\n\n\nExample\nA movie studio hand is asked to find a prop vase to be used as a Ming vase in an upcoming scene. The dimensions specified are for the outside diameter in centimeters and are given by\n\\[\nd(h) = \\begin{cases}\n2 \\sqrt{26^2 - (h-20)^2} & 0 \\leq h \\leq 44\\\\\n20 \\cdot e^{-(h - 44)/10} & 44 < h \\leq 50.\n\\end{cases}\n\\]\nIf the vase were solid, what would be the volume?\nWe define d using a ternary operator to handle the two cases:\n\nd(h) = h <= 44 ? 2*sqrt(26^2 - (h-20)^2) : 20 * exp(-(h-44)/10)\nrad(h) = d(h)/2\n\nrad (generic function with 1 method)\n\n\nThe volume in cm\\(^3\\) is then:\n\nVₜ, _ = quadgk(h -> pi * rad(h)^2, 0, 50)\n\n(71687.1744525789, 0.00030474267730795646)\n\n\nFor the actual shoot, the vase is to be filled with ash, to simulate a funeral urn. (It will then be knocked over in a humorous manner, of course.) How much ash is needed if the vase has walls that are 1/2 centimeter thick\nWe need to subtract \\(0.5\\) from the radius and a then recompute:\n\nV_int, _ = quadgk(h -> pi * (rad(h) - 1/2)^2, 1/2, 50)\n\n(68082.16068327641, 0.00044615780792156556)\n\n\nA liter of volume is \\(1000 \\text{cm}^3\\). So this is about \\(68\\) liters, or more than 15 gallons. Perhaps the dimensions given were bit off.\nWhile we are here, to compute the actual volume of the material in the vase could be done by subtraction.\n\nVₜ - V_int\n\n3605.013769302488\n\n\n\n\n45.1.1 The washer method\nReturning to the Michelin Man, in our initial back-of-the-envelope calculation we didnt account for the fact that a tire isnt a disc, as it has its center cut out. Returning, suppose \\(R_i\\) is the outer radius and \\(r_i\\) the inner radius. Then each tire has volume\n\\[\n\\pi R_i^2 h_i - \\pi r_i^2 h_i = \\pi (R_i^2 - r_i^2) h_i.\n\\]\nRather than use \\(\\pi r(x)^2\\) for a cross section, we would use \\(\\pi (R(x)^2 - r(x)^2)\\).\nIn general we call a shape like the tire a “washer” and use this formula for a washers cross section \\(A_{xc}(x) = \\pi(R(x)^2 - r(x)^2)\\).\nThen the volume for the solid of revolution whose cross sections are washers would be:\n\\[\nV = \\int_a^b \\pi \\cdot (R(x)^2 - r(x)^2) dx.\n\\]\n\nExample\nAn artist is working with a half-sphere of material, and wishes to bore out a conical shape. What would be the resulting volume, if the two figures are modeled by\n\\[\nR(x) = \\sqrt{1^2 - (x-1)^2}, \\quad r(x) = x,\n\\]\nwith \\(x\\) ranging from \\(x=0\\) to \\(1\\)?\nThe answer comes by integrating:\n\nRad(x) = sqrt(1 - (x-1)^2)\nrad(x) = x\nV, _ = quadgk(x -> pi*(Rad(x)^2 - rad(x)^2), 0, 1)\n\n(1.0471975511965974, 0.0)"
},
{
"objectID": "integrals/volumes_slice.html#solids-with-known-cross-section",
"href": "integrals/volumes_slice.html#solids-with-known-cross-section",
"title": "45  Volumes by slicing",
"section": "45.2 Solids with known cross section",
"text": "45.2 Solids with known cross section\nThe Dart cup company now produces the red solo cup with a square cross section. Suppose the dimensions are the same: a top diameter of \\(d_1 = 3 3/4\\) inches, a bottom diameter of \\(d_0 = 2 1/2\\) inches and a height of \\(h = 4 3/4\\) inches. What is the volume now?\nThe difference, of course, is that cross sections now have area \\(d^2\\), as opposed to \\(\\pi r^2\\). This leads to some difference, which we quantify, as follows:\n\nd0, d1, h = 2.5, 3.75, 4.75\nd(x) = d0 + (d1 - d0)/h * x\nvol, _ = quadgk(x -> d(x)^2, 0, h)\nvol / 231 * 128\n\n26.046176046176043\n\n\nThis shape would have more volume - the cross sections are bigger. Presumably the dimensions have changed. Without going out and buying a cup, lets assume the cross-sectional diameter remained the same, not the diameter. This means the largest dimension is the same. The cross section diameter is \\(\\sqrt{2}\\) larger. What would this do to the area?\nWe could do this two ways: divide \\(d_0\\) and \\(d_1\\) by \\(\\sqrt{2}\\) and recompute. However, each cross section of this narrower cup would simply be \\(\\sqrt{2}^2\\) smaller, so the total volume would change by \\(2\\), or be 13 ounces. We have \\(26.04\\) is too big, and \\(13.02\\) is too small, so some other overall dimensions are used.\n\nExample\nFor a general cone, we use this definition:\n\nA cone is the solid figure bounded by a base in a plane and by a surface (called the lateral surface) formed by the locus of all straight line segments joining the apex to the perimeter of the base.\n\nLet \\(h\\) be the distance from the apex to the base. Consider cones with the property that all planes parallel to the base intersect the cone with the same shape, though perhaps a different scale. This figure shows an example, with the rays coming from the apex defining the volume.\n\n\n\n\n\nA right circular cone is one where this shape is a circle. This definition can be more general, as a square-based right pyramid is also such a cone. After possibly reorienting the cone in space so the base is at \\(u=0\\) and the apex at \\(u=h\\) the volume of the cone can be found from:\n\\[\nV = \\int_0^h A_{xc}(u) du.\n\\]\nThe cross sectional area \\(A_{xc}(u)\\) satisfies a formula in terms of \\(A_{xc}(0)\\), the area of the base:\n\\[\nA_{xc}(u) = A_{xc}(0) \\cdot (1 - \\frac{u}{h})^2\n\\]\nSo the integral becomes:\n\\[\nV = \\int_0^h A_{xc}(u) du = A_{xc}(0) \\int_0^h (1 - \\frac{u}{h})^2 du = A_{xc}(0) \\int_0^1 v^2 \\frac{1}{h} dv = A_{xc}(0) \\frac{h}{3}.\n\\]\nThis gives a general formula for the volume of such cones.\n\n\n45.2.1 Cavalieris method\nCavalieris Principle is “Suppose two regions in three-space (solids) are included between two parallel planes. If every plane parallel to these two planes intersects both regions in cross-sections of equal area, then the two regions have equal volumes.” (Wikipedia).\nWith the formula for the volume of solids based on cross sections, this is a trivial observation, as the functions giving the cross-sectional area are identical. Still, it can be surprising. Consider a sphere with an interior cylinder bored out of it. (The Napkin ring problem.) The bore has height \\(h\\) - for larger radius spheres this means very wide bores.\n\n\n\n\n\nThe small orange line is rotated, so using the washer method we get the cross sections given by \\(\\pi(r_0^2 - r_i^2)\\), the outer and inner radii, as a function of \\(y\\).\nThe outer radii has points \\((x,y)\\) satisfying \\(x^2 + y^2 = R^2\\), so is \\(\\sqrt{R^2 - y^2}\\). The inner radii has a constant value, and as indicated in the figure, is \\(\\sqrt{R^2 - (h/2)^2}\\), by the Pythagorean theorem.\nThus the cross sectional area is\n\\[\n\\pi( (\\sqrt{R^2 - y^2})^2 - (\\sqrt{R^2 - (h/2)^2})^2 )\n= \\pi ((R^2 - y^2) - (R^2 - (h/2)^2))\n= \\pi ((\\frac{h}{2})^2 - y^2)\n\\]\nAs this does not depend on \\(R\\), and the limits of integration would always be \\(-h/2\\) to \\(h/2\\) by Cavalieris principle, the volume of the solid will be independent of \\(R\\) too.\nTo actually compute this volume, we take \\(R=h/2\\), so that the bore hole is just a line of no volume, the resulting volume is then that of a sphere with radius \\(h/2\\), or \\(4/3\\pi(h/2)^3 = \\pi h^3/6\\)."
},
{
"objectID": "integrals/volumes_slice.html#the-second-theorem-of-pappus",
"href": "integrals/volumes_slice.html#the-second-theorem-of-pappus",
"title": "45  Volumes by slicing",
"section": "45.3 The second theorem of Pappus",
"text": "45.3 The second theorem of Pappus\nThe second theorem of Pappus says that if a plane figure \\(F\\) is rotated around an axis to form a solid of revolution, the total volume can be written as \\(2\\pi r A(F)\\), where \\(r\\) is the distance the centroid is from the axis of revolution, and \\(A(F)\\) is the area of the plane figure. In short, the distance traveled by the centroid times the area.\nThis can make some computations trivial. For example, we can make a torus (or donut) by rotating the circle \\((x-2)^2 + y^2 = 1\\) about the \\(y\\) axis. As the centroid is clearly \\((2, 0)\\), with \\(r=2\\) in the above formula, and the area of the circle is \\(\\pi 1^2\\), the volume of the donut is \\(2\\pi(2)(\\pi) = 4\\pi^2\\).\n\nExample\nAbove, we found the volume of a cone, as it is a solid of revolution, through the general formula. However, parameterizing the cone as the revolution of a triangle with vertices \\((0,0)\\), \\((r, 0)\\), and \\((0,h)\\) and using the formula for the center of mass in the \\(x\\) direction of such a triangle, \\(r/3\\), we get that the volume of a cone with height \\(h\\) and radius \\(r\\) is \\(2\\pi (r/3)\\cdot (rh/2) = \\pi r^2 h/3\\), in agreement with the calculus based computation."
},
{
"objectID": "integrals/volumes_slice.html#questions",
"href": "integrals/volumes_slice.html#questions",
"title": "45  Volumes by slicing",
"section": "45.4 Questions",
"text": "45.4 Questions\n\nQuestion\nConsider this big Solo cup:\n\n\n\nBig solo cup.\n\n\nIt has approximate dimensions: smaller radius 5 feet, upper radius 8 feet and height 15 feet. How many gallons is it? At \\(8\\) pounds a gallon this would be pretty heavy!\nTwo facts are useful:\n\na cubic foot is 7.48052 gallons\nthe radius as a function of height is \\(r(h) = 5 + (3/15)\\cdot h\\)\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn Glass Shape Influences Consumption Rate for Alcoholic Beverages the authors demonstrate that the shape of the glass can have an effect on the rate of consumption, presumably people drink faster when they arent sure how much they have left. In particular, they comment that people have difficulty judging the half-finished-by-volume mark.\nThis figure shows some of the wide variety of beer-serving glasses:\n\n\n\nA variety of different serving glasses for beer.\n\n\nWe work with metric units, as there is a natural relation between volume in cm\\(^3\\) and liquid measure (\\(1\\) liter = \\(1000\\) cm\\(^3\\), so a \\(16\\)-oz pint glass is roughly \\(450\\) cm\\(^3\\).)\nLet two glasses be given as follows. A typical pint glass with linearly increasing radius:\n\\[\nr(h) = 3 + \\frac{1}{5}h, \\quad 0 \\leq h \\leq b;\n\\]\nand a curved-edge one:\n\\[\ns(h) = 3 + \\log(1 + h), \\quad 0 \\leq h \\leq b\n\\]\nThe following functions find the volume as a function of height, \\(h\\):\n\nr1(h) = 3 + h/5\ns1(h) = 2 + log(1 + h)\nr_vol(h) = quadgk(x -> pi*r1(x)^2, 0, h)[1]\ns_vol(h) = quadgk(x -> pi*s1(x)^2, 0, h)[1]\n\ns_vol (generic function with 1 method)\n\n\n\nFor the straight-sided glass find \\(h\\) so that the volume is \\(450\\).\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nFor the straight-sided glass find \\(h\\) so that the volume is \\(225\\) (half full).\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nFor the straight-sided glass, what is the percentage of the total height when the glass is half full. (For a cylinder it would just be 50.\n\n\n\n\n \n \n \n \n \n\n \n  percent \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nPeople often confuse the half-way by height amount for the half way by volume, as it is for the cylinder. Take the height for the straight-sided glass filled with \\(450\\) mm, divide it by \\(2\\), then compute the percentage of volume at the half way height to the original.\n\n\n\n\n \n \n \n \n \n\n \n  percent \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nFor the curved-sided glass find \\(h\\) so that the volume is \\(450\\).\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nFor the curved-sided glass find \\(h\\) so that the volume is \\(225\\) (half full).\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nFor the curved-sided glass, what is the percentage of the total height when the glass is half full. (For a cylinder it would just be 50.\n\n\n\n\n \n \n \n \n \n\n \n  percent \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nPeople often confuse the half-way by height amount for the half way by volume, as it is for the cylinder. Take the height for the curved-sided glass filled with \\(450\\) mm, divide it by \\(2\\), then compute the percentage of volume at the half way height to the original.\n\n\n\n\n \n \n \n \n \n\n \n  percent \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA right pyramid has its apex (top point) above the centroid of its base, and for our purposes, each of its cross sections. Suppose a pyramid has square base of dimension \\(w\\) and height of dimension \\(h\\).\nWill this integral give the volume:\n\\[\nV = \\int_0^h w^2 (1 - \\frac{y}{h})^2 dy?\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the volume?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(l\\cdot w \\cdot h/ 3\\)\n \n \n\n\n \n \n \n \n \\(1/3 \\cdot w^2\\cdot h\\)\n \n \n\n\n \n \n \n \n \\(1/3 \\cdot b\\cdot h\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAn ellipsoid is formed by rotating the region in the first and second quadrants bounded by the ellipse \\((x/2)^2 + (y/3)^2=1\\) and the \\(x\\) axis around the \\(x\\) axis. What is the volume of this ellipsoid? Find it numerically.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAn ellipsoid is formed by rotating the region in the first and second quadrants bounded by the ellipse \\((x/a)^2 + (y/b)^2=1\\) and the \\(x\\) axis around the \\(x\\) axis. What is the volume of this ellipsoid? Find it symbolically.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\pi/3 \\cdot a b^2\\)\n \n \n\n\n \n \n \n \n \\(4/3 \\cdot \\pi a b^2\\)\n \n \n\n\n \n \n \n \n \\(4/3 \\cdot \\pi a^2 b\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA solid is generated by rotating the region enclosed by the graph \\(y=\\sqrt{x}\\), the lines \\(x=1\\), \\(x=2\\), and \\(y=1\\) about the \\(x\\) axis. Find the volume of the solid.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe region enclosed by the graphs of \\(y=x^3 - 1\\) and \\(y=x-1\\) are rotated around the \\(y\\) axis. What is the volume of the solid?\n\n@syms x\nplot(x^3 - 1, 0, 1, legend=false)\nplot!(x-1)\n\n\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nRotate the region bounded by \\(y=e^x\\), the line \\(x=\\log(2)\\) and the first quadrant about the line \\(x=\\log(2)\\).\n(Be careful, the radius in the formula \\(V=\\int_a^b \\pi r(u)^2 du\\) is from the line \\(x=\\log(2)\\).)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the volume of rotating the region bounded by the line \\(y=x\\), \\(x=1\\) and the \\(x\\)-axis around the line \\(y=x\\). (The Theorem of Pappus is convenient and the fact that the centroid of the triangular region lies at \\((2/3, 1/3)\\).)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nRotate the region bounded by the line \\(y=x\\) and the function \\(f(x) = x^2\\) about the line \\(y=x\\). What is the resulting volume?\nYou can integrate in the length along the line \\(y=x\\) (\\(u\\) from \\(0\\) to \\(\\sqrt{2}\\)). The radius then can be found by intersecting the line perpendicular line to \\(y=x\\) at \\(u\\) to the curve \\(f(x)\\). This will do so:\n\ntheta = pi/4 ## we write y=x as y = x * tan(pi/4) for more generality, as this allows other slants.\n\nf(x) = x^2\n𝒙(u) = find_zero(x -> u*sin(theta) - 1/tan(theta) * (x - u*cos(theta)) - f(x), (u*cos(theta), 1))\n𝒓(u) = sqrt((u*cos(theta) - 𝒙(u))^2 + (u*sin(theta) - f(𝒙(u)))^2)\n\n𝒓 (generic function with 1 method)\n\n\n(Though in this case you can also find r(u) using the quadratic formula.)\nWith this, find the volume.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nRepeat (find the volume) only this time with the function \\(f(x) = x^{20}\\)."
},
{
"objectID": "integrals/arc_length.html",
"href": "integrals/arc_length.html",
"title": "46  Arc length",
"section": "",
"text": "This section uses these add-on packages:\nThe length of the jump rope in the picture can be computed by either looking at the packaging it came in, or measuring the length of each plastic segment and multiplying by the number of segments. The former is easier, the latter provides the intuition as to how we can find the length of curves in the \\(x-y\\) plane. The idea is old, Archimedes used fixed length segments of polygons to approximate \\(\\pi\\) using the circumference of circle producing the bounds \\(3~\\frac{1}{7} > \\pi > 3~\\frac{10}{71}\\).\nA more modern application is the algorithm used by GPS devices to record a path taken. However, rather than record times for a fixed distance traveled, the GPS device records position (\\((x,y)\\)) or longitude and latitude at fixed units of time - similar to how parametric functions are used. The device can then compute distance traveled and speed using some familiar formulas."
},
{
"objectID": "integrals/arc_length.html#arc-length-formula",
"href": "integrals/arc_length.html#arc-length-formula",
"title": "46  Arc length",
"section": "46.1 Arc length formula",
"text": "46.1 Arc length formula\nRecall the distance formula gives the distance between two points: \\(\\sqrt{(x_1 - x_0)^2 + (y_1 - y_0)^2}\\).\nConsider now two functions \\(g(t)\\) and \\(f(t)\\) and the parameterized graph between \\(a\\) and \\(b\\) given by the points \\((g(t), f(t))\\) for \\(a \\leq t \\leq b\\). Assume that both \\(g\\) and \\(f\\) are differentiable on \\((a,b)\\) and continuous on \\([a,b]\\) and furthermore that \\(\\sqrt{g'(t)^2 + f'(t)^2}\\) is Riemann integrable.\n\nThe arc length of a curve. For \\(f\\) and \\(g\\) as described, the arc length of the parameterized curve is given by\n\\(L = \\int_a^b \\sqrt{g'(t)^2 + f'(t)^2} dt.\\)\nFor the special case of the graph of a function \\(f(x)\\) between \\(a\\) and \\(b\\) the formula becomes \\(L = \\int_a^b \\sqrt{ 1 + f'(x)^2} dx\\) (taking \\(g(t) = t\\)).\n\n\n\n\n\n\n\nNote\n\n\n\nThe form of the integral may seem daunting with the square root and the derivatives. A more general writing would create a vector out of the two functions: \\(\\phi(t) = \\langle g(t), f(t) \\rangle\\). It is natural to then let \\(\\phi'(t) = \\langle g'(t), f'(t) \\rangle\\). With this, the integrand is just the norm - or length - of the derivative, or \\(L=\\int \\| \\phi'(t) \\| dt\\). This is similar to the distance traveled being the integral of the speed, or the absolute value of the derivative of position.\n\n\nTo see why, any partition of the interval \\([a,b]\\) by \\(a = t_0 < t_1 < \\cdots < t_n =b\\) gives rise to \\(n+1\\) points in the plane given by \\((g(t_i), f(t_i))\\).\n\n\n \n The arc length of the parametric curve can be approximated using straight line segments connecting points. This gives rise to an integral expression defining the length in terms of the functions \\(f\\) and \\(g\\).\n \n \n\n\n\nThe distance between points \\((g(t_i), f(t_i))\\) and \\((g(t_{i-1}), f(t_{i-1}))\\) is just\n\\[\nd_i = \\sqrt{(g(t_i)-g(t_{i-1}))^2 + (f(t_i)-f(t_{i-1}))^2}\n\\]\nThe total approximate distance of the curve would be \\(L_n = d_1 + d_2 + \\cdots + d_n\\). This is exactly how we would compute the length of the jump rope or the distance traveled from GPS recordings.\nHowever, differences, such as \\(f(t_i)-f(t_{i-1})\\), are the building blocks of approximate derivatives. With an eye towards this, we multiply both top and bottom by \\(t_i - t_{i-1}\\) to get:\n\\[\nL_n = d_1 \\cdot \\frac{t_1 - t_0}{t_1 - t_0} + d_2 \\cdot \\frac{t_2 - t_1}{t_2 - t_1} + \\cdots + d_n \\cdot \\frac{t_n - t_{n-1}}{t_n - t_{n-1}}.\n\\]\nBut looking at each term, we can push the denominator into the square root as:\n\\[\n\\begin{align*}\nd_i &= d_i \\cdot \\frac{t_i - t_{i-1}}{t_i - t_{i-1}}\n\\\\\n&= \\sqrt{ \\left(\\frac{g(t_i)-g(t_{i-1})}{t_i-t_{i-1}}\\right)^2 +\n\\left(\\frac{f(t_i)-f(t_{i-1})}{t_i-t_{i-1}}\\right)^2} \\cdot (t_i - t_{i-1}) \\\\\n&= \\sqrt{ g'(\\xi_i)^2 + f'(\\psi_i)^2} \\cdot (t_i - t_{i-1}).\n\\end{align*}\n\\]\nThe values \\(\\xi_i\\) and \\(\\psi_i\\) are guaranteed by the mean value theorem and must be in \\([t_{i-1}, t_i]\\).\nWith this, if \\(\\sqrt{f'(t)^2 + g'(t)^2}\\) is integrable, as assumed, then as the size of the partition goes to zero, the sum of the \\(d_i\\), \\(L_n\\), must converge to the integral:\n\\[\nL = \\int_a^b \\sqrt{f'(t)^2 + g'(t)^2} dt.\n\\]\n(This needs a technical adjustment to the Riemann theorem, as we are evaluating our function at two points in the interval. A general proof is here.)\n\n\n\n\n\n\nNote\n\n\n\nBressoud notes that Gregory (1668) proved this formula for arc length of the graph of a function by showing that the length of the curve \\(f(x)\\) is defined by the area under \\(\\sqrt{1 + f'(x)^2}\\). (It is commented that this was also known a bit earlier by von Heurat.) Gregory went further though, as part of the fundamental theorem of calculus was contained in his work. Gregory then posed this inverse question: given a curve \\(y=g(x)\\) find a function \\(u(x)\\) so that the area under \\(g\\) is equal to the length of the second curve. The answer given was \\(u(x) = (1/c)\\int_a^x \\sqrt{g^2(t) - c^2}\\), which if \\(g(t) = \\sqrt{1 + f'(t)^2}\\) and \\(c=1\\) says \\(u(x) = \\int_a^x f(t)dt\\).\nAn analogy might be a sausage maker. These take a mass of ground-up sausage material and return a long length of sausage. The material going in would depend on time via an equation like \\(\\int_0^t g(u) du\\) and the length coming out would be a constant (accounting for the cross section) times \\(u(t) = \\int_0^t \\sqrt{1 + g'(s)} ds\\).\n\n\n\nExamples\nLet \\(f(x) = x^2\\). The arc length of the graph of \\(f(x)\\) over \\([0,1]\\) is then \\(L=\\int_0^1 \\sqrt{1 + (2x)^2} dx\\). A trigonometric substitution of \\(2x = \\sin(\\theta)\\) leads to the antiderivative:\n\n@syms x\nF = integrate(sqrt(1 + (2x)^2), x)\n\n \n\\[\n\\frac{x \\sqrt{4 x^{2} + 1}}{2} + \\frac{\\operatorname{asinh}{\\left(2 x \\right)}}{4}\n\\]\n\n\n\n\nF(1) - F(0)\n\n \n\\[\n\\frac{\\operatorname{asinh}{\\left(2 \\right)}}{4} + \\frac{\\sqrt{5}}{2}\n\\]\n\n\n\nThat number has some context, as can be seen from the graph, which gives simple lower and upper bounds of \\(\\sqrt{1^2 + 1^2} = 1.414...\\) and \\(1 + 1 = 2\\).\n\nf(x) = x^2\nplot(f, 0, 1)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe integrand \\(\\sqrt{1 + f'(x)^2}\\) may seem odd at first, but it can be interpreted as the length of the hypotenuse of a right triangle with “run” of \\(1\\) and “rise” of \\(f'(x)\\). This triangle is easily formed using the tangent line to the graph of \\(f(x)\\). By multiplying by \\(dx\\), the integral is “summing” up the lengths of infinitesimal pieces of the tangent line approximation.\n\n\n\nExample\nLet \\(f(t) = R\\cos(t)\\) and \\(g(t) = R\\sin(t)\\). Then the parametric curve over \\([0, 2\\pi]\\) is a circle. As the curve does not wrap around, the arc-length of the curve is just the circumference of the circle. To see that the arc length formula gives us familiar answers, we have:\n\\[\nL = \\int_0^{2\\pi} \\sqrt{(R\\cos(t))^2 + (-R\\sin(t))^2} dt = R\\int_0^{2\\pi} \\sqrt{\\cos(t)^2 + \\sin(t)^2} dt =\nR\\int_0^{2\\pi} dt = 2\\pi R.\n\\]\n\n\nExample\nLet \\(f(x) = \\log(x)\\). Find the length of the graph of \\(f\\) over \\([1/e, e]\\).\nThe answer is\n\\[\nL = \\int_{1/e}^e \\sqrt{1 + \\left(\\frac{1}{x}\\right)^2} dx.\n\\]\nThis has a messy antiderivative, so we let SymPy compute for us:\n\nex = integrate(sqrt(1 + (1/x)^2), (x, 1/sympy.E, sympy.E)) # sympy.E is symbolic\n\n \n\\[\n- \\frac{1}{\\sqrt{e^{-2} + 1}} - \\operatorname{asinh}{\\left(e^{-1} \\right)} - \\frac{1}{\\sqrt{e^{-2} + 1} e^{2}} + \\frac{1}{\\sqrt{1 + e^{2}}} + \\operatorname{asinh}{\\left(e \\right)} + \\frac{e^{2}}{\\sqrt{1 + e^{2}}}\n\\]\n\n\n\nWhich isnt so satisfying. From a quick graph, we see the answer should be no more than 4, and we see in fact it is\n\nN(ex)\n\n3.196198513599507\n\n\n\n\nExample\nA catenary shape is the shape a hanging chain will take as it is suspended between two posts. It appears elsewhere, for example, power wires will also have this shape as they are suspended between towers. A formula for a catenary can be written in terms of the hyperbolic cosine, cosh in julia or exponentials.\n\\[\ny = a \\cosh(x/a) = a \\cdot \\frac{e^{x/a} + e^{-x/a}}{2}.\n\\]\nSuppose we have the following chain hung between \\(x=-1\\) and \\(x=1\\) with \\(a = 2\\):\n\nchain(x; a=2) = a * cosh(x/a)\nplot(chain, -1, 1)\n\n\n\n\nHow long is the chain? Looking at the graph we can guess an answer is between \\(2\\) and \\(2.5\\), say, but it isnt much work to get an approximate numeric answer. Recall, the accompanying CalculusWithJulia package defines f' to find the derivative using the ForwardDiff package.\n\nquadgk(x -> sqrt(1 + chain'(x)^2), -1, 1)[1]\n\n2.0843812219749895\n\n\nWe used a numeric approach, but this can be solved by hand and the answer is surprising.\n\n\nExample\nThis picture of Jasper Johns Near the Lagoon was taken at The Art Institute Chicago.\n\n\n\nOne of Jasper Johns Catenary series. Art Institute of Chicago.\n\n\nThe museum notes have\n\nFor his Catenary series (19972003), of which Near the Lagoon is the largest and last work, Johns formed catenaries—a term used to describe the curve assumed by a cord suspended freely from two points—by tacking ordinary household string to the canvas or its supports.\n\nThis particular catenary has a certain length. The basic dimensions are \\(78\\)in wide and \\(118\\)in drop. We shift the basic function for catenaries to have \\(f(78/2) = f(-78/2) = 0\\) and \\(f(0) = -118\\) (the top curve segment is on the \\(x\\) axis and centered). We let our shifted function be parameterized by\n\\[\nf(x; a, b) = a \\cosh(x/a) - b.\n\\]\nEvaluating at \\(0\\) gives:\n\\[\n-118 = a - b \\text{ or } b = a + 118.\n\\]\nEvaluating at \\(78/2\\) gives: \\(a \\cdot \\cosh(78/(2a)) - (a + 118) = 0\\). This can be solved numerically for a:\n\ncat(x; a=1, b=0) = a*cosh(x/a) - b\nfind_zero(a -> cat(78/2, a=a, b=118 + a), 10)\n\n12.994268574805428\n\n\nRounding, we take \\(a=13\\). With these parameters (\\(a=13\\), \\(b = 131\\)), we compute the length of Johns catenary in inches:\n\na = 13\nb = 118 + a\nf(x) = cat(x; a=13, b=118+13)\nquadgk(x -> sqrt(1 + f'(x)^2), -78/2, 78/2)[1]\n\n260.46474811265745\n\n\n\n\nExample\nSuspension bridges, like the Verrazzano-Narrows Bridge, have different loading than a cable and hence a different shape. A parabola is the shape the cable takes under uniform loading (cf. page 19 for a picture).\nThe Verrazzano-Narrows Bridge has a span of \\(1298\\)m.\nSuppose the drop of the main cables is \\(147\\) meters over this span. Then the cable itself can be modeled as a parabola with\n\nThe \\(x\\)-intercepts \\(a = 1298/2\\) and \\(-a\\) and\nvertex \\((0,b)\\) with \\(b=-147\\).\n\nThe parabola that fits these three points is\n\\[\ny = \\frac{-b}{a^2}(x^2 - a^2)\n\\]\nFind the length of the cable in meters.\n\na = 1298/2;\nb = -147;\nf(x) = (-b/a^2)*(x^2 - a^2);\nval, _ = quadgk(x -> sqrt(1 + f'(x)^2), -a, a)\nval\n\n1341.1191077638794\n\n\n\n\n\nThe Verrazzano-Narrows Bridge during construction. The unloaded suspension cables form a catenary.\n\n\n\n\n\nA rendering of the Verrazzano-Narrows Bridge after construction (cf. nycgovparks.org). The uniformly loaded suspension cables would form a parabola, presumably a fact the artist of this rendering knew. (The spelling in the link is not the official spelling, which carries two zs.)\n\n\n\n\nExample\nThe nephroid is a curve that can be described parametrically by\n\\[\n\\begin{align*}\ng(t) &= a(3\\cos(t) - \\cos(3t)), \\\\\nf(t) &= a(3\\sin(t) - \\sin(3t)).\n\\end{align*}\n\\]\nTaking \\(a=1\\) we have this graph:\n\na = 1\n𝒈(t) = a*(3cos(t) - cos(3t))\n𝒇(t) = a*(3sin(t) - sin(3t))\nplot(𝒈, 𝒇, 0, 2pi)\n\n\n\n\nFind the length of the perimeter of the closed figure formed by the graph.\nWe have \\(\\sqrt{g'(t)^2 + f'(t)^2} = \\sqrt{18 - 18\\cos(2t)}\\). An antiderivative isnt forthcoming through SymPy, so we take a numeric approach to find the length:\n\nquadgk(t -> sqrt(𝒈'(t)^2 + 𝒇'(t)^2), 0, 2pi)[1]\n\n23.999999999999993\n\n\nThe answer seems like a floating point approximation of \\(24\\), which suggests that this integral is tractable. Pursuing this, the integrand simplifies:\n\\[\n\\begin{align*}\n\\sqrt{g'(t)^2 + f'(t)^2}\n&= \\sqrt{(-3\\sin(t) + 3\\sin(3t))^2 + (3\\cos(t) - 3\\cos(3t))^2} \\\\\n&= 3\\sqrt{(\\sin(t)^2 - 2\\sin(t)\\sin(3t) + \\sin(3t)^2) + (\\cos(t)^2 -2\\cos(t)\\cos(3t) + \\cos(3t)^2)} \\\\\n&= 3\\sqrt{(\\sin(t)^2+\\cos(t)^2) + (\\sin(3t)^2 + \\cos(3t)^2) - 2(\\sin(t)\\sin(3t) + \\cos(t)\\cos(3t))}\\\\\n&= 3\\sqrt{2(1 - (\\sin(t)\\sin(3t) + \\cos(t)\\cos(3t)))}\\\\\n&= 3\\sqrt{2}\\sqrt{1 - \\cos(2t)}\\\\\n&= 3\\sqrt{2}\\sqrt{2\\sin(t)^2}.\n\\end{align*}\n\\]\nThe second to last line comes from a double angle formula expansion of \\(\\cos(3t - t)\\) and the last line from the half angle formula for \\(\\cos\\).\nBy graphing, we see that integrating over \\([0,2\\pi]\\) gives twice the answer to integrating over \\([0, \\pi]\\), which allows the simplification to:\n\\[\nL = \\int_0^{2\\pi} \\sqrt{g'(t)^2 + f'(t)^2}dt = \\int_0^{2\\pi} 3\\sqrt{2}\\sqrt{2\\sin(t)^2} =\n3 \\cdot 2 \\cdot 2 \\int_0^\\pi \\sin(t) dt = 3 \\cdot 2 \\cdot 2 \\cdot 2 = 24.\n\\]\n\n\nExample\nA teacher of small children assigns his students the task of computing the length of a jump rope by counting the number of \\(1\\)-inch segments it is made of. He knows that if a student is accurate, no matter how fast or slow they count the answer will be the same. (That is, unless the student starts counting in the wrong direction by mistake). The teacher knows this, as he is certain that the length of curve is independent of its parameterization, as it is a property intrinsic to the curve.\nMathematically, suppose a curve is described parametrically by \\((g(t), f(t))\\) for \\(a \\leq t \\leq b\\). A new parameterization is provided by \\(\\gamma(t)\\). Suppose \\(\\gamma\\) is strictly increasing, so that an inverse function exists. (This assumption is implicitly made by the teacher, as it implies the student wont start counting in the wrong direction.) Then the same curve is described by composition through \\((g(\\gamma(u)), f(\\gamma(u)))\\), \\(\\gamma^{-1}(a) \\leq u \\leq \\gamma^{-1}(b)\\). That the arc length is the same follows from substitution:\n\\[\n\\begin{align*}\n\\int_{\\gamma^{-1}(a)}^{\\gamma^{-1}(b)} \\sqrt{([g(\\gamma(t))]')^2 + ([f(\\gamma(t))]')^2} dt\n&=\\int_{\\gamma^{-1}(a)}^{\\gamma^{-1}(b)} \\sqrt{(g'(\\gamma(t) )\\gamma'(t))^2 + (f'(\\gamma(t) )\\gamma'(t))^2 } dt \\\\\n&=\\int_{\\gamma^{-1}(a)}^{\\gamma^{-1}(b)} \\sqrt{g'(\\gamma(t))^2 + f'(\\gamma(t))^2} \\gamma'(t) dt\\\\\n&=\\int_a^b \\sqrt{g'(u)^2 + f'(u)^2} du = L\n\\end{align*}\n\\]\n(Using \\(u=\\gamma(t)\\) for the substitution.)\nIn traveling there are two natural parameterizations: one by time, as in “how long have we been driving?”; and the other by distance, as in “how far have we been driving?” Parameterizing by distance, or more technically arc length, has other mathematical advantages.\nTo parameterize by arc length, we just need to consider a special \\(\\gamma\\) defined by:\n\\[\n\\gamma(u) = \\int_0^u \\sqrt{g'(t)^2 + f'(t)^2} dt.\n\\]\nSupposing \\(\\sqrt{g'(t)^2 + f'(t)^2}\\) is continuous and positive, This transformation is increasing, as its derivative by the Fundamental Theorem of Calculus is \\(\\gamma'(u) = \\sqrt{g'(u)^2 + f'(u)^2}\\), which by assumption is positive. (It is certainly non-negative.) So there exists an inverse function. That it exists is one thing, computing all of this is a different matter, of course.\nFor a simple example, we have \\(g(t) = R\\cos(t)\\) and \\(f(t)=R\\sin(t)\\) parameterizing the circle of radius \\(R\\). The arc length between \\(0\\) and \\(t\\) is simply \\(\\gamma(t) = Rt\\), which we can easily see from the formula. The inverse of this function is \\(\\gamma^{-1}(u) = u/R\\), so we get the parameterization \\((g(Rt), f(Rt))\\) for \\(0/R \\leq t \\leq 2\\pi/R\\).\nWhat looks at first glance to be just a slightly more complicated equation is that of an ellipse, with \\(g(t) = a\\cos(t)\\) and \\(f(t) = b\\sin(t)\\). Taking \\(a=1\\) and \\(b = a + c\\), for \\(c > 0\\) we get the equation for the arc length as a function of \\(t\\) is just\n\\[\n\\begin{align*}\ns(u) &= \\int_0^u \\sqrt{(-\\sin(t))^2 + b\\cos(t)^2} dt\\\\\n &= \\int_0^u \\sqrt{\\sin(t))^2 + \\cos(t)^2 + c\\cos(t)^2} dt \\\\\n &=\\int_0^u \\sqrt{1 + c\\cos(t)^2} dt.\n\\end{align*}\n\\]\nBut, despite it not looking too daunting, this integral is not tractable through our techniques and has an answer involving elliptic integrals. We can work numerically though. Letting \\(a=1\\) and \\(b=2\\), we have the arc length is given by:\n\n𝒂, 𝒃 = 1, 2\n𝒔(u) = quadgk(t -> sqrt(𝒂^2 * sin(t)^2 + 𝒃^2 * cos(t)^2), 0, u)[1]\n\n𝒔 (generic function with 1 method)\n\n\nThis has a graph, which does not look familiar, but we can see is monotonically increasing, so will have an inverse function:\n\nplot(𝒔, 0, 2pi)\n\n\n\n\nThe range is \\([0, s(2\\pi)]\\).\nThe inverse function can be found by solving, we use the bracketing version of find_zero for this:\n\nsinv(u) = find_zero(x -> 𝒔(x) - u, (0, 𝒔(2pi)))\n\nsinv (generic function with 1 method)\n\n\nHere we see visually that the new parameterization yields the same curve:\n\ng(t) = 𝒂 * cos(t)\nf(t) = 𝒃 * sin(t)\n\nplot(t -> g(𝒔(t)), t -> f(𝒔(t)), 0, 𝒔(2*pi))\n\n\n\n\n\n\n\nExample: An implication of concavity\nFollowing (faithfully) Kantorwitz and Neumann, we consider a function \\(f(x)\\) with the property that both \\(f\\) and \\(f'\\) are strictly concave down on \\([a,b]\\) and suppose \\(f(a) = f(b)\\). Further, assume \\(f'\\) is continuous. We will see this implies facts about arc-length and other integrals related to \\(f\\).\nThe following figure is clearly of a concave down function. The asymmetry about the critical point will be seen to be a result of the derivative also being concave down. This asymmetry will be characterized in several different ways in the following including showing that the arc length from \\((a,0)\\) to \\((c,f(c))\\) is longer than from \\((c,f(c))\\) to \\((b,0)\\).\n\n\n\n\n\nTake \\(a < u < c < v < b\\) with \\(f(u) = f(v)\\) and \\(c\\) a critical point, as in the picture. There must be a critical point by Rolles theorem, and it must be unique, as the derivative, which exists by the assumptions, must be strictly decreasing due to concavity of \\(f\\) and hence there can be at most \\(1\\) critical point.\nSome facts about this picture can be proven from the definition of concavity:\n\nThe slope of the tangent line at \\(u\\) goes up slower than the slope of the tangent line at \\(v\\) declines: \\(f'(u) < -f'(v)\\).\n\nSince \\(f'\\) is strictly concave, we have for any \\(a<u<v<b\\) from the definition of concavity that for all \\(0 \\leq t \\leq 1\\)\n\\[\ntf'(u) + (1-t)f'(v) < f'(tu + (1-t)v).\n\\]\nSo\n\\[\n\\begin{align*}\n\\int_0^1 (tf'(u) + (1-t)f'(v)) dt &< \\int_0^1 f'(tu + (1-t)v) dt, \\text{or}\\\\\n\\frac{f'(u) + f'(v)}{2} &< \\frac{1}{v-u}\\int_u^v f'(w) dw,\n\\end{align*}\n\\]\nby the substitution \\(w = tu + (1-t)v\\). Using the fundamental theorem of calculus to compute the mean value of the integral of \\(f'\\) over \\([u,v]\\) gives the following as a consequence of strict concavity of \\(f'\\):\n\\[\n\\frac{f'(u) + f'(v)}{2} < \\frac{f(v)-f(u)}{v-u}\n\\]\nThe above is true for any \\(u\\) and \\(v\\), but, by assumption our \\(u\\) and \\(v\\) under consideration have \\(f(u) = f(v)\\), hence it must be \\(f'(u) < -f'(v)\\).\n\nThe critical point is greater than the midpoint between \\(u\\) and \\(v\\): \\((u+v)/2 < c\\).\n\nThe function \\(f\\) restricted to \\([a,c]\\) and \\([c,b]\\) is strictly monotone, as \\(f'\\) only changes sign at \\(c\\). Hence, there are inverse functions, say \\(f_1^{-1}\\) and \\(f_2^{-1}\\) taking \\([0,m]\\) to \\([a,c]\\) and \\([c,b]\\) respectively. The inverses are differentiable, as \\(f'\\) exists, and must satisfy: \\([f_1^{-1}]'(y) > 0\\) (as \\(f'\\) is positive on \\([a,c]\\)) and, similarly, \\([f_2^{-1}]'(y) < 0\\). By the previous result, the inverses also satisfy:\n\\[\n[f_1^{-1}]'(y) > -[f_2^{-1}]'(y)\n\\]\n(The inequality reversed due to the derivative of the inverse function being related to the reciprocal of the derivative of the function.)\nFor any \\(0 \\leq \\alpha < \\beta \\leq m\\) we have:\n\\[\n\\int_{\\alpha}^{\\beta} ([f_1^{-1}]'(y) +[f_2^{-1}]'(y)) dy > 0\n\\]\nBy the fundamental theorem of calculus:\n\\[\n(f_1^{-1}(y) + f_2^{-1}(y))\\big|_\\alpha^\\beta > 0\n\\]\nOn rearranging:\n\\[\nf_1^{-1}(\\alpha) + f_2^{-1}(\\alpha) < f_1^{-1}(\\beta) + f_2^{-1}(\\beta)\n\\]\nThat is \\(f_1^{-1} + f_2^{-1}\\) is strictly increasing.\nTaking \\(\\beta=m\\) gives a bound in terms of \\(c\\) for any \\(0 \\leq \\alpha < m\\):\n\\[\nf_1^{-1}(\\alpha) + f_2^{-1}(\\alpha) < 2c.\n\\]\nThe result comes from setting \\(\\alpha=f(u)\\); setting \\(\\alpha=0\\) shows the result for \\([a,b]\\).\n\nThe intersection point of the two tangent lines, \\(d\\), satisfies \\((u+v)/2 < d\\).\n\nIf \\(f(u) = f(v)\\), the previously established relationship between the slopes of the tangent lines suggests the answer. However, this statement is actually true more generally, with just the assumption that \\(u < v\\) and not necessarily that \\(f(u)=f(v)\\).\nSolving for \\(d\\) from equations of the tangent lines gives\n\\[\nd = \\frac{f(v)-f(u) + uf'(u) - vf'(v)}{f'(u) - f'(v)}\n\\]\nSo \\((u+v)/2 < d\\) can be re-expressed as\n\\[\n\\frac{f'(u) + f'(v)}{2} < \\frac{f(v) - f(u)}{v-u}\n\\]\nwhich holds by the strict concavity of \\(f'\\), as found previously.\n\nLet \\(h=f(u)\\). The areas under \\(f\\) are such that there is more area in \\([a,u]\\) than \\([v,b]\\) and more area under \\(f(x)-h\\) in \\([u,c]\\) than \\([c,v]\\). In particular more area under \\(f\\) over \\([a,c]\\) than \\([c,b]\\).\n\nUsing the substitution \\(x = f_i^{-1}(u)\\) as needed to see:\n\\[\n\\begin{align*}\n\\int_a^u f(x) dx &= \\int_0^{f(u)} u [f_1^{-1}]'(u) du \\\\\n&> -\\int_0^h u [f_2^{-1}]'(u) du \\\\\n&= \\int_h^0 u [f_2^{-1}]'(u) du \\\\\n&= \\int_v^b f(x) dx.\n\\end{align*}\n\\]\nFor the latter claim, integrating in the \\(y\\) variable gives\n\\[\n\\begin{align*}\n\\int_u^c (f(x)-h) dx &= \\int_h^m (c - f_1^{-1}(y)) dy\\\\\n&> \\int_h^m (c - f_2^{-1}(y)) dy\\\\\n&= \\int_c^v (f(x)-h) dx\n\\end{align*}\n\\]\nNow, the area under \\(h\\) over \\([u,c]\\) is greater than that over \\([c,v]\\) as \\((u+v)/2 < c\\) or \\(v-c < c-u\\). That means the area under \\(f\\) over \\([u,c]\\) is greater than that over \\([c,v]\\).\n\nThere is more arc length for \\(f\\)over \\([a,u]\\) than \\([v,b]\\); more arc length for \\(f\\) over \\([u,c]\\) than \\([c,v]\\). In particular more arc length over \\([a,c]\\) than \\([c,b]\\).\n\nlet \\(\\phi(z) = f_2^{-1}(f_1(z))\\) be the function taking \\(u\\) to \\(v\\), and \\(a\\) to \\(b\\) and moreover the interval \\([a,u]\\) to \\([v,b]\\). Further, \\(f(z) = f(\\phi(z))\\). The function is differentiable, as it is a composition of differentiable functions and for any \\(a \\leq z \\leq u\\) we have\n\\[\nf'(\\phi(z)) \\cdot \\phi'(z) = f'(z) < 0\n\\]\nor \\(\\phi'(z) < 0\\). Moreover, we have by the first assertion that \\(f'(z) < -f'(\\phi(z))\\) so \\(|\\phi'(z)| = |f(z)/f'(\\phi(z))| < 1\\).\nUsing the substitution \\(x = \\phi(z)\\) gives:\n\\[\n\\begin{align*}\n\\int_v^b \\sqrt{1 + f'(x)^2} dx &=\n\\int_u^a \\sqrt{1 + f'(\\phi(z))^2} \\phi'(z) dz\\\\\n&= \\int_a^u \\sqrt{1 + f'(\\phi(z))^2} |\\phi'(z)| dz\\\\\n&= \\int_a^u \\sqrt{\\phi'(z)^2 + (f'(\\phi(z))\\phi'(z))^2} dz\\\\\n&= \\int_a^u \\sqrt{\\phi'(z)^2 + f'(z)^2} dz\\\\\n&< \\int_a^u \\sqrt{1 + f'(z)^2} dz\n\\end{align*}\n\\]\nLetting \\(h=f(u) \\rightarrow c\\) we get the inequality\n\\[\n\\int_c^b \\sqrt{1 + f'(x)^2} dx \\leq \\int_a^c \\sqrt{1 + f'(x)^2} dx,\n\\]\nwhich must also hold for any paired \\(u,v=\\phi(u)\\). This allows the use of the strict inequality over \\([v,b]\\) and \\([a,u]\\) to give:\n\\[\n\\int_c^b \\sqrt{1 + f'(x)^2} dx < \\int_a^c \\sqrt{1 + f'(x)^2} dx,\n\\]\nwhich would also hold for any paired \\(u, v\\).\nNow, why is this of interest. Previously, we have considered the example of the trajectory of an arrow on a windy day given in function form by:\n\\[\nf(x) = \\left(\\frac{g}{k v_0\\cos(\\theta)} + \\tan(\\theta) \\right) x + \\frac{g}{k^2}\\log\\left(1 - \\frac{k}{v_0\\cos(\\theta)} x \\right)\n\\]\nThis comes from solving the projectile motion equations with a drag force proportional to the velocity. This function satisfies:\n\n@syms gₑ::postive, k::postive, v₀::positive, θ::postive, x::postive\nex = (gₑ/(k*v₀*cos(θ)) + tan(θ))*x + gₑ/k^2 * log(1 - k/(v₀*cos(θ))*x)\ndiff(ex, x, x), diff(ex, x, x, x,)\n\n(-gₑ/(v₀^2*(k*x/(v₀*cos(θ)) - 1)^2*cos(θ)^2), 2*gₑ*k/(v₀^3*(k*x/(v₀*cos(θ)) - 1)^3*cos(θ)^3))\n\n\nBoth the second and third derivatives are negative (as \\(0 \\leq x < (v_0\\cos(\\theta))/k\\) due to the logarithm term), so, both \\(f\\) and \\(f'\\) are strictly concave down. Hence the results above apply. That is the arrow will fly further as it goes up, than as it comes down and will carve out more area on its way up, than its way down. The trajectory could also show time versus height, and the same would hold, e.g, the arrow would take longer to go up than come down.\nIn general, the drag force need not be proportional to the velocity, but merely in opposite direction to the velocity vector \\(\\langle x'(t), y'(t) \\rangle\\):\n\\[\n-m W(t, x(t), x'(t), y(t), y'(t)) \\cdot \\langle x'(t), y'(t)\\rangle,\n\\]\nwith the case above corresponding to \\(W = -m(k/m)\\). The set of equations then satisfy:\n\\[\n\\begin{align*}\nx''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \\cdot x'(t)\\\\\ny''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \\cdot y'(t)\\\\\n\\end{align*}\n\\]\nwith initial conditions: \\(x(0) = y(0) = 0\\) and \\(x'(0) = v_0 \\cos(\\theta), y'(0) = v_0 \\sin(\\theta)\\).\nOnly with certain drag forces, can this set of equations be be solved exactly, though it can be approximated numerically for admissible \\(W\\), but if \\(W\\) is strictly positive then it can be shown \\(x(t)\\) is increasing on \\([0, x_\\infty)\\) and so invertible, and \\(f(u) = y(x^{-1}(u))\\) is three times differentiable with both \\(f\\) and \\(f'\\) being strictly concave, as it can be shown that (say \\(x(v) = u\\) so \\(dv/du = 1/x'(v) > 0\\)):\n\\[\n\\begin{align*}\nf''(u) &= -\\frac{g}{x'(v)^2} < 0\\\\\nf'''(u) &= \\frac{2gx''(v)}{x'(v)^3} \\\\\n&= -\\frac{2gW}{x'(v)^2} \\cdot \\frac{dv}{du} < 0\n\\end{align*}\n\\]\nThe latter by differentiating, the former a consequence of the following formulas for derivatives of inverse functions\n\\[\n\\begin{align*}\n[x^{-1}]'(u) &= 1 / x'(v) \\\\\n[x^{-1}]''(u) &= -x''(v)/(x'(v))^3\n\\end{align*}\n\\]\nFor then\n\\[\n\\begin{align*}\nf(u) &= y(x^{-1}(u)) \\\\\nf'(u) &= y'(x^{-1}(u)) \\cdot {x^{-1}}'(u) \\\\\nf''(u) &= y''(x^{-1}(u))\\cdot[x^{-1}]'(u)^2 + y'(x^{-1}(u)) \\cdot [x^{-1}]''(u) \\\\\n &= y''(v) / (x'(v))^2 - y'(v) \\cdot x''(v) / x'(v)^3\\\\\n &= -g/(x'(v))^2 - W y'/(x'(v))^2 - y'(v) \\cdot (- W \\cdot x'(v)) / x'(v)^3\\\\\n &= -g/x'(v)^2.\n\\end{align*}\n\\]"
},
{
"objectID": "integrals/arc_length.html#questions",
"href": "integrals/arc_length.html#questions",
"title": "46  Arc length",
"section": "46.2 Questions",
"text": "46.2 Questions\n\nQuestion\nThe length of the curve given by \\(f(x) = e^x\\) between \\(0\\) and \\(1\\) is certainly longer than the length of the line connecting \\((0, f(0))\\) and \\((1, f(1))\\). What is that length?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe length of the curve is certainly less than the length of going from \\((0,f(0))\\) to \\((1, f(0))\\) and then up to \\((1, f(1))\\). What is the length of this upper bound?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNow find the actual length of the curve numerically:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the length of the graph of \\(f(x) = x^{3/2}\\) between \\(0\\) and \\(4\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA pursuit curve is a track an optimal pursuer will take when chasing prey. The function \\(f(x) = x^2 - \\log(x)\\) is an example. Find the length of the curve between \\(1/10\\) and \\(2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the length of the graph of \\(f(x) = \\tan(x)\\) between \\(-\\pi/4\\) and \\(\\pi/4\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNote, the straight line segment should be a close approximation and has length:\n\nsqrt((tan(pi/4) - tan(-pi/4))^2 + (pi/4 - -pi/4)^2)\n\n2.543108550627035\n\n\n\n\nQuestion\nFind the length of the graph of the function \\(g(x) =\\int_0^x \\tan(x)dx\\) between \\(0\\) and \\(\\pi/4\\) by hand or numerically:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA boat sits at the point \\((a, 0)\\) and a man holds a rope taut attached to the boat at the origin \\((0,0)\\). The man walks on the \\(y\\) axis. The position \\(y\\) depends then on the position \\(x\\) of the boat, and if the rope is taut, the position satisfies:\n\\[\ny = a \\ln\\frac{a + \\sqrt{a^2 - x^2}}{x} - \\sqrt{a^2 - x^2}\n\\]\nThis can be entered into julia as:\n\nh(x, a) = a * log((a + sqrt(a^2 - x^2))/x) - sqrt(a^2 - x^2)\n\nh (generic function with 2 methods)\n\n\nLet \\(a=12\\), \\(f(x) = h(x, a)\\). Compute the length the bow of the boat has traveled between \\(x=1\\) and \\(x=a\\) using quadgk.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(The most elementary description of this curve is in terms of the relationship \\(dy/dx = -\\sqrt{a^2-x^2}/x\\) which could be used in place of D(f) in your work.)\n\n\n\n\n\n\nNote\n\n\n\nTo see an example of how the tractrix can be found in an everyday observation, follow this link on a description of bicycle tracks.\n\n\n\n\nQuestion\nSymPy fails with the brute force approach to finding the length of a catenary, but can with a little help:\n\n@syms x::real a::real\nf(x,a) = a * cosh(x/a)\ninside = 1 + diff(f(x,a), x)^2\n\n \n\\[\n\\sinh^{2}{\\left(\\frac{x}{a} \\right)} + 1\n\\]\n\n\n\nJust trying integrate(sqrt(inside), x) will fail, but if we try integrate(sqrt(simplify(inside), x)) an antiderivative can be found. What is it?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\frac{a \\sinh{\\left(\\frac{x}{a} \\right)} \\cosh{\\left(\\frac{x}{a} \\right)}}{2} - \\frac{x \\sinh^{2}{\\left(\\frac{x}{a} \\right)}}{2} + \\frac{x \\cosh^{2}{\\left(\\frac{x}{a} \\right)}}{2}\\)\n \n \n\n\n \n \n \n \n \\(a \\sinh{\\left(\\frac{x}{a} \\right)}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA curve is parameterized by \\(g(t) = t + \\sin(t)\\) and \\(f(t) = \\cos(t)\\). Find the arc length of the curve between \\(t=0\\) and \\(\\pi\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe astroid is a curve parameterized by \\(g(t) = \\cos(t)^3\\) and \\(f(t) = \\sin(t)^3\\). Find the arc length of the curve between \\(t=0\\) and \\(2\\pi\\). (This can be computed by hand or numerically.)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA curve is parameterized by \\(g(t) = (2t + 3)^{2/3}/3\\) and \\(f(t) = t + t^2/2\\), for \\(0\\leq t \\leq 3\\). Compute the arc-length numerically or by hand:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe cycloid is parameterized by \\(g(t) = a(t - \\sin(t))\\) and \\(f(t) = a(1 - \\cos(t))\\) for \\(a > 0\\). Taking \\(a=3\\), and \\(t\\) in \\([0, 2\\pi]\\), find the length of the curve traced out. (This was solved by the architect and polymath Wren in 1650.)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nA cycloid parameterized this way can be generated by a circle of radius \\(a\\). Based on this example, what do you think Wren wrote to Pascal about this length:\n\n\n\n \n \n \n \n \n \n \n \n \n The length of the cycloidal arch is exactly two times the radius of the generating circle.\n \n \n\n\n \n \n \n \n The length of the cycloidal arch is exactly four times the radius of the generating circle.\n \n \n\n\n \n \n \n \n The length of the cycloidal arch is exactly eight times the radius of the generating circle.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nIn Martin we read why Wren was mailing Pascal:\nAfter demonstrating mathematical talent at an early age, Blaise Pascal turned his attention to theology, denouncing the study of mathematics as a vainglorious pursuit. Then one night, unable to sleep as the result of a toothache, he began thinking about the cycloid and to his surprise, his tooth stopped aching. Taking this as a sign that he had Gods approval to continue, Pascal spent the next eight days studying the curve. During this time he discovered nearly all of the geometric properties of the cycloid. He issued some of his results in \\(1658\\) in the form of a contest, offering a prize of forty Spanish gold pieces and a second prize of twenty pieces."
},
{
"objectID": "integrals/surface_area.html",
"href": "integrals/surface_area.html",
"title": "47  Surface Area",
"section": "",
"text": "This section uses these add-on packages:"
},
{
"objectID": "integrals/surface_area.html#surfaces-of-revolution",
"href": "integrals/surface_area.html#surfaces-of-revolution",
"title": "47  Surface Area",
"section": "47.1 Surfaces of revolution",
"text": "47.1 Surfaces of revolution\n\n\n\nThe exterior of the Jimi Hendrix Museum in Seattle has the signature style of its architect Frank Gehry. The surface is comprised of patches. A general method to find the amount of material to cover the surface - the surface area - might be to add up the area of each of the patches. However, in this section we will see for surfaces of revolution, there is an easier way. (Photo credit to firepanjewellery.)\n\n\n\nThe surface area generated by rotating the graph of \\(f(x)\\) between \\(a\\) and \\(b\\) about the \\(x\\)-axis is given by the integral\n\\[\n\\int_a^b 2\\pi f(x) \\cdot \\sqrt{1 + f'(x)^2} dx.\n\\]\nIf the curve is parameterized by \\((g(t), f(t))\\) between \\(a\\) and \\(b\\) then the surface area is\n\\[\n\\int_a^b 2\\pi f(t) \\cdot \\sqrt{g'(t)^2 + f'(t)^2} dx.\n\\]\nThese formulas do not add in the surface area of either of the ends.\n\n\n\n\n\n\nThe above figure shows a cone (the line \\(y=x\\)) presented as a surface of revolution about the \\(x\\)-axis.\nTo see why this formula is as it is, we look at the parameterized case, the first one being a special instance with \\(g(t) =t\\).\nLet a partition of \\([a,b]\\) be given by \\(a = t_0 < t_1 < t_2 < \\cdots < t_n =b\\). This breaks the curve into a collection of line segments. Consider the line segment connecting \\((g(t_{i-1}), f(t_{i-1}))\\) to \\((g(t_i), f(t_i))\\). Rotating this around the \\(x\\) axis will generate something approximating a disc, but in reality will be the frustum of a cone. What will be the surface area?\nConsider a right-circular cone parameterized by an angle \\(\\theta\\) and the largest radius \\(r\\) (so that the height satisfies \\(r/h=\\tan(\\theta)\\)). If this cone were made of paper, cut up a side, and layed out flat, it would form a sector of a circle, whose area would be \\(R\\gamma\\) where \\(R\\) is the radius of the circle (also the side length of our cone), and \\(\\gamma\\) an angle that we can figure out from \\(r\\) and \\(\\theta\\). To do this, we note that the arc length of the circles edge is \\(R\\gamma\\) and also the circumference of the bottom of the cone so \\(R\\gamma = 2\\pi r\\). With all this, we can solve to get \\(A = \\pi r^2/\\sin(\\theta)\\). But we have a frustum of a cone with radii \\(r_0\\) and \\(r_1\\), so the surface area is a difference: \\(A = \\pi (r_1^2 - r_0^2) /\\sin(\\theta)\\).\nRelating this to our values in terms of \\(f\\) and \\(g\\), we have \\(r_1=f(t_i)\\), \\(r_0 = f(t_{i-1})\\), and \\(\\sin(\\theta) = \\Delta f / \\sqrt{(\\Delta g)^2 + (\\Delta f)^2}\\), where \\(\\Delta f = f(t_i) - f(t_{i-1})\\) and similarly for \\(\\Delta g\\).\nPutting this altogether we get that the surface area generarated by rotating the line segment around the \\(x\\) axis is\n\\[\n\\text{sa}_i = \\pi (f(t_i)^2 - f(t_{i-1})^2) \\cdot \\sqrt{(\\Delta g)^2 + (\\Delta f)^2} / \\Delta f =\n\\pi (f(t_i) + f(t_{i-1})) \\cdot \\sqrt{(\\Delta g)^2 + (\\Delta f)^2}.\n\\]\n(This is \\(2 \\pi\\) times the average radius times the slant height.)\nAs was done in the derivation of the formula for arc length, these pieces are multiplied both top and bottom by \\(\\Delta t = t_{i} - t_{i-1}\\). Carrying the bottom inside the square root and noting that by the mean value theorem \\(\\Delta g/\\Delta t = g(\\xi)\\) and \\(\\Delta f/\\Delta t = f(\\psi)\\) for some \\(\\xi\\) and \\(\\psi\\) in \\([t_{i-1}, t_i]\\), this becomes:\n\\[\n\\text{sa}_i = \\pi (f(t_i) + f(t_{i-1})) \\cdot \\sqrt{(g'(\\xi))^2 + (f'(\\psi))^2} \\cdot (t_i - t_{i-1}).\n\\]\nAdding these up, \\(\\text{sa}_1 + \\text{sa}_2 + \\cdots + \\text{sa}_n\\), we get a Riemann sum approximation to the integral\n\\[\n\\text{SA} = \\int_a^b 2\\pi f(t) \\sqrt{g'(t)^2 + f'(t)^2} dt.\n\\]\nIf we assume integrability of the integrand, then as our partition size goes to zero, this approximate surface area converges to the value given by the limit. (As with arc length, this needs a technical adjustment to the Riemann integral theorem as here we are evaluating the integrand function at four points (\\(t_i\\), \\(t_{i-1}\\), \\(\\xi\\) and \\(\\psi\\)) and not just at some \\(c_i\\).\n\n\n \n Surface of revolution of \\(f(x) = 2 - x^2\\) about the \\(y\\) axis. The lines segments are the images of rotating the secant line connecting \\((1/2, f(1/2))\\) and \\((3/4, f(3/4))\\). These trace out the frustum of a cone which approximates the corresponding surface area of the surface of revolution. In the limit, this approximation becomes exact and a formula for the surface area of surfaces of revolution can be used to compute the value.\n \n \n\n\n\n\nExamples\nLets see that the surface area of an open cone follows from this formula, even though we just saw how to get this value.\nA cone be be envisioned as rotating the function \\(f(x) = x\\tan(\\theta)\\) between \\(0\\) and \\(h\\) around the \\(x\\) axis. This integral yields the surface area:\n\\[\n\\begin{align*}\n\\int_0^h 2\\pi f(x) \\sqrt{1 + f'(x)^2}dx\n&= \\int_0^h 2\\pi x \\tan(\\theta) \\sqrt{1 + \\tan(\\theta)^2}dx \\\\\n&= (2\\pi\\tan(\\theta)\\sqrt{1 + \\tan(\\theta)^2} x^2/2 \\big|_0^h \\\\\n&= \\pi \\tan(\\theta) \\sec(\\theta) h^2 \\\\\n&= \\pi r^2 / \\sin(\\theta).\n\\end{align*}\n\\]\n(There are many ways to express this, we used \\(r\\) and \\(\\theta\\) to match the work above. If the cone is parameterized by a height \\(h\\) and radius \\(r\\), then the surface area of the sides is \\(\\pi r\\sqrt{h^2 + r^2}\\). If the base is included, there is an additional \\(\\pi r^2\\) term.)\n\nExample\nLet the graph of \\(f(x) = x^2\\) from \\(x=0\\) to \\(x=1\\) be rotated around the \\(x\\) axis. What is the resulting surface area generated?\n\\[\n\\text{SA} = \\int_a^b 2\\pi f(x) \\sqrt{1 + f'(x)^2}dx = \\int_0^1 2\\pi x^2 \\sqrt{1 + (2x)^2} dx.\n\\]\nThis integral is done by a trig substitution, but gets involved. We let SymPy do it:\n\n@syms x\nF = integrate(2 * PI * x^2 * sqrt(1 + (2x)^2), x)\n\n \n\\[\n2 \\pi \\left(\\frac{x^{5}}{\\sqrt{4 x^{2} + 1}} + \\frac{3 x^{3}}{8 \\sqrt{4 x^{2} + 1}} + \\frac{x}{32 \\sqrt{4 x^{2} + 1}} - \\frac{\\operatorname{asinh}{\\left(2 x \\right)}}{64}\\right)\n\\]\n\n\n\nWe show F, only to demonstrate that indeed the integral is a bit involved. The actual surface area follows from a definite integral, which we get through the fundamental theorem of calculus:\n\nF(1) - F(0)\n\n \n\\[\n2 \\pi \\left(- \\frac{\\operatorname{asinh}{\\left(2 \\right)}}{64} + \\frac{9 \\sqrt{5}}{32}\\right)\n\\]\n\n\n\n\n\n\n47.1.1 Plotting surfaces of revolution\nThe commands to plot a surface of revolution will be described more clearly later; for now we present them as simply a pattern to be followed in case plots are desired. Suppose the curve in the \\(x-y\\) plane is given parametrically by \\((g(u), f(u))\\) for \\(a \\leq u \\leq b\\).\nTo be concrete, we parameterize the circle centered at \\((6,0)\\) with radius \\(2\\) by:\n\ng(u) = 6 + 2sin(u)\nf(u) = 2cos(u)\na, b = 0, 2pi\n\n(0, 6.283185307179586)\n\n\nThe plot of this curve is:\n\nus = range(a, b, length=100)\nplot(g.(us), f.(us), xlims=(-0.5, 9), aspect_ratio=:equal, legend=false)\nplot!([0,0],[-3,3], color=:red, linewidth=5) # y axis emphasis\nplot!([3,9], [0,0], color=:green, linewidth=5) # x axis emphasis\n\n\n\n\nThough parametric plots have a convenience constructor, plot(g, f, a, b), we constructed the points with Julias broadcasting notation, as we will need to do for a surface of revolution. The xlims are adjusted to show the \\(y\\) axis, which is emphasized with a layered line. The line is drawn by specifying two points, \\((x_0, y_0)\\) and \\((x_1, y_1)\\) in the form [x0,x1] and [y0,y1].\nNow, to rotate this about the \\(y\\) axis, creating a surface plot, we have the following pattern:\n\nS(u,v) = [g(u)*cos(v), g(u)*sin(v), f(u)]\nus = range(a, b, length=100)\nvs = range(0, 2pi, length=100)\nws = unzip(S.(us, vs')) # reorganize data\nsurface(ws..., zlims=(-6,6), legend=false)\n\nplot!([0,0], [0,0], [-3,3], color=:red, linewidth=5) # y axis emphasis\n\n\n\n\nThe unzip function is not part of base Julia, rather part of CalculusWithJulia. This function rearranges data into a form consumable by the plotting methods like surface. In this case, the result of S.(us,vs') is a grid (matrix) of points, the result of unzip is three grids of values, one for the \\(x\\) values, one for the \\(y\\) values, and one for the \\(z\\) values. A manual adjustment to the zlims is used, as aspect_ratio does not have an effect with the plotly() backend and errors on 3d graphics with pyplot().\nTo rotate this about the \\(x\\) axis, we have this pattern:\n\nS(u,v) = [g(u), f(u)*cos(v), f(u)*sin(v)]\nus = range(a, b, length=100)\nvs = range(0, 2pi, length=100)\nws = unzip(S.(us,vs'))\nsurface(ws..., legend=false)\n\nplot!([3,9], [0,0],[0,0], color=:green, linewidth=5) # x axis emphasis\n\n\n\n\nThe above pattern covers the case of rotating the graph of a function \\(f(x)\\) of \\(a,b\\) by taking \\(g(t)=t\\).\n\nExample\nRotate the graph of \\(x^x\\) from \\(0\\) to \\(3/2\\) around the \\(x\\) axis. What is the surface area generated?\nWe work numerically for this one, as no antiderivative is forthcoming. Recall, the accompanying CalculusWithJulia package defines f' to return the automatic derivative through the ForwardDiff package.\n\nf(x) = x^x\na, b = 0, 3/2\nval, _ = quadgk(x -> 2pi * f(x) * sqrt(1 + f'(x)^2), a, b)\nval\n\n14.934256764843937\n\n\n(The function is not defined at \\(x=0\\) mathematically, but is on the computer to be \\(1\\), the limiting value. Even were this not the case, the quadgk function doesnt evaluate the function at the points a and b that are specified.)\n\ng(u) = u\nf(u) = u^u\nS(u,v) = [g(u)*cos(v), g(u)*sin(v), f(u)]\nus = range(0, 3/2, length=100)\nvs = range(0, pi, length=100) # not 2pi (to see inside)\nws = unzip(S.(us,vs'))\nsurface(ws..., alpha=0.75)\n\n\n\n\nWe compare this answer to that of the frustum of a cone with radii \\(1\\) and \\((3/2)^2\\), formed by rotating the line segment connecting \\((0,f(0))\\) with \\((3/2,f(3/2))\\). From looking at the graph of the surface, these values should be comparable. The surface area of the cone part is \\(\\pi (r_1^2 + r_0^2) / \\sin(\\theta) = \\pi (r_1 + r_0) \\cdot \\sqrt{(\\Delta h)^2 + (r_1-r_0)^2}\\).\n\nf(x) = x^x\nr0, r1 = f(0), f(3/2)\npi * (r1 + r0) * sqrt((3/2)^2 + (r1-r0)^2)\n\n15.310680925915081\n\n\n\n\nExample\nWhat is the surface area generated by Gabriels Horn, the solid formed by rotating \\(1/x\\) for \\(x \\geq 1\\) around the \\(x\\) axis?\n\\[\n\\text{SA} = \\int_a^b 2\\pi f(x) \\sqrt{1 + f'(x)^2}dx =\n\\lim_{M \\rightarrow \\infty} \\int_1^M 2\\pi \\frac{1}{x} \\sqrt{1 + (-1/x^2)^2} dx.\n\\]\nWe do this with SymPy:\n\n@syms M\nex = integrate(2PI * (1/x) * sqrt(1 + (-1/x)^2), (x, 1, M))\n\n \n\\[\n2 \\pi \\left(- \\frac{M}{\\sqrt{M^{2} + 1}} + \\operatorname{asinh}{\\left(M \\right)} - \\frac{1}{M \\sqrt{M^{2} + 1}}\\right) - 2 \\pi \\left(- \\sqrt{2} + \\log{\\left(1 + \\sqrt{2} \\right)}\\right)\n\\]\n\n\n\nThe limit as \\(M\\) gets large is of interest. The only term that might get out of hand is asinh(M). We check its limit:\n\nlimit(asinh(M), M => oo)\n\n \n\\[\n\\infty\n\\]\n\n\n\nSo indeed it does. There is nothing to balance this out, so the integral will be infinite, as this shows:\n\nlimit(ex, M => oo)\n\n \n\\[\n\\infty\n\\]\n\n\n\nThis figure would have infinite surface, were it possible to actually construct an infinitely long solid. (But it has been shown to have finite volume.)\n\n\nExample\nThe curve described parametrically by \\(g(t) = 2(1 + \\cos(t))\\cos(t)\\) and \\(f(t) = 2(1 + \\cos(t))\\sin(t)\\) from \\(0\\) to \\(\\pi\\) is rotated about the \\(x\\) axis. Find the resulting surface area.\nThe graph shows half a heart, the resulting area will resemble an apple.\n\ng(t) = 2(1 + cos(t)) * cos(t)\nf(t) = 2(1 + cos(t)) * sin(t)\nplot(g, f, 0, 1pi)\n\n\n\n\nThe integrand simplifies to \\(8\\sqrt{2}\\pi \\sin(t) (1 + \\cos(t))^{3/2}\\). This lends itself to \\(u\\)-substitution with \\(u=\\cos(t)\\).\n\\[\n\\begin{align*}\n\\int_0^\\pi 8\\sqrt{2}\\pi \\sin(t) (1 + \\cos(t))^{3/2}\n&= 8\\sqrt{2}\\pi \\int_1^{-1} (1 + u)^{3/2} (-1) du\\\\\n&= 8\\sqrt{2}\\pi (2/5) (1+u)^{5/2} \\big|_{-1}^1\\\\\n&= 8\\sqrt{2}\\pi (2/5) 2^{5/2} = \\frac{2^7 \\pi}{5}.\n\\end{align*}\n\\]"
},
{
"objectID": "integrals/surface_area.html#the-first-theorem-of-pappus",
"href": "integrals/surface_area.html#the-first-theorem-of-pappus",
"title": "47  Surface Area",
"section": "47.2 The first Theorem of Pappus",
"text": "47.2 The first Theorem of Pappus\nThe first theorem of Pappus provides a simpler means to compute the surface area if the distance the centroid is from the axis (\\(\\rho\\)) and the arc length of the curve (\\(L\\)) are both known. In that case, the surface area satisfies:\n\\[\n\\text{SA} = 2 \\pi \\rho L\n\\]\nThat is, the surface area is simply the circumference of the circle traced out by the centroid of the curve times the length of the curve - the distances rotated are collapsed to that of just the centroid.\n\nExample\nThe surface area of of an open cone can be computed, as the arc length is \\(\\sqrt{h^2 + r^2}\\) and the centroid of the line is a distance \\(r/2\\) from the axis. This gives SA\\(=2\\pi (r/2) \\sqrt{h^2 + r^2} = \\pi r \\sqrt{h^2 + r^2}\\).\n\n\nExample\nWe can get the surface area of a torus from this formula.\nThe torus is found by rotating the curve \\((x-b)^2 + y^2 = a^2\\) about the \\(y\\) axis. The centroid is \\(b\\), the arc length \\(2\\pi a\\), so the surface area is \\(2\\pi (b) (2\\pi a) = 4\\pi^2 a b\\).\nA torus with \\(a=2\\) and \\(b=6\\)\n\n\n\n\n\n\n\nExample\nThe surface area of sphere will be SA\\(=2\\pi \\rho (\\pi r) = 2 \\pi^2 r \\cdot \\rho\\). What is \\(\\rho\\)? The centroid of an arc formula can be derived in a manner similar to that of the centroid of a region. The formulas are:\n\\[\n\\begin{align}\n\\text{cm}_x &= \\frac{1}{L} \\int_a^b g(t) \\sqrt{g'(t)^2 + f'(t)^2} dt\\\\\n\\text{cm}_y &= \\frac{1}{L} \\int_a^b f(t) \\sqrt{g'(t)^2 + f'(t)^2} dt.\n\\end{align}\n\\]\nHere, \\(L\\) is the arc length of the curve.\nFor the sphere parameterized by \\(g(t) = r \\cos(t)\\), \\(f(t) = r\\sin(t)\\), we get that these become\n\\[\n\\text{cm}_x = \\frac{1}{L}\\int_0^\\pi r\\cos(t) \\sqrt{r^2(\\sin(t)^2 + \\cos(t)^2)} dt = \\frac{1}{L}r^2 \\int_0^\\pi \\cos(t) = 0.\n\\]\n\\[\n\\text{cm}_y = \\frac{1}{L}\\int_0^\\pi r\\sin(t) \\sqrt{r^2(\\sin(t)^2 + \\cos(t)^2)} dt = \\frac{1}{L}r^2 \\int_0^\\pi \\sin(t) = \\frac{1}{\\pi r} r^2 \\cdot 2 = \\frac{2r}{\\pi}.\n\\]\nCombining this, we see that the surface area of a sphere is \\(2 \\pi^2 r (2r/\\pi) = 4\\pi r^2\\), by Pappus Theorem."
},
{
"objectID": "integrals/surface_area.html#questions",
"href": "integrals/surface_area.html#questions",
"title": "47  Surface Area",
"section": "47.3 Questions",
"text": "47.3 Questions\n\nQuestions\nThe graph of \\(f(x) = \\sin(x)\\) from \\(0\\) to \\(\\pi\\) is rotated around the \\(x\\) axis. After a \\(u\\)-substitution, what integral would give the surface area generated?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-\\int_1^{_1} 2\\pi u^2 \\sqrt{1 + u} du\\)\n \n \n\n\n \n \n \n \n \\(-\\int_1^{_1} 2\\pi u \\sqrt{1 + u^2} du\\)\n \n \n\n\n \n \n \n \n \\(-\\int_1^{-1} 2\\pi \\sqrt{1 + u^2} du\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThough the integral can be computed by hand, give a numeric value.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\nThe graph of \\(f(x) = \\sqrt{x}\\) from \\(0\\) to \\(4\\) is rotated around the \\(x\\) axis. Numerically find the surface area generated?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\nFind the surface area generated by revolving the graph of the function \\(f(x) = x^3/9\\) from \\(x=0\\) to \\(x=2\\) around the \\(x\\) axis. This can be done by hand or numerically.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\n(From Stewart.) If a loaf of bread is in the form of a sphere of radius \\(1\\), the amount of crust for a slice depends on the width, but not where in the loaf it is sliced.\nThat is this integral with \\(f(x) = \\sqrt{1 - x^2}\\) and \\(u, u+h\\) in \\([-1,1]\\) does not depend on \\(u\\):\n\\[\nA = \\int_u^{u+h} 2\\pi f(x) \\sqrt{1 + f'(x)^2} dx.\n\\]\nIf we let \\(f(x) = y\\) then \\(f'(x) = x/y\\). With this, what does the integral above come down to after cancellations:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_u^{u_h} 2\\pi x dx\\)\n \n \n\n\n \n \n \n \n \\(\\int_u^{u+h} 2\\pi dx\\)\n \n \n\n\n \n \n \n \n \\(\\int_u^{u_h} 2\\pi y dx\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\nFind the surface area of the dome of sphere generated by rotating the the curve generated by \\(g(t) = \\cos(t)\\) and \\(f(t) = \\sin(t)\\) for \\(t\\) in \\(0\\) to \\(\\pi/6\\).\nNumerically find the value.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\nThe astroid is parameterized by \\(g(t) = a\\cos(t)^3\\) and \\(f(t) = a \\sin(t)^3\\). Let \\(a=1\\) and rotate the curve from \\(t=0\\) to \\(t=\\pi\\) around the \\(x\\) axis. What is the surface area generated?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestions\nFor the curve parameterized by \\(g(t) = a\\cos(t)^5\\) and \\(f(t) = a \\sin(t)^5\\). Let \\(a=1\\) and rotate the curve from \\(t=0\\) to \\(t=\\pi\\) around the \\(x\\) axis. Numerically find the surface area generated?"
},
{
"objectID": "ODEs/odes.html",
"href": "ODEs/odes.html",
"title": "48  ODEs",
"section": "",
"text": "This section uses these add-on packages:\nSome relationships are easiest to describe in terms of rates or derivatives. For example:"
},
{
"objectID": "ODEs/odes.html#motion-with-constant-acceleration",
"href": "ODEs/odes.html#motion-with-constant-acceleration",
"title": "48  ODEs",
"section": "48.1 Motion with constant acceleration",
"text": "48.1 Motion with constant acceleration\nLets consider the case of constant acceleration. This describes how nearby objects fall to earth, as the force due to gravity is assumed to be a constant, so the acceleration is the constant force divided by the constant mass.\nWith constant acceleration, what is the velocity?\nAs mentioned, we have \\(dv/dt = a\\) for any velocity function \\(v(t)\\), but in this case, the right hand side is assumed to be constant. How does this restrict the possible functions, \\(v(t)\\), that the velocity can be?\nHere we can integrate to find that any answer must look like the following for some constant of integration:\n\\[\nv(t) = \\int \\frac{dv}{dt} dt = \\int a dt = at + C.\n\\]\nIf we are given the velocity at a fixed time, say \\(v(t_0) = v_0\\), then we can use the definite integral to get:\n\\[\nv(t) - v(t_0) = \\int_{t_0}^t a dt = at - a t_0.\n\\]\nSolving, gives:\n\\[\nv(t) = v_0 + a (t - t_0).\n\\]\nThis expresses the velocity at time \\(t\\) in terms of the initial velocity, the constant acceleration and the time duration.\nA natural question might be, is this the only possible answer? There are a few useful ways to think about this.\nFirst, suppose there were another, say \\(u(t)\\). Then define \\(w(t)\\) to be the difference: \\(w(t) = v(t) - u(t)\\). We would have that \\(w'(t) = v'(t) - u'(t) = a - a = 0\\). But from the mean value theorem, a function whose derivative is continuously \\(0\\), will necessarily be a constant. So at most, \\(v\\) and \\(u\\) will differ by a constant, but if both are equal at \\(t_0\\), they will be equal for all \\(t\\).\nSecond, since the derivative of any solution is a continuous function, it is true by the fundamental theorem of calculus that it must satisfy the form for the antiderivative. The initial condition makes the answer unique, as the indeterminate \\(C\\) can take only one value.\nSummarizing, we have\n\nIf \\(v(t)\\) satisfies the equation: \\(v'(t) = a\\), \\(v(t_0) = v_0,\\) then the unique solution will be \\(v(t) = v_0 + a (t - t_0)\\).\n\nNext, what about position? Here we know that the time derivative of position yields the velocity, so we should have that the unknown position function satisfies this equation and initial condition:\n\\[\nx'(t) = v(t) = v_0 + a (t - t_0), \\quad x(t_0) = x_0.\n\\]\nAgain, we can integrate to get an answer for any value \\(t\\):\n\\[\n\\begin{align*}\nx(t) - x(t_0) &= \\int_{t_0}^t \\frac{dv}{dt} dt \\\\\n&= (v_0t + \\frac{1}{2}a t^2 - at_0 t) |_{t_0}^t \\\\\n&= (v_0 - at_0)(t - t_0) + \\frac{1}{2} a (t^2 - t_0^2).\n\\end{align*}\n\\]\nThere are three constants: the initial value for the independent variable, \\(t_0\\), and the two initial values for the velocity and position, \\(v_0, x_0\\). Assuming \\(t_0 = 0\\), we can simplify the above to get a formula familiar from introductory physics:\n\\[\nx(t) = x_0 + v_0 t + \\frac{1}{2} at^2.\n\\]\nAgain, the mean value theorem can show that with the initial value specified this is the only possible solution."
},
{
"objectID": "ODEs/odes.html#first-order-initial-value-problems",
"href": "ODEs/odes.html#first-order-initial-value-problems",
"title": "48  ODEs",
"section": "48.2 First-order initial-value problems",
"text": "48.2 First-order initial-value problems\nThe two problems just looked at can be summarized by the following. We are looking for solutions to an equation of the form (taking \\(y\\) and \\(x\\) as the variables, in place of \\(x\\) and \\(t\\)):\n\\[\ny'(x) = f(x), \\quad y(x_0) = y_0.\n\\]\nThis is called an ordinary differential equation (ODE), as it is an equation involving the ordinary derivative of an unknown function, \\(y\\).\nThis is called a first-order, ordinary differential equation, as there is only the first derivative involved.\nThis is called an initial-value problem, as the value at the initial point \\(x_0\\) is specified as part of the problem.\n\nExamples\nLets look at a few more examples, and then generalize.\n\nExample: Newtons law of cooling\nConsider the ordinary differential equation given by Newtons law of cooling:\n\\[\nT'(t) = -r (T(t) - T_a), \\quad T(0) = T_0\n\\]\nThis equation is also first order, as it involves just the first derivative, but notice that on the right hand side is the function \\(T\\), not the variable being differentiated against, \\(t\\).\nAs we have a difference on the right hand side, we rename the variable through \\(U(t) = T(t) - T_a\\). Then, as \\(U'(t) = T'(t)\\), we have the equation:\n\\[\nU'(t) = -r U(t), \\quad U(0) = U_0.\n\\]\nThis shows that the rate of change of \\(U\\) depends on \\(U\\). Large postive values indicate a negative rate of change - a push back towards the origin, and large negative values of \\(U\\) indicate a positive rate of change - again, a push back towards the origin. We shouldnt be surprised to either see a steady decay towards the origin, or oscillations about the origin.\nWhat will we find? This equation is different from the previous two equations, as the function \\(U\\) appears on both sides. However, we can rearrange to get:\n\\[\n\\frac{dU}{dt}\\frac{1}{U(t)} = -r.\n\\]\nThis suggests integrating both sides, as before. Here we do the “\\(u\\)”-substitution \\(u = U(t)\\), so \\(du = U'(t) dt\\):\n\\[\n-rt + C = \\int \\frac{dU}{dt}\\frac{1}{U(t)} dt =\n\\int \\frac{1}{u}du = \\log(u).\n\\]\nSolving gives: \\(u = U(t) = e^C e^{-rt}\\). Using the initial condition forces \\(e^C = U(t_0) = T(0) - T_a\\) and so our solution in terms of \\(T(t)\\) is:\n\\[\nT(t) - T_a = (T_0 - T_a) e^{-rt}.\n\\]\nIn words, the initial difference in temperature of the object and the environment exponentially decays to \\(0\\).\nThat is, as \\(t > 0\\) goes to \\(\\infty\\), the right hand will go to \\(0\\) for \\(r > 0\\), so \\(T(t) \\rightarrow T_a\\) - the temperature of the object will reach the ambient temperature. The rate of this is largest when the difference between \\(T(t)\\) and \\(T_a\\) is largest, so when objects are cooling the statement “hotter things cool faster” is appropriate.\nA graph of the solution for \\(T_0=200\\) and \\(T_a=72\\) and \\(r=1/2\\) is made as follows. Weve added a few line segments from the defining formula, and see that they are indeed tangent to the solution found for the differential equation.\n\n\n\n\n\nThe above is implicitly assuming that there could be no other solution, than the one we found. Is that really the case? We will see that there is a theorem that can answer this, but in this case, the trick of taking the difference of two equations satisfying the equation leads to the equation \\(W'(t) = r W(t), \\text{ and } W(0) = 0\\). This equation has a general solution of \\(W(t) = Ce^{rt}\\) and the initial condition forces \\(C=0\\), so \\(W(t) = 0\\), as before. Hence, the initial-value problem for Newtons law of cooling has a unique solution.\nIn general, the equation could be written as (again using \\(y\\) and \\(x\\) as the variables):\n\\[\ny'(x) = g(y), \\quad y(x_0) = y_0\n\\]\nThis is called an autonomous, first-order ODE, as the right-hand side does not depend on \\(x\\) (except through \\(y(x)\\)).\nLet \\(F(y) = \\int_{y_0}^y du/g(u)\\), then a solution to the above is \\(F(y) = x - x_0\\), assuming \\(1/g(u)\\) is integrable.\n\n\nExample: Toricellis law\nToricellis Law describes the speed a jet of water will leave a vessel through an opening below the surface of the water. The formula is \\(v=\\sqrt{2gh}\\), where \\(h\\) is the height of the water above the hole and \\(g\\) the gravitational constant. This arises from equating the kinetic energy gained, \\(1/2 mv^2\\) and potential energy lost, \\(mgh\\), for the exiting water.\nAn application of Torricellis law is to describe the volume of water in a tank over time, \\(V(t)\\). Imagine a cylinder of cross sectional area \\(A\\) with a hole of cross sectional diameter \\(a\\) at the bottom, Then \\(V(t) = A h(t)\\), with \\(h\\) giving the height. The change in volume over \\(\\Delta t\\) units of time must be given by the value \\(a v(t) \\Delta t\\), or\n\\[\nV(t+\\Delta t) - V(t) = -a v(t) \\Delta t = -a\\sqrt{2gh(t)}\\Delta t\n\\]\nThis suggests the following formula, written in terms of \\(h(t)\\) should apply:\n\\[\nA\\frac{dh}{dt} = -a \\sqrt{2gh(t)}.\n\\]\nRearranging, this gives an equation\n\\[\n\\frac{dh}{dt} \\frac{1}{\\sqrt{h(t)}} = -\\frac{a}{A}\\sqrt{2g}.\n\\]\nIntegrating both sides yields:\n\\[\n2\\sqrt{h(t)} = -\\frac{a}{A}\\sqrt{2g} t + C.\n\\]\nIf \\(h(0) = h_0 = V(0)/A\\), we can solve for \\(C = 2\\sqrt{h_0}\\), or\n\\[\n\\sqrt{h(t)} = \\sqrt{h_0} -\\frac{1}{2}\\frac{a}{A}\\sqrt{2g} t.\n\\]\nSetting \\(h(t)=0\\) and solving for \\(t\\) shows that the time to drain the tank would be \\((2A)/(a\\sqrt{2g})\\sqrt{h_0}\\).\n\n\nExample\nConsider now the equation\n\\[\ny'(x) = y(x)^2, \\quad y(x_0) = y_0.\n\\]\nThis is called a non-linear ordinary differential equation, as the \\(y\\) variable on the right hand side presents itself in a non-linear form (it is squared). These equations may have solutions that are not defined for all times.\nThis particular problem can be solved as before by moving the \\(y^2\\) to the left hand side and integrating to yield:\n\\[\ny(x) = - \\frac{1}{C + x},\n\\]\nand with the initial condition:\n\\[\ny(x) = \\frac{y_0}{1 - y_0(x - x_0)}.\n\\]\nThis answer can demonstrate blow-up. That is, in a finite range for \\(x\\) values, the \\(y\\) value can go to infinity. For example, if the initial conditions are \\(x_0=0\\) and \\(y_0 = 1\\), then \\(y(x) = 1/(1-x)\\) is only defined for \\(x \\geq x_0\\) on \\([0,1)\\), as at \\(x=1\\) there is a vertical asymptote."
},
{
"objectID": "ODEs/odes.html#separable-equations",
"href": "ODEs/odes.html#separable-equations",
"title": "48  ODEs",
"section": "48.3 Separable equations",
"text": "48.3 Separable equations\nWeve seen equations of the form \\(y'(x) = f(x)\\) and \\(y'(x) = g(y)\\) both solved by integrating. The same tricks will work for equations of the form \\(y'(x) = f(x) \\cdot g(y)\\). Such equations are called separable.\nBasically, we equate up to constants\n\\[\n\\int \\frac{dy}{g(y)} = \\int f(x) dx.\n\\]\nFor example, suppose we have the equation\n\\[\n\\frac{dy}{dx} = x \\cdot y(x), \\quad y(x_0) = y_0.\n\\]\nThen we can find a solution, \\(y(x)\\) through:\n\\[\n\\int \\frac{dy}{y} = \\int x dx,\n\\]\nor\n\\[\n\\log(y) = \\frac{x^2}{2} + C\n\\]\nWhich yields:\n\\[\ny(x) = e^C e^{\\frac{1}{2}x^2}.\n\\]\nSubstituting in \\(x_0\\) yields a value for \\(C\\) in terms of the initial information \\(y_0\\) and \\(x_0\\)."
},
{
"objectID": "ODEs/odes.html#symbolic-solutions",
"href": "ODEs/odes.html#symbolic-solutions",
"title": "48  ODEs",
"section": "48.4 Symbolic solutions",
"text": "48.4 Symbolic solutions\nDifferential equations are classified according to their type. Different types have different methods for solution, when a solution exists.\nThe first-order initial value equations we have seen can be described generally by\n\\[\n\\begin{align*}\ny'(x) &= F(y,x),\\\\\ny(x_0) &= x_0.\n\\end{align*}\n\\]\nSpecial cases include:\n\nlinear if the function \\(F\\) is linear in \\(y\\);\nautonomous if \\(F(y,x) = G(y)\\) (a function of \\(y\\) alone);\nseparable if \\(F(y,x) = G(y)H(x)\\).\n\nAs seen, separable equations are approached by moving the “\\(y\\)” terms to one side, the “\\(x\\)” terms to the other and integrating. This also applies to autonomous equations then. There are other families of equation types that have exact solutions, and techniques for solution, summarized at this Wikipedia page.\nRather than go over these various families, we demonstrate that SymPy can solve many of these equations symbolically.\nThe solve function in SymPy solves equations for unknown variables. As a differential equation involves an unknown function there is a different function, dsolve. The basic idea is to describe the differential equation using a symbolic function and then call dsolve to solve the expression.\nSymbolic functions are defined by the @syms macro (also see ?symbols) using parentheses to distinguish a function from a variable:\n\n@syms x u() # a symbolic variable and a symbolic function\n\n(x, u)\n\n\nWe will solve the following, known as the logistic equation:\n\\[\nu'(x) = a u(1-u), \\quad a > 0\n\\]\nBefore beginning, we look at the form of the equation. When \\(u=0\\) or \\(u=1\\) the rate of change is \\(0\\), so we expect the function might be bounded within that range. If not, when \\(u\\) gets bigger than \\(1\\), then the slope is negative and when \\(u\\) gets less than \\(0\\), the slope is positive, so there will at least be a drift back to the range \\([0,1]\\). Lets see exactly what happens. We define a parameter, restricting a to be positive:\n\n@syms a::positive\n\n(a,)\n\n\nTo specify a derivative of u in our equation we can use diff(u(x),x) but here, for visual simplicity, use the Differential operator, as follows:\n\nD = Differential(x)\neqn = D(u)(x) ~ a * u(x) * (1 - u(x)) # use l \\Equal[tab] r, Eq(l,r), or just l - r\n\n \n\\[\n\\frac{d}{d x} u{\\left(x \\right)} = a \\left(1 - u{\\left(x \\right)}\\right) u{\\left(x \\right)}\n\\]\n\n\n\nIn the above, we evaluate the symbolic function at the variable x through the use of u(x) in the expression. The equation above uses ~ to combine the left- and right-hand sides as an equation in SymPy. (A unicode equals is also available for this task). This is a shortcut for Eq(l,r), but even just using l - r would suffice, as the default assumption for an equation is that it is set to 0.\nThe Differential operation is borrowed from the ModelingToolkit package, which will be introduced later.\nTo finish, we call dsolve to find a solution (if possible):\n\nout = dsolve(eqn)\n\n \n\\[\nu{\\left(x \\right)} = \\frac{1}{C_{1} e^{- a x} + 1}\n\\]\n\n\n\nThis answer - to a first-order equation - has one free constant, C_1, which can be solved for from an initial condition. We can see that when \\(a > 0\\), as \\(x\\) goes to positive infinity the solution goes to \\(1\\), and when \\(x\\) goes to negative infinity, the solution goes to \\(0\\) and otherwise is trapped in between, as expected.\nThe limits are confirmed by investigating the limits of the right-hand:\n\nlimit(rhs(out), x => oo), limit(rhs(out), x => -oo)\n\n(1, 0)\n\n\nWe can confirm that the solution is always increasing, hence trapped within \\([0,1]\\) by observing that the derivative is positive when C₁ is positive:\ndiff(rhs(out),x)\nSuppose that \\(u(0) = 1/2\\). Can we solve for \\(C_1\\) symbolically? We can use solve, but first we will need to get the symbol for C_1:\n\neq = rhs(out) # just the right hand side\nC1 = first(setdiff(free_symbols(eq), (x,a))) # fish out constant, it is not x or a\nc1 = solve(eq(x=>0) - 1//2, C1)\n\n1-element Vector{Sym}:\n 1\n\n\nAnd we plug in with:\n\neq(C1 => c1[1])\n\n \n\\[\n\\frac{1}{1 + e^{- a x}}\n\\]\n\n\n\nThats a lot of work. The dsolve function in SymPy allows initial conditions to be specified for some equations. In this case, ours is \\(x_0=0\\) and \\(y_0=1/2\\). The extra arguments passed in through a dictionary to the ics argument:\n\nx0, y0 = 0, Sym(1//2)\ndsolve(eqn, u(x), ics=Dict(u(x0) => y0))\n\n \n\\[\nu{\\left(x \\right)} = \\frac{1}{1 + e^{- a x}}\n\\]\n\n\n\n(The one subtlety is the need to write the rational value as a symbolic expression, as otherwise it will get converted to a floating point value prior to being passed along.)\n\nExample: Hookes law\nIn the first example, we solved for position, \\(x(t)\\), from an assumption of constant acceleration in two steps. The equation relating the two is a second-order equation: \\(x''(t) = a\\), so two constants are generated. That a second-order equation could be reduced to two first-order equations is not happy circumstance, as it can always be done. Rather than show the technique though, we demonstrate that SymPy can also handle some second-order ODEs.\nHookes law relates the force on an object to its position via \\(F=ma = -kx\\), or \\(x''(t) = -(k/m)x(t)\\).\nSuppose \\(k > 0\\). Then we can solve, similar to the above, with:\n\n@syms k::positive m::positive\nD2 = D ∘ D # takes second derivative through composition\neqnh = D2(u)(x) ~ -(k/m)*u(x)\ndsolve(eqnh)\n\n \n\\[\nu{\\left(x \\right)} = C_{1} \\sin{\\left(\\frac{\\sqrt{k} x}{\\sqrt{m}} \\right)} + C_{2} \\cos{\\left(\\frac{\\sqrt{k} x}{\\sqrt{m}} \\right)}\n\\]\n\n\n\nHere we find two constants, as anticipated, for we would guess that two integrations are needed in the solution.\nSuppose the spring were started by pulling it down to a bottom and releasing. The initial position at time \\(0\\) would be \\(a\\), say, and initial velocity \\(0\\). Here we get the solution specifying initial conditions on the function and its derivative (expressed through u'):\n\ndsolve(eqnh, u(x), ics = Dict(u(0) => -a, D(u)(0) => 0))\n\n \n\\[\nu{\\left(x \\right)} = - a \\cos{\\left(\\frac{\\sqrt{k} x}{\\sqrt{m}} \\right)}\n\\]\n\n\n\nWe get that the motion will follow \\(u(x) = -a \\cos(\\sqrt{k/m}x)\\). This is simple oscillatory behavior. As the spring stretches, the force gets large enough to pull it back, and as it compresses the force gets large enough to push it back. The amplitude of this oscillation is \\(a\\) and the period \\(2\\pi/\\sqrt{k/m}\\). Larger \\(k\\) values mean shorter periods; larger \\(m\\) values mean longer periods.\n\n\nExample: the pendulum\nThe simple gravity pendulum is an idealization of a physical pendulum that models a “bob” with mass \\(m\\) swinging on a massless rod of length \\(l\\) in a frictionless world governed only by the gravitational constant \\(g\\). The motion can be described by this differential equation for the angle, \\(\\theta\\), made from the vertical:\n\\[\n\\theta''(t) + \\frac{g}{l}\\sin(\\theta(t)) = 0\n\\]\nCan this second-order equation be solved by SymPy?\n\n@syms g::positive l::positive theta()=>\"θ\"\neqnp = D2(theta)(x) + g/l*sin(theta(x))\n\n \n\\[\n\\frac{g \\sin{\\left(θ{\\left(x \\right)} \\right)}}{l} + \\frac{d^{2}}{d x^{2}} θ{\\left(x \\right)}\n\\]\n\n\n\nTrying to do so, can cause SymPy to hang or simply give up and repeat its input; no easy answer is forthcoming for this equation.\nIn general, for the first-order initial value problem characterized by \\(y'(x) = F(y,x)\\), there are conditions (Peano and Picard-Lindelof) that can guarantee the existence (and uniqueness) of equation locally, but there may not be an accompanying method to actually find it. This particular problem has a solution, but it can not be written in terms of elementary functions.\nHowever, as Huygens first noted, if the angles involved are small, then we approximate the solution through the linearization \\(\\sin(\\theta(t)) \\approx \\theta(t)\\). The resulting equation for an approximate answer is just that of Hooke:\n\\[\n\\theta''(t) + \\frac{g}{l}\\theta(t) = 0\n\\]\nHere, the solution is in terms of sines and cosines, with period given by \\(T = 2\\pi/\\sqrt{k} = 2\\pi\\cdot\\sqrt{l/g}\\). The answer does not depend on the mass, \\(m\\), of the bob nor the amplitude of the motion, provided the small-angle approximation is valid.\nIf we pull the bob back an angle \\(a\\) and release it then the initial conditions are \\(\\theta(0) = a\\) and \\(\\theta'(a) = 0\\). This gives the solution:\n\neqnp₁ = D2(u)(x) + g/l * u(x)\ndsolve(eqnp₁, u(x), ics=Dict(u(0) => a, D(u)(0) => 0))\n\n \n\\[\nu{\\left(x \\right)} = a \\cos{\\left(\\frac{\\sqrt{g} x}{\\sqrt{l}} \\right)}\n\\]\n\n\n\n\n\nExample: hanging cables\nA chain hangs between two supports a distance \\(L\\) apart. What shape will it take if there are no forces outside of gravity acting on it? What about if the force is uniform along length of the chain, like a suspension bridge? How will the shape differ then?\nLet \\(y(x)\\) describe the chain at position \\(x\\), with \\(0 \\leq x \\leq L\\), say. We consider first the case of the chain with no force save gravity. Let \\(w(x)\\) be the density of the chain at \\(x\\), taken below to be a constant.\nThe chain is in equilibrium, so tension, \\(T(x)\\), in the chain will be in the direction of the derivative. Let \\(V\\) be the vertical component and \\(H\\) the horizontal component. With only gravity acting on the chain, the value of \\(H\\) will be a constant. The value of \\(V\\) will vary with position.\nAt a point \\(x\\), there is \\(s(x)\\) amount of chain with weight \\(w \\cdot s(x)\\). The tension is in the direction of the tangent line, so:\n\\[\n\\tan(\\theta) = y'(x) = \\frac{w s(x)}{H}.\n\\]\nIn terms of an increment of chain, we have:\n\\[\n\\frac{w ds}{H} = d(y'(x)).\n\\]\nThat is, the ratio of the vertical and horizontal tensions in the increment are in balance with the differential of the derivative.\nBut \\(ds = \\sqrt{dx^2 + dy^2} = \\sqrt{dx^2 + y'(x)^2 dx^2} = \\sqrt{1 + y'(x)^2}dx\\), so we can simplify to:\n\\[\n\\frac{w}{H}\\sqrt{1 + y'(x)^2}dx =y''(x)dx.\n\\]\nThis yields the second-order equation:\n\\[\ny''(x) = \\frac{w}{H} \\sqrt{1 + y'(x)^2}.\n\\]\nWe enter this into Julia:\n\n@syms w::positive H::positive y()\neqnc = D2(y)(x) ~ (w/H) * sqrt(1 + y'(x)^2)\n\n \n\\[\n\\frac{d^{2}}{d x^{2}} y{\\left(x \\right)} = \\frac{w \\sqrt{\\left(\\frac{d}{d x} y{\\left(x \\right)}\\right)^{2} + 1}}{H}\n\\]\n\n\n\nUnfortunately, SymPy needs a bit of help with this problem, by breaking the problem into steps.\nFor the first step we solve for the derivative. Let \\(u = y'\\), then we have \\(u'(x) = (w/H)\\sqrt{1 + u(x)^2}\\):\n\neqnc₁ = subs(eqnc, D(y)(x) => u(x))\n\n \n\\[\n\\frac{d}{d x} u{\\left(x \\right)} = \\frac{w \\sqrt{u^{2}{\\left(x \\right)} + 1}}{H}\n\\]\n\n\n\nand can solve via:\n\noutc = dsolve(eqnc₁)\n\n \n\\[\nu{\\left(x \\right)} = \\sinh{\\left(C_{1} + \\frac{w x}{H} \\right)}\n\\]\n\n\n\nSo \\(y'(x) = u(x) = \\sinh(C_1 + w \\cdot x/H)\\). This can be solved by direct integration as there is no \\(y(x)\\) term on the right hand side.\n\nD(y)(x) ~ rhs(outc)\n\n \n\\[\n\\frac{d}{d x} y{\\left(x \\right)} = \\sinh{\\left(C_{1} + \\frac{w x}{H} \\right)}\n\\]\n\n\n\nWe see a simple linear transformation involving the hyperbolic sine. To avoid, SymPy struggling with the above equation, and knowing the hyperbolic sine is the derivative of the hyperbolic cosine, we anticipate an answer and verify it:\n\nyc = (H/w)*cosh(C1 + w*x/H)\ndiff(yc, x) == rhs(outc) # == not \\Equal[tab]\n\ntrue\n\n\nThe shape is a hyperbolic cosine, known as the catenary.\n\n\n\nThe cables of an unloaded suspension bridge have a different shape than a loaded suspension bridge. As seen, the cables in this figure would be modeled by a catenary.\n\n\n\nIf the chain has a uniform load like a suspension bridge with a deck sufficient to make the weight of the chain negligible, then how does the above change? Then the vertical tension comes from \\(Udx\\) and not \\(w ds\\), so the equation becomes instead:\n\\[\n\\frac{Udx}{H} = d(y'(x)).\n\\]\nThis \\(y''(x) = U/H\\), a constant. So its answer will be a parabola.\n\n\nExample: projectile motion in a medium\nThe first example describes projectile motion without air resistance. If we use \\((x(t), y(t))\\) to describe position at time \\(t\\), the functions satisfy:\n\\[\nx''(t) = 0, \\quad y''(t) = -g.\n\\]\nThat is, the \\(x\\) position - where no forces act - has \\(0\\) acceleration, and the \\(y\\) position - where the force of gravity acts - has constant acceleration, \\(-g\\), where \\(g=9.8m/s^2\\) is the gravitational constant. These equations can be solved to give:\n\\[\nx(t) = x_0 + v_0 \\cos(\\alpha) t, \\quad y(t) = y_0 + v_0\\sin(\\alpha)t - \\frac{1}{2}g \\cdot t^2.\n\\]\nFurthermore, we can solve for \\(t\\) from \\(x(t)\\), to get an equation describing \\(y(x)\\). Here are all the steps:\n\n@syms x0::real y0::real v0::real alpha::real g::real\n@syms t x u()\na1 = dsolve(D2(u)(x) ~ 0, u(x), ics=Dict(u(0) => x0, D(u)(0) => v0 * cos(alpha)))\na2 = dsolve(D2(u)(x) ~ -g, u(x), ics=Dict(u(0) => y0, D(u)(0) => v0 * sin(alpha)))\nts = solve(t - rhs(a1), x)[1]\ny = simplify(rhs(a2)(t => ts))\nsympy.Poly(y, x).coeffs()\n\n3-element Vector{Sym}:\n -g/2\n v₀⋅sin(α)\n y₀\n\n\nThough y is messy, it can be seen that the answer is a quadratic polynomial in \\(x\\) yielding the familiar parabolic motion for a trajectory. The output shows the coefficients.\nIn a resistive medium, there are drag forces at play. If this force is proportional to the velocity, say, with proportion \\(\\gamma\\), then the equations become:\n\\[\n\\begin{align*}\nx''(t) &= -\\gamma x'(t), & \\quad y''(t) &= -\\gamma y'(t) -g, \\\\\nx(0) &= x_0, &\\quad y(0) &= y_0,\\\\\nx'(0) &= v_0\\cos(\\alpha),&\\quad y'(0) &= v_0 \\sin(\\alpha).\n\\end{align*}\n\\]\nWe now attempt to solve these.\n\n@syms alpha::real, γ::postive, t::positive, v()\n@syms x_0::real y_0::real v_0::real\nDₜ = Differential(t)\neq₁ = Dₜ(Dₜ(u))(t) ~ - γ * Dₜ(u)(t)\neq₂ = Dₜ(Dₜ(v))(t) ~ -g - γ * Dₜ(v)(t)\n\na₁ = dsolve(eq₁, ics=Dict(u(0) => x_0, Dₜ(u)(0) => v_0 * cos(alpha)))\na₂ = dsolve(eq₂, ics=Dict(v(0) => y_0, Dₜ(v)(0) => v_0 * sin(alpha)))\n\nts = solve(x - rhs(a₁), t)[1]\nyᵣ = rhs(a₂)(t => ts)\n\n \n\\[\n- \\frac{g \\log{\\left(\\frac{v_{0} \\cos{\\left(\\alpha \\right)}}{v_{0} \\cos{\\left(\\alpha \\right)} - x γ + x_{0} γ} \\right)}}{γ^{2}} + \\frac{g + v_{0} γ \\sin{\\left(\\alpha \\right)} + y_{0} γ^{2}}{γ^{2}} + \\frac{\\left(- g - v_{0} γ \\sin{\\left(\\alpha \\right)}\\right) \\left(v_{0} \\cos{\\left(\\alpha \\right)} - x γ + x_{0} γ\\right)}{v_{0} γ^{2} \\cos{\\left(\\alpha \\right)}}\n\\]\n\n\n\nThis gives \\(y\\) as a function of \\(x\\).\nThere are a lot of symbols. Lets simplify by using constants \\(x_0=y_0=0\\):\n\nyᵣ₁ = yᵣ(x_0 => 0, y_0 => 0)\n\n \n\\[\n- \\frac{g \\log{\\left(\\frac{v_{0} \\cos{\\left(\\alpha \\right)}}{v_{0} \\cos{\\left(\\alpha \\right)} - x γ} \\right)}}{γ^{2}} + \\frac{g + v_{0} γ \\sin{\\left(\\alpha \\right)}}{γ^{2}} + \\frac{\\left(- g - v_{0} γ \\sin{\\left(\\alpha \\right)}\\right) \\left(v_{0} \\cos{\\left(\\alpha \\right)} - x γ\\right)}{v_{0} γ^{2} \\cos{\\left(\\alpha \\right)}}\n\\]\n\n\n\nWhat is the trajectory? We see that the log function part will have issues when \\(-\\gamma x + v_0 \\cos(\\alpha) = 0\\).\nIf we fix some parameters, we can plot.\n\nv₀, γ₀, α = 200, 1/2, pi/4\nsoln = yᵣ₁(v_0=>v₀, γ=>γ₀, alpha=>α, g=>9.8)\nplot(soln, 0, v₀ * cos(α) / γ₀ - 1/10, legend=false)\n\n\n\n\nWe can see that the resistance makes the path quite non-symmetric."
},
{
"objectID": "ODEs/odes.html#visualizing-a-first-order-initial-value-problem",
"href": "ODEs/odes.html#visualizing-a-first-order-initial-value-problem",
"title": "48  ODEs",
"section": "48.5 Visualizing a first-order initial value problem",
"text": "48.5 Visualizing a first-order initial value problem\nThe solution, \\(y(x)\\), is known through its derivative. A useful tool to visualize the solution to a first-order differential equation is the slope field (or direction field) plot, which at different values of \\((x,y)\\), plots a vector with slope given through \\(y'(x)\\).The vectorfieldplot of the CalculusWithJulia package can be used to produce these.\nFor example, in a previous example we found a solution to \\(y'(x) = x\\cdot y(x)\\), coded as\n\nF(y, x) = y*x\n\nF (generic function with 1 method)\n\n\nSuppose \\(x_0=1\\) and \\(y_0=1\\). Then a direction field plot is drawn through:\n\n@syms x y\nx0, y0 = 1, 1\n\nplot(legend=false)\nvectorfieldplot!((x,y) -> [1, F(y,x)], xlims=(x0, 2), ylims=(y0-5, y0+5))\n\nf(x) = y0*exp(-x0^2/2) * exp(x^2/2)\nplot!(f, linewidth=5)\n\n\n\n\nIn general, if the first-order equation is written as \\(y'(x) = F(y,x)\\), then we plot a “function” that takes \\((x,y)\\) and returns an \\(x\\) value of \\(1\\) and a \\(y\\) value of \\(F(y,x)\\), so the slope is \\(F(y,x)\\).\n\n\n\n\n\n\nNote\n\n\n\nThe order of variables in \\(F(y,x)\\) is conventional with the equation \\(y'(x) = F(y(x),x)\\).\n\n\nThe plots are also useful for illustrating solutions for different initial conditions:\n\np = plot(legend=false)\nx0, y0 = 1, 1\n\nvectorfieldplot!((x,y) -> [1,F(y,x)], xlims=(x0, 2), ylims=(y0-5, y0+5))\nfor y0 in -4:4\n f(x) = y0*exp(-x0^2/2) * exp(x^2/2)\n plot!(f, x0, 2, linewidth=5)\nend\np\n\n\n\n\nSuch solutions are called integral curves. These graphs illustrate the fact that the slope field is tangent to the graph of any integral curve."
},
{
"objectID": "ODEs/odes.html#questions",
"href": "ODEs/odes.html#questions",
"title": "48  ODEs",
"section": "48.6 Questions",
"text": "48.6 Questions\n\nQuestion\nUsing SymPy to solve the differential equation\n\\[\nu' = \\frac{1-x}{u}\n\\]\ngives\n\n@syms x u()\ndsolve(D(u)(x) - (1-x)/u(x))\n\n2-element Vector{Sym}:\n Eq(u(x), -sqrt(C1 - x^2 + 2*x))\n Eq(u(x), sqrt(C1 - x^2 + 2*x))\n\n\nThe two answers track positive and negative solutions. For the initial condition, \\(u(-1)=1\\), we have the second one is appropriate: \\(u(x) = \\sqrt{C_1 - x^2 + 2x}\\). At \\(-1\\) this gives: \\(1 = \\sqrt{C_1-3}\\), so \\(C_1 = 4\\).\nThis value is good for what values of \\(x\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\([-1, \\infty)\\)\n \n \n\n\n \n \n \n \n \\([-1, 4]\\)\n \n \n\n\n \n \n \n \n \\([1-\\sqrt{5}, 1 + \\sqrt{5}]\\)\n \n \n\n\n \n \n \n \n \\([-1, 0]\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(y(x)\\) satisfies\n\\[\ny'(x) = y(x)^2, \\quad y(1) = 1.\n\\]\nWhat is \\(y(3/2)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSolve the initial value problem\n\\[\ny' = 1 + x^2 + y(x)^2 + x^2 y(x)^2, \\quad y(0) = 1.\n\\]\nUse your answer to find \\(y(1)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA population is modeled by \\(y(x)\\). The rate of population growth is generally proportional to the population (\\(k y(x)\\)), but as the population gets large, the rate is curtailed \\((1 - y(x)/M)\\).\nSolve the initial value problem\n\\[\ny'(x) = k\\cdot y(x) \\cdot (1 - \\frac{y(x)}{M}),\n\\]\nwhen \\(k=1\\), \\(M=100\\), and \\(y(0) = 20\\). Find the value of \\(y(5)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSolve the initial value problem\n\\[\ny'(t) = \\sin(t) - \\frac{y(t)}{t}, \\quad y(\\pi) = 1\n\\]\nFind the value of the solution at \\(t=2\\pi\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(u(x)\\) satisfies:\n\\[\n\\frac{du}{dx} = e^{-x} \\cdot u(x), \\quad u(0) = 1.\n\\]\nFind \\(u(5)\\) using SymPy.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe differential equation with boundary values\n\\[\n\\frac{r^2 \\frac{dc}{dr}}{dr} = 0, \\quad c(1)=2, c(10)=1,\n\\]\ncan be solved with SymPy. What is the value of \\(c(5)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(8/9\\)\n \n \n\n\n \n \n \n \n \\(10/9\\)\n \n \n\n\n \n \n \n \n \\(3/2\\)\n \n \n\n\n \n \n \n \n \\(9/10\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe example with projectile motion in a medium has a parameter \\(\\gamma\\) modeling the effect of air resistance. If y is the answer - as would be the case if the example were copy-and-pasted in - what can be said about limit(y, gamma=>0)?\n\n\n\n \n \n \n \n \n \n \n \n \n The limit is a quadratic polynomial in x, mirroring the first part of that example.\n \n \n\n\n \n \n \n \n The limit does not exist there is a singularity as seen by setting gamma=0.\n \n \n\n\n \n \n \n \n The limit does not exist, but the limit to oo gives a quadratic polynomial in x, mirroring the first part of that example."
},
{
"objectID": "ODEs/euler.html",
"href": "ODEs/euler.html",
"title": "49  Eulers method",
"section": "",
"text": "This section uses these add-on packages:\nThe following section takes up the task of numerically approximating solutions to differential equations. Julia has a huge set of state-of-the-art tools for this task starting with the DifferentialEquations package. We dont use that package in this section, focusing on simpler methods and implementations for pedagogical purposes, but any further exploration should utilize the tools provided therein. A brief introduction to the package follows in an upcoming section.\nConsider the differential equation:\n\\[\ny'(x) = y(x) \\cdot x, \\quad y(1)=1,\n\\]\nwhich can be solved with SymPy:\nWith the given initial condition, the solution becomes:\nPlotting this solution over the slope field\nwe see that the vectors that are drawn seem to be tangent to the graph of the solution. This is no coincidence, the tangent lines to integral curves are in the direction of the slope field.\nWhat if the graph of the solution were not there, could we use this fact to approximately reconstruct the solution?\nThat is, if we stitched together pieces of the slope field, would we get a curve that was close to the actual answer?\nThe illustration suggests the answer is yes, lets see. The solution is drawn over \\(x\\) values \\(1\\) to \\(2\\). Lets try piecing together \\(5\\) pieces between \\(1\\) and \\(2\\) and see what we have.\nThe slope-field vectors are scaled versions of the vector [1, F(y,x)]. The 1 is the part in the direction of the \\(x\\) axis, so here we would like that to be \\(0.2\\) (which is \\((2-1)/5\\). So our vectors would be 0.2 * [1, F(y,x)]. To allow for generality, we use h in place of the specific value \\(0.2\\).\nThen our first pieces would be the line connecting \\((x_0,y_0)\\) to\n\\[\n\\langle x_0, y_0 \\rangle + h \\cdot \\langle 1, F(y_0, x_0) \\rangle.\n\\]\nThe above uses vector notation to add the piece scaled by \\(h\\) to the starting point. Rather than continue with that notation, we will use subscripts. Let \\(x_1\\), \\(y_1\\) be the postion of the tip of the vector. Then we have:\n\\[\nx_1 = x_0 + h, \\quad y_1 = y_0 + h F(y_0, x_0).\n\\]\nWith this notation, it is easy to see what comes next:\n\\[\nx_2 = x_1 + h, \\quad y_2 = y_1 + h F(y_1, x_1).\n\\]\nWe just shifted the indices forward by \\(1\\). But graphically what is this? It takes the tip of the first part of our “stitched” together solution, finds the slope filed there ([1, F(y,x)]) and then uses this direction to stitch together one more piece.\nClearly, we can repeat. The \\(n\\)th piece will end at:\n\\[\nx_{n+1} = x_n + h, \\quad y_{n+1} = y_n + h F(y_n, x_n).\n\\]\nFor our example, we can do some numerics. We want \\(h=0.2\\) and \\(5\\) pieces, so values of \\(y\\) at \\(x_0=1, x_1=1.2, x_2=1.4, x_3=1.6, x_4=1.8,\\) and \\(x_5=2\\).\nBelow we do this in a loop. We have to be a bit careful, as in Julia the vector of zeros we create to store our answers begins indexing at \\(1\\), and not \\(0\\).\nSo how did we do? Lets look graphically:\nNot bad. We wouldnt expect this to be exact - due to the concavity of the solution, each step is an underestimate. However, we see it is an okay approximation and would likely be better with a smaller \\(h\\). A topic we pursue in just a bit.\nRather than type in the above command each time, we wrap it all up in a function. The inputs are \\(n\\), \\(a=x_0\\), \\(b=x_n\\), \\(y_0\\), and, most importantly, \\(F\\). The output is massaged into a function through a call to linterp, rather than two vectors. The linterp function we define below just finds a function that linearly interpolates between the points and is NaN outside of the range of the \\(x\\) values:\nWith that, here is our function to find an approximate solution to \\(y'=F(y,x)\\) with initial condition:\nWith euler, it becomes easy to explore different values.\nFor example, we thought the solution would look better with a smaller \\(h\\) (or larger \\(n\\)). Instead of \\(n=5\\), lets try \\(n=50\\):\nIt is more work for the computer, but not for us, and clearly a much better approximation to the actual answer is found."
},
{
"objectID": "ODEs/euler.html#the-euler-method",
"href": "ODEs/euler.html#the-euler-method",
"title": "49  Eulers method",
"section": "49.1 The Euler method",
"text": "49.1 The Euler method\n\n\n\nFigure from first publication of Eulers method. From Gander and Wanner.\n\n\nThe name of our function reflects the mathematician associated with the iteration:\n\\[\nx_{n+1} = x_n + h, \\quad y_{n+1} = y_n + h \\cdot F(y_n, x_n),\n\\]\nto approximate a solution to the first-order, ordinary differential equation with initial values: \\(y'(x) = F(y,x)\\).\nThe Euler method uses linearization. Each “step” is just an approximation of the function value \\(y(x_{n+1})\\) with the value from the tangent line tangent to the point \\((x_n, y_n)\\).\nEach step introduces an error. The error in one step is known as the local truncation error and can be shown to be about equal to \\(1/2 \\cdot h^2 \\cdot f''(x_{n})\\) assuming \\(y\\) has \\(3\\) or more derivatives.\nThe total error, or more commonly, global truncation error, is the error between the actual answer and the approximate answer at the end of the process. It reflects an accumulation of these local errors. This error is bounded by a constant times \\(h\\). Since it gets smaller as \\(h\\) gets smaller in direct proportion, the Euler method is called first order.\nOther, somewhat more complicated, methods have global truncation errors that involve higher powers of \\(h\\) - that is for the same size \\(h\\), the error is smaller. In analogy is the fact that Riemann sums have error that depends on \\(h\\), whereas other methods of approximating the integral have smaller errors. For example, Simpsons rule had error related to \\(h^4\\). So, the Euler method may not be employed if there is concern about total resources (time, computer, …), it is important for theoretical purposes in a manner similar to the role of the Riemann integral.\nIn the examples, we will see that for many problems the simple Euler method is satisfactory, but not always so. The task of numerically solving differential equations is not a one-size-fits-all one. In the following, a few different modifications are presented to the basic Euler method, but this just scratches the surface of the topic.\n\nExamples\n\nExample\nConsider the initial value problem \\(y'(x) = x + y(x)\\) with initial condition \\(y(0)=1\\). This problem can be solved exactly. Here we approximate over \\([0,2]\\) using Eulers method.\n\n𝑭(y,x) = x + y\n𝒙0, 𝒙n, 𝒚0 = 0, 2, 1\n𝒇 = euler(𝑭, 𝒙0, 𝒙n, 𝒚0, 25)\n𝒇(𝒙n)\n\n10.696950392438628\n\n\nWe graphically compare our approximate answer with the exact one:\n\nplot(𝒇, 𝒙0, 𝒙n)\n𝒐ut = dsolve(D(u)(x) - 𝑭(u(x),x), u(x), ics = Dict(u(𝒙0) => 𝒚0))\nplot(rhs(𝒐ut), 𝒙0, 𝒙n)\nplot!(𝒇, 𝒙0, 𝒙n)\n\n\n\n\nFrom the graph it appears our value for f(xn) will underestimate the actual value of the solution slightly.\n\n\nExample\nThe equation \\(y'(x) = \\sin(x \\cdot y)\\) is not separable, so need not have an easy solution. The default method will fail. Looking at the available methods with sympy.classify_ode(𝐞qn, u(x)) shows a power series method which can return a power series approximation (a Taylor polynomial). Lets look at comparing an approximate answer given by the Euler method to that one returned by SymPy.\nFirst, the SymPy solution:\n\n𝐅(y,x) = sin(x*y)\n𝐞qn = D(u)(x) - 𝐅(u(x), x)\n𝐨ut = dsolve(𝐞qn, hint=\"1st_power_series\")\n\n \n\\[\nu{\\left(x \\right)} = C_{1} + \\frac{C_{1} x^{2}}{2} + \\frac{C_{1} x^{4} \\cdot \\left(3 - C_{1}^{2}\\right)}{24} + O\\left(x^{6}\\right)\n\\]\n\n\n\nIf we assume \\(y(0) = 1\\), we can continue:\n\n𝐨ut1 = dsolve(𝐞qn, u(x), ics=Dict(u(0) => 1), hint=\"1st_power_series\")\n\n \n\\[\nu{\\left(x \\right)} = 1 + \\frac{x^{2}}{2} + \\frac{x^{4}}{12} + O\\left(x^{6}\\right)\n\\]\n\n\n\nThe approximate value given by the Euler method is\n\n𝐱0, 𝐱n, 𝐲0 = 0, 2, 1\n\nplot(legend=false)\nvectorfieldplot!((x,y) -> [1, 𝐅(y,x)], xlims=(𝐱0, 𝐱n), ylims=(0,5))\nplot!(rhs(𝐨ut1).removeO(), linewidth=5)\n\n𝐮 = euler(𝐅, 𝐱0, 𝐱n, 𝐲0, 10)\nplot!(𝐮, linewidth=5)\n\n\n\n\nWe see that the answer found from using a polynomial series matches that of Eulers method for a bit, but as time evolves, the approximate solution given by Eulers method more closely tracks the slope field.\n\n\nExample\nThe Brachistochrone problem was posed by Johann Bernoulli in 1696. It asked for the curve between two points for which an object will fall faster along that curve than any other. For an example, a bead sliding on a wire will take a certain amount of time to get from point \\(A\\) to point \\(B\\), the time depending on the shape of the wire. Which shape will take the least amount of time?\n\n\n\nA childs bead game. What shape wire will produce the shortest time for a bed to slide from a top to the bottom?\n\n\nRestrict our attention to the \\(x\\)-\\(y\\) plane, and consider a path, between the point \\((0,A)\\) and \\((B,0)\\). Let \\(y(x)\\) be the distance from \\(A\\), so \\(y(0)=0\\) and at the end \\(y\\) will be \\(A\\).\nGalileo knew the straight line was not the curve, but incorrectly thought the answer was a part of a circle.\n\n\n\nAs early as 1638, Galileo showed that an object falling along AC and then CB will fall faster than one traveling along AB, where C is on the arc of a circle. From the History of Math Archive.\n\n\nThis simulation also suggests that a curved path is better than the shorter straight one:\n\n\n \n The race is on. An illustration of beads falling along a path, as can be seen, some paths are faster than others. The fastest path would follow a cycloid. See Bensky and Moelter for details on simulating a bead on a wire.\n \n \n\n\n\nNow, the natural question is which path is best? The solution can be reduced to solving this equation for a positive \\(c\\):\n\\[\n1 + (y'(x))^2 = \\frac{c}{y}, \\quad c > 0.\n\\]\nReexpressing, this becomes:\n\\[\n\\frac{dy}{dx} = \\sqrt{\\frac{C-y}{y}}.\n\\]\nThis is a separable equation and can be solved, but even SymPy has trouble with this integral. However, the result has been known to be a piece of a cycloid since the insightful Jacob Bernoulli used an analogy from light bending to approach the problem. The answer is best described parametrically through:\n\\[\nx(u) = c\\cdot u - \\frac{c}{2}\\sin(2u), \\quad y(u) = \\frac{c}{2}( 1- \\cos(2u)), \\quad 0 \\leq u \\leq U.\n\\]\nThe values of \\(U\\) and \\(c\\) must satisfy \\((x(U), y(U)) = (B, A)\\).\nRather than pursue this, we will solve it numerically for a fixed value of \\(C\\) over a fixed interval to see the shape.\nThe equation can be written in terms of \\(y'=F(y,x)\\), where\n\\[\nF(y,x) = \\sqrt{\\frac{c-y}{y}}.\n\\]\nBut as \\(y_0 = 0\\), we immediately would have a problem with the first step, as there would be division by \\(0\\).\nThis says that for the optimal solution, the bead picks up speed by first sliding straight down before heading off towards \\(B\\). Thats great for the physics, but runs roughshod over our Euler method, as the first step has an infinity.\nFor this, we can try the backwards Euler method which uses the slope at \\((x_{n+1}, y_{n+1})\\), rather than \\((x_n, y_n)\\). The update step becomes:\n\\[\ny_{n+1} = y_n + h \\cdot F(y_{n+1}, x_{n+1}).\n\\]\nSeems innocuous, but the value we are trying to find, \\(y_{n+1}\\), is now on both sides of the equation, so is only implicitly defined. In this code, we use the find_zero function from the Roots package. The caveat is, this function needs a good initial guess, and the one we use below need not be widely applicable.\n\nfunction back_euler(F, x0, xn, y0, n)\n h = (xn - x0)/n\n xs = zeros(n+1)\n ys = zeros(n+1)\n xs[1] = x0\n ys[1] = y0\n for i in 1:n\n xs[i + 1] = xs[i] + h\n ## solve y[i+1] = y[i] + h * F(y[i+1], x[i+1])\n ys[i + 1] = find_zero(y -> ys[i] + h * F(y, xs[i + 1]) - y, ys[i]+h)\n end\n linterp(xs, ys)\nend\n\nback_euler (generic function with 1 method)\n\n\nWe then have with \\(C=1\\) over the interval \\([0,1.2]\\) the following:\n\n𝐹(y, x; C=1) = sqrt(C/y - 1)\n𝑥0, 𝑥n, 𝑦0 = 0, 1.2, 0\ncyc = back_euler(𝐹, 𝑥0, 𝑥n, 𝑦0, 50)\nplot(x -> 1 - cyc(x), 𝑥0, 𝑥n)\n\n\n\n\nRemember, \\(y\\) is the displacement from the top, so it is non-negative. Above we flipped the graph to make it look more like expectation. In general, the trajectory may actually dip below the ending point and come back up. The above wont see this, for as written \\(dy/dx \\geq 0\\), which need not be the case, as the defining equation is in terms of \\((dy/dx)^2\\), so the derivative could have any sign.\n\n\nExample: stiff equations\nThe Euler method is convergent, in that as \\(h\\) goes to \\(0\\), the approximate solution will converge to the actual answer. However, this does not say that for a fixed size \\(h\\), the approximate value will be good. For example, consider the differential equation \\(y'(x) = -5y\\). This has solution \\(y(x)=y_0 e^{-5x}\\). However, if we try the Euler method to get an answer over \\([0,2]\\) with \\(h=0.5\\) we dont see this:\n\n(y,x) = -5y\n𝓍0, 𝓍n, 𝓎0 = 0, 2, 1\n𝓊 = euler(, 𝓍0, 𝓍n, 𝓎0, 4) # n =4 => h = 2/4\nvectorfieldplot((x,y) -> [1, (y,x)], xlims=(0, 2), ylims=(-5, 5))\nplot!(x -> y0 * exp(-5x), 0, 2, linewidth=5)\nplot!(𝓊, 0, 2, linewidth=5)\n\n\n\n\nWhat we see is that the value of \\(h\\) is too big to capture the decay scale of the solution. A smaller \\(h\\), can do much better:\n\n𝓊₁ = euler(, 𝓍0, 𝓍n, 𝓎0, 50) # n=50 => h = 2/50\nplot(x -> y0 * exp(-5x), 0, 2)\nplot!(𝓊₁, 0, 2)\n\n\n\n\nThis is an example of a stiff equation. Such equations cause explicit methods like the Euler one problems, as small \\(h\\)s are needed to good results.\nThe implicit, backward Euler method does not have this issue, as we can see here:\n\n𝓊₂ = back_euler(, 𝓍0, 𝓍n, 𝓎0, 4) # n =4 => h = 2/4\nvectorfieldplot((x,y) -> [1, (y,x)], xlims=(0, 2), ylims=(-1, 1))\nplot!(x -> y0 * exp(-5x), 0, 2, linewidth=5)\nplot!(𝓊₂, 0, 2, linewidth=5)\n\n\n\n\n\n\nExample: The pendulum\nThe differential equation describing the simple pendulum is\n\\[\n\\theta''(t) = - \\frac{g}{l}\\sin(\\theta(t)).\n\\]\nThe typical approach to solving for \\(\\theta(t)\\) is to use the small-angle approximation that \\(\\sin(x) \\approx x\\), and then the differential equation simplifies to: \\(\\theta''(t) = -g/l \\cdot \\theta(t)\\), which is easily solved.\nHere we try to get an answer numerically. However, the problem, as stated, is not a first order equation due to the \\(\\theta''(t)\\) term. If we let \\(u(t) = \\theta(t)\\) and \\(v(t) = \\theta'(t)\\), then we get two coupled first order equations:\n\\[\nv'(t) = -g/l \\cdot \\sin(u(t)), \\quad u'(t) = v(t).\n\\]\nWe can try the Euler method here. A simple approach might be this iteration scheme:\n\\[\n\\begin{align*}\nx_{n+1} &= x_n + h,\\\\\nu_{n+1} &= u_n + h v_n,\\\\\nv_{n+1} &= v_n - h \\cdot g/l \\cdot \\sin(u_n).\n\\end{align*}\n\\]\nHere we need two initial conditions: one for the initial value \\(u(t_0)\\) and the initial value of \\(u'(t_0)\\). We have seen if we start at an angle \\(a\\) and release the bob from rest, so \\(u'(0)=0\\) we get a sinusoidal answer to the linearized model. What happens here? We let \\(a=1\\), \\(L=5\\) and \\(g=9.8\\):\nWe write a function to solve this starting from \\((x_0, y_0)\\) and ending at \\(x_n\\):\n\nfunction euler2(x0, xn, y0, yp0, n; g=9.8, l = 5)\n xs, us, vs = zeros(n+1), zeros(n+1), zeros(n+1)\n xs[1], us[1], vs[1] = x0, y0, yp0\n h = (xn - x0)/n\n for i = 1:n\n xs[i+1] = xs[i] + h\n us[i+1] = us[i] + h * vs[i]\n vs[i+1] = vs[i] + h * (-g / l) * sin(us[i])\n end\n linterp(xs, us)\nend\n\neuler2 (generic function with 1 method)\n\n\nLets take \\(a = \\pi/4\\) as the initial angle, then the approximate solution should be \\(\\pi/4\\cos(\\sqrt{g/l}x)\\) with period \\(T = 2\\pi\\sqrt{l/g}\\). We try first to plot then over 4 periods:\n\n𝗅, 𝗀 = 5, 9.8\n𝖳 = 2pi * sqrt(𝗅/𝗀)\n𝗑0, 𝗑n, 𝗒0, 𝗒p0 = 0, 4𝖳, pi/4, 0\nplot(euler2(𝗑0, 𝗑n, 𝗒0, 𝗒p0, 20), 0, 4𝖳)\n\n\n\n\nSomething looks terribly amiss. The issue is the step size, \\(h\\), is too large to capture the oscillations. There are basically only \\(5\\) steps to capture a full up and down motion. Instead, we try to get \\(20\\) steps per period so \\(n\\) must be not \\(20\\), but \\(4 \\cdot 20 \\cdot T \\approx 360\\). To this graph, we add the approximate one:\n\nplot(euler2(𝗑0, 𝗑n, 𝗒0, 𝗒p0, 360), 0, 4𝖳)\nplot!(x -> pi/4*cos(sqrt(𝗀/𝗅)*x), 0, 4𝖳)\n\n\n\n\nEven now, we still see that something seems amiss, though the issue is not as dramatic as before. The oscillatory nature of the pendulum is seen, but in the Euler solution, the amplitude grows, which would necessarily mean energy is being put into the system. A familiar instance of a pendulum would be a child on a swing. Without pumping the legs - putting energy in the system - the height of the swings arc will not grow. Though we now have oscillatory motion, this growth indicates the solution is still not quite right. The issue is likely due to each step mildly overcorrecting and resulting in an overall growth. One of the questions pursues this a bit further."
},
{
"objectID": "ODEs/euler.html#questions",
"href": "ODEs/euler.html#questions",
"title": "49  Eulers method",
"section": "49.2 Questions",
"text": "49.2 Questions\n\nQuestion\nUse Eulers method with \\(n=5\\) to approximate \\(u(1)\\) where\n\\[\nu'(x) = x - u(x), \\quad u(0) = 1\n\\]\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the equation\n\\[\ny' = x \\cdot \\sin(y), \\quad y(0) = 1.\n\\]\nUse Eulers method with \\(n=50\\) to find the value of \\(y(5)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the ordinary differential equation\n\\[\n\\frac{dy}{dx} = 1 - 2\\frac{y}{x}, \\quad y(1) = 0.\n\\]\nUse Eulers method to solve for \\(y(2)\\) when \\(n=50\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the ordinary differential equation\n\\[\n\\frac{dy}{dx} = \\frac{y \\cdot \\log(y)}{x}, \\quad y(2) = e.\n\\]\nUse Eulers method to solve for \\(y(3)\\) when \\(n=25\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the first-order non-linear ODE\n\\[\ny' = y \\cdot (1-2x), \\quad y(0) = 1.\n\\]\nUse Eulers method with \\(n=50\\) to approximate the solution \\(y\\) over \\([0,2]\\).\nWhat is the value at \\(x=1/2\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the value at \\(x=3/2\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion: The pendulum revisited.\nThe issue with the pendulums solution growing in amplitude can be addressed using a modification to the Euler method attributed to Cromer. The fix is to replace the term sin(us[i]) in the line vs[i+1] = vs[i] + h * (-g / l) * sin(us[i]) of the euler2 function with sin(us[i+1]), which uses the updated angular velocity in the \\(2\\)nd step in place of the value before the step.\nModify the euler2 function to implement the Euler-Cromer method. What do you see?\n\n\n\n \n \n \n \n \n \n \n \n \n The same as before - the amplitude grows\n \n \n\n\n \n \n \n \n The solution is identical to that of the approximation found by linearization of the sine term\n \n \n\n\n \n \n \n \n The solution has a constant amplitude, but its period is slightly shorter than that of the approximate solution found by linearization\n \n \n\n\n \n \n \n \n The solution has a constant amplitude, but its period is slightly longer than that of the approximate solution found by linearization"
},
{
"objectID": "ODEs/solve.html",
"href": "ODEs/solve.html",
"title": "50  The problem-algorithm-solve interface",
"section": "",
"text": "using Plots\nusing MonteCarloMeasurements\n\nThe DifferentialEquations.jl package is an entry point to a suite of Julia packages for numerically solving differential equations in Julia and other languages. A common interface is implemented that flexibly adjusts to the many different problems and algorithms covered by this suite of packages. In this section, we review a very informative post by discourse user @genkuroki which very nicely demonstrates the usefulness of the problem-algorithm-solve approach used with DifferentialEquations.jl. We slightly modify the presentation below for our needs, but suggest a perusal of the original post.\n\nExample: FreeFall\nThe motion of an object under a uniform gravitational field is of interest.\nThe parameters that govern the equation of motions are the gravitational constant, g; the initial height, y0; and the initial velocity, v0. The time span for which a solution is sought is tspan.\nA problem consists of these parameters. Typical Julia usage would be to create a structure to hold the parameters, which may be done as follows:\n\nstruct Problem{G, Y0, V0, TS}\n g::G\n y0::Y0\n v0::V0\n tspan::TS\nend\n\nProblem(;g=9.80665, y0=0.0, v0=30.0, tspan=(0.0,8.0)) = Problem(g, y0, v0, tspan)\n\nProblem\n\n\nThe above creates a type, Problem, and a default constructor with default values. (The original uses a more sophisticated setup that allows the two things above to be combined.)\nJust calling Problem() will create a problem suitable for the earth, passing different values for g would be possible for other planets.\nTo solve differential equations there are many different possible algorithms. Here is the construction of two types to indicate two algorithms:\n\nstruct EulerMethod{T}\n dt::T\nend\nEulerMethod(; dt=0.1) = EulerMethod(dt)\n\nstruct ExactFormula{T}\n dt::T\nend\nExactFormula(; dt=0.1) = ExactFormula(dt)\n\nExactFormula\n\n\nThe above just specifies a type for dispatch - the directions indicating what code to use to solve the problem. As seen, each specifies a size for a time step with default of 0.1.\nA type for solutions is useful for different show methods or other methods. One can be created through:\nstruct Solution{Y, V, T, P<:Problem, A}\n y::Y\n v::V\n t::T\n prob::P\n alg::A\nend\nThe different algorithms then can be implemented as part of a generic solve function. Following the post we have:\n\nsolve(prob::Problem) = solve(prob, default_algorithm(prob))\ndefault_algorithm(prob::Problem) = EulerMethod()\n\nfunction solve(prob::Problem, alg::ExactFormula)\n g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan\n dt = alg.dt\n t0, t1 = tspan\n t = range(t0, t1 + dt/2; step = dt)\n\n y(t) = y0 + v0*(t - t0) - g*(t - t0)^2/2\n v(t) = v0 - g*(t - t0)\n\n Solution(y.(t), v.(t), t, prob, alg)\nend\n\nfunction solve(prob::Problem, alg::EulerMethod)\n g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan\n dt = alg.dt\n t0, t1 = tspan\n t = range(t0, t1 + dt/2; step = dt)\n\n n = length(t)\n y = Vector{typeof(y0)}(undef, n)\n v = Vector{typeof(v0)}(undef, n)\n y[1] = y0\n v[1] = v0\n\n for i in 1:n-1\n v[i+1] = v[i] - g*dt # F*h step of Euler\n y[i+1] = y[i] + v[i]*dt # F*h step of Euler\n end\n\n Solution(y, v, t, prob, alg)\nend\n\nsolve (generic function with 3 methods)\n\n\nThe post has a more elegant means to unpack the parameters from the structures, but for each of the above, the parameters are unpacked, and then the corresponding algorithm employed. As of version v1.7 of Julia, the syntax (;g,y0,v0,tspan) = prob could also be employed.\nThe exact formulas, y(t) = y0 + v0*(t - t0) - g*(t - t0)^2/2 and v(t) = v0 - g*(t - t0), follow from well-known physics formulas. Each answer is wrapped in a Solution type so that the answers found can be easily extracted in a uniform manner.\nFor example, plots of each can be obtained through:\n\nearth = Problem()\nsol_euler = solve(earth)\nsol_exact = solve(earth, ExactFormula())\n\nplot(sol_euler.t, sol_euler.y;\n label=\"Euler's method (dt = $(sol_euler.alg.dt))\", ls=:auto)\nplot!(sol_exact.t, sol_exact.y; label=\"exact solution\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft)\n\n\n\n\nFollowing the post, since the time step dt = 0.1 is not small enough, the error of the Euler method is rather large. Next we change the algorithm parameter, dt, to be smaller:\n\nearth₂ = Problem()\nsol_euler₂ = solve(earth₂, EulerMethod(dt = 0.01))\nsol_exact₂ = solve(earth₂, ExactFormula())\n\nplot(sol_euler₂.t, sol_euler₂.y;\n label=\"Euler's method (dt = $(sol_euler₂.alg.dt))\", ls=:auto)\nplot!(sol_exact₂.t, sol_exact₂.y; label=\"exact solution\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft)\n\n\n\n\nIt is worth noting that only the first line is modified, and only the method requires modification.\nWere the moon to be considered, the gravitational constant would need adjustment. This parameter is part of the problem, not the solution algorithm.\nSuch adjustments are made by passing different values to the Problem constructor:\n\nmoon = Problem(g = 1.62, tspan = (0.0, 40.0))\nsol_eulerₘ = solve(moon)\nsol_exactₘ = solve(moon, ExactFormula(dt = sol_euler.alg.dt))\n\nplot(sol_eulerₘ.t, sol_eulerₘ.y;\n label=\"Euler's method (dt = $(sol_eulerₘ.alg.dt))\", ls=:auto)\nplot!(sol_exactₘ.t, sol_exactₘ.y; label=\"exact solution\", ls=:auto)\ntitle!(\"On the Moon\"; xlabel=\"t\", legend=:bottomleft)\n\n\n\n\nThe code above also adjusts the time span in addition to the graviational constant. The algorithm for exact formula is set to use the dt value used in the euler formula, for easier comparison. Otherwise, outside of the labels, the patterns are the same. Only those things that need changing are changed, the rest comes from defaults.\nThe above shows the benefits of using a common interface. Next, the post illustrates how other authors could extend this code, simply by adding a new solve method. For example,\n\nstruct Symplectic2ndOrder{T}\n dt::T\nend\nSymplectic2ndOrder(;dt=0.1) = Symplectic2ndOrder(dt)\n\nfunction solve(prob::Problem, alg::Symplectic2ndOrder)\n g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan\n dt = alg.dt\n t0, t1 = tspan\n t = range(t0, t1 + dt/2; step = dt)\n\n n = length(t)\n y = Vector{typeof(y0)}(undef, n)\n v = Vector{typeof(v0)}(undef, n)\n y[1] = y0\n v[1] = v0\n\n for i in 1:n-1\n ytmp = y[i] + v[i]*dt/2\n v[i+1] = v[i] - g*dt\n y[i+1] = ytmp + v[i+1]*dt/2\n end\n\n Solution(y, v, t, prob, alg)\nend\n\nsolve (generic function with 4 methods)\n\n\nHad the two prior methods been in a package, the other user could still extend the interface, as above, with just a slight standard modification.\nThe same approach works for this new type:\n\nearth₃ = Problem()\nsol_sympl₃ = solve(earth₃, Symplectic2ndOrder(dt = 2.0))\nsol_exact₃ = solve(earth₃, ExactFormula())\n\nplot(sol_sympl₃.t, sol_sympl₃.y; label=\"2nd order symplectic (dt = $(sol_sympl₃.alg.dt))\", ls=:auto)\nplot!(sol_exact₃.t, sol_exact₃.y; label=\"exact solution\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft)\n\n\n\n\nFinally, the author of the post shows how the interface can compose with other packages in the Julia package ecosystem. This example uses the external package MonteCarloMeasurements which plots the behavior of the system for perturbations of the initial value:\n\nearth₄ = Problem(y0 = 0.0 ± 0.0, v0 = 30.0 ± 1.0)\nsol_euler₄ = solve(earth₄)\nsol_sympl₄ = solve(earth₄, Symplectic2ndOrder(dt = 2.0))\nsol_exact₄ = solve(earth₄, ExactFormula())\n\nylim = (-100, 60)\nP = plot(sol_euler₄.t, sol_euler₄.y;\n label=\"Euler's method (dt = $(sol_euler₄.alg.dt))\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft, ylim)\n\nQ = plot(sol_sympl₄.t, sol_sympl₄.y;\n label=\"2nd order symplectic (dt = $(sol_sympl₄.alg.dt))\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft, ylim)\n\nR = plot(sol_exact₄.t, sol_exact₄.y; label=\"exact solution\", ls=:auto)\ntitle!(\"On the Earth\"; xlabel=\"t\", legend=:bottomleft, ylim)\n\nplot(P, Q, R; size=(720, 600))\n\n\n\n\nThe only change was in the problem, Problem(y0 = 0.0 ± 0.0, v0 = 30.0 ± 1.0), where a different number type is used which accounts for uncertainty. The rest follows the same pattern.\nThis example, shows the flexibility of the problem-algorithm-solver pattern while maintaining a consistent pattern for execution."
},
{
"objectID": "ODEs/differential_equations.html",
"href": "ODEs/differential_equations.html",
"title": "51  The DifferentialEquations suite",
"section": "",
"text": "This section uses these add-on packages:\nThe DifferentialEquations suite of packages contains solvers for a wide range of various differential equations. This section just briefly touches touch on ordinary differential equations (ODEs), and so relies only on OrdinaryDiffEq part of the suite. For more detail on this type and many others covered by the suite of packages, there are many other resources, including the documentation and accompanying tutorials."
},
{
"objectID": "ODEs/differential_equations.html#sir-model",
"href": "ODEs/differential_equations.html#sir-model",
"title": "51  The DifferentialEquations suite",
"section": "51.1 SIR Model",
"text": "51.1 SIR Model\nWe follow along with an introduction to the SIR model for the spread of disease by Smith and Moore. This model received a workout due to the COVID-19 pandemic.\nThe basic model breaks a population into three cohorts: The susceptible individuals, the infected individuals, and the recovered individuals. These add to the population size, \\(N\\), which is fixed, but the cohort sizes vary in time. We name these cohort sizes \\(S(t)\\), \\(I(t)\\), and \\(R(t)\\) and define \\(s(t)=S(t)/N\\), \\(i(t) = I(t)/N\\) and \\(r(t) = R(t)/N\\) to be the respective proportions.\nThe following assumptions are made about these cohorts by Smith and Moore:\n\nNo one is added to the susceptible group, since we are ignoring births and immigration. The only way an individual leaves the susceptible group is by becoming infected.\n\nThis implies the rate of change in time of \\(S(t)\\) depends on the current number of susceptibles, and the amount of interaction with the infected cohorts. The model assumes each infected person has \\(b\\) contacts per day that are sufficient to spread the disease. Not all contacts will be with susceptible people, but if people are assumed to mix within the cohorts, then there will be on average \\(b \\cdot S(t)/N\\) contacts with susceptible people per infected person. As each infected person is modeled identically, the time rate of change of \\(S(t)\\) is:\n\\[\n\\frac{dS}{dt} = - b \\cdot \\frac{S(t)}{N} \\cdot I(t) = -b \\cdot s(t) \\cdot I(t)\n\\]\nIt is negative, as no one is added, only taken off. After dividing by \\(N\\), this can also be expressed as \\(s'(t) = -b s(t) i(t)\\).\n\nassume that a fixed fraction \\(k\\) of the infected group will recover during any given day.\n\nThis means the change in time of the recovered depends on \\(k\\) and the number infected, giving rise to the equation\n\\[\n\\frac{dR}{dt} = k \\cdot I(t)\n\\]\nwhich can also be expressed in proportions as \\(r'(t) = k \\cdot i(t)\\).\nFinally, from \\(S(t) + I(T) + R(t) = N\\) we have \\(S'(T) + I'(t) + R'(t) = 0\\) or \\(s'(t) + i'(t) + r'(t) = 0\\).\nCombining, it is possible to express the rate of change of the infected population through:\n\\[\n\\frac{di}{dt} = b \\cdot s(t) \\cdot i(t) - k \\cdot i(t)\n\\]\nThe authors apply this model to flu statistics from Hong Kong where:\n\\[\n\\begin{align*}\nS(0) &= 7,900,000\\\\\nI(0) &= 10\\\\\nR(0) &= 0\\\\\n\\end{align*}\n\\]\nIn Julia we define these, N to model the total population, and u0 to be the proportions.\n\nS0, I0, R0 = 7_900_000, 10, 0\nN = S0 + I0 + R0\nu0 = [S0, I0, R0]/N # initial proportions\n\n3-element Vector{Float64}:\n 0.9999987341788175\n 1.2658211825048323e-6\n 0.0\n\n\nAn estimated set of values for \\(k\\) and \\(b\\) are \\(k=1/3\\), coming from the average period of infectiousness being estimated at three days and \\(b=1/2\\), which seems low in normal times, but not for an infected person who may be feeling quite ill and staying at home. (The model for COVID would certainly have a larger \\(b\\) value).\nOkay, the mathematical modeling is done; now we try to solve for the unknown functions using DifferentialEquations.\nTo warm up, if \\(b=0\\) then \\(i'(t) = -k \\cdot i(t)\\) describes the infected. (There is no circulation of people in this case.) The solution would be achieved through:\n\nk = 1/3\n\nf(u,p,t) = -k * u # solving u(t) = - k u(t)\ntime_span = (0.0, 20.0)\n\nprob = ODEProblem(f, I0/N, time_span)\nsol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)\n\nplot(sol)\n\n\n\n\nThe sol object is a set of numbers with a convenient plot method. As may have been expected, this graph shows exponential decay.\nA few comments are in order. The problem we want to solve is\n\\[\n\\frac{di}{dt} = -k \\cdot i(t) = F(i(t), k, t)\n\\]\nwhere \\(F\\) depends on the current value (\\(i\\)), a parameter (\\(k\\)), and the time (\\(t\\)). We did not utilize \\(p\\) above for the parameter, as it was easy not to, but could have, and will in the following. The time variable \\(t\\) does not appear by itself in our equation, so only f(u, p, t) = -k * u was used, u the generic name for a solution which in this case is \\(i\\).\nThe problem we set up needs an initial value (the \\(u0\\)) and a time span to solve over. Here we want time to model real time, so use floating point values.\nThe plot shows steady decay, as there is no mixing of infected with others.\nAdding in the interaction requires a bit more work. We now have what is known as a system of equations:\n\\[\n\\begin{align*}\n\\frac{ds}{dt} &= -b \\cdot s(t) \\cdot i(t)\\\\\n\\frac{di}{dt} &= b \\cdot s(t) \\cdot i(t) - k \\cdot i(t)\\\\\n\\frac{dr}{dt} &= k \\cdot i(t)\\\\\n\\end{align*}\n\\]\nSystems of equations can be solved in a similar manner as a single ordinary differential equation, though adjustments are made to accommodate the multiple functions.\nWe use a style that updates values in place, and note that u now holds \\(3\\) different functions at once:\n\nfunction sir!(du, u, p, t)\n k, b = p\n s, i, r = u[1], u[2], u[3]\n\n ds = -b * s * i\n di = b * s * i - k * i\n dr = k * i\n\n du[1], du[2], du[3] = ds, di, dr\nend\n\nsir! (generic function with 1 method)\n\n\nThe notation du is suggestive of both the derivative and a small increment. The mathematical formulation follows the derivative, the numeric solution uses a time step and increments the solution over this time step. The Tsit5() solver, used here, adaptively chooses a time step, dt; were the Euler method used, this time step would need to be explicit.\n\n\n\n\n\n\nMutation not re-binding\n\n\n\nThe sir! function has the trailing ! indicating by convention it mutates its first value, du. In this case, through an assignment, as in du[1]=ds. This could use some explanation. The binding du refers to the container holding the \\(3\\) values, whereas du[1] refers to the first value in that container. So du[1]=ds changes the first value, but not the binding of du to the container. That is, du mutates. This would be quite different were the call du = [ds,di,dr] which would create a new binding to a new container and not mutate the values in the original container.\n\n\nWith the update function defined, the problem is setup and a solution found with in the same manner:\n\np = (k=1/3, b=1/2) # parameters\ntime_span = (0.0, 150.0) # time span to solve over, 5 months\n\nprob = ODEProblem(sir!, u0, time_span, p)\nsol = solve(prob, Tsit5())\n\nplot(sol)\nplot!(x -> 0.5, linewidth=2) # mark 50% line\n\n\n\n\nThe lower graph shows the number of infected at each day over the five-month period displayed. The peak is around 6-7% of the population at any one time. However, over time the recovered part of the population reaches over 50%, meaning more than half the population is modeled as getting sick.\nNow we change the parameter \\(b\\) and observe the difference. We passed in a value p holding our two parameters, so we just need to change that and run the model again:\n\np = (k=1/2, b=2) # change b from 1/2 to 2 -- more daily contact\nprob = ODEProblem(sir!, u0, time_span, p)\nsol = solve(prob, Tsit5())\n\nplot(sol)\n\n\n\n\nThe graphs are somewhat similar, but the steady state is reached much more quickly and nearly everyone became infected.\nWhat about if \\(k\\) were bigger?\n\np = (k=2/3, b=1/2)\nprob = ODEProblem(sir!, u0, time_span, p)\nsol = solve(prob, Tsit5())\n\nplot(sol)\n\n\n\n\nThe graphs show that under these conditions the infections never take off; we have \\(i' = (b\\cdot s-k)i = k\\cdot((b/k) s - 1) i\\) which is always negative, since (b/k)s < 1, so infections will only decay.\nThe solution object is indexed by time, then has the s, i, r estimates. We use this structure below to return the estimated proportion of recovered individuals at the end of the time span.\n\nfunction recovered(k,b)\n prob = ODEProblem(sir!, u0, time_span, (k,b));\n sol = solve(prob, Tsit5());\n s,i,r = last(sol)\n r\nend\n\nrecovered (generic function with 1 method)\n\n\nThis function makes it easy to see the impact of changing the parameters. For example, fixing \\(k=1/3\\) we have:\n\nf(b) = recovered(1/3, b)\nplot(f, 0, 2)\n\n\n\n\nThis very clearly shows the sharp dependence on the value of \\(b\\); below some level, the proportion of people who are ever infected (the recovered cohort) remains near \\(0\\); above that level it can climb quickly towards \\(1\\).\nThe function recovered is of two variables returning a single value. In subsequent sections we will see a few \\(3\\)-dimensional plots that are common for such functions, here we skip ahead and show how to visualize multiple function plots at once using “z” values in a graph.\n\nk, ks = 0.1, 0.2:0.1:0.9 # first `k` and then the rest\nbs = range(0, 2, length=100)\nzs = recovered.(k, bs) # find values for fixed k, each of bs\np = plot(bs, k*one.(bs), zs, legend=false) # k*one.(ks) is [k,k,...,k]\nfor k in ks\n plot!(p, bs, k*one.(bs), recovered.(k, bs))\nend\np\n\n\n\n\nThe 3-dimensional graph with plotly can have its viewing angle adjusted with the mouse. When looking down on the \\(x-y\\) plane, which code b and k, we can see the rapid growth along a line related to \\(b/k\\).\nSmith and Moore point out that \\(k\\) is roughly the reciprocal of the number of days an individual is sick enough to infect others. This can be estimated during a breakout. However, they go on to note that there is no direct way to observe \\(b\\), but there is an indirect way.\nThe ratio \\(c = b/k\\) is the number of close contacts per day times the number of days infected which is the number of close contacts per infected individual.\nThis can be estimated from the curves once steady state has been reached (at the end of the pandemic).\n\\[\n\\frac{di}{ds} = \\frac{di/dt}{ds/dt} = \\frac{b \\cdot s(t) \\cdot i(t) - k \\cdot i(t)}{-b \\cdot s(t) \\cdot i(t)} = -1 + \\frac{1}{c \\cdot s}\n\\]\nThis equation does not depend on \\(t\\); \\(s\\) is the dependent variable. It could be solved numerically, but in this case affords an algebraic solution: \\(i = -s + (1/c) \\log(s) + q\\), where \\(q\\) is some constant. The quantity \\(q = i + s - (1/c) \\log(s)\\) does not depend on time, so is the same at time \\(t=0\\) as it is as \\(t \\rightarrow \\infty\\). At \\(t=0\\) we have \\(s(0) \\approx 1\\) and \\(i(0) \\approx 0\\), whereas \\(t \\rightarrow \\infty\\), \\(i(t) \\rightarrow 0\\) and \\(s(t)\\) goes to the steady state value, which can be estimated. Solving with \\(t=0\\), we see \\(q=0 + 1 - (1/c)\\log(1) = 1\\). In the limit them \\(1 = 0 + s_{\\infty} - (1/c)\\log(s_\\infty)\\) or \\(c = \\log(s_\\infty)/(1-s_\\infty)\\)."
},
{
"objectID": "ODEs/differential_equations.html#trajectory-with-drag",
"href": "ODEs/differential_equations.html#trajectory-with-drag",
"title": "51  The DifferentialEquations suite",
"section": "51.2 Trajectory with drag",
"text": "51.2 Trajectory with drag\nWe now solve numerically the problem of a trajectory with a drag force from air resistance.\nThe general model is:\n\\[\n\\begin{align*}\nx''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \\cdot x'(t)\\\\\ny''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \\cdot y'(t)\\\\\n\\end{align*}\n\\]\nwith initial conditions: \\(x(0) = y(0) = 0\\) and \\(x'(0) = v_0 \\cos(\\theta), y'(0) = v_0 \\sin(\\theta)\\).\nThis is turned into an ODE by a standard trick. Here we define our function for updating a step. As can be seen the vector u contains both \\(\\langle x,y \\rangle\\) and \\(\\langle x',y' \\rangle\\)\n\nfunction xy!(du, u, p, t)\n g, γ = p.g, p.k\n x, y = u[1], u[2]\n x, y = u[3], u[4] # unicode \\prime[tab]\n\n W = γ\n\n du[1] = x\n du[2] = y\n du[3] = 0 - W * x\n du[4] = -g - W * y\nend\n\nxy! (generic function with 1 method)\n\n\nThis function \\(W\\) is just a constant above, but can be easily modified as desired.\n\n\n\n\n\n\nA second-order ODE is a coupled first-order ODE\n\n\n\nThe “standard” trick is to take a second order ODE like \\(u''(t)=u\\) and turn this into two coupled ODEs by using a new name: \\(v=u'(t)\\) and then \\(v'(t) = u(t)\\). In this application, there are \\(4\\) equations, as we have both \\(x''\\) and \\(y''\\) being so converted. The first and second components of \\(du\\) are new variables, the third and fourth show the original equation.\n\n\nThe initial conditions are specified through:\n\nθ = pi/4\nv₀ = 200\nxy₀ = [0.0, 0.0]\nvxy₀ = v₀ * [cos(θ), sin(θ)]\nINITIAL = vcat(xy₀, vxy₀)\n\n4-element Vector{Float64}:\n 0.0\n 0.0\n 141.4213562373095\n 141.42135623730948\n\n\nThe time span can be computed using an upper bound of no drag, for which the classic physics formulas give (when \\(y_0=0\\)) \\((0, 2v_{y0}/g)\\)\n\ng = 9.8\nTSPAN = (0, 2*vxy₀[2] / g)\n\n(0, 28.8615012729203)\n\n\nThis allows us to define an ODEProblem:\n\ntrajectory_problem = ODEProblem(xy!, INITIAL, TSPAN)\n\n\nODEProblem with uType Vector{Float64} and tType Float64. In-place: true\ntimespan: (0.0, 28.8615012729203)\nu0: 4-element Vector{Float64}:\n 0.0\n 0.0\n 141.4213562373095\n 141.42135623730948\n\n\n\nWhen \\(\\gamma = 0\\) there should be no drag and we expect to see a parabola:\n\nps = (g=9.8, k=0)\nSOL = solve(trajectory_problem, Tsit5(); p = ps)\n\nplot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false)\n\n\n\n\nThe plot is a parametric plot of the \\(x\\) and \\(y\\) parts of the solution over the time span. We can see the expected parabolic shape.\nOn a windy day, the value of \\(k\\) would be positive. Repeating the above with \\(k=1/4\\) gives:\n\nps = (g=9.8, k=1/4)\nSOL = solve(trajectory_problem, Tsit5(); p = ps)\n\nplot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false)\n\n\n\n\nWe see that the \\(y\\) values have gone negative. The DifferentialEquations package can adjust for that with a callback which terminates the problem once \\(y\\) has gone negative. This can be implemented as follows:\n\ncondition(u,t,integrator) = u[2] # called when `u[2]` is negative\naffect!(integrator) = terminate!(integrator) # stop the process\ncb = ContinuousCallback(condition, affect!)\n\nps = (g=9.8, k = 1/4)\nSOL = solve(trajectory_problem, Tsit5(); p = ps, callback=cb)\n\nplot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false)\n\n\n\n\nFinally, we note that the ModelingToolkit package provides symbolic-numeric computing. This allows the equations to be set up symbolically, as in SymPy before being passed off to DifferentialEquations to solve numerically. The above example with no wind resistance could be translated into the following:\n\n@parameters t γ g\n@variables x(t) y(t)\nD = Differential(t)\n\neqs = [D(D(x)) ~ -γ * D(x),\n D(D(y)) ~ -g - γ * D(y)]\n\n@named sys = ODESystem(eqs)\nsys = ode_order_lowering(sys) # turn 2nd order into 1st\n\nu0 = [D(x) => vxy₀[1],\n D(y) => vxy₀[2],\n x => 0.0,\n y => 0.0]\n\np = [γ => 0.0,\n g => 9.8]\n\nprob = ODEProblem(sys, u0, TSPAN, p, jac=true)\nsol = solve(prob,Tsit5())\n\nplot(t -> sol(t)[3], t -> sol(t)[4], TSPAN..., legend=false)\n\n\n\n\nThe toolkit will automatically generate fast functions and can perform transformations (such as is done by ode_order_lowering) before passing along to the numeric solves."
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html",
"href": "differentiable_vector_calculus/polar_coordinates.html",
"title": "52  Polar Coordinates and Curves",
"section": "",
"text": "This section uses these add-on packages:\nThe description of the \\(x\\)-\\(y\\) plane via Cartesian coordinates is not the only possible way, though one that is most familiar. Here we discuss a different means. Instead of talking about over and up from an origin, we focus on a direction and a distance from the origin."
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html#definition-of-polar-coordinates",
"href": "differentiable_vector_calculus/polar_coordinates.html#definition-of-polar-coordinates",
"title": "52  Polar Coordinates and Curves",
"section": "52.1 Definition of polar coordinates",
"text": "52.1 Definition of polar coordinates\nPolar coordinates parameterize the plane though an angle \\(\\theta\\) made from the positive ray of the \\(x\\) axis and a radius \\(r\\).\n\n\n\n\n\nTo recover the Cartesian coordinates from the pair \\((r,\\theta)\\), we have these formulas from right triangle geometry:\n\\[\nx = r \\cos(\\theta),~ y = r \\sin(\\theta).\n\\]\nEach point \\((x,y)\\) corresponds to several possible values of \\((r,\\theta)\\), as any integer multiple of \\(2\\pi\\) added to \\(\\theta\\) will describe the same point. Except for the origin, there is only one pair when we restrict to \\(r > 0\\) and \\(0 \\leq \\theta < 2\\pi\\).\nFor values in the first and fourth quadrants (the range of \\(\\tan^{-1}(x)\\)), we have:\n\\[\nr = \\sqrt{x^2 + y^2},~ \\theta=\\tan^{-1}(y/x).\n\\]\nFor the other two quadrants, the signs of \\(y\\) and \\(x\\) must be considered. This is done with the function atan when two arguments are used.\nFor example, \\((-3, 4)\\) would have polar coordinates:\n\nx,y = -3, 4\nrad, theta = sqrt(x^2 + y^2), atan(y, x)\n\n(5.0, 2.214297435588181)\n\n\nAnd reversing\n\nrad*cos(theta), rad*sin(theta)\n\n(-2.999999999999999, 4.000000000000001)\n\n\nThis figure illustrates:\n\n\n\n\n\nThe case where \\(r < 0\\) is handled by going \\(180\\) degrees in the opposite direction, in other words the point \\((r, \\theta)\\) can be described as well by \\((-r,\\theta+\\pi)\\)."
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html#parameterizing-curves-using-polar-coordinates",
"href": "differentiable_vector_calculus/polar_coordinates.html#parameterizing-curves-using-polar-coordinates",
"title": "52  Polar Coordinates and Curves",
"section": "52.2 Parameterizing curves using polar coordinates",
"text": "52.2 Parameterizing curves using polar coordinates\nIf \\(r=r(\\theta)\\), then the parameterized curve \\((r(\\theta), \\theta)\\) is just the set of points generated as \\(\\theta\\) ranges over some set of values. There are many examples of parameterized curves that simplify what might be a complicated presentation in Cartesian coordinates.\nFor example, a circle has the form \\(x^2 + y^2 = R^2\\). Whereas parameterized by polar coordinates it is just \\(r(\\theta) = R\\), or a constant function.\nThe circle centered at \\((r_0, \\gamma)\\) (in polar coordinates) with radius \\(R\\) has a more involved description in polar coordinates:\n\\[\nr(\\theta) = r_0 \\cos(\\theta - \\gamma) + \\sqrt{R^2 - r_0^2\\sin^2(\\theta - \\gamma)}.\n\\]\nThe case where \\(r_0 > R\\) will not be defined for all values of \\(\\theta\\), only when \\(|\\sin(\\theta-\\gamma)| \\leq R/r_0\\).\n\nExamples\nThe Plots.jl package provides a means to visualize polar plots through plot(thetas, rs, proj=:polar). For example, to plot a circe with \\(r_0=1/2\\) and \\(\\gamma=\\pi/6\\) we would have:\n\nR, r0, gamma = 1, 1/2, pi/6\nr(theta) = r0 * cos(theta-gamma) + sqrt(R^2 - r0^2*sin(theta-gamma)^2)\nts = range(0, 2pi, length=100)\nrs = r.(ts)\nplot(ts, rs, proj=:polar, legend=false)\n\n\n\n\nTo avoid having to create values for \\(\\theta\\) and values for \\(r\\), the CalculusWithJulia package provides a helper function, plot_polar. To distinguish it from other functions provided by Plots, the calling pattern is different. It specifies an interval to plot over by a..b and puts that first (this notation for closed intervals is from IntervalSets), followed by r. Other keyword arguments are passed onto a plot call.\nWe will use this in the following, as the graphs are a bit more familiar and the calling pattern similar to how we have plotted functions.\nAs Plots will make a parametric plot when called as plot(function, function, a,b), the above function creates two such functions using the relationship \\(x=r\\cos(\\theta)\\) and \\(y=r\\sin(\\theta)\\).\nUsing plot_polar, we can plot circles with the following. We have to be a bit careful for the general circle, as when the center is farther away from the origin that the radius (\\(R\\)), then not all angles will be acceptable and there are two functions needed to describe the radius, as this comes from a quadratic equation and both the “plus” and “minus” terms are used.\n\nR=4; r(t) = R;\n\nfunction plot_general_circle!(r0, gamma, R)\n # law of cosines has if gamma=0, |theta| <= asin(R/r0)\n # R^2 = a^2 + r^2 - 2a*r*cos(theta); solve for a\n r(t) = r0 * cos(t - gamma) + sqrt(R^2 - r0^2*sin(t-gamma)^2)\n l(t) = r0 * cos(t - gamma) - sqrt(R^2 - r0^2*sin(t-gamma)^2)\n\n if R < r0\n theta = asin(R/r0)-1e-6 # avoid round off issues\n plot_polar!((gamma-theta)..(gamma+theta), r)\n plot_polar!((gamma-theta)..(gamma+theta), l)\n else\n plot_polar!(0..2pi, r)\n end\nend\n\nplot_polar(0..2pi, r, aspect_ratio=:equal, legend=false)\nplot_general_circle!(2, 0, 2)\nplot_general_circle!(3, 0, 1)\n\n\n\n\nThere are many interesting examples of curves described by polar coordinates. An interesting compilation of famous curves is found at the MacTutor History of Mathematics archive, many of which have formulas in polar coordinates.\n\nExample\nThe rhodenea curve has\n\\[\nr(\\theta) = a \\sin(k\\theta)\n\\]\n\na, k = 4, 5\nr(theta) = a * sin(k * theta)\nplot_polar(0..pi, r)\n\n\n\n\nThis graph has radius \\(0\\) whenever \\(\\sin(k\\theta) = 0\\) or \\(k\\theta =n\\pi\\). Solving means that it is \\(0\\) at integer multiples of \\(\\pi/k\\). In the above, with \\(k=5\\), there will \\(5\\) zeroes in \\([0,\\pi]\\). The entire curve is traced out over this interval, the values from \\(\\pi\\) to \\(2\\pi\\) yield negative value of \\(r\\), so are related to values within \\(0\\) to \\(\\pi\\) via the relation \\((r,\\pi +\\theta) = (-r, \\theta)\\).\n\n\nExample\nThe folium is a somewhat similar looking curve, but has this description:\n\\[\nr(\\theta) = -b \\cos(\\theta) + 4a \\cos(\\theta) \\sin(2\\theta)\n\\]\n\n𝒂, 𝒃 = 4, 2\n𝒓(theta) = -𝒃 * cos(theta) + 4𝒂 * cos(theta) * sin(2theta)\nplot_polar(0..2pi, 𝒓)\n\n\n\n\nThe folium has radial part \\(0\\) when \\(\\cos(\\theta) = 0\\) or \\(\\sin(2\\theta) = b/4a\\). This could be used to find out what values correspond to which loop. For our choice of \\(a\\) and \\(b\\) this gives \\(\\pi/2\\), \\(3\\pi/2\\) or, as \\(b/4a = 1/8\\), when \\(\\sin(2\\theta) = 1/8\\) which happens at \\(a_0=\\sin^{-1}(1/8)/2=0.0626...\\) and \\(\\pi/2 - a_0\\), \\(\\pi+a_0\\) and \\(3\\pi/2 - a_0\\). The first folium can be plotted with:\n\n𝒂0 = (1/2) * asin(1/8)\nplot_polar(𝒂0..(pi/2-𝒂0), 𝒓)\n\n\n\n\nThe second - which is too small to appear in the initial plot without zooming in - with\n\nplot_polar((pi/2 - 𝒂0)..(pi/2), 𝒓)\n\n\n\n\nThe third with\n\nplot_polar((pi/2)..(pi + 𝒂0), 𝒓)\n\n\n\n\nThe plot repeats from there, so the initial plot could have been made over \\([0, \\pi + a_0]\\).\n\n\nExample\nThe Limacon of Pascal has\n\\[\nr(\\theta) = b + 2a\\cos(\\theta)\n\\]\n\na,b = 4, 2\nr(theta) = b + 2a*cos(theta)\nplot_polar(0..2pi, r)\n\n\n\n\n\n\nExample\nSome curves require a longer parameterization, such as this where we plot over \\([0, 8\\pi]\\) so that the cosine term can range over an entire half period:\n\nr(theta) = sqrt(abs(cos(theta/8)))\nplot_polar(0..8pi, r)"
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html#area-of-polar-graphs",
"href": "differentiable_vector_calculus/polar_coordinates.html#area-of-polar-graphs",
"title": "52  Polar Coordinates and Curves",
"section": "52.3 Area of polar graphs",
"text": "52.3 Area of polar graphs\nConsider the cardioid described by \\(r(\\theta) = 2(1 + \\cos(\\theta))\\):\n\nr(theta) = 2(1 + cos(theta))\nplot_polar(0..2pi, r)\n\n\n\n\nHow much area is contained in the graph?\nIn some cases it might be possible to translate back into Cartesian coordinates and compute from there. In practice, this is not usually the best solution.\nThe area can be approximated by wedges (not rectangles). For example, here we see that the area over a given angle is well approximated by the wedge for each of the sectors:\n\n\n\n\n\nAs well, see this part of a Wikipedia page for a figure.\nImagine we have \\(a < b\\) and a partition \\(a=t_0 < t_1 < \\cdots < t_n = b\\). Let \\(\\phi_i = (1/2)(t_{i-1} + t_{i})\\) be the midpoint. Then the wedge of radius \\(r(\\phi_i)\\) with angle between \\(t_{i-1}\\) and \\(t_i\\) will have area \\(\\pi r(\\phi_i)^2 (t_i-t_{i-1}) / (2\\pi) = (1/2) r(\\phi_i)(t_i-t_{i-1})\\), the ratio \\((t_i-t_{i-1}) / (2\\pi)\\) being the angle to the total angle of a circle. Summing the area of these wedges over the partition gives a Riemann sum approximation for the integral \\((1/2)\\int_a^b r(\\theta)^2 d\\theta\\). This limit of this sum defines the area in polar coordinates.\n\nArea of polar regions. Let \\(R\\) denote the region bounded by the curve \\(r(\\theta)\\) and bounded by the rays \\(\\theta=a\\) and \\(\\theta=b\\) with \\(b-a \\leq 2\\pi\\), then the area of \\(R\\) is given by:\n\\(A = \\frac{1}{2}\\int_a^b r(\\theta)^2 d\\theta.\\)\n\nSo the area of the cardioid, which is parameterized over \\([0, 2\\pi]\\) is found by\n\nr(theta) = 2(1 + cos(theta))\n@syms theta\n(1//2) * integrate(r(theta)^2, (theta, 0, 2PI))\n\n \n\\[\n6 \\pi\n\\]\n\n\n\n\nExample\nThe folium has general formula \\(r(\\theta) = -b \\cos(\\theta) +4a\\cos(\\theta)\\sin(\\theta)^2\\). When \\(a=1\\) and \\(b=1\\) a leaf of the folium is traced out between \\(\\pi/6\\) and \\(\\pi/2\\). What is the area of that leaf?\nAn antiderivative exists for arbitrary \\(a\\) and \\(b\\):\n\n@syms 𝐚 𝐛 𝐭heta\n𝐫(theta) = -𝐛*cos(theta) + 4𝐚*cos(theta)*sin(theta)^2\nintegrate(𝐫(𝐭heta)^2, 𝐭heta) / 2\n\n \n\\[\n\\frac{𝐚^{2} 𝐭heta \\sin^{6}{\\left(𝐭heta \\right)}}{2} + \\frac{3 𝐚^{2} 𝐭heta \\sin^{4}{\\left(𝐭heta \\right)} \\cos^{2}{\\left(𝐭heta \\right)}}{2} + \\frac{3 𝐚^{2} 𝐭heta \\sin^{2}{\\left(𝐭heta \\right)} \\cos^{4}{\\left(𝐭heta \\right)}}{2} + \\frac{𝐚^{2} 𝐭heta \\cos^{6}{\\left(𝐭heta \\right)}}{2} + \\frac{𝐚^{2} \\sin^{5}{\\left(𝐭heta \\right)} \\cos{\\left(𝐭heta \\right)}}{2} - \\frac{4 𝐚^{2} \\sin^{3}{\\left(𝐭heta \\right)} \\cos^{3}{\\left(𝐭heta \\right)}}{3} - \\frac{𝐚^{2} \\sin{\\left(𝐭heta \\right)} \\cos^{5}{\\left(𝐭heta \\right)}}{2} - \\frac{𝐚 𝐛 𝐭heta \\sin^{4}{\\left(𝐭heta \\right)}}{2} - 𝐚 𝐛 𝐭heta \\sin^{2}{\\left(𝐭heta \\right)} \\cos^{2}{\\left(𝐭heta \\right)} - \\frac{𝐚 𝐛 𝐭heta \\cos^{4}{\\left(𝐭heta \\right)}}{2} - \\frac{𝐚 𝐛 \\sin^{3}{\\left(𝐭heta \\right)} \\cos{\\left(𝐭heta \\right)}}{2} + \\frac{𝐚 𝐛 \\sin{\\left(𝐭heta \\right)} \\cos^{3}{\\left(𝐭heta \\right)}}{2} + \\frac{𝐛^{2} 𝐭heta \\sin^{2}{\\left(𝐭heta \\right)}}{4} + \\frac{𝐛^{2} 𝐭heta \\cos^{2}{\\left(𝐭heta \\right)}}{4} + \\frac{𝐛^{2} \\sin{\\left(𝐭heta \\right)} \\cos{\\left(𝐭heta \\right)}}{4}\n\\]\n\n\n\nFor our specific values, the answer can be computed with:\n\nex = integrate(𝐫(𝐭heta)^2, (𝐭heta, PI/6, PI/2)) / 2\nex(𝐚 => 1, 𝐛=>1)\n\n \n\\[\n\\frac{\\pi}{12}\n\\]\n\n\n\n\nExample\nPascals limacon is like the cardioid, but contains an extra loop. When \\(a=1\\) and \\(b=1\\) we have this graph.\n\n\n\n\n\nWhat is the area contained in the outer loop, that is not in the inner loop?\nTo answer, we need to find out what range of values in \\([0, 2\\pi]\\) the inner and outer loops are traced. This will be when \\(r(\\theta) = 0\\), which for the choice of \\(a\\) and \\(b\\) solves \\(1 + 2\\cos(\\theta) = 0\\), or \\(\\cos(\\theta) = -1/2\\). This is \\(\\pi/2 + \\pi/6\\) and \\(3\\pi/2 - \\pi/6\\). The inner loop is traversed between those values and has area:\n\n@syms 𝖺 𝖻 𝗍heta\n𝗋(theta) = 𝖻 + 2𝖺*cos(𝗍heta)\n𝖾x = integrate(𝗋(𝗍heta)^2 / 2, (𝗍heta, PI/2 + PI/6, 3PI/2 - PI/6))\n𝗂nner = 𝖾x(𝖺=>1, 𝖻=>1)\n\n \n\\[\n\\pi - \\frac{3 \\sqrt{3}}{2}\n\\]\n\n\n\nThe outer area (including the inner loop) is the integral from \\(0\\) to \\(\\pi/2 + \\pi/6\\) plus that from \\(3\\pi/2 - \\pi/6\\) to \\(2\\pi\\). These areas are equal, so we double the first:\n\n𝖾x1 = 2 * integrate(𝗋(𝗍heta)^2 / 2, (𝗍heta, 0, PI/2 + PI/6))\n𝗈uter = 𝖾x1(𝖺=>1, 𝖻=>1)\n\n \n\\[\n\\frac{3 \\sqrt{3}}{2} + 2 \\pi\n\\]\n\n\n\nThe answer is the difference:\n\n𝗈uter - 𝗂nner\n\n \n\\[\n\\pi + 3 \\sqrt{3}\n\\]"
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html#arc-length",
"href": "differentiable_vector_calculus/polar_coordinates.html#arc-length",
"title": "52  Polar Coordinates and Curves",
"section": "52.4 Arc length",
"text": "52.4 Arc length\nThe length of the arc traced by a polar graph can also be expressed using an integral. Again, we partition the interval \\([a,b]\\) and consider the wedge from \\((r(t_{i-1}), t_{i-1})\\) to \\((r(t_i), t_i)\\). The curve this wedge approximates will have its arc length approximated by the line segment connecting the points. Expressing the points in Cartesian coordinates and simplifying gives the distance squared as:\n\\[\n\\begin{align}\nd_i^2 &= (r(t_i) \\cos(t_i) - r(t_{i-1})\\cos(t_{i-1}))^2 + (r(t_i) \\sin(t_i) - r(t_{i-1})\\sin(t_{i-1}))^2\\\\\n&= r(t_i)^2 - 2r(t_i)r(t_{i-1}) \\cos(t_i - t_{i-1}) + r(t_{i-1})^2 \\\\\n&\\approx r(t_i)^2 - 2r(t_i)r(t_{i-1}) (1 - \\frac{(t_i - t_{i-1})^2}{2})+ r(t_{i-1})^2 \\quad(\\text{as} \\cos(x) \\approx 1 - x^2/2)\\\\\n&= (r(t_i) - r(t_{i-1}))^2 + r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2.\n\\end{align}\n\\]\nAs was done with arc length we multiply \\(d_i\\) by \\((t_i - t_{i-1})/(t_i - t_{i-1})\\) and move the bottom factor under the square root:\n\\[\n\\begin{align}\nd_i\n&= d_i \\frac{t_i - t_{i-1}}{t_i - t_{i-1}} \\\\\n&\\approx \\sqrt{\\frac{(r(t_i) - r(t_{i-1}))^2}{(t_i - t_{i-1})^2} +\n\\frac{r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2}{(t_i - t_{i-1})^2}} \\cdot (t_i - t_{i-1})\\\\\n&= \\sqrt{(r'(\\xi_i))^2 + r(t_i)r(t_{i-1})} \\cdot (t_i - t_{i-1}).\\quad(\\text{the mean value theorem})\n\\end{align}\n\\]\nAdding the approximations to the \\(d_i\\) looks like a Riemann sum approximation to the integral \\(\\int_a^b \\sqrt{(r'(\\theta)^2) + r(\\theta)^2} d\\theta\\) (with the extension to the Riemann sum formula needed to derive the arc length for a parameterized curve). That is:\n\nArc length of a polar curve. The arc length of the curve described in polar coordinates by \\(r(\\theta)\\) for \\(a \\leq \\theta \\leq b\\) is given by:\n\\(\\int_a^b \\sqrt{r'(\\theta)^2 + r(\\theta)^2} d\\theta.\\)\n\nWe test this out on a circle with \\(r(\\theta) = R\\), a constant. The integrand simplifies to just \\(\\sqrt{R^2}\\) and the integral is from \\(0\\) to \\(2\\pi\\), so the arc length is \\(2\\pi R\\), precisely the formula for the circumference.\n\nExample\nA cardioid is described by \\(r(\\theta) = 2(1 + \\cos(\\theta))\\). What is the arc length from \\(0\\) to \\(2\\pi\\)?\nThe integrand is integrable with antiderivative \\(4\\sqrt{2\\cos(\\theta) + 2} \\cdot \\tan(\\theta/2)\\), but SymPy isnt able to find the integral. Instead we give a numeric answer:\n\nr(theta) = 2*(1 + cos(theta))\nquadgk(t -> sqrt(r'(t)^2 + r(t)^2), 0, 2pi)[1]\n\n16.0\n\n\n\n\nExample\nThe equiangular spiral has polar representation\n\\[\nr(\\theta) = a e^{\\theta \\cot(b)}\n\\]\nWith \\(a=1\\) and \\(b=\\pi/4\\), find the arc length traced out from \\(\\theta=0\\) to \\(\\theta=1\\).\n\na, b = 1, PI/4\n@syms θ\nr(theta) = a * exp(theta * cot(b))\nds = sqrt(diff(r(θ), θ)^2 + r(θ)^2)\nintegrate(ds, (θ, 0, 1))\n\n \n\\[\n- \\sqrt{2} + \\sqrt{2} e\n\\]\n\n\n\n\n\nExample\nAn Archimedean spiral is defined in polar form by\n\\[\nr(\\theta) = a + b \\theta\n\\]\nThat is, the radius increases linearly. The crossings of the positive \\(x\\) axis occur at \\(a + b n 2\\pi\\), so are evenly spaced out by \\(2\\pi b\\). These could be a model for such things as coils of materials of uniform thickness.\nFor example, a roll of toilet paper promises \\(1000\\) sheets with the smaller \\(4.1 \\times 3.7\\) inch size. This \\(3700\\) inch long connected sheet of paper is wrapped around a paper tube in an Archimedean spiral with \\(r(\\theta) = d_{\\text{inner}}/2 + b\\theta\\). The entire roll must fit in a standard dimension, so the outer diameter will be \\(d_{\\text{outer}} = 5~1/4\\) inches. Can we figure out \\(b\\)?\nLet \\(n\\) be the number of windings and assume the starting and ending point is on the positive \\(x\\) axis, \\(r(2\\pi n) = d_{\\text{outer}}/2 = d_{\\text{inner}}/2 + b (2\\pi n)\\). Solving for \\(n\\) in terms of \\(b\\) we get: \\(n = ( d_{\\text{outer}} - d_{\\text{inner}})/2 / (2\\pi b)\\). With this, the following must hold as the total arc length is \\(3700\\) inches.\n\\[\n\\int_0^{n\\cdot 2\\pi} \\sqrt{r(\\theta)^2 + r'(\\theta)^2} d\\theta = 3700\n\\]\nNumerically then we have:\n\ndinner = 1 + 5/8\ndouter = 5 + 1/4\nr(b,t) = dinner/2 + b*t\nrp(b,t) = b\nintegrand(b,t) = sqrt((r(b,t))^2 + rp(b,t)^2) # sqrt(r^2 + r'^2)\nn(b) = (douter - dinner)/2/(2*pi*b)\nb = find_zero(b -> quadgk(t->integrand(b,t), 0, n(b)*2*pi)[1] - 3700, (1/100000, 1/100))\nb, b*25.4\n\n(0.0008419553488281331, 0.02138566586023458)\n\n\nThe value b gives a value in inches, the latter in millimeters."
},
{
"objectID": "differentiable_vector_calculus/polar_coordinates.html#questions",
"href": "differentiable_vector_calculus/polar_coordinates.html#questions",
"title": "52  Polar Coordinates and Curves",
"section": "52.5 Questions",
"text": "52.5 Questions\n\nQuestion\nLet \\(r=3\\) and \\(\\theta=\\pi/8\\). In Cartesian coordinates what is \\(x\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(y\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA point in Cartesian coordinates is given by \\((-12, -5)\\). In has a polar coordinate representation with an angle \\(\\theta\\) in \\([0,2\\pi]\\) and \\(r > 0\\). What is \\(r\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is \\(\\theta\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes \\(r(\\theta) = a \\sec(\\theta - \\gamma)\\) describe a line for \\(0\\) when \\(a=3\\) and \\(\\gamma=\\pi/4\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf yes, what is the \\(y\\) intercept\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is slope of the line?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes this seem likely: the slope is \\(-1/\\tan(\\gamma)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe polar curve \\(r(\\theta) = 2\\cos(\\theta)\\) has tangent lines at most points. This differential representation of the chain rule\n\\[\n\\frac{dy}{dx} = \\frac{dy}{d\\theta} / \\frac{dx}{d\\theta},\n\\]\nallows the slope to be computed when \\(y\\) and \\(x\\) are the Cartesian form of the polar curve. For this curve, we have\n\\[\n\\frac{dy}{d\\theta} = \\frac{d}{d\\theta}(2\\cos(\\theta) \\cdot \\cos(\\theta)),~ \\text{ and }\n\\frac{dx}{d\\theta} = \\frac{d}{d\\theta}(2\\sin(\\theta) \\cdot \\cos(\\theta)).\n\\]\nNumerically, what is the slope of the tangent line when \\(\\theta = \\pi/4\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor different values \\(k > 0\\) and \\(e > 0\\) the polar equation\n\\[\nr(\\theta) = \\frac{ke}{1 + e\\cos(\\theta)}\n\\]\nhas a familiar form. The value of \\(k\\) is just a scale factor, but different values of \\(e\\) yield different shapes.\nWhen \\(0 < e < 1\\) what is the shape of the curve? (Answer by making a plot and guessing.)\n\n\n\n \n \n \n \n \n \n \n \n \n an ellipse\n \n \n\n\n \n \n \n \n a parabola\n \n \n\n\n \n \n \n \n a hyperbola\n \n \n\n\n \n \n \n \n a circle\n \n \n\n\n \n \n \n \n a line\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(e = 1\\) what is the shape of the curve?\n\n\n\n \n \n \n \n \n \n \n \n \n an ellipse\n \n \n\n\n \n \n \n \n a parabola\n \n \n\n\n \n \n \n \n a hyperbola\n \n \n\n\n \n \n \n \n a circle\n \n \n\n\n \n \n \n \n a line\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhen \\(1 < e\\) what is the shape of the curve?\n\n\n\n \n \n \n \n \n \n \n \n \n an ellipse\n \n \n\n\n \n \n \n \n a parabola\n \n \n\n\n \n \n \n \n a hyperbola\n \n \n\n\n \n \n \n \n a circle\n \n \n\n\n \n \n \n \n a line\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area of a lobe of the lemniscate curve traced out by \\(r(\\theta) = \\sqrt{\\cos(2\\theta)}\\) between \\(-\\pi/4\\) and \\(\\pi/4\\). What is the answer?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1/2\\)\n \n \n\n\n \n \n \n \n \\(\\pi/2\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the area of a lobe of the eight curve traced out by \\(r(\\theta) = \\cos(2\\theta)\\sec(\\theta)^4\\) from \\(-\\pi/4\\) to \\(\\pi/4\\). Do this numerically.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the arc length of a lobe of the lemniscate curve traced out by \\(r(\\theta) = \\sqrt{\\cos(2\\theta)}\\) between \\(-\\pi/4\\) and \\(\\pi/4\\). What is the answer (numerically)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFind the arc length of a lobe of the eight curve traced out by \\(r(\\theta) = \\cos(2\\theta)\\sec(\\theta)^4\\) from \\(-\\pi/4\\) to \\(\\pi/4\\). Do this numerically."
},
{
"objectID": "differentiable_vector_calculus/vectors.html",
"href": "differentiable_vector_calculus/vectors.html",
"title": "53  Vectors and matrices",
"section": "",
"text": "This section uses these add-on package:\nIn vectors we introduced the concept of a vector. For Julia, vectors are a useful storage container and are used to hold, for example, zeros of functions or the coefficients of a polynomial. This section is about their mathematical properties. A vector mathematically is a geometric object with two attributes a magnitude and a direction. (The direction is undefined in the case the magnitude is \\(0\\).) Vectors are typically visualized with an arrow, where the anchoring of the arrow is context dependent and is not particular to a given vector.\nVectors and points are related, but distinct. They are identified when the tail of the vector is taken to be the origin. Lets focus on \\(3\\) dimensions. Mathematically, the notation for a point is \\(p=(x,y,z)\\) while the notation for a vector is \\(\\vec{v} = \\langle x, y, z \\rangle\\). The \\(i\\)th component in a vector is referenced by a subscript: \\(v_i\\). With this, we may write a typical vector as \\(\\vec{v} = \\langle v_1, v_2, \\dots, v_n \\rangle\\) and a vector in \\(n=3\\) as \\(\\vec{v} =\\langle v_1, v_2, v_3 \\rangle\\). The different grouping notation distinguishes the two objects. As another example, the notation \\(\\{x, y, z\\}\\) indicates a set. Vectors and points may be identified by anchoring the vector at the origin. Sets are quite different from both, as the order of their entries is not unique.\nIn Julia, the notation to define a point and a vector would be identical, using square brackets to group like-type values: [x, y, z]. The notation (x,y,z) would form a tuple which though similar in many respects, are different, as tuples do not have the operations associated with a point or a vector defined for them.\nThe square bracket constructor has some subtleties:\n(A vector, mathematically, is a one-dimensional collection of numbers, a matrix a two-dimensional rectangular collection of numbers, and an array an \\(n\\)-dimensional rectangular-like collection of numbers. In Julia, a vector can hold a collection of objects of arbitrary type, though each will be promoted to a common type.)"
},
{
"objectID": "differentiable_vector_calculus/vectors.html#vector-addition-scalar-multiplication",
"href": "differentiable_vector_calculus/vectors.html#vector-addition-scalar-multiplication",
"title": "53  Vectors and matrices",
"section": "53.1 Vector addition, scalar multiplication",
"text": "53.1 Vector addition, scalar multiplication\nAs seen earlier, vectors have some arithmetic operations defined for them. As a typical use of vectors, mathematically, is to collect the \\(x\\), \\(y\\), and \\(z\\) (in \\(3\\)D) components together, operations like addition and subtraction operate component wise. With this, addition can be visualized geometrically: put the tail of \\(\\vec{v}\\) at the tip of \\(\\vec{u}\\) and draw a vector from the tail of \\(\\vec{u}\\) to the tip of \\(\\vec{v}\\) and you have \\(\\vec{u}+\\vec{v}\\). This is identical by \\(\\vec{v} + \\vec{u}\\) as vector addition is commutative. Unless \\(\\vec{u}\\) and \\(\\vec{v}\\) are parallel or one has \\(0\\) length, the addition will create a vector with a different direction from the two.\nAnother operation for vectors is scalar multiplication. Geometrically this changes the magnitude, but not the direction of a vector, when the scalar is positive. Scalar multiplication is defined component wise, like addition so the \\(i\\)th component of \\(c \\vec{v}\\) is \\(c\\) times the \\(i\\)th component of \\(\\vec{v}\\). When the scalar is negative, the direction is “reversed.”\nTo illustrate we define two \\(3\\)-dimensional vectors:\n\nu, v = [1, 2, 3], [4, 3, 2]\n\n([1, 2, 3], [4, 3, 2])\n\n\nThe sum is component-wise summation (1+4, 2+3, 3+2):\n\nu + v\n\n3-element Vector{Int64}:\n 5\n 5\n 5\n\n\nFor addition, as the components must pair off, the two vectors being added must be the same dimension.\nScalar multiplication by 2, say, multiplies each entry by 2:\n\n2 * u\n\n3-element Vector{Int64}:\n 2\n 4\n 6"
},
{
"objectID": "differentiable_vector_calculus/vectors.html#the-length-and-direction-of-a-vector",
"href": "differentiable_vector_calculus/vectors.html#the-length-and-direction-of-a-vector",
"title": "53  Vectors and matrices",
"section": "53.2 The length and direction of a vector",
"text": "53.2 The length and direction of a vector\nIf a vector \\(\\vec{v} = \\langle v_1, v_2, \\dots, v_n\\rangle\\) then the norm (also Euclidean norm or length) of \\(\\vec{v}\\) is defined by:\n\\[\n\\| \\vec{v} \\| = \\sqrt{ v_1^2 + v_2^2 + \\cdots + v_n^2}.\n\\]\nThe definition of a norm leads to a few properties. First, if \\(c\\) is a scalar, \\(\\| c\\vec{v} \\| = |c| \\| \\vec{v} \\|\\) - which says scalar multiplication by \\(c\\) changes the length by \\(|c|\\). (Sometimes, scalar multiplication is described as “scaling by….”) The other property is an analog of the triangle inequality, in which for any two vectors \\(\\| \\vec{v} + \\vec{w} \\| \\leq \\| \\vec{v} \\| + \\| \\vec{w} \\|\\). The right hand side is equal only when the two vectors are parallel.\nA vector with length \\(1\\) is called a unit vector. Dividing a non-zero vector by its norm will yield a unit vector, a consequence of the first property above. Unit vectors are often written with a “hat:” \\(\\hat{v}\\).\nThe direction indicated by \\(\\vec{v}\\) can be visualized as an angle in \\(2\\)- or \\(3\\)-dimensions, but in higher dimensions, visualization is harder. For \\(2\\)-dimensions, we might associate with a vector, its unit vector. This in turn may be identified with a point on the unit circle, which from basic trigonometry can be associated with an angle. Something similar, can be done in \\(3\\) dimensions, using two angles. However, the “direction” of a vector is best thought of in terms of its associated unit vector. With this, we have a decomposition of a non-zero vector \\(\\vec{v}\\) into a magnitude and a direction when we write \\(\\vec{v} = \\|\\vec{v}\\| \\cdot (\\vec{v} / \\|\\vec{v}\\|)=\\|\\vec{v}\\| \\hat{v}\\)."
},
{
"objectID": "differentiable_vector_calculus/vectors.html#visualization-of-vectors",
"href": "differentiable_vector_calculus/vectors.html#visualization-of-vectors",
"title": "53  Vectors and matrices",
"section": "53.3 Visualization of vectors",
"text": "53.3 Visualization of vectors\nVectors may be visualized in \\(2\\) or \\(3\\) dimensions using Plots. In \\(2\\) dimensions, the quiver function may be used. To graph a vector, it must have its tail placed at a point, so two values are needed.\nTo plot u=[1,2] from p=[0,0] we have the following usage:\n\nquiver([0],[0], quiver=([1],[2]))\n\n\n\n\nThe cumbersome syntax is typical here. We naturally describe vectors and points using [a,b,c] to combine them, but the plotting functions want to plot many such at a time and expect vectors containing just the x values, just the y values, etc. The above usage looks a bit odd, as these vectors of x and y values have only one entry. Converting from the one representation to the other requires reshaping the data. We will use the unzip function from CalculusWithJulia which in turn just uses the the invert function of the SplitApplyCombine package (“return a new nested container by reversing the order of the nested container”) for the bulk of its work.\nThis function takes a vector of vectors, and returns a vector containing the x values, the y values, etc. So if u=[1,2,3] and v=[4,5,6], then unzip([u,v]) becomes [[1,4],[2,5],[3,6]], etc. (The zip function in base does essentially the reverse operation, hence the name.) Notationally, A = [u,v] can have the third element of the first vector (u) accessed by A[1][3], where as unzip(A)[3][1] will do the same. We use unzip([u]) in the following, which for this u returns ([1],[2],[3]). (Note the [u] to make a vector of a vector.)\nWith unzip defined, we can plot a \\(2\\)-dimensional vector v anchored at point p through quiver(unzip([p])..., quiver=unzip([v])).\nTo illustrate, the following defines \\(3\\) vectors (the third through addition), then graphs all three, though in different starting points to emphasize the geometric interpretation of vector addition.\n\nu = [1, 2]\nv = [4, 2]\nw = u + v\np = [0,0]\nquiver(unzip([p])..., quiver=unzip([u]))\nquiver!(unzip([u])..., quiver=unzip([v]))\nquiver!(unzip([p])..., quiver=unzip([w]))\n\n\n\n\nPlotting a \\(3\\)-d vector is not supported in all toolkits with quiver. A line segment may be substituted and can be produced with plot(unzip([p,p+v])...). To avoid all these details, the CalculusWithJulia provides the arrow! function to add a vector to an existing plot. The function requires a point, p, and the vector, v:\nWith this, the above simplifies to:\n\nu = [1, 2]\nv = [4, 2]\nw = u + v\np = [0,0]\nplot(legend=false)\narrow!(p, u)\narrow!(u, v)\narrow!(p, w)\n\n\n\n\nThe distinction between a point and a vector within Julia is only mental. We use the same storage type. Mathematically, we can identify a point and a vector, by considering the vector with its tail placed at the origin. In this case, the tip of the arrow is located at the point. But this is only an identification, though a useful one. It allows us to “add” a point and a vector (e.g., writing \\(P + \\vec{v}\\)) by imagining the point as a vector anchored at the origin.\nTo see that a unit vector has the same “direction” as the vector, we might draw them with different widths:\n\nv = [2, 3]\nu = v / norm(v)\np = [0, 0]\nplot(legend=false)\narrow!(p, v)\narrow!(p, u, linewidth=5)\n\n\n\n\nThe norm function is in the standard library, LinearAlgebra, which must be loaded first through the command using LinearAlgebra. (Though here it is redundant, as that package is loaded and reexported when the CalculusWithJulia package is loaded.)"
},
{
"objectID": "differentiable_vector_calculus/vectors.html#aside-review-of-julias-use-of-dots-to-work-with-containers",
"href": "differentiable_vector_calculus/vectors.html#aside-review-of-julias-use-of-dots-to-work-with-containers",
"title": "53  Vectors and matrices",
"section": "53.4 Aside: review of Julias use of dots to work with containers",
"text": "53.4 Aside: review of Julias use of dots to work with containers\nJulia makes use of the dot, “.”, in a few ways to simplify usage when containers, such as vectors, are involved:\n\nSplatting. The use of three dots, “...”, to “splat” the values from a container like a vector (or tuple) into arguments of a function can be very convenient. It was used above in the definition for the arrow! function: essentially quiver!(unzip([p])..., quiver=unzip([v])). The quiver function expects \\(2\\) (or \\(3\\)) arguments describing the xs and ys (and sometimes zs). The unzip function returns these in a container, so splatting is used to turn the values in the container into distinct arguments of the function. Whereas the quiver argument expects a tuple of vectors, so no splatting is used for that part of the definition. Another use of splatting we will see is with functions of vectors. These can be defined in terms of the vectors components or the vector as a whole, as below:\n\n\nf(x, y, z) = x^2 + y^2 + z^2\nf(v) = v[1]^2 + v[2]^2 + v[3]^2\n\nf (generic function with 2 methods)\n\n\nThe first uses the components and is arguably, much easier to read. The second uses indexing in the function body to access the components. It has an advantage, as it can more easily handle different length vectors (e.g. using sum(v.^2)). Both uses have their merits, though the latter is more idiomatic throughout Julia.\nIf a function is easier to write in terms of its components, but an interface expects a vector of components as it argument, then splatting can be useful, to go from one style to another, similar to this:\n\ng(x, y, z) = x^2 + y^2 + z^2\ng(v) = g(v...)\n\ng (generic function with 2 methods)\n\n\nThe splatting will mean g(v) eventually calls g(x, y, z) through Julias multiple dispatch machinery when v = [x, y, z].\n(The three dots can also appear in the definition of the arguments to a function, but there the usage is not splatting but rather a specification of a variable number of arguments.)\n\nBroadcasting. For a univariate function, f, and vector, xs, the call f.(xs) broadcasts f over each value of xs and returns a container holding all the values. This is a compact alternative to a comprehension when a function is defined. When f depends on more than one value, broadcasting can still be used: f.(xs, ys) will broadcast f over values formed from both xs and ys. Broadcasting has the extra feature (over map) of attempting to match up the shapes of xs and ys when they are not identical. (See the help page for broadcast for more details.)\n\nFor example, if xs is a vector and ys a scalar, then the value in ys is repeated many times to match up with the values of xs. Or if xs and ys have different dimensions, the values of one will be repeated. Consider this:\n\n𝐟(x,y) = x + y\n\n𝐟 (generic function with 1 method)\n\n\n\nxs = ys = [0, 1]\n𝐟.(xs, ys)\n\n2-element Vector{Int64}:\n 0\n 2\n\n\nThis matches xs and ys to pass (0,0) and then (1,1) to f, returning 0 and 2. Now consider\n\nxs = [0, 1]; ys = [0 1] # xs is a column vector, ys a row vector\n𝐟.(xs, ys)\n\n2×2 Matrix{Int64}:\n 0 1\n 1 2\n\n\nThe two dimensions are different so for each value of xs the vector of ys is broadcast. This returns a matrix now. This will be important for some plotting usages where a grid (matrix) of values is needed.\nAt times using the “apply” notation: x |> f, in place of using f(x) is useful, as it can move the wrapping function to the right of the expression. To broadcast, .|> is available."
},
{
"objectID": "differentiable_vector_calculus/vectors.html#the-dot-product",
"href": "differentiable_vector_calculus/vectors.html#the-dot-product",
"title": "53  Vectors and matrices",
"section": "53.5 The dot product",
"text": "53.5 The dot product\nThere is no concept of multiplying two vectors, or for that matter dividing two vectors. However, there are two operations between vectors that are somewhat similar to multiplication, these being the dot product and the cross product. Each has an algebraic definition, but their geometric properties are what motivate their usage. We begin by discussing the dot product.\nThe dot product between two vectors can be viewed algebraically in terms of the following product. If \\(\\vec{v} = \\langle v_1, v_2, \\dots, v_n\\rangle\\) and \\(\\vec{w} = \\langle w_1, w_2, \\dots, w_n\\rangle\\), then the dot product of \\(\\vec{v}\\) and \\(\\vec{w}\\) is defined by:\n\\[\n\\vec{v} \\cdot \\vec{w} = v_1 w_1 + v_2 w_2 + \\cdots + v_n w_n.\n\\]\nFrom this, we can see the relationship between the norm, or Euclidean length of a vector: \\(\\vec{v} \\cdot \\vec{v} = \\| \\vec{v} \\|^2\\). We can also see that the dot product is commutative, that is \\(\\vec{v} \\cdot \\vec{w} = \\vec{w} \\cdot \\vec{v}\\).\nThe dot product has an important geometrical interpolation. Two (non-parallel) vectors will lie in the same “plane”, even in higher dimensions. Within this plane, there will be an angle between them within \\([0, \\pi]\\). Call this angle \\(\\theta\\). (This means the angle between the two vectors is the same regardless of their order of consideration.) Then\n\\[\n\\vec{v} \\cdot \\vec{w} = \\|\\vec{v}\\| \\|\\vec{w}\\| \\cos(\\theta).\n\\]\nIf we denoted \\(\\hat{v} = \\vec{v} / \\| \\vec{v} \\|\\), the unit vector in the direction of \\(\\vec{v}\\), then by dividing, we see that \\(\\cos(\\theta) = \\hat{v} \\cdot \\hat{w}\\). That is the angle does not depend on the magnitude of the vectors involved.\nThe dot product is computed in Julia by the dot function, which is in the LinearAlgebra package of the standard library. This must be loaded (as above) before its use either directly or through the CalculusWithJulia package:\n\n𝒖 = [1, 2]\n𝒗 = [2, 1]\ndot(𝒖, 𝒗)\n\n4\n\n\n\n\n\n\n\n\nNote\n\n\n\nIn Julia, the unicode operator entered by \\cdot[tab] can also be used to mirror the math notation:\n\n\n\n𝒖𝒗 # u \\cdot[tab] v\n\n4\n\n\nContinuing, to find the angle between \\(\\vec{u}\\) and \\(\\vec{v}\\), we might do this:\n\n𝒄theta = dot(𝒖/norm(𝒖), 𝒗/norm(𝒗))\nacos(𝒄theta)\n\n0.6435011087932845\n\n\nThe cosine of \\(\\pi/2\\) is \\(0\\), so two vectors which are at right angles to each other will have a dot product of \\(0\\):\n\nu = [1, 2]\nv = [2, -1]\nu ⋅ v\n\n0\n\n\nIn two dimensions, we learn that a perpendicular line to a line with slope \\(m\\) will have slope \\(-1/m\\). From a \\(2\\)-dimensional vector, say \\(\\vec{u} = \\langle u_1, u_2 \\rangle\\), the slope is \\(u_2/u_1\\) so a perpendicular vector to \\(\\vec{u}\\) will be \\(\\langle u_2, -u_1 \\rangle\\), as above. For higher dimensions, where the angle is harder to visualize, the dot product defines perpendicularness, or orthogonality.\nFor example, these two vectors are orthogonal, as their dot product is \\(0\\), even though we cant readily visualize them:\n\nu = [1, 2, 3, 4, 5]\nv = [-30, 4, 3, 2, 1]\nu ⋅ v\n\n0\n\n\n\nProjection\nFrom right triangle trigonometry, we learn that \\(\\cos(\\theta) = \\text{adjacent}/\\text{hypotenuse}\\). If we use a vector, \\(\\vec{h}\\) for the hypotenuse, and \\(\\vec{a} = \\langle 1, 0 \\rangle\\), we have this picture:\n\nh = [2, 3]\na = [1, 0] # unit vector\nh_hat = h / norm(h)\ntheta = acos(h_hat ⋅ a)\n\nplot(legend=false)\narrow!([0,0], h)\narrow!([0,0], norm(h) * cos(theta) * a)\narrow!([0,0], a, linewidth=3)\n\n\n\n\nWe used vectors to find the angle made by h, and from there, using the length of the hypotenuse is norm(h), we can identify the length of the adjacent side, it being the length of the hypotenuse times the cosine of \\(\\theta\\). Geometrically, we call the vector norm(h) * cos(theta) * a the projection of \\(\\vec{h}\\) onto \\(\\vec{a}\\), the word coming from the shadow \\(\\vec{h}\\) would cast on the direction of \\(\\vec{a}\\) were there light coming perpendicular to \\(\\vec{a}\\).\nThe projection can be made for any pair of vectors, and in any dimension \\(n > 1\\). The projection of \\(\\vec{u}\\) on \\(\\vec{v}\\) would be a vector of length \\(\\vec{u}\\) (the hypotenuse) times the cosine of the angle in the direction of \\(\\vec{v}\\). In dot-product notation:\n\\[\nproj_{\\vec{v}}(\\vec{u}) = \\| \\vec{u} \\| \\frac{\\vec{u}\\cdot\\vec{v}}{\\|\\vec{u}\\|\\|\\vec{v}\\|} \\frac{\\vec{v}}{\\|\\vec{v}\\|}.\n\\]\nThis can simplify. After cancelling, and expressing norms in terms of dot products, we have:\n\\[\nproj_{\\vec{v}}(\\vec{u}) = \\frac{\\vec{u} \\cdot \\vec{v}}{\\vec{v} \\cdot \\vec{v}} \\vec{v} = (\\vec{u} \\cdot \\hat{v}) \\hat{v},\n\\]\nwhere \\(\\hat{v}\\) is the unit vector in the direction of \\(\\vec{v}\\).\n\nExample\nA pendulum, a bob on a string, swings back and forth due to the force of gravity. When the bob is displaced from rest by an angle \\(\\theta\\), then the tension force of the string on the bob is directed along the string and has magnitude given by the projection of the force due to gravity.\nA force diagram is a useful visualization device of physics to illustrate the applied forces involved in a scenario. In this case the bob has two forces acting on it: a force due to tension in the string of unknown magnitude, but in the direction of the string; and a force due to gravity. The latter is in the downward direction and has magnitude \\(mg\\), \\(g=9.8m/sec^2\\) being the gravitational constant.\n\n𝗍heta = pi/12\n𝗆ass, 𝗀ravity = 1/9.8, 9.8\n\n𝗅 = [-sin(𝗍heta), cos(𝗍heta)]\n𝗉 = -𝗅\n𝖥g = [0, -𝗆ass * 𝗀ravity]\nplot(legend=false)\narrow!(𝗉, 𝗅)\narrow!(𝗉, 𝖥g)\nscatter!(𝗉[1:1], 𝗉[2:2], markersize=5)\n\n\n\n\nThe magnitude of the tension force is exactly that of the force of gravity projected onto \\(\\vec{l}\\), as the bob is not accelerating in that direction. The component of the gravity force in the perpendicular direction is the part of the gravitational force that causes acceleration in the pendulum. Here we find the projection onto \\(\\vec{l}\\) and visualize the two components of the gravitational force.\n\nplot(legend=false, aspect_ratio=:equal)\narrow!(𝗉, 𝗅)\narrow!(𝗉, 𝖥g)\nscatter!(𝗉[1:1], 𝗉[2:2], markersize=5)\n\n𝗉roj = (𝖥g ⋅ 𝗅) / (𝗅𝗅) * 𝗅 # force of gravity in direction of tension\n𝗉orth = 𝖥g - 𝗉roj # force of gravity perpendicular to tension\n\narrow!(𝗉, 𝗉roj)\narrow!(𝗉, 𝗉orth, linewidth=3)\n\n\n\n\n\n\nExample\nStarting with three vectors, we can create three orthogonal vectors using projection and subtraction. The creation of porth above is the pattern we will exploit.\nLets begin with three vectors in \\(R^3\\):\n\nu = [1, 2, 3]\nv = [1, 1, 2]\nw = [1, 2, 4]\n\n3-element Vector{Int64}:\n 1\n 2\n 4\n\n\nWe can find a vector from v orthogonal to u using:\n\nunit_vec(u) = u / norm(u)\nprojection(u, v) = (u ⋅ unit_vec(v)) * unit_vec(v)\n\nvₚ = v - projection(v, u)\nwₚ = w - projection(w, u) - projection(w, vₚ)\n\n3-element Vector{Float64}:\n -0.33333333333333265\n -0.3333333333333336\n 0.33333333333333354\n\n\nWe can verify the orthogonality through:\n\nu ⋅ vₚ, u ⋅ wₚ, vₚ ⋅ wₚ\n\n(-3.3306690738754696e-16, 8.881784197001252e-16, 3.677613769070831e-16)\n\n\nThis only works when the three vectors do not all lie in the same plane. In general, this is the beginning of the Gram-Schmidt process for creating orthogonal vectors from a collection of vectors.\n\n\n\nAlgebraic properties\nThe dot product is similar to multiplication, but different, as it is an operation defined between vectors of the same dimension. However, many algebraic properties carry over:\n\ncommutative: \\(\\vec{u} \\cdot \\vec{v} = \\vec{v} \\cdot \\vec{u}\\)\nscalar multiplication: \\((c\\vec{u})\\cdot\\vec{v} = c(\\vec{u}\\cdot\\vec{v})\\).\ndistributive \\(\\vec{u} \\cdot (\\vec{v} + \\vec{w}) = \\vec{u} \\cdot \\vec{v} + \\vec{u} \\cdot \\vec{w}\\)\n\nThe last two can be combined: \\(\\vec{u}\\cdot(s \\vec{v} + t \\vec{w}) = s(\\vec{u}\\cdot\\vec{v}) + t (\\vec{u}\\cdot\\vec{w})\\).\nBut the associative property does not make sense, as \\((\\vec{u} \\cdot \\vec{v}) \\cdot \\vec{w}\\) does not make sense as two dot products: the result of the first is not a vector, but a scalar."
},
{
"objectID": "differentiable_vector_calculus/vectors.html#matrices",
"href": "differentiable_vector_calculus/vectors.html#matrices",
"title": "53  Vectors and matrices",
"section": "53.6 Matrices",
"text": "53.6 Matrices\nAlgebraically, the dot product of two vectors - pair off by components, multiply these, then add - is a common operation. Take for example, the general equation of a line, or a plane:\n\\[\nax + by = c, \\quad ax + by + cz = d.\n\\]\nThe left hand sides are in the form of a dot product, in this case \\(\\langle a,b \\rangle \\cdot \\langle x, y\\rangle\\) and \\(\\langle a,b,c \\rangle \\cdot \\langle x, y, z\\rangle\\) respectively. When there is a system of equations, something like:\n\\[\n\\begin{array}{}\n3x &+& 4y &- &5z &= 10\\\\\n3x &-& 5y &+ &7z &= 11\\\\\n-3x &+& 6y &+ &9z &= 12,\n\\end{array}\n\\]\nThen we might think of \\(3\\) vectors \\(\\langle 3,4,-5\\rangle\\), \\(\\langle 3,-5,7\\rangle\\), and \\(\\langle -3,6,9\\rangle\\) being dotted with \\(\\langle x,y,z\\rangle\\). Mathematically, matrices and their associated algebra are used to represent this. In this example, the system of equations above would be represented by a matrix and two vectors:\n\\[\nM = \\left[\n\\begin{array}{}\n3 & 4 & -5\\\\\n5 &-5 & 7\\\\\n-3& 6 & 9\n\\end{array}\n\\right],\\quad\n\\vec{x} = \\langle x, y , z\\rangle,\\quad\n\\vec{b} = \\langle 10, 11, 12\\rangle,\n\\]\nand the expression \\(M\\vec{x} = \\vec{b}\\). The matrix \\(M\\) is a rectangular collection of numbers or expressions arranged in rows and columns with certain algebraic definitions. There are \\(m\\) rows and \\(n\\) columns in an \\(m\\times n\\) matrix. In this example \\(m=n=3\\), and in such a case the matrix is called square. A vector, like \\(\\vec{x}\\) is usually identified with the \\(n \\times 1\\) matrix (a column vector). Were that done, the system of equations would be written \\(Mx=b\\).\nIf we refer to a matrix \\(M\\) by its components, a convention is to use \\((M)_{ij}\\) or \\(m_{ij}\\) to denote the entry in the \\(i\\)th row and \\(j\\)th column. Following Julias syntax, we would use \\(m_{i:}\\) to refer to all entries in the \\(i\\)th row, and \\(m_{:j}\\) to denote all entries in the \\(j\\) column.\nIn addition to square matrices, there are some other common types of matrices worth naming: square matrices with \\(0\\) entries below the diagonal are called upper triangular; square matrices with \\(0\\) entries above the diagonal are called lower triangular matrices; square matrices which are \\(0\\) except possibly along the diagonal are diagonal matrices; and a diagonal matrix whose diagonal entries are all \\(1\\) is called an identity matrix.\nMatrices, like vectors, have scalar multiplication defined for them. then scalar multiplication of a matrix \\(M\\) by \\(c\\) just multiplies each entry by \\(c\\), so the new matrix would have components defined by \\(cm_{ij}\\).\nMatrices of the same size, like vectors, have addition defined for them. As with scalar multiplication, addition is defined component wise. So \\(A+B\\) is the matrix with \\(ij\\) entry \\(A_{ij} + B_{ij}\\).\n\n53.6.1 Matrix multiplication\nMatrix multiplication may be viewed as a collection of dot product operations. First, matrix multiplication is only defined between \\(A\\) and \\(B\\), as \\(AB\\), if the size of \\(A\\) is \\(m\\times n\\) and the size of \\(B\\) is \\(n \\times k\\). That is the number of columns of \\(A\\) must match the number of rows of \\(B\\) for the left multiplication of \\(AB\\) to be defined. If this is so, then we have the \\(ij\\) entry of \\(AB\\) is:\n\\[\n(AB)_{ij} = A_{i:} \\cdot B_{:j}.\n\\]\nThat is, if we view the \\(i\\)th row of \\(A\\) and the \\(j\\)th column of B as vectors, then the \\(ij\\) entry is the dot product.\nThis is why \\(M\\) in the example above, has the coefficients for each equation in a row and not a column, and why \\(\\vec{x}\\) is thought of as a \\(n\\times 1\\) matrix (a column vector) and not as a row vector.\nMatrix multiplication between \\(A\\) and \\(B\\) is not, in general, commutative. Not only may the sizes not permit \\(BA\\) to be found when \\(AB\\) may be, there is just no guarantee when the sizes match that the components will be the same.\n\nMatrices have other operations defined on them. We mention three here:\n\nThe transpose of a matrix flips the difference between row and column, so the \\(ij\\) entry of the transpose is the \\(ji\\) entry of the matrix. This means the transpose will have size \\(n \\times m\\) when \\(M\\) has size \\(m \\times n\\). Mathematically, the transpose is denoted \\(M^t\\).\nThe determinant of a square matrix is a number that can be used to characterize the matrix. The determinant may be computed different ways, but its definition by the Leibniz formula is common. Two special cases are all we need. The \\(2\\times 2\\) case and the \\(3 \\times 3\\) case:\n\n\\[\n\\left|\n\\begin{array}{}\na&b\\\\\nc&d\n\\end{array}\n\\right| =\nad - bc, \\quad\n\\left|\n\\begin{array}{}\na&b&c\\\\\nd&e&f\\\\\ng&h&i\n\\end{array}\n\\right| =\na \\left|\n\\begin{array}{}\ne&f\\\\\nh&i\n\\end{array}\n\\right|\n- b \\left|\n\\begin{array}{}\nd&f\\\\\ng&i\n\\end{array}\n\\right|\n+c \\left|\n\\begin{array}{}\nd&e\\\\\ng&h\n\\end{array}\n\\right|.\n\\]\nThe \\(3\\times 3\\) case shows how determinants may be computed recursively, using “cofactor” expansion.\n\nThe inverse of a square matrix. If \\(M\\) is a square matrix and its determinant is non-zero, then there is an inverse matrix, denoted \\(M^{-1}\\), with the properties that \\(MM^{-1} = M^{-1}M = I\\), where \\(I\\) is the diagonal matrix of all \\(1\\)s called the identify matrix.\n\n\n\n53.6.2 Matrices in Julia\nAs mentioned previously, a matrix in Julia is defined component by component with []. We separate row entries with spaces and columns with semicolons:\n\n = [3 4 -5; 5 -5 7; -3 6 9]\n\n3×3 Matrix{Int64}:\n 3 4 -5\n 5 -5 7\n -3 6 9\n\n\nSpace is the separator, which means computing a component during definition (i.e., writing 2 + 3 in place of 5) can be problematic, as no space can be used in the computation, lest it be parsed as a separator.\nVectors are defined similarly. As they are identified with column vectors, we use a semicolon (or a comma with simple numbers) to separate:\n\n𝒷 = [10, 11, 12] # not 𝒷 = [10 11 12], which would be a row vector.\n\n3-element Vector{Int64}:\n 10\n 11\n 12\n\n\nIn Julia, entries in a matrix (or a vector) are stored in a container with a type wide enough accomodate each entry. In this example, the type is SymPys Sym type:\n\n@syms x1 x2 x3\n𝓍 = [x1, x2, x3]\n\n3-element Vector{Sym}:\n x₁\n x₂\n x₃\n\n\nMatrices may also be defined from blocks. This example shows how to make two column vectors into a matrix:\n\n𝓊 = [10, 11, 12]\n𝓋 = [13, 14, 15]\n[𝓊 𝓋] # horizontally combine\n\n3×2 Matrix{Int64}:\n 10 13\n 11 14\n 12 15\n\n\nVertically combining the two will stack them:\n\n[𝓊; 𝓋]\n\n6-element Vector{Int64}:\n 10\n 11\n 12\n 13\n 14\n 15\n\n\nScalar multiplication will just work as expected:\n\n2 * \n\n3×3 Matrix{Int64}:\n 6 8 -10\n 10 -10 14\n -6 12 18\n\n\nMatrix addition is also straightforward:\n\n + \n\n3×3 Matrix{Int64}:\n 6 8 -10\n 10 -10 14\n -6 12 18\n\n\nMatrix addition expects matrices of the same size. An error will otherwise be thrown. However, if addition is broadcasted then the sizes need only be commensurate. For example, this will add 1 to each entry of M:\n\n .+ 1\n\n3×3 Matrix{Int64}:\n 4 5 -4\n 6 -4 8\n -2 7 10\n\n\nMatrix multiplication is defined by *:\n\n * \n\n3×3 Matrix{Int64}:\n 44 -38 -32\n -31 87 3\n -6 12 138\n\n\nWe can then see how the system of equations is represented with matrices:\n\n * 𝓍 - 𝒷\n\n3-element Vector{Sym}:\n 3⋅x₁ + 4⋅x₂ - 5⋅x₃ - 10\n 5⋅x₁ - 5⋅x₂ + 7⋅x₃ - 11\n -3⋅x₁ + 6⋅x₂ + 9⋅x₃ - 12\n\n\nHere we use SymPy to verify the above:\n\n𝒜 = [symbols(\"A$i$j\", real=true) for i in 1:3, j in 1:2]\n = [symbols(\"B$i$j\", real=true) for i in 1:2, j in 1:2]\n\n2×2 Matrix{Sym}:\n B₁₁ B₁₂\n B₂₁ B₂₂\n\n\nThe matrix product has the expected size: the number of rows of A (\\(3\\)) by the number of columns of B (\\(2\\)):\n\n𝒜 * \n\n3×2 Matrix{Sym}:\n A₁₁⋅B₁₁ + A₁₂⋅B₂₁ A₁₁⋅B₁₂ + A₁₂⋅B₂₂\n A₂₁⋅B₁₁ + A₂₂⋅B₂₁ A₂₁⋅B₁₂ + A₂₂⋅B₂₂\n A₃₁⋅B₁₁ + A₃₂⋅B₂₁ A₃₁⋅B₁₂ + A₃₂⋅B₂₂\n\n\nThis confirms how each entry ((A*B)[i,j]) is from a dot product (A[i,:] ⋅ B[:,j]):\n\n[ (𝒜 * )[i,j] == 𝒜[i,:] ⋅ [:,j] for i in 1:3, j in 1:2]\n\n3×2 Matrix{Bool}:\n 1 1\n 1 1\n 1 1\n\n\nWhen the multiplication is broadcasted though, with .*, the operation will be component wise:\n\n .* # component wise (Hadamard product)\n\n3×3 Matrix{Int64}:\n 9 16 25\n 25 25 49\n 9 36 81\n\n\n\nThe determinant is found by det provided by the LinearAlgebra package:\n\ndet()\n\n-600.0000000000001\n\n\n\nThe transpose of a matrix is found through transpose which doesnt create a new object, but rather an object which knows to switch indices when referenced:\n\ntranspose()\n\n3×3 transpose(::Matrix{Int64}) with eltype Int64:\n 3 5 -3\n 4 -5 6\n -5 7 9\n\n\nFor matrices with real numbers, the transpose can be performed with the postfix operation ':\n\n'\n\n3×3 adjoint(::Matrix{Int64}) with eltype Int64:\n 3 5 -3\n 4 -5 6\n -5 7 9\n\n\n(However, this is not true for matrices with complex numbers as ' is the “adjoint,” that is, the transpose of the matrix after taking complex conjugates.)\nWith u and v, vectors from above, we have:\n\n[𝓊' 𝓋'] # [𝓊 𝓋] was a 3 × 2 matrix, above\n\n1×6 adjoint(::Vector{Int64}) with eltype Int64:\n 10 11 12 13 14 15\n\n\nand\n\n[𝓊'; 𝓋']\n\n2×3 Matrix{Int64}:\n 10 11 12\n 13 14 15\n\n\n\n\nInfoThe adjoint is defined recursively in Julia. In the CalculusWithJulia package, we overload the ' notation for functions to yield a univariate derivative found with automatic differentiation. This can lead to problems: if we have a matrix of functions, M, and took the transpose with M', then the entries of M' would be the derivatives of the functions in M - not the original functions. This is very much likely to not be what is desired. The CalculusWithJulia package commits type piracy here and abuses the generic idea for ' in Julia. In general type piracy is very much frowned upon, as it can change expected behaviour. It is defined in CalculusWithJulia, as that package is intended only to act as a means to ease users into the wider package ecosystem of Julia.\n\n\n\n\n\nThe dot product and matrix multiplication are related, and mathematically identified through the relation: \\(\\vec{u} \\cdot \\vec{v} = u^t v\\), where the right hand side identifies \\(\\vec{u}\\) and \\(\\vec{v}\\) with a \\(n\\times 1\\) column matrix, and \\(u^t\\) is the transpose, or a \\(1\\times n\\) row matrix. However, mathematically the left side is a scalar, but the right side a \\(1\\times 1\\) matrix. While distinct, the two are identified as the same. This is similar to the useful identification of a point and a vector. Within Julia, these identifications are context dependent. Julia stores vectors as \\(1\\)-dimensional arrays, transposes as \\(1\\)-dimensional objects, and matrices as \\(2\\)-dimensional arrays. The product of a transpose and a vector is a scalar:\n\nu, v = [1,1,2], [3,5,8]\nu' * v # a scalar\n\n24\n\n\nBut if we make u a matrix (here by “reshapeing” in a matrix with \\(1\\) row and \\(3\\) columns), we will get a matrix (actually a vector) in return:\n\nu, v = [1,1,2], [3,5,8]\nreshape(u,(1,3)) * v\n\n1-element Vector{Int64}:\n 24"
},
{
"objectID": "differentiable_vector_calculus/vectors.html#cross-product",
"href": "differentiable_vector_calculus/vectors.html#cross-product",
"title": "53  Vectors and matrices",
"section": "53.7 Cross product",
"text": "53.7 Cross product\nIn three dimensions, there is a another operation between vectors that is similar to multiplication, though we will see with many differences.\nLet \\(\\vec{u}\\) and \\(\\vec{v}\\) be two \\(3\\)-dimensional vectors, then the cross product, \\(\\vec{u} \\times \\vec{v}\\), is defined as a vector with length:\n\\[\n\\| \\vec{u} \\times \\vec{v} \\| = \\| \\vec{u} \\| \\| \\vec{v} \\| \\sin(\\theta),\n\\]\nwith \\(\\theta\\) being the angle in \\([0, \\pi]\\) between \\(\\vec{u}\\) and \\(\\vec{v}\\). Consequently, \\(\\sin(\\theta) \\geq 0\\).\nThe direction of the cross product is such that it is orthogonal to both \\(\\vec{u}\\) and \\(\\vec{v}\\). There are two such directions, to identify which is correct, the right-hand rule is used. This rule points the right hand fingers in the direction of \\(\\vec{u}\\) and curls them towards \\(\\vec{v}\\) (so that the angle between the two vectors is in \\([0, \\pi]\\)). The thumb will point in the direction. Call this direction \\(\\hat{n}\\), a normal unit vector. Then the cross product can be defined by:\n\\[\n\\vec{u} \\times \\vec{v} = \\| \\vec{u} \\| \\| \\vec{v} \\| \\sin(\\theta) \\hat{n}.\n\\]\n\n\nInfoThe right-hand rule is also useful to understand how standard household screws will behave when twisted with a screwdriver. If the right hand fingers curl in the direction of the twisting screwdriver, then the screw will go in or out following the direction pointed to by the thumb.\n\n\n\n\nThe right-hand rule depends on the order of consideration of the vectors. If they are reversed, the opposite direction is determined. A consequence is that the cross product is anti-commutative, unlike multiplication:\n\\[\n\\vec{u} \\times \\vec{v} = - \\vec{v} \\times \\vec{u}.\n\\]\nMathematically, the definition in terms of its components is a bit involved:\n\\[\n\\vec{u} \\times \\vec{v} = \\langle u_2 v_3 - u_3 v_2, u_3 v_1 - u_1 v_3, u_1 v_2 - u_2 v_1 \\rangle.\n\\]\nThere is a matrix notation that can simplify this computation. If we formally define \\(\\hat{i}\\), \\(\\hat{j}\\), and \\(\\hat{k}\\) to represent unit vectors in the \\(x\\), \\(y\\), and \\(z\\) direction, then a vector \\(\\langle u_1, u_2, u_3 \\rangle\\) could be written \\(u_1\\hat{i} + u_2\\hat{j} + u_3\\hat{k}\\). With this the cross product of \\(\\vec{u}\\) and \\(\\vec{v}\\) is the vector associated with the determinant of the matrix\n\\[\n\\left[\n\\begin{array}{}\n\\hat{i} & \\hat{j} & \\hat{k}\\\\\nu_1 & u_2 & u_3\\\\\nv_1 & v_2 & v_3\n\\end{array}\n\\right]\n\\]\nFrom the \\(\\sin(\\theta)\\) term in the definition, we see that \\(\\vec{u}\\times\\vec{u}=0\\). In fact, the cross product is \\(0\\) only if the two vectors involved are parallel or there is a zero vector.\nIn Julia, the cross function from the LinearAlgebra package implements the cross product. For example:\n\n𝓪 = [1, 2, 3]\n𝓫 = [4, 2, 1]\ncross(𝓪, 𝓫)\n\n3-element Vector{Int64}:\n -4\n 11\n -6\n\n\nThere is also the infix unicode operator \\times[tab] that can be used for similarity to traditional mathematical syntax.\n\n𝓪 × 𝓫\n\n3-element Vector{Int64}:\n -4\n 11\n -6\n\n\nWe can see the cross product is anti-commutative by comparing the last answer with:\n\n𝓫 × 𝓪\n\n3-element Vector{Int64}:\n 4\n -11\n 6\n\n\nUsing vectors of size different than \\(n=3\\) produces a dimension mismatch error:\n\n[1, 2] × [3, 4]\n\nLoadError: DimensionMismatch(\"cross product is only defined for vectors of length 3\")\n\n\n(It can prove useful to pad \\(2\\)-dimensional vectors into \\(3\\)-dimensional vectors by adding a \\(0\\) third component. We will see this in the discussion on curvature in the plane.)\nLets see that the matrix definition will be identical (after identifications) to cross:\n\n@syms î ĵ k̂\n𝓜 = [î ĵ k̂; 3 4 5; 3 6 7]\ndet(𝓜) |> simplify\n\n \n\\[\n6 k̂ - 2 î - 6 ĵ\n\\]\n\n\n\nCompare with\n\n𝓜[2,:] × 𝓜[3,:]\n\n3-element Vector{Sym}:\n -2\n -6\n 6\n\n\n\nConsider this extended picture involving two vectors \\(\\vec{u}\\) and \\(\\vec{v}\\) drawn in two dimensions:\n\nu₁ = [1, 2]\nv₁ = [2, 1]\np₁ = [0,0]\n\nplot(aspect_ratio=:equal)\narrow!(p₁, u₁)\narrow!(p₁, v₁)\narrow!(u₁, v₁)\narrow!(v₁, u₁)\n\npuv₁ = (u₁ ⋅ v₁) / (v₁ ⋅ v₁) * v₁\nporth₁ = u₁ - puv₁\narrow!(puv₁, porth₁)\n\n\n\n\nThe enclosed shape is a parallelogram. To this we added the projection of \\(\\vec{u}\\) onto \\(\\vec{v}\\) (puv) and then the orthogonal part (porth).\nThe area of a parallelogram is the length of one side times the perpendicular height. The perpendicular height could be found from norm(porth), so the area is:\n\nnorm(v₁) * norm(porth₁)\n\n3.0\n\n\nHowever, from trigonometry we have the height would also be the norm of \\(\\vec{u}\\) times \\(\\sin(\\theta)\\), a value that is given through the length of the cross product of \\(\\vec{u}\\) and \\(\\hat{v}\\), the unit vector, were these vectors viewed as \\(3\\) dimensional by adding a \\(0\\) third component. In formulas, this is also the case:\n\\[\n\\text{area of the parallelogram} = \\| \\vec{u} \\times \\hat{v} \\| \\| \\vec{v} \\| = \\| \\vec{u} \\times \\vec{v} \\|.\n\\]\nWe have, for our figure, after extending u and v to be three dimensional the area of the parallelogram:\n\nu₂ = [1, 2, 0]\nv₂ = [2, 1, 0]\nnorm(u₂ × v₂)\n\n3.0\n\n\n\nThis analysis can be extended to the case of 3 vectors, which - when not co-planar - will form a parallelepiped.\n\nu₃, v₃, w₃ = [1,2,3], [2,1,0], [1,1,2]\nplot()\np₃ = [0,0,0]\n\nplot(legend=false)\narrow!(p₃, u₃); arrow!(p₃, v₃); arrow!(p₃, w₃)\narrow!(u₃, v₃); arrow!(u₃, w₃)\narrow!(v₃, u₃); arrow!(v₃, w₃)\narrow!(w₃, u₃); arrow!(w₃, v₃)\narrow!(u₃ + v₃, w₃); arrow!(u₃ + w₃, v₃); arrow!(v₃ + w₃, u₃)\n\n\n\n\nThe volume of a parallelepiped is the area of a base parallelogram times the height of a perpendicular. If \\(\\vec{u}\\) and \\(\\vec{v}\\) form the base parallelogram, then the perpendicular will have height \\(\\|\\vec{w}\\| \\cos(\\theta)\\) where the angle is the one made by \\(\\vec{w}\\) with the normal, \\(\\vec{n}\\). Since \\(\\vec{u} \\times \\vec{v} = \\| \\vec{u} \\times \\vec{v}\\| \\hat{n} = \\hat{n}\\) times the area of the base parallelogram, we have if we dot this answer with \\(\\vec{w}\\):\n\\[\n(\\vec{u} \\times \\vec{v}) \\cdot \\vec{w} =\n\\|\\vec{u} \\times \\vec{v}\\| (\\vec{n} \\cdot \\vec{w}) =\n\\|\\vec{u} \\times \\vec{v}\\| \\| \\vec{w}\\| \\cos(\\theta),\n\\]\nthat is, the area of the parallelepiped. Wait, what about \\((\\vec{v}\\times\\vec{u})\\cdot\\vec{w}\\)? That will have an opposite sign. Yes, in the above, there is an assumption that \\(\\vec{n}\\) and \\(\\vec{w}\\) have a an angle between them within \\([0, \\pi/2]\\), otherwise an absolute value must be used, as volume is non-negative.\n\n\n\n\n\n\nOrientation\n\n\n\nThe triple-scalar product, \\(\\vec{u}\\cdot(\\vec{v}\\times\\vec{w})\\), gives the volume of the parallelepiped up to sign. If the sign of this is positive, the \\(3\\) vectors are said to have a positive orientation, if the triple-scalar product is negative, the vectors have a negative orientation.\n\n\n\nAlgebraic properties\nThe cross product has many properties, some different from regular multiplication:\n\nscalar multiplication: \\((c\\vec{u})\\times\\vec{v} = c(\\vec{u}\\times\\vec{v})\\)\ndistributive over addition: \\(\\vec{u} \\times (\\vec{v} + \\vec{w}) = \\vec{u}\\times\\vec{v} + \\vec{u}\\times\\vec{w}\\).\nanti-commutative: \\(\\vec{u} \\times \\vec{v} = - \\vec{v} \\times \\vec{u}\\)\nnot associative: that is there is no guarantee that \\((\\vec{u}\\times\\vec{v})\\times\\vec{w}\\) will be equivalent to \\(\\vec{u}\\times(\\vec{v}\\times\\vec{w})\\).\nThe triple cross product \\((\\vec{u}\\times\\vec{v}) \\times \\vec{w}\\) must be orthogonal to \\(\\vec{u}\\times\\vec{v}\\) so lies in a plane with this as a normal vector. But, \\(\\vec{u}\\) and \\(\\vec{v}\\) will generate this plane, so it should be possible to express this triple product in terms of a sum involving \\(\\vec{u}\\) and \\(\\vec{v}\\) and indeed:\n\n\\[\n(\\vec{u}\\times\\vec{v})\\times\\vec{w} = (\\vec{u}\\cdot\\vec{w})\\vec{v} - (\\vec{v}\\cdot\\vec{w})\\vec{u}.\n\\]\n\nThe following shows the algebraic properties stated above hold for symbolic vectors. First the linearity of the dot product:\n\n@syms s₄ t₄ u₄[1:3]::real v₄[1:3]::real w₄[1:3]::real\n\nu₄ ⋅ (s₄ * v₄ + t₄ * w₄) - (s₄ * (u₄ ⋅ v₄) + t₄ * (u₄ ⋅ w₄)) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nThis shows the dot product is commutative:\n\n(u₄ ⋅ v₄) - (v₄ ⋅ u₄) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nThis shows the linearity of the cross product over scalar multiplication and vector addition:\n\nu₄ × (s₄* v₄ + t₄ * w₄) - (s₄ * (u₄ × v₄) + t₄ * (u₄ × w₄)) .|> simplify\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\n(We use .|> to broadcast simplify over each component.)\nThe cross product is anti-commutative:\n\nu₄ × v₄ + v₄ × u₄ .|> simplify\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nbut not associative:\n\nu₄ × (v₄ × w₄) - (u₄ × v₄) × w₄ .|> simplify\n\n3-element Vector{Sym}:\n u₄₁⋅v₄₂⋅w₄₂ + u₄₁⋅v₄₃⋅w₄₃ - u₄₂⋅v₄₂⋅w₄₁ - u₄₃⋅v₄₃⋅w₄₁\n -u₄₁⋅v₄₁⋅w₄₂ + u₄₂⋅v₄₁⋅w₄₁ + u₄₂⋅v₄₃⋅w₄₃ - u₄₃⋅v₄₃⋅w₄₂\n -u₄₁⋅v₄₁⋅w₄₃ - u₄₂⋅v₄₂⋅w₄₃ + u₄₃⋅v₄₁⋅w₄₁ + u₄₃⋅v₄₂⋅w₄₂\n\n\nFinally we verify the decomposition of the triple cross product:\n\n(u₄ × v₄) × w₄ - ( (u₄ ⋅ w₄) * v₄ - (v₄ ⋅ w₄) * u₄) .|> simplify\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\n\nThis table shows common usages of the symbols for various multiplication types: *, \\(\\cdot\\), and \\(\\times\\):\n\n\n\nSymbol\ninputs\noutput\ntype\n\n\n\n\n*\nscalar, scalar\nscalar\nregular multiplication\n\n\n*\nscalar, vector\nvector\nscalar multiplication\n\n\n*\nvector, vector\nundefined\n\n\n\n\\(\\cdot\\)\nscalar, scalar\nscalar\nregular multiplication\n\n\n\\(\\cdot\\)\nscalar, vector\nvector\nscalar multiplication\n\n\n\\(\\cdot\\)\nvector, vector\nscalar\ndot product\n\n\n\\(\\times\\)\nscalar, scalar\nscalar\nregular multiplication\n\n\n\\(\\times\\)\nscalar, vector\nundefined\n\n\n\n\\(\\times\\)\nvector, vector\nvector\ncross product (\\(3\\)D)\n\n\n\n\nExample: lines and planes\nA line in two dimensions satisfies the equation \\(ax + by = c\\). Suppose \\(a\\) and \\(b\\) are non-zero. This can be represented in vector form, as the collection of all points associated to the vectors: \\(p + t \\vec{v}\\) where \\(p\\) is a point on the line, say \\((0,c/b)\\), and v is the vector \\(\\langle b, -a \\rangle\\). We can verify, this for values of t as follows:\n\n@syms a b c x y t\n\neq = c - (a*x + b*y)\n\np = [0, c/b]\nv = [-b, a]\nli = p + t * v\n\neq(x=>li[1], y=>li[2]) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nLet \\(\\vec{n} = \\langle a , b \\rangle\\), taken from the coefficients in the equation. We can see directly that \\(\\vec{n}\\) is orthogonal to \\(\\vec{v}\\). The line may then be seen as the collection of all vectors that are orthogonal to \\(\\vec{n}\\) that have their tail at the point \\(p\\).\nIn three dimensions, the equation of a plane is \\(ax + by + cz = d\\). Suppose, \\(a\\), \\(b\\), and \\(c\\) are non-zero, for simplicity. Setting \\(\\vec{n} = \\langle a,b,c\\rangle\\) by comparison, it can be seen that plane is identified with the set of all vectors orthogonal to \\(\\vec{n}\\) that are anchored at \\(p\\).\nFirst, let \\(p = (0, 0, d/c)\\) be a point on the plane. We find two vectors \\(u = \\langle -b, a, 0 \\rangle\\) and \\(v = \\langle 0, c, -b \\rangle\\). Then any point on the plane may be identified with the vector \\(p + s\\vec{u} + t\\vec{v}\\). We can verify this algebraically through:\n\n@syms a b c d x y z s t\n\neq = d - (a*x + b*y + c * z)\n\np = [0, 0, d/c]\nu, v = [-b, a, 0], [0, c, -b]\npl = p + t * u + s * v\n\nsubs(eq, x=>pl[1], y=>pl[2], z=>pl[3]) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nThe above viewpoint can be reversed:\n\na plane is determined by two (non-parallel) vectors and a point.\n\nThe parameterized version of the plane would be \\(p + t \\vec{u} + s \\vec{v}\\), as used above.\nThe equation of the plane can be given from \\(\\vec{u}\\) and \\(\\vec{v}\\). Let \\(\\vec{n} = \\vec{u} \\times \\vec{v}\\). Then \\(\\vec{n} \\cdot \\vec{u} = \\vec{n} \\cdot \\vec{v} = 0\\), from the properties of the cross product. As such, \\(\\vec{n} \\cdot (s \\vec{u} + t \\vec{v}) = 0\\). That is, the cross product is orthogonal to any linear combination of the two vectors. This figure shows one such linear combination:\n\nu = [1,2,3]\nv = [2,3,1]\nn = u × v\np = [0,0,1]\n\nplot(legend=false)\n\narrow!(p, u)\narrow!(p, v)\narrow!(p + u, v)\narrow!(p + v, u)\narrow!(p, n)\n\ns, t = 1/2, 1/4\narrow!(p, s*u + t*v)\n\n\n\n\nSo if \\(\\vec{n} \\cdot p = d\\) (identifying the point \\(p\\) with a vector so the dot product is defined), we will have for any vector \\(\\vec{v} = \\langle x, y, z \\rangle = s \\vec{u} + t \\vec{v}\\) that\n\\[\n\\vec{n} \\cdot (p + s\\vec{u} + t \\vec{v}) = \\vec{n} \\cdot p + \\vec{n} \\cdot (s \\vec{u} + t \\vec{v}) = d + 0 = d,\n\\]\nBut if \\(\\vec{n} = \\langle a, b, c \\rangle\\), then this says \\(d = ax + by + cz\\), so from \\(\\vec{n}\\) and \\(p\\) the equation of the plane is given.\nIn summary:\n\n\n\nObject\nEquation\nvector equation\n\n\n\n\nLine\n\\(ax + by = c\\)\nline: \\(p + t\\vec{u}\\)\n\n\nPlane\n\\(ax + by + cz = d\\)\nplane: \\(p + s\\vec{u} + t\\vec{v}\\)\n\n\n\n\n\n\nExample\nYou are given that the vectors \\(\\vec{u} =\\langle 6, 3, 1 \\rangle\\) and \\(\\vec{v} = \\langle 3, 2, 1 \\rangle\\) describe a plane through the point \\(p=[1,1,2]\\). Find the equation of the plane.\nThe key is to find the normal vector to the plane, \\(\\vec{n} = \\vec{u} \\times \\vec{v}\\):\n\nu, v, p = [6,3,1], [3,2,1], [1,1,2]\nn = u × v\na, b, c = n\nd = n ⋅ p\n\"equation of plane: $a x + $b y + $c z = $d\"\n\n\"equation of plane: 1 x + -3 y + 3 z = 4\""
},
{
"objectID": "differentiable_vector_calculus/vectors.html#questions",
"href": "differentiable_vector_calculus/vectors.html#questions",
"title": "53  Vectors and matrices",
"section": "53.8 Questions",
"text": "53.8 Questions\n\nQuestion\nLet u=[1,2,3], v=[4,3,2], and w=[5,2,1].\nFind u ⋅ v:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAre v and w orthogonal?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind the angle between u and w:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind u × v:\n\n\n\n \n \n \n \n \n \n \n \n \n [-1, 6, -7]\n \n \n\n\n \n \n \n \n [-4, 14, -8]\n \n \n\n\n \n \n \n \n [-5, 10, -5]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind the area of the parallelogram formed by v and w\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFind the volume of the parallelepiped formed by u, v, and w:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe dot product of two vectors may be described in words: pair off the corresponding values, multiply them, then add. In Julia the zip command will pair off two iterable objects, like vectors, so it seems like this command: sum(prod.(zip(u,v))) will find a dot product. Investigate if it is does or doesnt by testing the following command and comparing to the dot product:\nu,v = [1,2,3], [5,4,2]\nsum(prod.(zip(u,v)))\nDoes this return the same answer:\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does command zip(u,v) return?\n\n\n\n \n \n \n \n \n \n \n \n \n A vector of values [(1, 5), (2, 4), (3, 2)]\n \n \n\n\n \n \n \n \n An object of type Base.Iterators.Zip that is only realized when used\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does prod.(zip(u,v)) return?\n\n\n\n \n \n \n \n \n \n \n \n \n A vector of values [5, 8, 6]\n \n \n\n\n \n \n \n \n An object of type Base.Iterators.Zip that when realized will produce a vector of values\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{u}\\) and \\(\\vec{v}\\) be 3-dimensional unit vectors. What is the value of\n\\[\n(\\vec{u} \\times \\vec{v}) \\cdot (\\vec{u} \\times \\vec{v}) + (\\vec{u} \\cdot \\vec{v})^2?\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n Can't say in general\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the projection of \\(\\langle 1, 2, 3\\rangle\\) on \\(\\langle 3, 2, 1\\rangle\\). What is its length?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{u} = \\langle 1, 2, 3 \\rangle\\) and \\(\\vec{v} = \\langle 3, 2, 1 \\rangle\\). Describe the plane created by these two non-parallel vectors going through the origin.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-4x + 8y - 4z = 0\\)\n \n \n\n\n \n \n \n \n \\(x + 2y + 3z = 6\\)\n \n \n\n\n \n \n \n \n \\(x + 2y + z = 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA plane \\(P_1\\) is orthogonal to \\(\\vec{n}_1\\), a plane \\(P_2\\) is orthogonal to \\(\\vec{n}_2\\). Explain why vector \\(\\vec{v} = \\vec{n}_1 \\times \\vec{n}_2\\) is parallel to the intersection of \\(P_1\\) and \\(P_2\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec{v}\\) is in plane \\(P_1\\), as it is orthogonal to \\(\\vec{n}_1\\) and \\(P_2\\) as it is orthogonal to \\(\\vec{n}_2\\), hence it is parallel to both planes.\n \n \n\n\n \n \n \n \n \\(\\vec{n}_1\\) and \\(\\vec{n_2}\\) are unit vectors, so the cross product gives the projection, which must be orthogonal to each vector, hence in the intersection\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n(From Strang). For an (analog) clock draw vectors from the center out to each of the 12 hours marked on the clock. What is the vector sum of these 12 vectors?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(12 \\langle 1, 0 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\vec{0}\\)\n \n \n\n\n \n \n \n \n \\(\\langle 12, 12 \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIf the vector to 3 oclock is removed, (call this \\(\\langle 1, 0 \\rangle\\)) what expresses the sum of all the remaining vectors?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle 1, 0 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -1, 0 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle 11, 11 \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{u}\\) and \\(\\vec{v}\\) be unit vectors. Let \\(\\vec{w} = \\vec{u} + \\vec{v}\\). Then \\(\\vec{u} \\cdot \\vec{w} = \\vec{v} \\cdot \\vec{w}\\). What is the value?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec{u}\\cdot\\vec{v} + \\vec{v}\\cdot \\vec{v}\\)\n \n \n\n\n \n \n \n \n \\(1 + \\vec{u}\\cdot\\vec{v}\\)\n \n \n\n\n \n \n \n \n \\(\\vec{u} + \\vec{v}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAs the two are equal, which interpretation is true?\n\n\n\n \n \n \n \n \n \n \n \n \n The vector \\(\\vec{w}\\) must also be a unit vector\n \n \n\n\n \n \n \n \n the two are orthogonal\n \n \n\n\n \n \n \n \n The angle they make with \\(\\vec{w}\\) is the same\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(\\| \\vec{u} + \\vec{v} \\|^2 = \\|\\vec{u}\\|^2 + \\|\\vec{v}\\|^2\\). What is \\(\\vec{u}\\cdot\\vec{v}\\)?\nWe have \\((\\vec{u} + \\vec{v})\\cdot(\\vec{u} + \\vec{v}) = \\vec{u}\\cdot \\vec{u} + 2 \\vec{u}\\cdot\\vec{v} + \\vec{v}\\cdot\\vec{v}\\). From this, we can infer that:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec{u}\\cdot\\vec{v} = 0\\)\n \n \n\n\n \n \n \n \n \\(\\vec{u}\\cdot\\vec{v} = -(\\vec{u}\\cdot\\vec{u} \\vec{v}\\cdot\\vec{v})\\)\n \n \n\n\n \n \n \n \n \\(\\vec{u}\\cdot\\vec{v} = 2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nGive a geometric reason for this identity:\n\\[\n\\vec{u} \\cdot (\\vec{v} \\times \\vec{w}) =\n\\vec{v} \\cdot (\\vec{w} \\times \\vec{u}) =\n\\vec{w} \\cdot (\\vec{u} \\times \\vec{v})\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n The triple product describes a volume up to sign, this combination preserves the sign\n \n \n\n\n \n \n \n \n The vectors are orthogonal, so these are all zero\n \n \n\n\n \n \n \n \n The vectors are all unit lengths, so these are all 1\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSnells law in planar form is \\(n_1\\sin(\\theta_1) = n_2\\sin(\\theta_2)\\) where \\(n_i\\) is a constant depending on the medium.\n\n\n\n\n\nIn vector form, we can express it using unit vectors through:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(n_1 (\\hat{v_1}\\times\\hat{N}) = -n_2 (\\hat{v_2}\\times\\hat{N})\\)\n \n \n\n\n \n \n \n \n \\(n_1 (\\hat{v_1}\\times\\hat{N}) = n_2 (\\hat{v_2}\\times\\hat{N})\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe Jacobi relationship show that for any \\(3\\) randomly chosen vectors:\n\\[\n\\vec{a}\\times(\\vec{b}\\times\\vec{c})+\n\\vec{b}\\times(\\vec{c}\\times\\vec{a})+\n\\vec{c}\\times(\\vec{a}\\times\\vec{b})\n\\]\nsimplifies. To what? (Use SymPy or randomly generated vectors to see.)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec{a}\\)\n \n \n\n\n \n \n \n \n \\(\\vec{0}\\)\n \n \n\n\n \n \n \n \n \\(\\vec{a} + \\vec{b} + \\vec{c}\\)"
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html",
"href": "differentiable_vector_calculus/vector_valued_functions.html",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "",
"text": "This section uses these add-on packages:\nand\nWe discuss functions of a single variable that return a vector in \\(R^n\\). There are many parallels to univariate functions (when \\(n=1\\)) and differences."
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#definition",
"href": "differentiable_vector_calculus/vector_valued_functions.html#definition",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.1 Definition",
"text": "54.1 Definition\nA function \\(\\vec{f}: R \\rightarrow R^n\\), \\(n > 1\\) is called a vector-valued function. Some examples:\n\\[\n\\vec{f}(t) = \\langle \\sin(t), 2\\cos(t) \\rangle, \\quad\n\\vec{g}(t) = \\langle \\sin(t), \\cos(t), t \\rangle, \\quad\n\\vec{h}(t) = \\langle 2, 3 \\rangle + t \\cdot \\langle 1, 2 \\rangle.\n\\]\nThe components themselves are also functions of \\(t\\), in this case univariate functions. Depending on the context, it can be useful to view vector-valued functions as a function that returns a vector, or a vector of the component functions.\nThe above example functions have \\(n\\) equal \\(2\\), \\(3\\), and \\(2\\) respectively. We will see that many concepts of calculus for univariate functions (\\(n=1\\)) have direct counterparts.\n(We use \\(\\vec{f}\\) above to emphasize the return value is a vector, but will quickly drop that notation and let context determine if \\(f\\) refers to a scalar- or vector-valued function.)"
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#representation-in-julia",
"href": "differentiable_vector_calculus/vector_valued_functions.html#representation-in-julia",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.2 Representation in Julia",
"text": "54.2 Representation in Julia\nIn Julia, the representation of a vector-valued function is straightforward: we define a function of a single variable that returns a vector. For example, the three functions above would be represented by:\n\nf(t) = [sin(t), 2*cos(t)]\ng(t) = [sin(t), cos(t), t]\nh(t) = [2, 3] + t * [1, 2]\n\nh (generic function with 1 method)\n\n\nFor a given t, these evaluate to a vector. For example:\n\nh(2)\n\n2-element Vector{Int64}:\n 4\n 7\n\n\nWe can create a vector of functions, e.g., F = [cos, sin, identity], but calling this object, as in F(t), would require some work, such as t = 1; [f(t) for f in F] or 1 .|> F.\n\nF = [cos, sin, identity]\n[f(1) for f in F]\n\n3-element Vector{Real}:\n 0.5403023058681398\n 0.8414709848078965\n 1\n\n\nor\n\n1 .|> F\n\n3-element Vector{Real}:\n 0.5403023058681398\n 0.8414709848078965\n 1"
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#space-curves",
"href": "differentiable_vector_calculus/vector_valued_functions.html#space-curves",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.3 Space curves",
"text": "54.3 Space curves\nA vector-valued function is typically visualized as a curve. That is, for some range, \\(a \\leq t \\leq b\\) the set of points \\(\\{\\vec{f}(t): a \\leq t \\leq b\\}\\) are plotted. If, say in \\(n=2\\), we have \\(x(t)\\) and \\(y(t)\\) as the component functions, then the graph would also be the parametric plot of \\(x\\) and \\(y\\). The term planar curve is common for the \\(n=2\\) case and space curve for the \\(n \\geq 3\\) case.\nThis plot represents the vectors with their tails at the origin.\nThere is a convention for plotting the component functions to yield a parametric plot within the Plots package (e.g., plot(x, y, a, b)). This can be used to make polar plots, where x is t -> r(t)*cos(t) and y is t -> r(t)*sin(t).\nHowever, we will use a different approach, as the component functions are not naturally produced from the vector-valued function.\nIn Plots, the command plot(xs, ys), where, say, xs=[x1, x2, ..., xn] and ys=[y1, y2, ..., yn], will make a connect-the-dot plot between corresponding pairs of points. As previously discussed, this can be used as an alternative to plotting a function through plot(f, a, b): first make a set of \\(x\\) values, say xs=range(a, b, length=100); then the corresponding \\(y\\) values, say ys = f.(xs); and then plotting through plot(xs, ys).\nSimilarly, were a third vector, zs, for \\(z\\) components used, plot(xs, ys, zs) will make a \\(3\\)-dimensional connect the dot plot\nHowever, our representation of vector-valued functions naturally generates a vector of points: [[x1,y1], [x2, y2], ..., [xn, yn]], as this comes from broadcasting f over some time values. That is, for a collection of time values, ts the command f.(ts) will produce a vector of points. (Technically a vector of vectors, but points if you identify the \\(2\\)-\\(d\\) vectors as points.)\nTo get the xs and ys from this is conceptually easy: just iterate over all the points and extract the corresponding component. For example, to get xs we would have a command like [p[1] for p in f.(ts)]. Similarly, the ys would use p[2] in place of p[1]. The unzip function from the CalculusWithJulia package does this for us. The name comes from how the zip function in base Julia takes two vectors and returns a vector of the values paired off. This is the reverse. As previously mentioned, unzip uses the invert function of the SplitApplyCombine package to invert the indexing (the \\(j\\)th component of the \\(i\\)th point can be referenced by vs[i][j] or invert(vs)[j][i]).\nVisually, we have unzip performing this reassociation:\n[[x1, y1, z1], (⌈x1⌉, ⌈y1⌉, ⌈z1⌉,\n [x2, y2, z2], |x2|, |y2|, |z2|,\n [x3, y3, z3], --> |x3|, |y3|, |z3|,\n ⋮ ⋮\n [xn, yn, zn]] ⌊xn⌋, ⌊yn⌋, ⌊zn⌋ )\nTo turn a collection of vectors into separate arguments for a function, splatting (the ...) is used.\n\nFinally, with these definitions, we can visualize the three functions we have defined.\nHere we show the plot of f over the values between \\(0\\) and \\(2\\pi\\) and also add a vector anchored at the origin defined by f(1).\n\nts = range(0, 2pi, length=200)\nxs, ys = unzip(f.(ts))\nplot(xs, ys)\narrow!([0, 0], f(1))\n\n\n\n\nThe trace of the plot is an ellipse. If we describe the components as \\(\\vec{f}(t) = \\langle x(t), y(t) \\rangle\\), then we have \\(x(t)^2 + y(t)^2/4 = 1\\). That is, for any value of \\(t\\), the resulting point satisfies the equation \\(x^2 + y^2/4 =1\\) for an ellipse.\nThe plot of \\(g\\) needs \\(3\\)-dimensions to render. For most plotting backends, the following should work with no differences, save the additional vector is anchored in \\(3\\) dimensions now:\n\nts = range(0, 6pi, length=200)\nplot(unzip(g.(ts))...) # use splatting to avoid xs,ys,zs = unzip(g.(ts))\narrow!([0, 0, 0], g(2pi))\n\n\n\n\nHere the graph is a helix; three turns are plotted. If we write \\(g(t) = \\langle x(t), y(t), z(t) \\rangle\\), as the \\(x\\) and \\(y\\) values trace out a circle, the \\(z\\) value increases. When the graph is viewed from above, as below, we see only \\(x\\) and \\(y\\) components, and the view is circular.\n\nts = range(0, 6pi, length=200)\nplot(unzip(g.(ts))..., camera=(0, 90))\n\n\n\n\nThe graph of \\(h\\) shows that this function parameterizes a line in space. The line segment for \\(-2 \\leq t \\leq 2\\) is shown below:\n\nts = range(-2, 2, length=200)\nplot(unzip(h.(ts))...)\n\n\n\n\n\n54.3.1 The plot_parametric function\nWhile the unzip function is easy to understand as a function that reshapes data from one format into one that plot can use, its usage is a bit cumbersome. The CalculusWithJulia package provides a function plot_parametric which hides the use of unzip and the splatting within a function definition.\nThe function borrows a calling style for Makie. The interval to plot over is specified first using a..b notation (which specifies a closed interval in the IntervalSets package), then the function is specified. Additional keyword arguments are passed along to plot.\n\nplot_parametric(-2..2, h)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nDefining plotting functions in Julia for Plots is facilitated by the RecipesBase package. There are two common choices: creating a new function for plotting, as is done with plot_parametric and plot_polar; or creating a new type so that plot can dispatch to an appropriate plotting method. The latter would also be a reasonable choice, but wasnt taken here. In any case, each can be avoided by creating the appropriate values for xs and ys (and possibly zs).\n\n\n\nExample\nFamiliarity with equations for lines, circles, and ellipses is important, as these fundamental geometric shapes are often building blocks in the description of other more complicated things.\nThe point-slope equation of a line, \\(y = y_0 + m \\cdot (x - x_0)\\) finds an analog. The slope, \\(m\\), is replaced with a vector \\(\\vec{v}\\) and the point, \\((x_0, y_0)\\) is replaced with a vector \\(\\vec{p}\\) identified with a point in the plane. A parameterization would then be \\(\\vec{f}(t) = \\vec{p} + (t - t_0) \\vec{v}\\). From this, we have \\(\\vec{f}(t_0) = \\vec{p}\\).\nThe unit circle is instrumental in introducing the trigonometric functions though the identification of an angle \\(t\\) with a point on the unit circle \\((x,y)\\) through \\(y = \\sin(t)\\) and \\(x=\\cos(t)\\). With this identification certain properties of the trigonometric functions are immediately seen, such as the period of \\(\\sin\\) and \\(\\cos\\) being \\(2\\pi\\), or the angles for which \\(\\sin\\) and \\(\\cos\\) are positive or even increasing. Further, this gives a natural parameterization for a vector-valued function whose plot yields the unit circle, namely \\(\\vec{f}(t) = \\langle \\cos(t), \\sin(t) \\rangle\\). This parameterization starts (at \\(t=0\\)) at the point \\((1, 0)\\). More generally, we might have additional parameters \\(\\vec{f}(t) = \\vec{p} + R \\cdot \\langle \\cos(\\omega(t-t_0)), \\sin(\\omega(t-t_0)) \\rangle\\) to change the origin, \\(\\vec{p}\\); the radius, \\(R\\); the starting angle, \\(t_0\\); and the rotational frequency, \\(\\omega\\).\nAn ellipse has a slightly more general equation than a circle and in simplest forms may satisfy the equation \\(x^2/a^2 + y^2/b^2 = 1\\), where when \\(a=b\\) a circle is being described. A vector-valued function of the form \\(\\vec{f}(t) = \\langle a\\cdot\\cos(t), b\\cdot\\sin(t) \\rangle\\) will trace out an ellipse.\nThe above description of an ellipse is useful, but it can also be useful to re-express the ellipse so that one of the foci is at the origin. With this, the ellipse can be given in polar coordinates through a description of the radius:\n\\[\nr(\\theta) = \\frac{a (1 - e^2)}{1 + e \\cos(\\theta)}.\n\\]\nHere, \\(a\\) is the semi-major axis (\\(a > b\\)); \\(e\\) is the eccentricity given by \\(b = a \\sqrt{1 - e^2}\\); and \\(\\theta\\) a polar angle.\nUsing the conversion to Cartesian equations, we have \\(\\vec{f}(\\theta) = \\langle r(\\theta) \\cos(\\theta), r(\\theta) \\sin(\\theta)\\rangle\\).\nFor example:\n\na, ecc = 20, 3/4\nf(t) = a*(1-ecc^2)/(1 + ecc*cos(t)) * [cos(t), sin(t)]\nplot_parametric(0..2pi, f, legend=false)\nscatter!([0],[0], markersize=4)\n\n\n\n\n\n\nExample\nThe Spirograph is “… a geometric drawing toy that produces mathematical roulette curves of the variety technically known as hypotrochoids and epitrochoids. It was developed by British engineer Denys Fisher and first sold in \\(1965\\).” These can be used to make interesting geometrical curves.\nFollowing Wikipedia: Consider a fixed outer circle \\(C_o\\) of radius \\(R\\) centered at the origin. A smaller inner circle \\(C_i\\) of radius \\(r < R\\) rolling inside \\(C_o\\) and is continuously tangent to it. \\(C_i\\) will be assumed never to slip on \\(C_o\\) (in a real Spirograph, teeth on both circles prevent such slippage). Now assume that a point \\(A\\) lying somewhere inside \\(C_{i}\\) is located a distance \\(\\rho < r\\) from \\(C_i\\)s center.\nThe center of the inner circle will move in a circular manner with radius \\(R-r\\). The fixed point on the inner circle will rotate about this center. The accumulated angle may be described by the angle the point of contact of the inner circle with the outer circle. Call this angle \\(t\\).\nSuppose the outer circle is centered at the origin and the inner circle starts (\\(t=0\\)) with center \\((R-r, 0)\\) and rotates around counterclockwise. Then if the point of contact makes angle \\(t\\), the arc length along the outer circle is \\(Rt\\). The inner circle will have moved a distance \\(r t'\\) in the opposite direction, so \\(Rt =-r t'\\) and solving the angle will be \\(t' = -(R/r)t\\).\nIf the initial position of the fixed point is at \\((\\rho, 0)\\) relative to the origin, then the following function will describe the motion:\n\\[\n\\vec{s}(t) = (R-r) \\cdot \\langle \\cos(t), \\sin(t) \\rangle +\n\\rho \\cdot \\langle \\cos(-\\frac{R}{r}t), \\sin(-\\frac{R}{r}t) \\rangle.\n\\]\nTo visualize this we first define a helper function to draw a circle at point \\(P\\) with radius \\(R\\):\n\ncircle!(P, R; kwargs...) = plot_parametric!(0..2pi, t -> P + R * [cos(t), sin(t)]; kwargs...)\n\ncircle! (generic function with 1 method)\n\n\nThen we have this function to visualize the spirograph for different \\(t\\) values:\n\nfunction spiro(t; r=2, R=5, rho=0.8*r)\n\n cent(t) = (R-r) * [cos(t), sin(t)]\n\n p = plot(legend=false, aspect_ratio=:equal)\n circle!([0,0], R, color=:blue)\n circle!(cent(t), r, color=:black)\n\n tp(t) = -R/r * t\n\n s(t) = cent(t) + rho * [cos(tp(t)), sin(tp(t))]\n plot_parametric!(0..t, s, color=:red)\n\n p\nend\n\nspiro (generic function with 1 method)\n\n\nAnd we can see the trace for \\(t=\\pi\\):\n\nspiro(pi)\n\n\n\n\nThe point of contact is at \\((-R, 0)\\), as expected. Carrying this forward to a full circles worth is done through:\n\nspiro(2pi)\n\n\n\n\nThe curve does not match up at the start. For that, a second time around the outer circle is needed:\n\nspiro(4pi)\n\n\n\n\nWhether the curve will have a period or not is decided by the ratio of \\(R/r\\) being rational or irrational.\n\n\nExample\nIn 1935 Marcel Duchamp showed a collection of “Rotorelief” discs at a French fair for inventors. Disk number 10 is comprised of several nested, off-center circles on disk that would be rotated to give a sense of movement. To mimic the effect:\n\nfor each circle, \\(3\\) points where selected using a mouse from an image and their pixels recorded;\nas \\(3\\) points determine a circle, the center and radius of each circle can be solved for\nthe exterior of the disc is drawn (the last specified circle below);\neach nested circle is drawn after its center is rotated by \\(\\theta\\) radian;\nan animation captures the movement for display.\n\n\nlet\n# https://exploratorium.tumblr.com/post/33140874462/marcel-duchamp-rotoreliefs-duchamp-recognized\n\n# coordinates and colors selected by gimp from\n# https://arthur.io/art/marcel-duchamp/rotorelief-no-10-cage-modele-depose-verso\n circs = [466 548 513 505 556 554 # x₁,y₁,x₂,y₂,x₂,y₃\n 414 549 511 455 595 549\n 365 545 507 408 635 548\n 319 541 506 361 673 546\n 277 543 509 317 711 546\n 236 539 507 272 747 551\n 201 541 504 230 781 550\n 166 541 503 189 816 544\n 140 542 499 153 848 538\n 116 537 496 119 879 538\n 96 539 501 90 905 534\n 81 530 500 67 930 530\n 72 525 498 51 949 529\n 66 520 500 36 966 527\n 60 515 499 25 982 526\n 35 509 499 11 1004 525 # outer edge, c₀\n ]\n\n greenblue= RGB(8/100, 58/100, 53/100)\n grey = RGB(76/100, 74/100, 72/100)\n white = RGB(88/100, 85/100, 81/100)\n\n # solve for center of circle, radius for each\n @syms h::positive k::positive r::positive\n function solve_i(i)\n eqs = [(p[1] - h)^2 + (p[2]-k)^2 ~ r^2 for\n p ∈ (circs[i,1:2], circs[i,3:4], circs[i,5:6])]\n d = solve(eqs)[1]\n (x=float(d[h]), y=float(d[k]), r=float(d[r]))\n end\n c₀, cs... = solve_i.(16:-1:1) # c₀ is centered\n\n function duchamp_rotorelief_10(θ)\n p = plot(legend=false,\n axis=nothing, xaxis=false, yaxis=false,\n aspect_ratio=:equal)\n\n O = [c₀.x, c₀.y]\n θ̂ = [cos(θ), sin(θ)]\n\n circle!(O, c₀.r, # outer ring is c₀\n linewidth=2,\n color=grey, fill=white,\n seriestype=:shape)\n\n for (i,c) ∈ enumerate(cs) # add nested rings\n rᵢ = sqrt((c₀.x - c.x)^2+(c₀.y - c.y)^2)\n P = O + rᵢ * θ̂ # rotate about origin by θ\n circle!(P, c.r,\n linewidth = i == 1 ? 1 : i <= 3 ? 2 : 3,\n color=greenblue)\n end\n\n p\n\n end\n\n # animate using Plots.@animate macro\n anim = @animate for θ ∈ range(0, -2π, length=60)\n duchamp_rotorelief_10(θ)\n end\n\n fname = tempname() * \".gif\"\n gif(anim, fname, fps = 40)\nend\n\n\n\n\n\n\n\nExample\nIvars Peterson described the carnival ride “tilt-a-whirl” as a chaotic system, whose equations of motion are presented in American Journal of Physics by Kautz and Huggard. The tilt-a-whirl has a platform that moves in a circle that also moves up and down. To describe the motion of a point on the platform assuming it has radius \\(R\\) and period \\(T\\) and rises twice in that period could be done with the function:\n\\[\n\\vec{u}(t) = \\langle R \\sin(2\\pi t/T), R \\cos(2\\pi t/T), h + h \\cdot \\sin(2\\pi t/ T) \\rangle.\n\\]\nA passenger sits on a circular platform with radius \\(r\\) attached at some point on the larger platform. The dynamics of the person on the tilt-a-whirl depend on physics, but for simplicity, lets assume the platform moves at a constant rate with period \\(S\\) and has no relative \\(z\\) component. The motion of the platform in relation to the point it is attached would be modeled by:\n\\[\n\\vec{v}(t) = \\langle r \\sin(2\\pi t/S), r \\sin(2\\pi t/S), 0 \\rangle.\n\\]\nAnd the motion relative to the origin would be the vector sum, or superposition:\n\\[\n\\vec{f}(t) = \\vec{u}(t) + \\vec{v}(t).\n\\]\nTo visualize for some parameters, we have:\n\nM, m = 25, 5\nheight = 5\nS, T = 8, 2\nouter(t) = [M * sin(2pi*t/T), M * cos(2pi*t/T), height*(1 +sin(2pi * (t-pi/2)/T))]\ninner(t) = [m * sin(2pi*t/S), m * cos(2pi*t/S), 0]\nf(t) = outer(t) + inner(t)\nplot_parametric(0..8, f)"
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#limits-and-continuity",
"href": "differentiable_vector_calculus/vector_valued_functions.html#limits-and-continuity",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.4 Limits and continuity",
"text": "54.4 Limits and continuity\nThe definition of a limit for a univariate function is: For every \\(\\epsilon > 0\\) there exists a \\(\\delta > 0\\) such that if \\(0 < |x-c| < \\delta\\) then \\(|f(x) - L | < \\epsilon\\).\nIf the notion of “\\(\\vec{f}\\) is close to \\(L\\)” is replaced by close in the sense of a norm, or vector distance, then the same limit definition can be used, with the new wording “… \\(\\| \\vec{f}(x) - L \\| < \\epsilon\\)”.\nThe notion of continuity is identical: \\(\\vec{f}(t)\\) is continuous at \\(t_0\\) if \\(\\lim_{t \\rightarrow t_0}\\vec{f}(t) = \\vec{f}(t_0)\\). More informally \\(\\| \\vec{f}(t) - \\vec{f}(t_0)\\| \\rightarrow 0\\).\nA consequence of the triangle inequality is that a vector-valued function is continuous or has a limit if and only it its component functions do.\n\n54.4.1 Derivatives\nIf \\(\\vec{f}(t)\\) is vector valued, and \\(\\Delta t > 0\\) then we can consider the vector:\n\\[\n\\vec{f}(t + \\Delta t) - \\vec{f}(t)\n\\]\nFor example, if \\(\\vec{f}(t) = \\langle 3\\cos(t), 2\\sin(t) \\rangle\\) and \\(t=\\pi/4\\) and \\(\\Delta t = \\pi/16\\) we have this picture:\n\nf(t) = [3cos(t), 2sin(t)]\nt, Δt = pi/4, pi/16\ndf = f(t + Δt) - f(t)\n\nplot(legend=false)\narrow!([0,0], f(t))\narrow!([0,0], f(t + Δt))\narrow!(f(t), df)\n\n\n\n\nThe length of the difference appears to be related to the length of \\(\\Delta t\\), in a similar manner as the univariate derivative. The following limit defines the derivative of a vector-valued function:\n\\[\n\\vec{f}'(t) = \\lim_{\\Delta t \\rightarrow 0} \\frac{f(t + \\Delta t) - f(t)}{\\Delta t}.\n\\]\nThe limit exists if the component limits do. The component limits are just the derivatives of the component functions. So, if \\(\\vec{f}(t) = \\langle x(t), y(t) \\rangle\\), then \\(\\vec{f}'(t) = \\langle x'(t), y'(t) \\rangle\\).\nIf the derivative is never \\(\\vec{0}\\), the curve is called regular. For a regular curve the derivative is a tangent vector to the parameterized curve, akin to the case for a univariate function. We can use ForwardDiff to compute the derivative in the exact same manner as was done for univariate functions:\nusing ForwardDiff\nD(f,n=1) = n > 1 ? D(D(f),n-1) : x -> ForwardDiff.derivative(f, float(x))\nBase.adjoint(f::Function) = D(f) # allow f' to compute derivative\n(This is already done by the CalculusWithJulia package.)\nWe can visualize the tangential property through a graph:\n\nf(t) = [3cos(t), 2sin(t)]\np = plot_parametric(0..2pi, f, legend=false, aspect_ratio=:equal)\nfor t in [1,2,3]\n arrow!(f(t), f'(t)) # add arrow with tail on curve, in direction of derivative\nend\np\n\n\n\n\n\n\n54.4.2 Symbolic representation\nWere symbolic expressions used in place of functions, the vector-valued function would naturally be represented as a vector of expressions:\n\n@syms 𝒕\n𝒗vf = [cos(𝒕), sin(𝒕), 𝒕]\n\n3-element Vector{Sym}:\n cos(𝒕)\n sin(𝒕)\n 𝒕\n\n\nWe will see working with these expressions is not identical to working with a vector-valued function.\nTo plot, we can avail ourselves of the the parametric plot syntax. The following expands to plot(cos(t), sin(t), t, 0, 2pi):\n\nplot(𝒗vf..., 0, 2pi)\n\n\n\n\nThe unzip usage, as was done above, could be used, but it would be more trouble in this case.\nTo evaluate the function at a given value, say \\(t=2\\), we can use subs with broadcasting to substitute into each component:\n\nsubs.(𝒗vf, 𝒕=>2)\n\n3-element Vector{Sym}:\n cos(2)\n sin(2)\n 2\n\n\nLimits are performed component by component, and can also be defined by broadcasting, again with the need to adjust the values:\n\n@syms Δ\nlimit.((subs.(𝒗vf, 𝒕 => 𝒕 + Δ) - 𝒗vf) / Δ, Δ => 0)\n\n3-element Vector{Sym}:\n -sin(𝒕)\n cos(𝒕)\n 1\n\n\nDerivatives, as was just done through a limit, are a bit more straightforward than evaluation or limit taking, as we wont bump into the shape mismatch when broadcasting:\n\ndiff.(𝒗vf, 𝒕)\n\n3-element Vector{Sym}:\n -sin(𝒕)\n cos(𝒕)\n 1\n\n\nThe second derivative, can be found through:\n\ndiff.(𝒗vf, 𝒕, 𝒕)\n\n3-element Vector{Sym}:\n -cos(𝒕)\n -sin(𝒕)\n 0\n\n\n\n\n54.4.3 Applications of the derivative\nHere are some sample applications of the derivative.\n\nExample: equation of the tangent line\nThe derivative of a vector-valued function is similar to that of a univariate function, in that it indicates a direction tangent to a curve. The point-slope form offers a straightforward parameterization. We have a point given through the vector-valued function and a direction given by its derivative. (After identifying a vector with its tail at the origin with the point that is the head of the vector.)\nWith this, the equation is simply \\(\\vec{tl}(t) = \\vec{f}(t_0) + \\vec{f}'(t_0) \\cdot (t - t_0)\\), where the dot indicates scalar multiplication.\n\n\nExample: parabolic motion\nIn physics, we learn that the equation \\(F=ma\\) can be used to derive a formula for postion, when acceleration, \\(a\\), is a constant. The resulting equation of motion is \\(x = x_0 + v_0t + (1/2) at^2\\). Similarly, if \\(x(t)\\) is a vector-valued postion vector, and the second derivative, \\(x''(t) =\\vec{a}\\), a constant, then we have: \\(x(t) = \\vec{x_0} + \\vec{v_0}t + (1/2) \\vec{a} t^2\\).\nFor two dimensions, we have the force due to gravity acts downward, only in the \\(y\\) direction. The acceleration is then \\(\\vec{a} = \\langle 0, -g \\rangle\\). If we start at the origin, with initial velocity \\(\\vec{v_0} = \\langle 2, 3\\rangle\\), then we can plot the trajectory until the object returns to ground (\\(y=0\\)) as follows:\n\ngravity = 9.8\nx0, v0, a = [0,0], [2, 3], [0, -gravity]\nxpos(t) = x0 + v0*t + (1/2)*a*t^2\n\nt_0 = find_zero(t -> xpos(t)[2], (1/10, 100)) # find when y=0\n\nplot_parametric(0..t_0, xpos)\n\n\n\n\n\n\nExample: a tractrix\nA tractrix, studied by Perrault, Newton, Huygens, and many others, is the curve along which an object moves when pulled in a horizontal plane by a line segment attached to a pulling point (Wikipedia). If the object is placed at \\((a,0)\\) and the puller at the origin, and the puller moves along the positive \\(x\\) axis, then the line will always be tangent to the curve and of fixed length, so determinable from the motion of the puller. In this example \\(dy/dx = -\\sqrt{a^2-x^2}/x\\).\nThis is the key property: “Due to the geometrical way it was defined, the tractrix has the property that the segment of its tangent, between the asymptote and the point of tangency, has constant length \\(a\\).”\nThe tracks made by the front and rear bicycle wheels also have this same property and similarly afford a mathematical description. We follow Dunbar, Bosman, and Nooij from The Track of a Bicycle Back Tire below, though Levi and Tabachnikov and Foote, Levi, and Tabachnikov were also consulted. Let \\(a\\) be the distance between the front and back wheels, whose positions are parameterized by \\(\\vec{F}(t)\\) and \\(\\vec{B}(t)\\), respectively. The key property is the distance between the two is always \\(a\\), and, as the back wheel is always moving in the direction of the front wheel, we have \\(\\vec{B}'(t)\\) is in the direction of \\(\\vec{F}(t) - \\vec{B}(t)\\), that is the vector \\((\\vec{F}(t)-\\vec{B}(t))/a\\) is a unit vector in the direction of the derivative of \\(\\vec{B}\\). How long is the derivative vector? That would be answered by the speed of the back wheel, which is related to the velocity of the front wheel. But only the component of the velocity in the direction of \\(\\vec{F}(t)-\\vec{B}(t)\\), so the speed of the back wheel is the length of the projection of \\(\\vec{F}'(t)\\) onto the unit vector \\((\\vec{F}(t)-\\vec{B}(t))/a\\), which is identified through the dot product.\nCombined, this gives the following equations relating \\(\\vec{F}(t)\\) to \\(\\vec{B}(t)\\):\n\\[\ns_B(t) = \\vec{F}'(t) \\cdot \\frac{\\vec{F}(t)-\\vec{B}(t)}{a}, \\quad\n\\vec{B}'(t) = s_B(t) \\frac{\\vec{F}(t)-\\vec{B}(t)}{a}.\n\\]\nThis is a differential equation describing the motion of the back wheel in terms of the front wheel.\nIf the back wheel trajectory is known, the relationship is much easier, as the two differ by a vector of length \\(a\\) in the direction of \\(\\vec{B}'(t)\\), or:\n\\[\nF(t) = \\vec{B}(t) + a \\frac{\\vec{B'(t)}}{\\|\\vec{B}'(t)\\|}.\n\\]\nWe dont discuss when a differential equation has a solution, or if it is unique when it does, but note that the differential equation above may be solved numerically, in a manner somewhat similar to what was discussed in ODEs. Here we will use the DifferentialEquations package for finding numeric solutions.\nWe can define our equation as follows, using p to pass in the two parameters: the wheel-base length \\(a\\), and \\(F(t)\\), the parameterization of the front wheel in time:\n\nfunction bicycle(dB, B, p, t)\n\n a, F = p # unpack parameters\n\n speed = F'(t) ⋅ (F(t) - B) / a\n dB[1], dB[2] = speed * (F(t) - B) / a\n\nend\n\nbicycle (generic function with 1 method)\n\n\nLets consider a few simple cases first. We suppose \\(a=1\\) and the front wheel moves in a circle of radius \\(3\\). Here is how we can plot two loops:\n\nt₀, t₁ = 0.0, 4pi\n\ntspan₁ = (t₀, t₁) # time span to consider\n\na₁ = 1\nF₁(t) = 3 * [cos(t), sin(t)]\np₁ = (a₁, F₁) # combine parameters\n\nB₁0 = F₁(0) - [0, a₁] # some initial position for the back\nprob₁ = ODEProblem(bicycle, B₁0, tspan₁, p₁)\n\nout₁ = solve(prob₁, reltol=1e-6, Tsit5())\n\nretcode: Success\nInterpolation: specialized 4th order \"free\" interpolation\nt: 141-element Vector{Float64}:\n 0.0\n 0.024824742235786373\n 0.06751397117038993\n 0.1225775744732509\n 0.1929915124073604\n 0.2711129719090575\n 0.3857318027998987\n 0.46740132244848365\n 0.5643259884408179\n 0.6554327124582173\n 0.751828610060814\n 0.8474585646116065\n 0.9451683737988007\n ⋮\n 11.626710544395511\n 11.715968802190137\n 11.806662802491857\n 11.898926271619045\n 11.99240771723366\n 12.086815783410266\n 12.181697617359077\n 12.276652206566792\n 12.37129130733533\n 12.465296916090319\n 12.558395042971764\n 12.566370614359172\nu: 141-element Vector{Vector{Float64}}:\n [3.0, -1.0]\n [2.9999774774962056, -0.9255330157507721]\n [2.9995614415250142, -0.7975914654455536]\n [2.997483559821272, -0.6329875803722077]\n [2.990691054485171, -0.4235364001967119]\n [2.975659153761233, -0.1929487825725642]\n [2.935503829312559, 0.14094436765464505]\n [2.8916659543917227, 0.3747473361066778]\n [2.8215694921803482, 0.6465254569928282]\n [2.736905463531427, 0.8949976598137234]\n [2.6270391429970594, 1.1488425231948574]\n [2.4974132664632624, 1.3896631861459474]\n [2.344143370381871, 1.622435084495964]\n ⋮\n [0.8123161465358226, -2.7092697598779267]\n [1.0505861468263447, -2.6260748066160198]\n [1.284111227882155, -2.520130648758621]\n [1.510835839600701, -2.3911033418366356]\n [1.72743761116193, -2.2396337696631425]\n [1.9308706574010548, -2.066818479716193]\n [2.1179952135647127, -1.8745923300524743]\n [2.2861878502648825, -1.66533637730947]\n [2.433328078249219, -1.4418441695389987]\n [2.5579261636905746, -1.2070683064035401]\n [2.6590625968276593, -0.9640468194329426]\n [2.666666766573229, -0.9428088491846468]\n\n\nThe object out holds the answer. This object is callable, in that out(t) will return the numerically computed value for the answer to our equation at time point t.\nTo plot the two trajectories, we could use that out.u holds the \\(x\\) and \\(y\\) components of the computed trajectory, but more simply, we can just call out like a function.\n\nplt₁ = plot_parametric(t₀..t₁, F₁, legend=false)\nplot_parametric!(t₀..t₁, out₁, linewidth=3)\n\n## add the bicycle as a line segment at a few times along the path\nfor t in range(t₀, t₁, length=11)\n plot!(unzip([out₁(t), F₁(t)])..., linewidth=3, color=:black)\nend\nplt₁\n\n\n\n\nThat the rear wheel track appears shorter, despite the rear wheel starting outside the circle, is typical of bicycle tracks and also a reason to rotate tires on car, as the front ones move a bit more than the rear, so presumably wear faster.\nLets look what happens if the front wheel wobbles back and forth following a sine curve. Repeating the above, only with \\(F\\) redefined, we have:\n\na₂ = 1\nF₂(t) = [t, 2sin(t)]\np₂ = (a₂, F₂)\n\nB₂0 = F₂(0) - [0, a₂] # some initial position for the back\nprob₂ = ODEProblem(bicycle, B₂0, tspan₁, p₂)\n\nout₂ = solve(prob₂, reltol=1e-6, Tsit5())\n\nplot_parametric(t₀..t₁, F₂, legend=false)\nplot_parametric!(t₀..t₁, t -> out₂(t), linewidth=3)\n\n\n\n\nAgain, the back wheel moves less than the front.\nThe motion of the back wheel need not be smooth, even if the motion of the front wheel is, as this curve illustrates:\n\na₃ = 1\nF₃(t) = [cos(t), sin(t)] + [cos(2t), sin(2t)]\np₃ = (a₃, F₃)\n\nB₃0 = F₃(0) - [0,a₃]\nprob₃ = ODEProblem(bicycle, B₃0, tspan₁, p₃)\n\nout₃ = solve(prob₃, reltol=1e-6, Tsit5())\nplot_parametric(t₀..t₁, F₃, legend=false)\nplot_parametric!(t₀..t₁, t -> out₃(t), linewidth=3)\n\n\n\n\nThe back wheel is moving backwards for part of the above trajectory.\nThis effect can happen even for a front wheel motion as simple as a circle when the front wheel radius is less than the wheelbase:\n\na₄ = 1\nF₄(t) = a₄/3 * [cos(t), sin(t)]\np₄ = (a₄, F₄)\n\nt₀₄, t₁₄ = 0.0, 25pi\ntspan₄ = (t₀₄, t₁₄)\n\nB₄0 = F₄(0) - [0, a₄]\nprob₄ = ODEProblem(bicycle, B₄0, tspan₄, p₄)\n\nout₄ = solve(prob₄, reltol=1e-6, Tsit5())\nplot_parametric(t₀₄..t₁₄, F₄, legend=false, aspect_ratio=:equal)\nplot_parametric!(t₀₄..t₁₄, t -> out₄(t), linewidth=3)\n\n\n\n\nLater we will characterize when there are cusps in the rear-wheel trajectory."
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#derivative-rules",
"href": "differentiable_vector_calculus/vector_valued_functions.html#derivative-rules",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.5 Derivative rules",
"text": "54.5 Derivative rules\nFrom the definition, as it is for univariate functions, for vector-valued functions \\(\\vec{f}, \\vec{g}: R \\rightarrow R^n\\):\n\\[\n[\\vec{f} + \\vec{g}]'(t) = \\vec{f}'(t) + \\vec{g}'(t), \\quad\\text{and }\n[a\\vec{f}]'(t) = a \\vec{f}'(t).\n\\]\nIf \\(a(t)\\) is a univariate (scalar) function of \\(t\\), then a product rule holds:\n\\[\n[a(t) \\vec{f}(t)]' = a'(t)\\vec{f}(t) + a(t)\\vec{f}'(t).\n\\]\nIf \\(s\\) is a univariate function, then the composition \\(\\vec{f}(s(t))\\) can be differentiated. Each component would satisfy the chain rule, and consequently:\n\\[\n\\frac{d}{dt}\\left(\\vec{f}(s(t))\\right) = \\vec{f}'(s(t)) \\cdot s'(t),\n\\]\nThe dot being scalar multiplication by the derivative of the univariate function \\(s\\).\nVector-valued functions do not have multiplication or division defined for them, so there are no ready analogues of the product and quotient rule. However, the dot product and the cross product produce new functions that may have derivative rules available.\nFor the dot product, the combination \\(\\vec{f}(t) \\cdot \\vec{g}(t)\\) we have a univariate function of \\(t\\), so we know a derivative is well defined. Can it be represented in terms of the vector-valued functions? In terms of the component functions, we have this calculation specific to \\(n=2\\), but that which can be generalized:\n\\[\n\\begin{align*}\n\\frac{d}{dt}(\\vec{f}(t) \\cdot \\vec{g}(t)) &=\n\\frac{d}{dt}(f_1(t) g_1(t) + f_2(t) g_2(t))\\\\\n&= f_1'(t) g_1(t) + f_1(t) g_1'(t) + f_2'(t) g_2(t) + f_2(t) g_2'(t)\\\\\n&= f_1'(t) g_1(t) + f_2'(t) g_2(t) + f_1(t) g_1'(t) + f_2(t) g_2'(t)\\\\\n&= \\vec{f}'(t)\\cdot \\vec{g}(t) + \\vec{f}(t) \\cdot \\vec{g}'(t).\n\\end{align*}\n\\]\nSuggesting the that a product rule like formula applies for dot products.\nFor the cross product, we let SymPy derive a formula for us.\n\n@syms tₛ us()[1:3] vs()[1:3]\nuₛ = tₛ .|> us # evaluate each of us at t\nvₛ = tₛ .|> vs\n\n3-element Vector{Sym}:\n vs₁(tₛ)\n vs₂(tₛ)\n vs₃(tₛ)\n\n\nThen the cross product has a derivative:\n\ndiff.(uₛ × vₛ, tₛ)\n\n3-element Vector{Sym}:\n us₂(tₛ)*Derivative(vs₃(tₛ), tₛ) - us₃(tₛ)*Derivative(vs₂(tₛ), tₛ) - vs₂(tₛ)*Derivative(us₃(tₛ), tₛ) + vs₃(tₛ)*Derivative(us₂(tₛ), tₛ)\n -us₁(tₛ)*Derivative(vs₃(tₛ), tₛ) + us₃(tₛ)*Derivative(vs₁(tₛ), tₛ) + vs₁(tₛ)*Derivative(us₃(tₛ), tₛ) - vs₃(tₛ)*Derivative(us₁(tₛ), tₛ)\n us₁(tₛ)*Derivative(vs₂(tₛ), tₛ) - us₂(tₛ)*Derivative(vs₁(tₛ), tₛ) - vs₁(tₛ)*Derivative(us₂(tₛ), tₛ) + vs₂(tₛ)*Derivative(us₁(tₛ), tₛ)\n\n\nAdmittedly, that isnt very clear. With a peek at the answer, we show that the derivative is the same as the product rule would suggest (\\(\\vec{u}' \\times \\vec{v} + \\vec{u} \\times \\vec{v}'\\)):\n\ndiff.(uₛ × vₛ, tₛ) - (diff.(uₛ, tₛ) × vₛ + uₛ × diff.(vₛ, tₛ))\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nIn summary, these two derivative formulas hold for vector-valued functions \\(R \\rightarrow R^n\\):\n\\[\n\\begin{align}\n(\\vec{u} \\cdot \\vec{v})' &= \\vec{u}' \\cdot \\vec{v} + \\vec{u} \\cdot \\vec{v}',\\\\\n(\\vec{u} \\times \\vec{v})' &= \\vec{u}' \\times \\vec{v} + \\vec{u} \\times \\vec{v}'.\n\\end{align}\n\\]\n\nApplication. Circular motion and the tangent vector.\nThe parameterization \\(\\vec{r}(t) = \\langle \\cos(t), \\sin(t) \\rangle\\) describes a circle. Characteristic of this motion is a constant radius, or in terms of a norm: \\(\\| \\vec{r}(t) \\| = c\\). The norm squared, can be expressed in terms of the dot product:\n\\[\n\\| \\vec{r}(t) \\|^2 = \\vec{r}(t) \\cdot \\vec{r}(t).\n\\]\nDifferentiating this for the case of a constant radius yields the equation \\(0 = [\\vec{r}\\cdot\\vec{r}]'(t)\\), which simplifies through the product rule and commutativity of the dot product to \\(0 = 2 \\vec{r}(t) \\cdot \\vec{r}'(t)\\). That is, the two vectors are orthogonal to each other. This observation proves to be very useful, as will be seen.\n\n\nExample: Keplers laws\nKeplers laws of planetary motion are summarized by:\n\nThe orbit of a planet is an ellipse with the Sun at one of the two foci.\nA line segment joining a planet and the Sun sweeps out equal areas during equal intervals of time.\nThe square of the orbital period of a planet is directly proportional to the cube of the semi-major axis of its orbit.\n\nKepler was a careful astronomer, and derived these laws empirically. We show next how to derive these laws using vector calculus assuming some facts on Newtonian motion, as postulated by Newton. This approach is borrowed from Joyce.\nWe adopt a sun-centered view of the universe, placing the sun at the origin and letting \\(\\vec{x}(t)\\) be the position of a planet relative to this origin. We can express this in terms of a magnitude and direction through \\(r(t) \\hat{x}(t)\\).\nNewtons law of gravitational force between the sun and this planet is then expressed by:\n\\[\n\\vec{F} = -\\frac{G M m}{r^2} \\hat{x}(t).\n\\]\nNewtons famous law relating force and acceleration is\n\\[\n\\vec{F} = m \\vec{a} = m \\ddot{\\vec{x}}.\n\\]\nCombining, Newton states \\(\\vec{a} = -(GM/r^2) \\hat{x}\\).\nNow to show the first law. Consider \\(\\vec{x} \\times \\vec{v}\\). It is constant, as:\n\\[\n\\begin{align}\n(\\vec{x} \\times \\vec{v})' &= \\vec{x}' \\times \\vec{v} + \\vec{x} \\times \\vec{v}'\\\\\n&= \\vec{v} \\times \\vec{v} + \\vec{x} \\times \\vec{a}.\n\\end{align}\n\\]\nBoth terms are \\(\\vec{0}\\), as \\(\\vec{a}\\) is parallel to \\(\\vec{x}\\) by the above, and clearly \\(\\vec{v}\\) is parallel to itself.\nThis says, \\(\\vec{x} \\times \\vec{v} = \\vec{c}\\) is a constant vector, meaning, the motion of \\(\\vec{x}\\) must lie in a plane, as \\(\\vec{x}\\) is always orthogonal to the fixed vector \\(\\vec{c}\\).\nNow, by differentiating \\(\\vec{x} = r \\hat{x}\\) we have:\n\\[\n\\begin{align}\n\\vec{v} &= \\vec{x}'\\\\\n&= (r\\hat{x})'\\\\\n&= r' \\hat{x} + r \\hat{x}',\n\\end{align}\n\\]\nand so\n\\[\n\\begin{align}\n\\vec{c} &= \\vec{x} \\times \\vec{v}\\\\\n&= (r\\hat{x}) \\times (r'\\hat{x} + r \\hat{x}')\\\\\n&= r^2 (\\hat{x} \\times \\hat{x}').\n\\end{align}\n\\]\nFrom this, we can compute \\(\\vec{a} \\times \\vec{c}\\):\n\\[\n\\begin{align}\n\\vec{a} \\times \\vec{c} &= (-\\frac{GM}{r^2})\\hat{x} \\times r^2(\\hat{x} \\times \\hat{x}')\\\\\n&= -GM \\hat{x} \\times (\\hat{x} \\times \\hat{x}') \\\\\n&= GM (\\hat{x} \\times \\hat{x}')\\times \\hat{x}.\n\\end{align}\n\\]\nThe last line by anti-commutativity.\nBut, the triple cross product can be simplified through the identify \\((\\vec{u}\\times\\vec{v})\\times\\vec{w} = (\\vec{u}\\cdot\\vec{w})\\vec{v} - (\\vec{v}\\cdot\\vec{w})\\vec{u}\\). So, the above becomes:\n\\[\n\\begin{align}\n\\vec{a} \\times \\vec{c} &= GM ((\\hat{x}\\cdot\\hat{x})\\hat{x}' - (\\hat{x} \\cdot \\hat{x}')\\hat{x})\\\\\n&= GM (1 \\hat{x}' - 0 \\hat{x}).\n\\end{align}\n\\]\nNow, since \\(\\vec{c}\\) is constant, we have:\n\\[\n\\begin{align}\n(\\vec{v} \\times \\vec{c})' &= (\\vec{a} \\times \\vec{c})\\\\\n&= GM \\hat{x}'\\\\\n&= (GM\\hat{x})'.\n\\end{align}\n\\]\nThe two sides have the same derivative, hence differ by a constant:\n\\[\n\\vec{v} \\times \\vec{c} = GM \\hat{x} + \\vec{d}.\n\\]\nAs \\(\\vec{u}\\) and \\(\\vec{v}\\times\\vec{c}\\) lie in the same plane - orthogonal to \\(\\vec{c}\\) - so does \\(\\vec{d}\\). With a suitable re-orientation, so that \\(\\vec{d}\\) is along the \\(x\\) axis, \\(\\vec{c}\\) is along the \\(z\\)-axis, then we have \\(\\vec{c} = \\langle 0,0,c\\rangle\\) and \\(\\vec{d} = \\langle d ,0,0 \\rangle\\), and \\(\\vec{x} = \\langle x, y, 0 \\rangle\\). Set \\(\\theta\\) to be the angle, then \\(\\hat{x} = \\langle \\cos(\\theta), \\sin(\\theta), 0\\rangle\\).\nNow\n\\[\n\\begin{align}\nc^2 &= \\|\\vec{c}\\|^2 \\\\\n&= \\vec{c} \\cdot \\vec{c}\\\\\n&= (\\vec{x} \\times \\vec{v}) \\cdot \\vec{c}\\\\\n&= \\vec{x} \\cdot (\\vec{v} \\times \\vec{c})\\\\\n&= r\\hat{x} \\cdot (GM\\hat{x} + \\vec{d})\\\\\n&= GMr + r \\hat{x} \\cdot \\vec{d}\\\\\n&= GMr + rd \\cos(\\theta).\n\\end{align}\n\\]\nSolving, this gives the first law. That is, the radial distance is in the form of an ellipse:\n\\[\nr = \\frac{c^2}{GM + d\\cos(\\theta)} =\n\\frac{c^2/(GM)}{1 + (d/GM) \\cos(\\theta)}.\n\\]\n\nKeplers second law can also be derived from vector calculus. This derivation follows that given at MIT OpenCourseWare and OpenCourseWare.\nThe second law states that the area being swept out during a time duration only depends on the duration of time, not the time. Let \\(\\Delta t\\) be this duration. Then if \\(\\vec{x}(t)\\) is the position vector, as above, we have the area swept out between \\(t\\) and \\(t + \\Delta t\\) is visualized along the lines of:\n\nx1(t) = [cos(t), 2 * sin(t)]\nt0, t1, Delta = 1.0, 2.0, 1/10\nplot_parametric(0..pi/2, x1)\n\narrow!([0,0], x1(t0)); arrow!([0,0], x1(t0 + Delta))\narrow!(x1(t0), x1(t0+Delta)- x1(t0), linewidth=5)\n\n\n\n\nThe area swept out, is basically the half the area of the parallelogram formed by \\(\\vec{x}(t)\\) and \\(\\Delta \\vec{x}(t) = \\vec{x}(t + \\Delta t) - \\vec{x}(t))\\). This area is \\((1/2) (\\vec{x} \\times \\Delta\\vec{x}(t))\\).\nIf we divide through by \\(\\Delta t\\), and take a limit we have:\n\\[\n\\frac{dA}{dt} = \\| \\frac{1}{2}\\lim_{\\Delta t \\rightarrow 0} (\\vec{x} \\times \\frac{\\vec{x}(t + \\Delta t) - \\vec{x}(t)}{\\Delta t})\\| =\n\\frac{1}{2}\\|\\vec{x} \\times \\vec{v}\\|.\n\\]\nBut we saw above, that for the motion of a planet, that \\(\\vec{x} \\times \\vec{v} = \\vec{c}\\), a constant. This says, \\(dA\\) is a constant independent of \\(t\\), and consequently, the area swept out over a duration of time will not depend on the particular times involved, just the duration.\n\nThe third law relates the period to a parameter of the ellipse. We have from the above a strong suggestion that area of the ellipse can be found by integrating \\(dA\\) over the period, say \\(T\\). Assuming that is the case and letting \\(a\\) be the semi-major axis length, and \\(b\\) the semi-minor axis length, then\n\\[\n\\pi a b = \\int_0^T dA = \\int_0^T (1/2) \\|\\vec{x} \\times \\vec{v}\\| dt = \\| \\vec{x} \\times \\vec{v}\\| \\frac{T}{2}.\n\\]\nAs \\(c = \\|\\vec{x} \\times \\vec{v}\\|\\) is a constant, this allows us to express \\(c\\) by: \\(2\\pi a b/T\\).\nBut, we have\n\\[\nr(\\theta) = \\frac{c^2}{GM + d\\cos(\\theta)} = \\frac{c^2/(GM)}{1 + d/(GM) \\cos(\\theta)}.\n\\]\nSo, \\(e = d/(GM)\\) and \\(a (1 - e^2) = c^2/(GM)\\). Using \\(b = a \\sqrt{1-e^2}\\) we have:\n\\[\na(1-e^2) = c^2/(GM) = (\\frac{2\\pi a b}{T})^2 \\frac{1}{GM} =\n\\frac{(2\\pi)^2}{GM} \\frac{a^2 (a^2(1-e^2))}{T^2},\n\\]\nor after cancelling \\((1-e^2)\\) from each side:\n\\[\nT^2 = \\frac{(2\\pi)^2}{GM} \\frac{a^4}{a} = \\frac{(2\\pi)^2}{GM} a^3.\n\\]\n\nThe above shows how Newton might have derived Keplers observational facts. Next we show, that assuming the laws of Kepler can anticipate Newtons equation for gravitational force. This follows Wikipedia.\nNow let \\(\\vec{r}(t)\\) be the position of the planet relative to the Sun at the origin, in two dimensions (we used \\(\\vec{x}(t)\\) above). Assume \\(\\vec{r}(0)\\) points in the \\(x\\) direction. Write \\(\\vec{r} = r \\hat{r}\\). Define \\(\\hat{\\theta}(t)\\) to be the mapping from time \\(t\\) to the angle defined by \\(\\hat{r}\\) through the unit circle.\nThen we express the velocity (\\(\\dot{\\vec{r}}\\)) and acceleration (\\(\\ddot{\\vec{r}}\\)) in terms of the orthogonal vectors \\(\\hat{r}\\) and \\(\\hat{\\theta}\\), as follows:\n\\[\n\\frac{d}{dt}(r \\hat{r}) = \\dot{r} \\hat{r} + r \\dot{\\hat{r}} = \\dot{r} \\hat{r} + r \\dot{\\theta}\\hat{\\theta}.\n\\]\nThe last equality from expressing \\(\\hat{r}(t) = \\hat{r}(\\theta(t))\\) and using the chain rule, noting \\(d(\\hat{r}(\\theta))/d\\theta = \\hat{\\theta}\\).\nContinuing,\n\\[\n\\frac{d^2}{dt^2}(r \\hat{r}) =\n(\\ddot{r} \\hat{r} + \\dot{r} \\dot{\\hat{r}}) +\n(\\dot{r} \\dot{\\theta}\\hat{\\theta} + r \\ddot{\\theta}\\hat{\\theta} + r \\dot{\\theta}\\dot{\\hat{\\theta}}).\n\\]\nNoting, similar to above, \\(\\dot{\\hat{\\theta}} = d\\hat{\\theta}/dt = d\\hat{\\theta}/d\\theta \\cdot d\\theta/dt = -\\dot{\\theta} \\hat{r}\\) we can express the above in terms of \\(\\hat{r}\\) and \\(\\hat{\\theta}\\) as:\n\\[\n\\vec{a} = \\frac{d^2}{dt^2}(r \\hat{r}) = (\\ddot{r} - r (\\dot{\\theta})^2) \\hat{r} + (r\\ddot{\\theta} + 2\\dot{r}\\dot{\\theta}) \\hat{\\theta}.\n\\]\nThat is, in general, the acceleration has a radial component and a transversal component.\nKeplers second law says that the area increment over time is constant (\\(dA/dt\\)), but this area increment is approximated by the following wedge in polar coordinates: \\(dA = (1/2) r \\cdot rd\\theta\\). We have then \\(dA/dt = r^2 \\dot{\\theta}\\) is constant.\nDifferentiating, we have:\n\\[\n0 = \\frac{d(r^2 \\dot{\\theta})}{dt} = 2r\\dot{r}\\dot{\\theta} + r^2 \\ddot{\\theta},\n\\]\nwhich is the tranversal component of the acceleration times \\(r\\), as decomposed above. This means, that the acceleration of the planet is completely towards the Sun at the origin.\nKeplers first law, relates \\(r\\) and \\(\\theta\\) through the polar equation of an ellipse:\n\\[\nr = \\frac{p}{1 + \\epsilon \\cos(\\theta)}.\n\\]\nExpressing in terms of \\(p/r\\) and differentiating in \\(t\\) gives:\n\\[\n-\\frac{p \\dot{r}}{r^2} = -\\epsilon\\sin(\\theta) \\dot{\\theta}.\n\\]\nOr\n\\[\np\\dot{r} = \\epsilon\\sin(\\theta) r^2 \\dot{\\theta} = \\epsilon \\sin(\\theta) C,\n\\]\nFor a constant \\(C\\), used above, as the second law implies \\(r^2 \\dot{\\theta}\\) is constant. (This constant can be expressed in terms of parameters describing the ellipse.)\nDifferentiating again in \\(t\\), gives:\n\\[\np \\ddot{r} = C\\epsilon \\cos(\\theta) \\dot{\\theta} = C\\epsilon \\cos(\\theta)\\frac{C}{r^2}.\n\\]\nSo \\(\\ddot{r} = (C^2 \\epsilon / p) \\cos{\\theta} (1/r^2)\\).\nThe radial acceleration from above is:\n\\[\n\\ddot{r} - r (\\dot{\\theta})^2 =\n(C^2 \\epsilon/p) \\cos{\\theta} \\frac{1}{r^2} - r\\frac{C^2}{r^4} = \\frac{C^2}{pr^2}(\\epsilon \\cos(\\theta) - \\frac{p}{r}).\n\\]\nUsing \\(p/r = 1 + \\epsilon\\cos(\\theta)\\), we have the radial acceleration is \\(C^2/p \\cdot (1/r^2)\\). That is the acceleration, is proportional to the inverse square of the position, and using the relation between \\(F\\), force, and acceleration, we see the force on the planet follows the inverse-square law of Newton."
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#moving-frames-of-reference",
"href": "differentiable_vector_calculus/vector_valued_functions.html#moving-frames-of-reference",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.6 Moving frames of reference",
"text": "54.6 Moving frames of reference\nIn the last example, it proved useful to represent vectors in terms of other unit vectors, in that case \\(\\hat{r}\\) and \\(\\hat{\\theta}\\). Here we discuss a coordinate system defined intrinsically by the motion along the trajectory of a curve.\nLet \\(r(t)\\) be a smooth vector-valued function in \\(R^3\\). It gives rise to a space curve, through its graph. This curve has tangent vector \\(\\vec{r}'(t)\\), indicating the direction of travel along \\(\\vec{r}\\) as \\(t\\) increases. The length of \\(\\vec{r}'(t)\\) depends on the parameterization of \\(\\vec{r}\\), as for any increasing, differentiable function \\(s(t)\\), the composition \\(\\vec{r}(s(t))\\) will have derivative, \\(\\vec{r}'(s(t)) s'(t)\\), having the same direction as \\(\\vec{r}'(t)\\) (at suitably calibrated points), but not the same magnitude, the factor of \\(s(t)\\) being involved.\nTo discuss properties intrinsic to the curve, the unit vector is considered:\n\\[\n\\hat{T}(t) = \\frac{\\vec{r}'(t)}{\\|\\vec{r}'(t)\\|}.\n\\]\nThe function \\(\\hat{T}(t)\\) is the unit tangent vector. An assumption of regularity ensures the denominator is never \\(0\\).\nNow define the unit normal, \\(\\hat{N}(t)\\), by:\n\\[\n\\hat{N}(t) = \\frac{\\hat{T}'(t)}{\\| \\hat{T}'(t) \\|}.\n\\]\nSince \\(\\|\\hat{T}(t)\\| = 1\\), a constant, it must be that \\(\\hat{T}'(t) \\cdot \\hat{T}(t) = 0\\), that is, the \\(\\hat{N}\\) and \\(\\hat{T}\\) are orthogonal.\nFinally, define the binormal, \\(\\hat{B}(t) = \\hat{T}(t) \\times \\hat{N}(t)\\). At each time \\(t\\), the three unit vectors are orthogonal to each other. They form a moving coordinate system for the motion along the curve that does not depend on the parameterization.\nWe can visualize this, for example along a Viviani curve, as is done in a Wikipedia animation:\n\nfunction viviani(t, a=1)\n [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)]\nend\n\n\nTangent(t) = viviani'(t)/norm(viviani'(t))\nNormal(t) = Tangent'(t)/norm(Tangent'(t))\nBinormal(t) = Tangent(t) × Normal(t)\n\np = plot(legend=false)\nplot_parametric!(-2pi..2pi, viviani)\n\nt0, t1 = -pi/3, pi/2 + 2pi/5\nr0, r1 = viviani(t0), viviani(t1)\narrow!(r0, Tangent(t0)); arrow!(r0, Binormal(t0)); arrow!(r0, Normal(t0))\narrow!(r1, Tangent(t1)); arrow!(r1, Binormal(t1)); arrow!(r1, Normal(t1))\np\n\n\n\n\n\nThe curvature of a \\(3\\)-dimensional space curve is defined by:\n\nThe curvature: For a \\(3-D\\) curve the curvature is defined by:\n\\(\\kappa = \\frac{\\| r'(t) \\times r''(t) \\|}{\\| r'(t) \\|^3}.\\)\n\nFor \\(2\\)-dimensional space curves, the same formula applies after embedding a \\(0\\) third component. It can also be expressed directly as\n\\[\n\\kappa = (x'y''-x''y')/\\|r'\\|^3. \\quad (r(t) =\\langle x(t), y(t) \\rangle)\n\\]\nCurvature can also be defined as derivative of the tangent vector, \\(\\hat{T}\\), when the curve is parameterized by arc length, a topic still to be taken up. The vector \\(\\vec{r}'(t)\\) is the direction of motion, whereas \\(\\vec{r}''(t)\\) indicates how fast and in what direction this is changing. For curves with little curve in them, the two will be nearly parallel and the cross product small (reflecting the presence of \\(\\cos(\\theta)\\) in the definition). For “curvy” curves, \\(\\vec{r}''\\) will be in a direction opposite of \\(\\vec{r}'\\) to the \\(\\cos(\\theta)\\) term in the cross product will be closer to \\(1\\).\nLet \\(\\vec{r}(t) = k \\cdot \\langle \\cos(t), \\sin(t), 0 \\rangle\\). This will have curvature:\n\n@syms k::positive t::real\nr1 = k * [cos(t), sin(t), 0]\nnorm(diff.(r1,t) × diff.(r1,t,t)) / norm(diff.(r1,t))^3 |> simplify\n\n \n\\[\n\\frac{1}{k}\n\\]\n\n\n\nFor larger circles (bigger \\(\\|k\\|\\)) there is less curvature. The limit being a line with curvature \\(0\\).\nIf a curve is imagined to have a tangent “circle” (second order Taylor series approximation), then the curvature of that circle matches the curvature of the curve.\nThe torsion, \\(\\tau\\), of a space curve (\\(n=3\\)), is a measure of how sharply the curve is twisting out of the plane of curvature.\nThe torsion is defined for smooth curves by\n\nThe torsion:\n\\(\\tau = \\frac{(\\vec{r}' \\times \\vec{r}'') \\cdot \\vec{r}'''}{\\|\\vec{r}' \\times \\vec{r}''\\|^2}.\\)\n\nFor the torsion to be defined, the cross product \\(\\vec{r}' \\times \\vec{r}''\\) must be non zero, that is the two must not be parallel or zero.\n\nExample: Tubular surface\nThis last example comes from a collection of several examples provided by Discourse user @empet to illustrate plotlyjs. We adopt it to Plots with some minor changes below.\nThe task is to illustrate a space curve, \\(c(t)\\), using a tubular surface. At each time point \\(t\\), assume the curve has tangent, \\(e_1\\); normal, \\(e_2\\); and binormal, \\(e_3\\). (This assumes the defining derivatives exist and are non-zero and the cross product in the torsion is non zero.) The tubular surface is a circle of radius \\(\\epsilon\\) in the plane determined by the normal and binormal. This curve would be parameterized by \\(r(t,u) = c(t) + \\epsilon (e_2(t) \\cdot \\cos(u) + e_3(t) \\cdot \\sin(u))\\) for varying \\(u\\).\nThe Frenet-Serret equations setup a system of differential equations driven by the curvature and torsion. We use the DifferentialEquations package to solve this equation for two specific functions and a given initial condition. The equations when expanded into coordinates become \\(12\\) different equations:\n\n# e₁, e₂, e₃, (x,y,z)\nfunction Frenet_eq!(du, u, p, s) #system of ODEs\n κ, τ = p\n du[1] = κ(s) * u[4] # e₁ = κ ⋅ e₂\n du[2] = κ(s) * u[5]\n du[3] = κ(s) * u[6]\n du[4] = -κ(s) * u[1] + τ(s) * u[7] # e₂ = - κ ⋅ e₁ + τ ⋅ e₃\n du[5] = -κ(s) * u[2] + τ(s) * u[8]\n du[6] = -κ(s) * u[3] + τ(s) * u[9]\n du[7] = -τ(s) * u[4] # e₃ = - τ ⋅ e₂\n du[8] = -τ(s) * u[5]\n du[9] = -τ(s) * u[6]\n du[10] = u[1] # c = e₁\n du[11] = u[2]\n du[12] = u[3]\nend\n\nFrenet_eq! (generic function with 1 method)\n\n\nThe last set of equations describe the motion of the spine. It follows from specifying the tangent to the curve is \\(e_1\\), as desired; it is parameterized by arc length, as \\(\\mid c'(t) \\mid = 1\\).\nFollowing the example of @empet, we define a curvature function and torsion function, the latter a constant:\n\nκ(s) = 3 * sin(s/10) * sin(s/10)\nτ(s) = 0.35\n\nτ (generic function with 1 method)\n\n\nThe initial condition and time span are set with:\n\ne₁₀, e₂₀, e₃₀ = [1,0,0], [0,1,0], [0,0,1]\nu₀ = [0, 0, 0]\nu0 = vcat(e₁₀, e₂₀, e₃₀, u₀) # initial condition for the system of ODE\nt_span = (0.0, 150.0) # time interval for solution\n\n(0.0, 150.0)\n\n\nWith this set up, the problem can be solved:\nprob = ODEProblem(Frenet_eq!, u0, t_span, (κ, τ))\nsol = solve(prob, Tsit5());\nThe “spine” is the center axis of the tube and is the \\(10\\)th, \\(11\\)th, and \\(12\\)th coordinates:\n\nspine(t) = sol(t)[10:12]\n\nspine (generic function with 1 method)\n\n\nThe tangent, normal, and binormal can be similarly defined using the other \\(9\\) indices:\n\ne₁(t) = sol(t)[1:3]\ne₂(t) = sol(t)[4:6]\ne₃(t) = sol(t)[7:9]\n\ne₃ (generic function with 1 method)\n\n\nWe fix a small time range and show the trace of the spine and the frame at a single point in time:\n\na_0, b_0 = 50, 60\nts_0 = range(a_0, b_0, length=251)\n\nt_0 = (a_0 + b_0) / 2\nϵ = 1/5\n\nplot_parametric(a_0..b_0, spine)\n\narrow!(spine(t_0), e₁(t_0))\narrow!(spine(t_0), e₂(t_0))\narrow!(spine(t_0), e₃(t_0))\n\nr_0(t, θ) = spine(t) + ϵ * (e₂(t)*cos(θ) + e₃(t)*sin(θ))\nplot_parametric!(0..2pi, θ -> r_0(t_0, θ))\n\n\n\n\nThe ϵ value determines the radius of the tube; we see it above as the radius of the drawn circle. The function r for a fixed t traces out such a circle centered at a point on the spine. For a fixed θ, the function r describes a line on the surface of the tube paralleling the spine.\nThe tubular surface is now ready to be rendered along the entire time span using a pattern for parametrically defined surfaces:\n\nts = range(t_span..., length=1001)\nθs = range(0, 2pi, length=100)\nsurface(unzip(r_0.(ts, θs'))...)"
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#arc-length",
"href": "differentiable_vector_calculus/vector_valued_functions.html#arc-length",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.7 Arc length",
"text": "54.7 Arc length\nIn Arc length there is a discussion of how to find the arc length of a parameterized curve in \\(2\\) dimensions. The general case is discussed by Destafano who shows:\n\nArc-length: if a curve \\(C\\) is parameterized by a smooth function \\(\\vec{r}(t)\\) over an interval \\(I\\), then the arc length of \\(C\\) is:\n\\[\n\\int_I \\| \\vec{r}'(t) \\| dt.\n\\]\n\nIf we associate \\(\\vec{r}'(t)\\) with the velocity, then this is the integral of the speed (the magnitude of the velocity).\nLet \\(I=[a,b]\\) and \\(s(t): [v,w] \\rightarrow [a,b]\\) such that \\(s\\) is increasing and differentiable. Then \\(\\vec{\\phi} = \\vec{r} \\circ s\\) will have have\n\\[\n\\text{arc length} =\n\\int_v^w \\| \\vec{\\phi}'(t)\\| dt =\n\\int_v^w \\| \\vec{r}'(s(t))\\| s'(t) dt =\n\\int_a^b \\| \\vec{r}'(u) \\| du,\n\\]\nby a change of variable \\(u=s(t)\\). As such the arc length is a property of the curve and not the parameterization of the curve.\nFor some parameterization, we can define\n\\[\ns(t) = \\int_0^t \\| \\vec{r}'(u) \\| du\n\\]\nThen by the fundamental theorem of calculus, \\(s(t)\\) is non-decreasing. If \\(\\vec{r}'\\) is assumed to be non-zero and continuous (regular), then \\(s(t)\\) has a derivative and an inverse which is monotonic. Using the inverse function \\(s^{-1}\\) to change variables (\\(\\vec{\\phi} = \\vec{r} \\circ s^{-1}\\)) has\n\\[\n\\int_0^c \\| \\phi'(t) \\| dt =\n\\int_{s^{-1}(0)}^{s^{-1}(c)} \\| \\vec{r}'(u) \\| du =\ns(s^{-1}(c)) - s(s^{-1}(0)) =\nc\n\\]\nThat is, the arc length from \\([0,c]\\) for \\(\\phi\\) is just \\(c\\); the curve \\(C\\) is parameterized by arc length.\n\nExample\nVivianis curve is the intersection of sphere of radius \\(a\\) with a cylinder of radius \\(a\\). A parameterization was given previously by:\n\nfunction viviani(t, a=1)\n [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)]\nend\n\nviviani (generic function with 2 methods)\n\n\nThe curve is traced out over the interval \\([0, 4\\pi]\\). We try to find the arc-length:\n\n@syms t::positive a::positive\nspeed = simplify(norm(diff.(viviani(t, a), t)))\nintegrate(speed, (t, 0, 4*PI))\n\n \n\\[\n\\frac{\\sqrt{2} a \\int\\limits_{0}^{4 \\pi} \\sqrt{\\cos{\\left(t \\right)} + 3}\\, dt}{2}\n\\]\n\n\n\nWe see that the answer depends linearly on \\(a\\), but otherwise is a constant expressed as an integral. We use QuadGk to provide a numeric answer for the case \\(a=1\\):\n\nquadgk(t -> norm(viviani'(t)), 0, 4pi)\n\n(15.280791156110851, 3.750801136348514e-8)\n\n\n\n\nExample\nVery few parameterized curves admit a closed-form expression for parameterization by arc-length. Lets consider the helix expressed by \\(\\langle a\\cos(t), a\\sin(t), bt\\rangle\\), as this does allow such a parameterization.\n\n@syms aₕ::positive bₕ::positive tₕ::positive alₕ::positive\nhelix = [aₕ * cos(tₕ), aₕ * sin(tₕ), bₕ * tₕ]\nspeed = simplify( norm(diff.(helix, tₕ)) )\ns = integrate(speed, (tₕ, 0, alₕ))\n\n \n\\[\nalₕ \\sqrt{aₕ^{2} + bₕ^{2}}\n\\]\n\n\n\nSo s is a linear function. We can re-parameterize by:\n\neqnₕ = subs.(helix, tₕ => alₕ/sqrt(aₕ^2 + bₕ^2))\n\n3-element Vector{Sym}:\n aₕ*cos(alₕ/sqrt(aₕ^2 + bₕ^2))\n aₕ*sin(alₕ/sqrt(aₕ^2 + bₕ^2))\n alₕ*bₕ/sqrt(aₕ^2 + bₕ^2)\n\n\nTo see that the speed, \\(\\| \\vec{\\phi}' \\|\\), is constantly \\(1\\):\n\nsimplify(norm(diff.(eqnₕ, alₕ)))\n\n \n\\[\n1\n\\]\n\n\n\nFrom this, we have the arc length is:\n\\[\n\\int_0^t \\| \\vec{\\phi}'(u) \\| du = \\int_0^t 1 du = t\n\\]\n\nParameterizing by arc-length is only explicitly possible for a few examples, however knowing it can be done in theory, is important. Some formulas are simplified, such as the tangent, normal, and binormal. Let \\(\\vec{r}(s)\\) be parameterized by arc length, then:\n\\[\n\\hat{T}(s)= \\vec{r}'(s) / \\| \\vec{r}'(s) \\| = \\vec{r}'(s),\\quad\n\\hat{N}(s) = \\hat{T}'(s) / \\| \\hat{T}'(s)\\| = \\hat{T}'(s)/\\kappa,\\quad\n\\hat{B} = \\hat{T} \\times \\hat{N},\n\\]\nAs before, but further, we have if \\(\\kappa\\) is the curvature and \\(\\tau\\) the torsion, these relationships expressing the derivatives with respect to \\(s\\) in terms of the components in the frame:\n\\[\n\\begin{align*}\n\\hat{T}'(s) &= &\\kappa \\hat{N}(s) &\\\\\n\\hat{N}'(s) &= -\\kappa \\hat{T}(s) & &+ \\tau \\hat{B}(s)\\\\\n\\hat{B}'(s) &= &-\\tau \\hat{N}(s) &\n\\end{align*}\n\\]\nThese are the Frenet-Serret formulas.\n\n\nExample\nContinuing with our parameterization of a helix by arc length, we can compute the curvature and torsion by differentiation:\n\ngammaₕ = subs.(helix, tₕ => alₕ/sqrt(aₕ^2 + bₕ^2)) # gamma parameterized by arc length\n@syms uₕ::positive\ngammaₕ₁ = subs.(gammaₕ, alₕ .=> uₕ) # u is arc-length parameterization\n\n3-element Vector{Sym}:\n aₕ*cos(uₕ/sqrt(aₕ^2 + bₕ^2))\n aₕ*sin(uₕ/sqrt(aₕ^2 + bₕ^2))\n bₕ*uₕ/sqrt(aₕ^2 + bₕ^2)\n\n\n\nTₕ = diff.(gammaₕ₁, uₕ)\nnorm(Tₕ) |> simplify\n\n \n\\[\n1\n\\]\n\n\n\nThe length is one, as the speed of a curve parameterized by arc-length is 1.\n\noutₕ = diff.(Tₕ, uₕ)\n\n3-element Vector{Sym}:\n -aₕ*cos(uₕ/sqrt(aₕ^2 + bₕ^2))/(aₕ^2 + bₕ^2)\n -aₕ*sin(uₕ/sqrt(aₕ^2 + bₕ^2))/(aₕ^2 + bₕ^2)\n 0\n\n\nThis should be \\(\\kappa \\hat{N}\\), so we do:\n\nκₕ = norm(outₕ) |> simplify\nNormₕ = outₕ / κₕ\nκₕ, Normₕ\n\n(aₕ/(aₕ^2 + bₕ^2), Sym[-cos(uₕ/sqrt(aₕ^2 + bₕ^2)), -sin(uₕ/sqrt(aₕ^2 + bₕ^2)), 0])\n\n\nInterpreting, \\(a\\) is the radius of the circle and \\(b\\) how tight the coils are. If \\(a\\) gets much larger than \\(b\\), then the curvature is like \\(1/a\\), just as with a circle. If \\(b\\) gets very big, then the trajectory looks more stretched out and the curvature gets smaller.\nTo find the torsion, we find, \\(\\hat{B}\\) then differentiate:\n\nBₕ = Tₕ × Normₕ\noutₕ₁ = diff.(Bₕ, uₕ)\nτₕ = norm(outₕ₁)\n\n \n\\[\n\\sqrt{\\frac{bₕ^{2} \\sin^{2}{\\left(\\frac{uₕ}{\\sqrt{aₕ^{2} + bₕ^{2}}} \\right)}}{\\left(aₕ^{2} + bₕ^{2}\\right)^{2}} + \\frac{bₕ^{2} \\cos^{2}{\\left(\\frac{uₕ}{\\sqrt{aₕ^{2} + bₕ^{2}}} \\right)}}{\\left(aₕ^{2} + bₕ^{2}\\right)^{2}}}\n\\]\n\n\n\nThis looks complicated, as does Norm:\n\nNormₕ\n\n3-element Vector{Sym}:\n -cos(uₕ/sqrt(aₕ^2 + bₕ^2))\n -sin(uₕ/sqrt(aₕ^2 + bₕ^2))\n 0\n\n\nHowever, the torsion, up to a sign, simplifies nicely:\n\nτₕ |> simplify\n\n \n\\[\n\\frac{bₕ}{aₕ^{2} + bₕ^{2}}\n\\]\n\n\n\nHere, when \\(b\\) gets large, the curve looks more and more “straight” and the torsion decreases. Similarly, if \\(a\\) gets big, the torsion decreases.\n\n\nExample\nLevi and Tabachnikov consider the trajectories of the front and rear bicycle wheels. Recall the notation previously used: \\(\\vec{F}(t)\\) for the front wheel, and \\(\\vec{B}(t)\\) for the rear wheel trajectories. Consider now their parameterization by arc length, using \\(u\\) for the arc-length parameter for \\(\\vec{F}\\) and \\(v\\) for \\(\\vec{B}\\). We define \\(\\alpha(u)\\) to be the steering angle of the bicycle. This can be found as the angle between the tangent vector of the path of \\(\\vec{F}\\) with the vector \\(\\vec{B} - \\vec{F}\\). Let \\(\\kappa\\) be the curvature of the front wheel and \\(k\\) the curvature of the back wheel.\n\n\n\n\n\nLevi and Tabachnikov prove in their Proposition 2.4:\n\\[\n\\begin{align*}\n\\kappa(u) &= \\frac{d\\alpha(u)}{du} + \\frac{\\sin(\\alpha(u))}{a},\\\\\n|\\frac{du}{dv}| &= |\\cos(\\alpha)|, \\quad \\text{and}\\\\\nk &= \\frac{\\tan(\\alpha)}{a}.\n\\end{align*}\n\\]\nThe first equation relates the steering angle with the curvature. If the steering angle is not changed (\\(d\\alpha/du=0\\)) then the curvature is constant and the motion is circular. It will be greater for larger angles (up to \\(\\pi/2\\)). As the curvature is the reciprocal of the radius, this means the radius of the circular trajectory will be smaller. For the same constant steering angle, the curvature will be smaller for longer wheelbases, meaning the circular trajectory will have a larger radius. For cars, which have similar dynamics, this means longer wheelbase cars will take more room to make a U-turn.\nThe second equation may be interpreted in ratio of arc lengths. The infinitesimal arc length of the rear wheel is proportional to that of the front wheel only scaled down by \\(\\cos(\\alpha)\\). When \\(\\alpha=0\\) - the bike is moving in a straight line - and the two are the same. At the other extreme - when \\(\\alpha=\\pi/2\\) - the bike must be pivoting on its rear wheel and the rear wheel has no arc length. This cosine, is related to the speed of the back wheel relative to the speed of the front wheel, which was used in the initial differential equation.\nThe last equation, relates the curvature of the back wheel track to the steering angle of the front wheel. When \\(\\alpha=\\pm\\pi/2\\), the rear-wheel curvature, \\(k\\), is infinite, resulting in a cusp (no circle with non-zero radius will approximate the trajectory). This occurs when the front wheel is steered orthogonal to the direction of motion. As was seen in previous graphs of the trajectories, a cusp can happen for quite regular front wheel trajectories.\nTo derive the first one, we have previously noted that when a curve is parameterized by arc length, the curvature is more directly computed: it is the magnitude of the derivative of the tangent vector. The tangent vector is of unit length, when parametrized by arc length. This implies its derivative will be orthogonal. If \\(\\vec{r}(t)\\) is a parameterization by arc length, then the curvature formula simplifies as:\n\\[\n\\begin{align*}\n\\kappa(s) &= \\frac{\\| \\vec{r}'(s) \\times \\vec{r}''(s) \\|}{\\|\\vec{r}'(s)\\|^3} \\\\\n&= \\frac{\\| \\vec{r}'(s) \\times \\vec{r}''(s) \\|}{1} \\\\\n&= \\| \\vec{r}'(s) \\| \\| \\vec{r}''(s) \\| \\sin(\\theta) \\\\\n&= 1 \\| \\vec{r}''(s) \\| 1 = \\| \\vec{r}''(s) \\|.\n\\end{align*}\n\\]\nSo in the above, the curvature is \\(\\kappa = \\| \\vec{F}''(u) \\|\\) and \\(k = \\|\\vec{B}''(v)\\|\\).\nOn the figure, the tangent vector \\(\\vec{F}'(u)\\) is drawn, along with this unit vector rotated by \\(\\pi/2\\). We call these, for convenience, \\(\\vec{U}\\) and \\(\\vec{V}\\). We have \\(\\vec{U} = \\vec{F}'(u)\\) and \\(\\vec{V} = -(1/\\kappa) \\vec{F}''(u)\\).\nThe key decomposition, is to express a unit vector in the direction of the line segment, as the vector \\(\\vec{U}\\) rotated by \\(\\alpha\\) degrees. Mathematically, this is usually expressed in matrix notation, but more explicitly by\n\\[\n\\langle \\cos(\\alpha) \\vec{U}_1 - \\sin(\\alpha) \\vec{U}_2,\n\\sin(\\alpha) \\vec{U}_1 + \\cos(\\alpha) \\vec{U}_2 =\n\\vec{U} \\cos(\\alpha) - \\vec{V} \\sin(\\alpha).\n\\]\nWith this, the mathematical relationship between \\(F\\) and \\(B\\) is just a multiple of this unit vector:\n\\[\n\\vec{B}(u) = \\vec{F}(u) - a \\vec{U} \\cos(\\alpha) + a \\vec{V} \\sin(\\alpha).\n\\]\nIt must be that the tangent line of \\(\\vec{B}\\) is parallel to \\(\\vec{U} \\cos(\\alpha) + \\vec{V} \\sin(\\alpha)\\). To utilize this, we differentiate \\(\\vec{B}\\) using the facts that \\(\\vec{U}' = \\kappa \\vec{V}\\) and \\(\\vec{V}' = -\\kappa \\vec{U}\\). These coming from \\(\\vec{U} = \\vec{F}'\\) and so its derivative in \\(u\\) has magnitude yielding the curvature, \\(\\kappa\\), and direction orthogonal to \\(\\vec{U}\\).\n\\[\n\\begin{align}\n\\vec{B}'(u) &= \\vec{F}'(u)\n-a \\vec{U}' \\cos(\\alpha) -a \\vec{U} (-\\sin(\\alpha)) \\alpha'\n+a \\vec{V}' \\sin(\\alpha) + a \\vec{V} \\cos(\\alpha) \\alpha'\\\\\n& = \\vec{U}\n-a (\\kappa) \\vec{V} \\cos(\\alpha) + a \\vec{U} \\sin(\\alpha) \\alpha' +\na (-\\kappa) \\vec{U} \\sin(\\alpha) + a \\vec{V} \\cos(\\alpha) \\alpha' \\\\\n&= \\vec{U}\n+ a(\\alpha' - \\kappa) \\sin(\\alpha) \\vec{U}\n+ a(\\alpha' - \\kappa) \\cos(\\alpha)\\vec{V}.\n\\end{align}\n\\]\nExtend the \\(2\\)-dimensional vectors to \\(3\\) dimensions, by adding a zero \\(z\\) component, then:\n\\[\n\\begin{align}\n\\vec{0} &= (\\vec{U}\n+ a(\\alpha' - \\kappa) \\sin(\\alpha) \\vec{U}\n+ a(\\alpha' - \\kappa) \\cos(\\alpha)\\vec{V}) \\times\n(-\\vec{U} \\cos(\\alpha) + \\vec{V} \\sin(\\alpha)) \\\\\n&= (\\vec{U} \\times \\vec{V}) \\sin(\\alpha) +\na(\\alpha' - \\kappa) \\sin(\\alpha) \\vec{U} \\times \\vec{V} \\sin(\\alpha)) -\na(\\alpha' - \\kappa) \\cos(\\alpha)\\vec{V} \\times \\vec{U} \\cos(\\alpha) \\\\\n&= (\\sin(\\alpha) + a(\\alpha'-\\kappa) \\sin^2(\\alpha) +\na(\\alpha'-\\kappa) \\cos^2(\\alpha)) \\vec{U} \\times \\vec{V} \\\\\n&= (\\sin(\\alpha) + a (\\alpha' - \\kappa)) \\vec{U} \\times \\vec{V}.\n\\end{align}\n\\]\nThe terms \\(\\vec{U} \\times\\vec{U}\\) and \\(\\vec{V}\\times\\vec{V}\\) being \\(\\vec{0}\\), due to properties of the cross product. This says the scalar part must be \\(0\\), or\n\\[\n\\frac{\\sin(\\alpha)}{a} + \\alpha' = \\kappa.\n\\]\nAs for the second equation, from the expression for \\(\\vec{B}'(u)\\), after setting \\(a(\\alpha'-\\kappa) = -\\sin(\\alpha)\\):\n\\[\n\\begin{align}\n\\|\\vec{B}'(u)\\|^2\n&= \\| (1 -\\sin(\\alpha)\\sin(\\alpha)) \\vec{U} -\\sin(\\alpha)\\cos(\\alpha) \\vec{V} \\|^2\\\\\n&= \\| \\cos^2(\\alpha) \\vec{U} -\\sin(\\alpha)\\cos(\\alpha) \\vec{V} \\|^2\\\\\n&= (\\cos^2(\\alpha))^2 + (\\sin(\\alpha)\\cos(\\alpha))^2\\quad\\text{using } \\vec{U}\\cdot\\vec{V}=0\\\\\n&= \\cos^2(\\alpha)(\\cos^2(\\alpha) + \\sin^2(\\alpha))\\\\\n&= \\cos^2(\\alpha).\n\\end{align}\n\\]\nFrom this \\(\\|\\vec{B}(u)\\| = |\\cos(\\alpha)\\|\\). But \\(1 = \\|d\\vec{B}/dv\\| = \\|d\\vec{B}/du \\| \\cdot |du/dv|\\) and \\(|dv/du|=|\\cos(\\alpha)|\\) follows."
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#evolutes-and-involutes",
"href": "differentiable_vector_calculus/vector_valued_functions.html#evolutes-and-involutes",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.8 Evolutes and involutes",
"text": "54.8 Evolutes and involutes\nFollowing Fuchs we discuss a geometric phenomenon known and explored by Huygens, and likely earlier. We stick to the two-dimensional case, Fuchs extends this to three dimensions. The following figure\n\n\n\n\n\nis that of an ellipse with many normal lines drawn to it. The normal lines appear to intersect in a somewhat diamond-shaped curve. This curve is the evolute of the ellipse. We can characterize this using the language of planar curves.\nConsider a parameterization of a curve by arc-length, \\(\\vec\\gamma(s) = \\langle u(s), v(s) \\rangle\\). The unit tangent to this curve is \\(\\vec\\gamma'(s) = \\hat{T}(s) = \\langle u'(s), v'(s) \\rangle\\) and by simple geometry the unit normal will be \\(\\hat{N}(s) = \\langle -v'(s), u'(s) \\rangle\\). At a time \\(t\\), a line through the curve parameterized by \\(\\vec\\gamma\\) is given by \\(l_t(a) = \\vec\\gamma(t) + a \\hat{N}(t)\\).\nConsider two nearby points \\(t\\) and \\(t+\\epsilon\\) and the intersection of \\(l_t\\) and \\(l_{t+\\epsilon}\\). That is, we need points \\(a\\) and \\(b\\) with: \\(l_t(a) = l_{t+\\epsilon}(b)\\). Setting the components equal, this is:\n\\[\n\\begin{align}\nu(t) - av'(t) &= u(t+\\epsilon) - bv'(t+\\epsilon) \\\\\nv(t) - au'(t) &= v(t+\\epsilon) - bu'(t+\\epsilon).\n\\end{align}\n\\]\nThis is a linear equation in two unknowns (\\(a\\) and \\(b\\)) which can be solved. Here is the value for a:\n\n@syms u() v() t epsilon w\n@syms a b\nγ(t) = [u(t),v(t)]\nn(t) = subs.(diff.([-v(w), u(w)], w), w.=>t)\nl(a, t) = γ(t) + a * n(t)\nout = solve(l(a, t) - l(b, t+epsilon), [a,b])\nout[a]\n\n \n\\[\n- \\frac{u{\\left(t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}}{\\frac{d}{d t} u{\\left(t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }} - \\frac{d}{d t} v{\\left(t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}} + \\frac{u{\\left(\\epsilon + t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}}{\\frac{d}{d t} u{\\left(t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }} - \\frac{d}{d t} v{\\left(t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}} - \\frac{v{\\left(t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}}{\\frac{d}{d t} u{\\left(t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }} - \\frac{d}{d t} v{\\left(t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}} + \\frac{v{\\left(\\epsilon + t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}}{\\frac{d}{d t} u{\\left(t \\right)} \\left. \\frac{d}{d w} v{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }} - \\frac{d}{d t} v{\\left(t \\right)} \\left. \\frac{d}{d w} u{\\left(w \\right)} \\right|_{\\substack{ w=\\epsilon + t }}}\n\\]\n\n\n\nLetting \\(\\epsilon \\rightarrow 0\\) we get an expression for \\(a\\) that will describe the evolute at time \\(t\\) in terms of the function \\(\\gamma\\). Looking at the expression above, we can see that dividing the numerator by \\(\\epsilon\\) and taking a limit will yield \\(u'(t)^2 + v'(t)^2\\). If the denominator has a limit after dividing by \\(\\epsilon\\), then we can find the description sought. Pursuing this leads to:\n\\[\n\\begin{align*}\n\\frac{u'(t) v'(t+\\epsilon) - v'(t) u'(t+\\epsilon)}{\\epsilon}\n&= \\frac{u'(t) v'(t+\\epsilon) -u'(t)v'(t) + u'(t)v'(t)- v'(t) u'(t+\\epsilon)}{\\epsilon} \\\\\n&= \\frac{u'(t)(v'(t+\\epsilon) -v'(t))}{\\epsilon} + \\frac{(u'(t)- u'(t+\\epsilon))v'(t)}{\\epsilon},\n\\end{align*}\n\\]\nwhich in the limit will give \\(u'(t)v''(t) - u''(t) v'(t)\\). All told, in the limit as \\(\\epsilon \\rightarrow 0\\) we get\n\\[\n\\begin{align*}\na &= \\frac{u'(t)^2 + v'(t)^2}{u'(t)v''(t) - v'(t) u''(t)} \\\\\n&= 1/(\\|\\vec\\gamma'\\|\\kappa) \\\\\n&= 1/(\\|\\hat{T}\\|\\kappa) \\\\\n&= 1/\\kappa,\n\\end{align*}\n\\]\nwith \\(\\kappa\\) being the curvature of the planar curve. That is, the evolute of \\(\\vec\\gamma\\) is described by:\n\\[\n\\vec\\beta(s) = \\vec\\gamma(s) + \\frac{1}{\\kappa(s)}\\hat{N}(s).\n\\]\nRevisualizing:\n\nrₑ₃(t) = [2cos(t), sin(t), 0]\nTangent(r, t) = unit_vec(r'(t))\nNormal(r, t) = unit_vec((𝒕 -> Tangent(r, 𝒕))'(t))\ncurvature(r, t) = norm(r'(t) × r''(t) ) / norm(r'(t))^3\n\nplot_parametric(0..2pi, t -> rₑ₃(t)[1:2], legend=false, aspect_ratio=:equal)\nplot_parametric!(0..2pi, t -> (rₑ₃(t) + Normal(rₑ₃, t)/curvature(rₑ₃, t))[1:2])\n\n\n\n\nWe computed the above illustration using \\(3\\) dimensions (hence the use of [1:2]...) as the curvature formula is easier to express. Recall, the curvature also appears in the Frenet-Serret formulas: \\(d\\hat{T}/ds = \\kappa \\hat{N}\\) and \\(d\\hat{N}/ds = -\\kappa \\hat{T}+ \\tau \\hat{B}\\). In a planar curve, as under consideration, the binormal is \\(\\vec{0}\\). This allows the computation of \\(\\vec\\beta(s)'\\):\n\\[\n\\begin{align}\n\\vec{\\beta}' &= \\frac{d(\\vec\\gamma + (1/k) \\hat{N})}{dt}\\\\\n&= \\hat{T} + (-\\frac{k'}{k^2}\\hat{N} + \\frac{1}{k} \\hat{N}')\\\\\n&= \\hat{T} - \\frac{k'}{k^2}\\hat{N} + \\frac{1}{k} (-\\kappa \\hat{T})\\\\\n&= - \\frac{k'}{k^2}\\hat{N}.\n\\end{align}\n\\]\nWe see \\(\\vec\\beta'\\) is zero (the curve is non-regular) when \\(\\kappa'(s) = 0\\). The curvature changes from increasing to decreasing, or vice versa at each of the \\(4\\) crossings of the major and minor axes - there are \\(4\\) non-regular points, and we see \\(4\\) cusps in the evolute.\nThe curve parameterized by \\(\\vec{r}(t) = 2(1 - \\cos(t)) \\langle \\cos(t), \\sin(t)\\rangle\\) over \\([0,2\\pi]\\) is cardiod. It is formed by rolling a circle of radius \\(r\\) around another similar sized circle. The following graphically shows the evolute is a smaller cardiod (one-third the size). For fun, the evolute of the evolute is drawn:\n\nfunction evolute(r)\n t -> r(t) + 1/curvature(r, t) * Normal(r, t)\nend\n\nevolute (generic function with 1 method)\n\n\n\nr(t) = 2*(1 - cos(t)) * [cos(t), sin(t), 0]\n\nplot(legend=false, aspect_ratio=:equal)\nplot_parametric!(0..2pi, t -> r(t)[1:2])\nplot_parametric!(0..2pi, t -> evolute(r)(t)[1:2])\nplot_parametric!(0..2pi, t -> ((evolute∘evolute)(r)(t))[1:2])\n\n\n\n\n\nIf \\(\\vec\\beta\\) is the evolute of \\(\\vec\\gamma\\), then \\(\\vec\\gamma\\) is an involute of \\(\\beta\\). For a given curve, there is a parameterized family of involutes. While this definition has a pleasing self-referentialness, it doesnt have an immediately clear geometric interpretation. For that, consider the image of a string of fixed length \\(a\\) attached to the curve \\(\\vec\\gamma\\) at some point \\(t_0\\). As this curve wraps around the curve traced by \\(\\vec\\gamma\\) it is held taut so that it makes a tangent at the point of contact. The end of the string will trace out a curve and this is the trace of an involute.\n\nr(t) = [t, cosh(t)]\nt0, t1 = -2, 0\na = t1\n\nbeta(r, t) = r(t) - Tangent(r, t) * quadgk(t -> norm(r'(t)), a, t)[1]\n\np = plot_parametric(-2..2, r, legend=false)\nplot_parametric!(t0..t1, t -> beta(r, t))\nfor t in range(t0,-0.2, length=4)\n arrow!(r(t), -Tangent(r, t) * quadgk(t -> norm(r'(t)), a, t)[1])\n scatter!(unzip([r(t)])...)\nend\np\n\n\n\n\nThis lends itself to this mathematical description, if \\(\\vec\\gamma(t)\\) parameterizes the planar curve, then an involute for \\(\\vec\\gamma(t)\\) is described by:\n\\[\n\\vec\\beta(t) = \\vec\\gamma(t) + \\left((a - \\int_{t_0}^t \\| \\vec\\gamma'(t)\\| dt) \\hat{T}(t)\\right),\n\\]\nwhere \\(\\hat{T}(t) = \\vec\\gamma'(t)/\\|\\vec\\gamma'(t)\\|\\) is the unit tangent vector. The above uses two parameters (\\(a\\) and \\(t_0\\)), but only one is needed, as there is an obvious redundancy (a point can also be expressed by \\(t\\) and the shortened length of string). Wikipedia uses this definition for \\(a\\) and \\(t\\) values in an interval \\([t_0, t_1]\\):\n\\[\n\\vec\\beta_a(t) = \\vec\\gamma(t) - \\frac{\\vec\\gamma'(t)}{\\|\\vec\\gamma'(t)\\|}\\int_a^t \\|\\vec\\gamma'(t)\\| dt.\n\\]\nIf \\(\\vec\\gamma(s)\\) is parameterized by arc length, then this simplifies quite a bit, as the unit tangent is just \\(\\vec\\gamma'(s)\\) and the remaining arc length just \\((s-a)\\):\n\\[\n\\begin{align*}\n\\vec\\beta_a(s) &= \\vec\\gamma(s) - \\vec\\gamma'(s) (s-a) \\\\\n&=\\vec\\gamma(s) - \\hat{T}_{\\vec\\gamma}(s)(s-a).\\quad (a \\text{ is the arc-length parameter})\n\\end{align*}\n\\]\nWith this characterization, we see several properties:\n\nFrom \\(\\vec\\beta_a'(s) = \\hat{T}(s) - (\\kappa(s) \\hat{N}(s) (s-a) + \\hat{T}(s)) = -\\kappa_{\\vec\\gamma}(s) \\cdot (s-a) \\cdot \\hat{N}_{\\vec\\gamma}(s)\\), the involute is not regular at \\(s=a\\), as its derivative is zero.\nAs \\(\\vec\\beta_a(s) = \\vec\\beta_0(s) + a\\hat{T}(s)\\), the family of curves is parallel.\nThe evolute of \\(\\vec\\beta_a(s)\\), \\(s\\) the arc-length parameter of \\(\\vec\\gamma\\), can be shown to be \\(\\vec\\gamma\\). This requires more work:\n\nThe evolute for \\(\\vec\\beta_a(s)\\) is:\n\\[\n\\vec\\beta_a(s) + \\frac{1}{\\kappa_{\\vec\\beta_a}(s)}\\hat{N}_{\\vec\\beta_a}(s).\n\\]\nIn the following we show that:\n\\[\n\\begin{align}\n\\kappa_{\\vec\\beta_a}(s) &= 1/(s-a),\\\\\n\\hat{N}_{\\vec\\beta_a}(s) &= \\hat{T}_{\\vec\\beta_a}'(s)/\\|\\hat{T}_{\\vec\\beta_a}'(s)\\| = -\\hat{T}_{\\vec\\gamma}(s).\n\\end{align}\n\\]\nThe first shows in a different way that when \\(s=a\\) the curve is not regular, as the curvature fails to exists. In the above figure, when the involute touches \\(\\vec\\gamma\\), there will be a cusp.\nWith these two identifications and using \\(\\vec\\gamma'(s) = \\hat{T}_{\\vec\\gamma(s)}\\), we have the evolute simplifies to\n\\[\n\\begin{align*}\n\\vec\\beta_a(s) + \\frac{1}{\\kappa_{\\vec\\beta_a}(s)}\\hat{N}_{\\vec\\beta_a}(s)\n&=\n\\vec\\gamma(s) + \\vec\\gamma'(s)(s-a) + \\frac{1}{\\kappa_{\\vec\\beta_a}(s)}\\hat{N}_{\\vec\\beta_a}(s) \\\\\n&=\n\\vec\\gamma(s) + \\hat{T}_{\\vec\\gamma}(s)(s-a) + \\frac{1}{1/(s-a)} (-\\hat{T}_{\\vec\\gamma}(s)) \\\\\n&= \\vec\\gamma(s).\n\\end{align*}\n\\]\nThat is the evolute of an involute of \\(\\vec\\gamma(s)\\) is \\(\\vec\\gamma(s)\\).\nWe have:\n\\[\n\\begin{align}\n\\beta_a(s) &= \\vec\\gamma - \\vec\\gamma'(s)(s-a)\\\\\n\\beta_a'(s) &= -\\kappa_{\\vec\\gamma}(s)(s-a)\\hat{N}_{\\vec\\gamma}(s)\\\\\n\\beta_a''(s) &= (-\\kappa_{\\vec\\gamma}(s)(s-a))' \\hat{N}_{\\vec\\gamma}(s) + (-\\kappa_{\\vec\\gamma}(s)(s-a))(-\\kappa_{\\vec\\gamma}\\hat{T}_{\\vec\\gamma}(s)),\n\\end{align}\n\\]\nthe last line by the Frenet-Serret formula for planar curves which show \\(\\hat{T}'(s) = \\kappa(s) \\hat{N}\\) and \\(\\hat{N}'(s) = -\\kappa(s)\\hat{T}(s)\\).\nTo compute the curvature of \\(\\vec\\beta_a\\), we need to compute both:\n\\[\n\\begin{align}\n\\| \\vec\\beta' \\|^3 &= |\\kappa^3 (s-a)^3|\\\\\n\\| \\vec\\beta' \\times \\vec\\beta'' \\| &= |\\kappa(s)^3 (s-a)^2|,\n\\end{align}\n\\]\nthe last line using both \\(\\hat{N}\\times\\hat{N} = \\vec{0}\\) and \\(\\|\\hat{N}\\times\\hat{T}\\| = 1\\). The curvature then is \\(\\kappa_{\\vec\\beta_a}(s) = 1/(s-a)\\).\nUsing the formula for \\(\\vec\\beta'\\) above, we get \\(\\hat{T}_\\beta(s)=\\hat{N}_{\\vec\\gamma}(s)\\) so \\(\\hat{T}_\\beta(s)' = -\\kappa_{\\vec\\gamma}(s) \\hat{T}_{\\vec\\gamma}(s)\\) with unit vector just \\(\\hat{N}_{\\vec\\beta_a} = -\\hat{T}_{\\vec\\gamma}(s)\\).\n\nShow that an involute of the cycloid \\(\\vec{r}(t) = \\langle t - \\sin(t), 1 - \\cos(t) \\rangle\\) is also a cycloid. We do so graphically:\n\nr(t) = [t - sin(t), 1 - cos(t)]\n## find *involute*: r - r'/|r'| * int(|r'|, a, t)\nt0, t1, a = 2PI, PI, PI\n@syms t::real\nrp = diff.(r(t), t)\nspeed = 2sin(t/2)\n\nex = r(t) - rp/speed * integrate(speed, a, t)\n\nplot_parametric(0..4pi, r, legend=false)\nplot_parametric!(0..4pi, u -> SymPy.N.(subs.(ex, t .=> u)))\n\n\n\n\nThe expression ex is secretly [t + sin(t), 3 + cos(t)], another cycloid.\n\nExample: goats\nAn old problem of calculus is called the goat problem. This formulation with horses is from \\(1748\\):\n\nObserving a horse tied to feed in a gentlemens park, with one end of a rope to his fore foot, and the other end to one of the circular iron rails, inclosing a pond, the circumference of which rails being \\(160\\) yards, equal to the length of the rope, what quantity of ground at most, could the horse feed?\n\nLet \\(r\\) be the radius of a circle and for concreteness we position it at \\((-r, 0)\\). Let \\(R\\) be the length of a rope, and suppose \\(R \\ge \\pi r\\). (It is equal in the problem). Then the question can be rephrased as what is twice the area suggested by this graphic which is drawn in pieces:\n\nBetween angles \\(0\\) and \\(\\pi/2\\) the horse has unconstrained access, so they can graze a wedge of radius \\(R\\).\nBetween angles \\(\\pi/2\\) and until the horses \\(y\\) position is \\(0\\) when the tether is taut the boundary of what can be eaten is described by the involute.\nThe horse cant eat from withing the circle or radius \\(r\\).\n\n\n\n\n\n\nTo solve for the area we parameterize the circle of radius \\(r\\) between \\(\\pi/2\\) and when the involute would cross the \\(x\\) axis. We use find_zero to identify the value.\n\nlet\nr,R = 160/(2π), 160\nR = max(R, pi*r) # R ≥ 1/2 circumference\nγ(θ) = -2r*cos(θ) * [cos(θ), sin(θ)]\n## find *involute*: r - r'/|r'| * int(|r'|, a, t)\ninvolute(t) = γ(t) + γ'(t)/norm(γ'(t))* (R - quadgk(u -> norm(γ'(u)), pi/2, t)[1])\n\nt₀ = find_zero(t -> round(involute(t)[2], digits=4), (3pi/4, pi))\n\nA₁ = π * R^2 / 4\ny(t) = involute(t)[2]\nx(t) = (h=1e-4; (involute(t+h)[1]-involute(t-h)[1])/(2h))\nA₂ = quadgk(t -> -y(t)*x(t), pi/2, t₀)[1] # A₂ = -∫ y dx, as counterclockwise parameterization\nA₃ = (1/2) * π * r^2\n2 * (A₁ + A₂ - A₃)\nend\n\n76255.66077719798\n\n\nThe calculation for \\(A_1\\) and \\(A_3\\) are from the familiar formula for the area of a circle. However, \\(A_2\\) requires the formula for area above the \\(x\\) axis when the curve is parameterized: \\(A = -\\int_a^b y(t) x'(t) dt\\), given how the curve is parameterized. As written, the automatic derivative of the numeric integral gives an error, so a central-difference approximation is used for \\(x'(t)\\)."
},
{
"objectID": "differentiable_vector_calculus/vector_valued_functions.html#questions",
"href": "differentiable_vector_calculus/vector_valued_functions.html#questions",
"title": "54  Vector-valued functions, \\(f:R \\rightarrow R^n\\)",
"section": "54.9 Questions",
"text": "54.9 Questions\n\nQuestion\nA cycloid is formed by pushing a wheel on a surface without slipping. The position of a fixed point on the outer rim of the wheel traces out the cycloid. Suppose the wheel has radius \\(R\\) and the initial position of the point is at the bottom, \\((0,0)\\). Let \\(t\\) measure angle measurement, in radians. Then the point of contact of the wheel will be at \\(Rt\\), as that is the distance the wheel will have rotated. That is, the hub of the wheel will move according to \\(\\langle Rt,~ R\\rangle\\). Relative to the hub, the point on the rim will have coordinates \\(\\langle -R\\sin(t), -R\\cos(t) \\rangle\\), so the superposition gives:\n\\[\n\\vec{r}(t) = \\langle Rt - R\\sin(t), R - R\\cos(t) \\rangle.\n\\]\nWhat is the position at \\(t=\\pi/4\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [0.570796, 1.0]\n \n \n\n\n \n \n \n \n [0.0782914, 0.292893 ]\n \n \n\n\n \n \n \n \n [0.181172, 0.5]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAnd the position at \\(\\pi/2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [0.181172, 0.5]\n \n \n\n\n \n \n \n \n [0.570796, 1.0]\n \n \n\n\n \n \n \n \n [0.0782914, 0.292893 ]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose instead of keeping track of a point on the outer rim of the wheel, a point a distance \\(r < R\\) from the hub is chosen in the above description of a cycloid (a Curtate cycloid). If we start at \\(\\langle 0,~ R-r \\rangle\\), what will be the position at \\(t\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle -r\\sin(t),~ -r\\cos(t) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle Rt - r\\sin(t),~ R - r\\cos(t) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle Rt - R\\sin(t),~ R - R\\cos(t) \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the cycloid \\(\\vec{r}(t) = \\langle t - \\sin(t),~ 1 - \\cos(t) \\rangle\\), find a simplified expression for \\(\\| \\vec{r}'(t)\\|\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 - \\cos(t)\\)\n \n \n\n\n \n \n \n \n \\(\\sqrt{2 - 2\\cos(t)}\\)\n \n \n\n\n \n \n \n \n \\(1 + \\cos(t) + \\cos(2t)\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe cycloid \\(\\vec{r}(t) = \\langle t - \\sin(t),~ 1 - \\cos(t) \\rangle\\) has a formula for the arc length from \\(0\\) to \\(t\\) given by: \\(l(t) = 4 - 4\\cos(t/2)\\).\nPlot the following two equations over \\([0,8]\\) which are a reparameterization of the cycloid by \\(l^{-1}(t)\\).\n\nγ(s) = 2 * acos(1-s/4)\nx1(s) = γ(s) - sin(γ(s))\ny1(s) = 1 - cos(γ(s))\n\ny1 (generic function with 1 method)\n\n\nHow many arches of the cycloid are traced out?\n\n\n\n \n \n \n \n \n \n \n \n \n 1\n \n \n\n\n \n \n \n \n 2\n \n \n\n\n \n \n \n \n 3\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the cycloid \\(\\vec{r}(t) = \\langle t - \\sin(t),~ 1 - \\cos(t) \\rangle\\)\nWhat is the derivative at \\(t=\\pi/2\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [2,0]\n \n \n\n\n \n \n \n \n [1,1]\n \n \n\n\n \n \n \n \n [0,0]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the derivative at \\(t=\\pi\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n [0,0]\n \n \n\n\n \n \n \n \n [2,0]\n \n \n\n\n \n \n \n \n [1,1]\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the circle \\(\\vec{r}(t) = R \\langle \\cos(t),~ \\sin(t) \\rangle\\), \\(R > 0\\). Find the norm of \\(\\vec{r}'(t)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(1/R\\)\n \n \n\n\n \n \n \n \n \\(R\\)\n \n \n\n\n \n \n \n \n \\(R^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe curve described by \\(\\vec{r}(t) = \\langle 10t,~ 10t - 16t^2\\rangle\\) models the flight of an arrow. Compute the length traveled from when it is launched to when it returns to the ground.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{r}(t) = \\langle t, t^2 \\rangle\\) describe a parabola. What is the arc length between \\(0 \\leq t \\leq 1\\)? First, what is a formula for the speed (\\(\\| \\vec{r}'(t)\\|\\))?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 + 4t^2\\)\n \n \n\n\n \n \n \n \n \\(\\sqrt{1 + 4t^2}\\)\n \n \n\n\n \n \n \n \n \\(t + t^2\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nNumerically find the arc length.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{r}(t) = \\langle t, t^2 \\rangle\\) describe a parabola. What is the curvature of \\(\\vec{r}(t)\\) at \\(t=0\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe curvature at \\(1\\) will be\n\n\n\n \n \n \n \n \n \n \n \n \n less than the curvature at \\(t=0\\)\n \n \n\n\n \n \n \n \n greater than the curvature at \\(t=0\\)\n \n \n\n\n \n \n \n \n the same as the curvature at \\(t=0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe curvature as \\(t\\rightarrow \\infty\\) will be\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\infty\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nNow, if we have a more general parabola by introducing a parameter \\(a>0\\): \\(\\vec{r}(t) = \\langle t, a\\cdot t^2 \\rangle\\), What is the curvature of \\(\\vec{r}(t)\\) at \\(t=0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \\(2a\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2/a\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nProjectile motion with constant acceleration is expressed parametrically by \\(\\vec{x}(t) = \\vec{x}_0 + \\vec{v}_0 t + (1/2) \\vec{a} t^2\\), where \\(\\vec{x}_0\\) and \\(\\vec{v}_0\\) are initial positions and velocity respectively. In Strang p451, we find an example utilizing this formula to study the curve of a baseball. Place the pitcher at the origin, the batter along the \\(x\\) axis, then a baseball thrown with spin around its \\(z\\) axis will have acceleration in the \\(y\\) direction in addition to the acceration due to gravity in the \\(z\\) direction. Suppose the ball starts \\(5\\) feet above the ground when pitched (\\(\\vec{x}_0 = \\langle 0,0, 5\\rangle\\)), and has initial velocity \\(\\vec{v}_0 = \\langle 120, -2, 2 \\rangle\\). (\\(120\\) feet per second is about \\(80\\) miles per hour). Suppose the pitcher can produce an acceleration in the \\(y\\) direction of \\(16ft/sec^2\\), then \\(\\vec{a} = \\langle 0, 16, -32\\rangle\\) in these units. (Gravity is \\(9.8m/s^2\\) or \\(32ft/s^2\\).)\nThe plate is \\(60\\) feet away. How long will it take for the ball to reach the batter? (When the first component is \\(60\\)?)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\(t=1/4\\) the ball is half-way to home. If the batter reads the ball at this point, where in the \\(y\\) direction is the ball?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\(t=1/2\\) has the ball moved more than \\(1/2\\) foot in the \\(y\\) direction?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn Strang we see this picture describing a curve:\n\n\n\n\n\nStrang notes that the curve is called the “witch of Agnesi” after Maria Agnesi, the author of the first three-semester calculus book. (LHopitals book did not contain integration.)\nWe wish to identify the parameterization. Using \\(\\theta\\) an angle in standard position, we can see that the component functions \\(x(\\theta)\\) and \\(y(\\theta)\\) may be found using trigonometric analysis.\nWhat is the \\(x\\) coordinate of point \\(A\\)? (Also the \\(x\\) coordinate of \\(P\\).)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\cot(\\theta)\\)\n \n \n\n\n \n \n \n \n \\(2\\tan(\\theta)\\)\n \n \n\n\n \n \n \n \n \\(2̧\\cot(\\theta)\\)\n \n \n\n\n \n \n \n \n \\(\\tan(\\theta)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nUsing the polar form of a circle, the length between the origin and \\(B\\) is given by \\(2\\cos(\\theta-\\pi/2) = 2\\sin(\\theta)\\). Using this, what is the \\(y\\) coordinate of \\(B\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2\\sin^2(\\theta)\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \\(2\\sin(\\theta)\\)\n \n \n\n\n \n \n \n \n \\(\\sin(\\theta)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(n > 0\\), \\(\\vec{r}(t) = \\langle t^(n+1),t^n\\rangle\\). Find the speed, \\(\\|\\vec{r}'(t)\\|\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sqrt{n^2 + t^2}\\)\n \n \n\n\n \n \n \n \n \\(t^n + t^{n+1}\\)\n \n \n\n\n \n \n \n \n \\(\\frac{\\sqrt{n^{2} t^{2 n} + t^{2 n + 2} \\left(n + 1\\right)^{2}}}{t}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFor \\(n=2\\), the arc length of \\(\\vec{r}\\) can be found exactly. What is the arc-length between \\(0 \\leq t \\leq a\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\frac{2 a^{\\frac{5}{2}}}{5}\\)\n \n \n\n\n \n \n \n \n \\(\\frac{a^{2} \\sqrt{9 a^{2} + 4}}{3} + \\frac{4 \\sqrt{9 a^{2} + 4}}{27} - \\frac{8}{27}\\)\n \n \n\n\n \n \n \n \n \\(\\sqrt{a^2 + 4}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe astroid is one of the few curves with an exactly computable arc-length. The curve is parametrized by \\(\\vec{r}(t) = \\langle a\\cos^3(t), a\\sin^3(t)\\rangle\\). For \\(a=1\\) find the arc-length between \\(0 \\leq t \\leq \\pi/2\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sqrt{2}\\)\n \n \n\n\n \n \n \n \n \\(3/2\\)\n \n \n\n\n \n \n \n \n \\(\\pi/2\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\n\n\nLet \\(F\\) and \\(B\\) be pictured above. Which is the red curve?\n\n\n\n \n \n \n \n \n \n \n \n \n The back wheel\n \n \n\n\n \n \n \n \n The front wheel\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n\n\n\n\n\nLet \\(F\\) and \\(B\\) be pictured above. Which is the red curve?\n\n\n\n \n \n \n \n \n \n \n \n \n The back wheel\n \n \n\n\n \n \n \n \n The front wheel\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{\\gamma}(s)\\) be a parameterization of a curve by arc length and \\(s(t)\\) some continuous increasing function of \\(t\\). Then \\(\\vec{\\gamma} \\circ s\\) also parameterizes the curve. We have\n\\[\n\\text{velocity} = \\frac{d (\\vec{\\gamma} \\circ s)}{dt} = \\frac{d\\vec{\\gamma}}{ds} \\frac{ds}{dt} = \\hat{T} \\frac{ds}{dt}.\n\\]\nContinuing with a second derivative\n\\[\n\\text{acceleration} = \\frac{d^2(\\vec{\\gamma}\\circ s)}{dt^2} =\n\\frac{d\\hat{T}}{ds} \\frac{ds}{dt} \\frac{ds}{dt} + \\hat{T} \\frac{d^2s}{dt^2} = \\frac{d^2s}{dt^2}\\hat{T} + \\kappa (\\frac{ds}{dt})^2 \\hat{N},\n\\]\nUsing \\(d\\hat{T}{ds} = \\kappa\\hat{N}\\) when parameterized by arc length.\nThis expresses the acceleration in terms of the tangential part and the normal part. Strang views this in terms of driving where the car motion is determined by the gas pedal and the brake pedal only giving acceleration in the \\(\\hat{T}\\) direction) and the steering wheel (giving acceleration in the \\(\\hat{N}\\) direction).\nIf a car is on a straight road, then \\(\\kappa=0\\). Is the acceleration along the \\(\\hat{T}\\) direction or the \\(\\hat{N}\\) direction?\n\n\n\n \n \n \n \n \n \n \n \n \n The \\(\\hat{N}\\) direction\n \n \n\n\n \n \n \n \n The \\(\\hat{T}\\) direction\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSuppose no gas or brake is applied for a duration of time. The tangential acceleration will be \\(0\\). During this time, which of these must be \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec{\\gamma} \\circ s\\)\n \n \n\n\n \n \n \n \n \\(ds/dt\\)\n \n \n\n\n \n \n \n \n \\(d^2s/dt^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIn going around a corner (with non-zero curvature), which is true?\n\n\n\n \n \n \n \n \n \n \n \n \n The acceleration in the normal direction depends on both the curvature and the speed (\\(ds/dt\\))\n \n \n\n\n \n \n \n \n The acceleration in the normal direction depends only on the curvature and not the speed (\\(ds/dt\\))\n \n \n\n\n \n \n \n \n The acceleration in the normal direction depends only on the speed (\\(ds/dt\\)) and not the curvature\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe evolute comes from the formula \\(\\vec\\gamma(T) - (1/\\kappa(t)) \\hat{N}(t)\\). For hand computation, this formula can be explicitly given by two components \\(\\langle X(t), Y(t) \\rangle\\) through:\n\\[\n\\begin{align}\nr(t) &= x'(t)^2 + y'(t)^2\\\\\nk(t) &= x'(t)y''(t) - x''(t) y'(t)\\\\\nX(t) &= x(t) - y'(t) r(t)/k(t)\\\\\nY(t) &= x(t) + x'(t) r(t)/k(t)\n\\end{align}\n\\]\nLet \\(\\vec\\gamma(t) = \\langle t, t^2 \\rangle = \\langle x(t), y(t)\\rangle\\) be a parameterization of a parabola.\n\nCompute \\(r(t)\\)\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1 - 2t\\)\n \n \n\n\n \n \n \n \n \\(1 + 2t\\)\n \n \n\n\n \n \n \n \n \\(1 - 4t^2\\)\n \n \n\n\n \n \n \n \n \\(1 + 4t^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nCompute \\(k(t)\\)\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-8t\\)\n \n \n\n\n \n \n \n \n \\(8t\\)\n \n \n\n\n \n \n \n \n \\(-2\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nCompute \\(X(t)\\)\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(t - 2t(1 + 4t^2)/2\\)\n \n \n\n\n \n \n \n \n \\(t - 4t(1+2t)/2\\)\n \n \n\n\n \n \n \n \n \\(t - 2(8t)/(1-2t)\\)\n \n \n\n\n \n \n \n \n \\(t - 1(1+4t^2)/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nCompute \\(Y(t)\\)\n\n\n\n\n \n \n \n \n \n \n \n \n \n \\(t^2 + 1(1 + 4t^2)/2\\)\n \n \n\n\n \n \n \n \n \\(t^2 + 2t(1+4t^2)/2\\)\n \n \n\n\n \n \n \n \n \\(t^2 - 1(1+4t^2)/2\\)\n \n \n\n\n \n \n \n \n \\(t^2 - 2t(1+4t^2)/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following will compute the evolute of an ellipse:\n@syms t a b\nx = a * cos(t)\ny = b * sin(t)\nxp, xpp, yp, ypp = diff(x, t), diff(x,t,t), diff(y,t), diff(y,t,t)\nr2 = xp^2 + yp^2\nk = xp * ypp - xpp * yp\nX = x - yp * r2 / k |> simplify\nY = y + xp * r2 / k |> simplify\n[X, Y]\nWhat is the resulting curve?\n\n\n\n \n \n \n \n \n \n \n \n \n An astroid of the form \\(c \\langle \\cos^3(t), \\sin^3(t) \\rangle\\)\n \n \n\n\n \n \n \n \n An cubic parabola of the form \\(\\langle ct^3, dt^2\\rangle\\)\n \n \n\n\n \n \n \n \n An ellipse of the form \\(\\langle a\\cos(t), b\\sin(t)\\)\n \n \n\n\n \n \n \n \n A cyloid of the form \\(c\\langle t + \\sin(t), 1 - \\cos(t)\\rangle\\)"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html",
"href": "differentiable_vector_calculus/scalar_functions.html",
"title": "55  Scalar functions",
"section": "",
"text": "This section uses these add-on packages:\nAlso, these methods from the Contour package:\nConsider a function \\(f: R^n \\rightarrow R\\). It has multiple arguments for its input (an \\(x_1, x_2, \\dots, x_n\\)) and only one, scalar, value for an output. Some simple examples might be:\n\\[\n\\begin{align}\nf(x,y) &= x^2 + y^2\\\\\ng(x,y) &= x \\cdot y\\\\\nh(x,y) &= \\sin(x) \\cdot \\sin(y)\n\\end{align}\n\\]\nFor two examples from real life consider the elevation Point Query Service (of the USGS) returns the elevation in international feet or meters for a specific latitude/longitude within the United States. The longitude can be associated to an \\(x\\) coordinate, the latitude to a \\(y\\) coordinate, and the elevation a \\(z\\) coordinate, and as long as the region is small enough, the \\(x\\)-\\(y\\) coordinates can be thought to lie on a plane. (A flat earth assumption.)\nSimilarly, a weather map, say of the United States, may show the maximum predicted temperature for a given day. This describes a function that take a position (\\(x\\), \\(y\\)) and returns a predicted temperature (\\(z\\)).\nMathematically, we may describe the values \\((x,y)\\) in terms of a point, \\(P=(x,y)\\) or a vector \\(\\vec{v} = \\langle x, y \\rangle\\) using the identification of a point with a vector. As convenient, we may write any of \\(f(x,y)\\), \\(f(P)\\), or \\(f(\\vec{v})\\) to describe the evaluation of \\(f\\) at the value \\(x\\) and \\(y\\)\nReturning to the task at hand, in Julia, defining a scalar function is straightforward, the syntax following mathematical notation:\nTo call a scalar function for specific values of \\(x\\) and \\(y\\) is also similar to the mathematical case:\nIt may be advantageous to have the values as a vector or a point, as in v=[x,y]. Splatting can be used to turn a vector or tuple into two arguments:\nAlternatively, the function may be defined using a vector argument:\nA style required for other packages within the Julia ecosystem, as there are many advantages to passing containers of values: they can have arbitrary length, they can be modified inside a function, the functions can be more generic, etc.\nMore verbosely, but avoiding index notation, we can use multiline functions:\nThen we have\nMore elegantly, perhaps and the approach we will use in this section is to mirror the mathematical notation through multiple dispatch. If we define j for multiple variables, say with:\nThe we can define an alternative method with just a single variable and use splatting to turn it into multiple variables:\nThe we can call j with a vector or point:\nor by passing in the individual components:\nFollowing a calculus perspective, we take up the question of how to visualize scalar functions within Julia? Further, how to describe the change in the function between nearby values?"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#visualizing-scalar-functions",
"href": "differentiable_vector_calculus/scalar_functions.html#visualizing-scalar-functions",
"title": "55  Scalar functions",
"section": "55.1 Visualizing scalar functions",
"text": "55.1 Visualizing scalar functions\nSuppose for the moment that \\(f:R^2 \\rightarrow R\\). The equation \\(z = f(x,y)\\) may be visualized by the set of points in \\(3\\)-dimensions \\(\\{(x,y,z): z = f(x,y)\\}\\). This will render as a surface, and that surface will pass a “vertical line test”, in that each \\((x,y)\\) value corresponds to at most one \\(z\\) value. We will see alternatives for describing surfaces beyond through a function of the form \\(z=f(x,y)\\). These are similar to how a curve in the \\(x\\)-\\(y\\) plane can be described by a function of the form \\(y=f(x)\\) but also through an equation of the form \\(F(x,y) = c\\) or through a parametric description, such as is used for planar curves. For now though we focus on the case where \\(z=f(x,y)\\).\nIn Julia, plotting such a surface requires a generalization to plotting a univariate function where, typically, a grid of evenly spaced values is given between some \\(a\\) and \\(b\\), the corresponding \\(y\\) or \\(f(x)\\) values are found, and then the points are connected in a dot-to-dot manner.\nHere, a two-dimensional grid of \\(x\\)-\\(y\\) values needs specifying, and the corresponding \\(z\\) values found. As the grid will be assumed to be regular only the \\(x\\) and \\(y\\) values need specifying, the set of pairs can be computed. The \\(z\\) values, it will be seen, are easily computed. This cloud of points is plotted and each cell in the \\(x\\)-\\(y\\) plane is plotted with a surface giving the \\(x\\)-\\(y\\)-\\(z\\), \\(3\\)-dimensional, view. One way to plot such a surface is to tessalate the cell and then for each triangle, represent a plane made up of the \\(3\\) boundary points.\nHere is an example:\n\n𝒇(x, y) = x^2 + y^2\n\n𝒇 (generic function with 1 method)\n\n\n\nxs = range(-2, 2, length=100)\nys = range(-2, 2, length=100)\n\nsurface(xs, ys, 𝒇)\n\n\n\n\nThe surface function will generate the surface.\n\n\n\n\n\n\nNote\n\n\n\nUsing surface as a function name is equivalent to plot(xs, ys, f, seriestype=:surface).\n\n\nWe can also use surface(xs, ys, zs) where zs is not a vector, but rather a matrix of values corresponding to a grid described by the xs and ys. A matrix is a rectangular collection of values indexed by row and column through indices i and j. Here the values in zs should satisfy: the \\(i\\)th row and \\(j\\)th column entry should be \\(z_{ij} = f(x_i, y_j)\\) where \\(x_i\\) is the \\(i\\)th entry from the xs and \\(y_j\\) the \\(j\\)th entry from the ys.\nWe can generate this using a comprehension:\n\nzs = [𝒇(x,y) for y in ys, x in xs]\nsurface(xs, ys, zs)\n\n\n\n\nIf remembering that the \\(y\\) values go first, and then the \\(x\\) values in the above is too hard, then an alternative can be used. Broadcasting f.(xs,ys) may not make sense, were the xs and ys not of commensurate lengths, and when it does, this call pairs off xs and ys values and passes them to f. What is desired here is different, where for each xs value there are pairs for each of the ys values. The syntax xs' can ve viewed as creating a row vector, where xs is a column vector. Broadcasting will create a matrix of values in this case. So the following is identical to the above:\n\nsurface(xs, ys, 𝒇.(xs', ys))\n\n\n\n\n(This is still subtle. The use of the adjoint operation on ys will error if the dimensions are not square, but will produce an incorrect surface if not. It would be best to simply pass the function and let Plots handle this detail which for the alternative Makie is reversed.)\n\nAn alternate to surface is wireframe which may not use shading in all backenends. This displays a grid in the \\(x\\)-\\(y\\) plane mapped to the surface:\n\nxs = ys = range(-2,2, length=10) # downsample to see the frame\nwireframe(xs, ys, 𝒇) # gr() or pyplot() wireplots render better than plotly()\n\n\n\n\n\nExample\nThe surface \\(f(x,y) = x^2 - y^2\\) has a “saddle,” as this shows:\n\nf(x,y) = x^2 - y^2\nxs = ys = range(-2, 2, length=100)\nsurface(xs, ys, f)\n\n\n\n\n\n\nExample\nAs mentioned. In plots of univariate functions, a dot-to-dot algorithm is followed. For surfaces, the two dots are replaced by four points, which over determines a plane. Some choice is made to partition that rectangle into two triangles, and for each triangle, the \\(3\\) resulting points determines a plane, which can be suitably rendered.\nWe can see this in the default gr toolkit by forcing the surface to show just one cell, as the xs and ys below only contain \\(2\\) values:\n\nxs = [-1,1]; ys = [-1,1]\nf(x,y) = x*y\nsurface(xs, ys, f)\n\n\n\n\nCompare this, to the same region, but with many cells to represent the surface:\n\nxs = ys = range(-1, 1, length=100)\nf(x,y) = x*y\nsurface(xs, ys, f)\n\n\n\n\n\n\n55.1.1 Contour plots and heatmaps\nConsider the example of latitude, longitude, and elevation data describing a surface. The following graph is generated from such data, which was retrieved from the USGS website for a given area. The grid points are chosen about every \\(150\\)m, so this is not too fine grained.\n\nSC = JSON.parse(somocon) # defined in a hidden cell\nxsₛ, ysₛ, zsₛ = [float.(SC[i]) for i in (\"xs\", \"ys\",\"zs\")]\nzzsₛ = reshape(zsₛ, (length(xsₛ), length(ysₛ)))' # reshape to matrix\nsurface(xsₛ, ysₛ, zzsₛ)\n\n\n\n\nThis shows a bit of the topography. If we look at the region from directly above, the graph looks different:\n\nsurface(xsₛ, ysₛ, zzsₛ, camera=(0, 90))\n\n\n\n\nThe rendering uses different colors to indicate height. A more typical graph, that is somewhat similar to the top down view, is a contour map.\nFor a scalar function, Define a level curve as the solutions to the equations \\(f(x,y) = c\\) for a given \\(c\\). (Or more generally \\(f(\\vec{x}) = c\\) for a vector if dimension \\(2\\) or more.) Plotting a selection of level curves yields a contour graph. These are produced with contour and called as above. For example, we have:\n\ncontour(xsₛ, ysₛ, zzsₛ)\n\n\n\n\nWere one to walk along one of the contour lines, then there would be no change in elevation. The areas of greatest change in elevation - basically the hills - occur where the different contour lines are closest. In this particular area, there is a river that runs from the upper right through to the lower left and this is flanked by hills.\nThe \\(c\\) values for the levels drawn may be specified through the levels argument:\n\ncontour(xsₛ, ysₛ, zzsₛ, levels=[50,75,100, 125, 150, 175])\n\n\n\n\nThat shows the \\(50\\)m, \\(75\\)m, … contours.\nIf a fixed number of evenly spaced levels is desirable, then the nlevels argument is available.\n\ncontour(xsₛ, ysₛ, zzsₛ, nlevels = 5)\n\n\n\n\nIf a function describes the surface, then the function may be passed as the third value:\n\nf(x, y) = sin(x) - cos(y)\nxs = range(0, 2pi, length=100)\nys = range(-pi, pi, length = 100)\ncontour(xs, ys, f)\n\n\n\n\n\nExample\nAn informative graphic mixes both a surface plot with a contour plot. The PyPlot package can be used to generate one, but such graphs are not readily made within the Plots framework. Here is a workaround, where the contours are generated through the Contours package. At the beginning of this section several of its methods are imported.\nThis example shows how to add a contour at a fixed level (\\(0\\) below). As no hidden line algorithm is used to hide the contour line if the surface were to cover it, a transparency is specified through alpha=0.5:\n\nfunction surface_contour(xs, ys, f; offset=0)\n p = surface(xs, ys, f, legend=false, fillalpha=0.5)\n\n ## we add to the graphic p, then plot\n zs = [f(x,y) for x in xs, y in ys] # reverse order for use with Contour package\n for cl in levels(contours(xs, ys, zs))\n lvl = level(cl) # the z-value of this contour level\n for line in lines(cl)\n _xs, _ys = coordinates(line) # coordinates of this line segment\n _zs = offset .+ (0 .* _xs)\n plot!(p, _xs, _ys, _zs, alpha=0.5) # add curve on x-y plane\n end\n end\n p\nend\n\nxs = ys = range(-pi, stop=pi, length=100)\nf(x,y) = 2 + sin(x) - cos(y)\n\nsurface_contour(xs, ys, f)\n\n\n\n\nWe can see that at the minimum of the surface, the contour lines are nested closed loops with decreasing area.\n\n\nExample\nThe figure shows a weather map from \\(1943\\) with contour lines based on atmospheric pressure. These are also know as isolines.\n\n\n\nImage from weather.gov of a contour map showing atmospheric pressures from January 22, 1943 in Rapid City, South Dakota.\n\n\nThis day is highlighted as “The most notable temperature fluctuations occurred on January 22, 1943 when temperatures rose and fell almost 50 degrees in a few minutes. This phenomenon was caused when a frontal boundary separating extremely cold Arctic air from warmer Pacific air rolled like an ocean tide along the northern and eastern slopes of the Black Hills.”\nThis frontal boundary is marked with triangles and half circles along the thicker black line. The tight spacing of the contour lines above that marked line show a big change in pressure in a short distance.\n\n\nExample\nSea surface temperature varies with latitude and other factors, such as water depth. The following figure shows average temperatures for January 1982 around Australia. The filled contours allow for an easier identification of the ranges represented.\n\n\n\nImage from IRI shows mean sea surface temperature near Australia in January 1982. IRI has zoomable graphs for this measurement from 1981 to the present. The contour lines are in 2 degree Celsius increments.\n\n\n\n\nExample\nThe filled contour and the heatmap are related figures to a simple contour graph. The heatmap uses a color gradient to indicate the value at \\((x,y)\\):\n\nf(x,y) = exp(-(x^2 + y^2)/5) * sin(x) * cos(y)\nxs= ys = range(-pi, pi, length=100)\nheatmap(xs, ys, f)\n\n\n\n\nThe filled contour layers on the contour lines to a heatmap:\n\nf(x,y) = exp(-(x^2 + y^2)/5) * sin(x) * cos(y)\nxs= ys = range(-pi, pi, length=100)\ncontourf(xs, ys, f)\n\n\n\n\nThis function has a prominent peak and a prominent valley, around the middle of the viewing window. The nested contour lines indicate this, and the color key can be used to identify which is the peak and which the valley."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#limits",
"href": "differentiable_vector_calculus/scalar_functions.html#limits",
"title": "55  Scalar functions",
"section": "55.2 Limits",
"text": "55.2 Limits\nThe notion of a limit for a univariate function: as \\(x\\) gets close to \\(c\\) then \\(f(x)\\) gets close to \\(L\\), needs some modification:\n\nLet \\(f: R^n \\rightarrow R\\) and \\(C\\) be a point in \\(R^n\\). Then \\(\\lim_{P \\rightarrow C}f(P) = L\\) if for every \\(\\epsilon > 0\\) there exists a \\(\\delta > 0\\) such that \\(|f(P) - L| < \\epsilon\\) whenever \\(0 < \\| P - C \\| < \\delta\\).\n\n(If \\(P=(x_1, x_2, \\dots, x_n)\\) we use \\(f(P) = f(x_1, x_2, \\dots, x_n)\\).)\nThis says, informally, for any scale about \\(L\\) there is a “ball” about \\(C\\) (not including \\(C\\)) for which the images of \\(f\\) always sit in the ball. Formally we define a ball of radius \\(r\\) about a point \\(C\\) to be all points \\(P\\) with distance between \\(P\\) and \\(C\\) less than \\(r\\). A ball is an open set. An open is a set \\(U\\) such that for any \\(x\\) in \\(U\\), there is a radius \\(r\\) such that the ball of radius \\(r\\) about \\(x\\) is still within \\(U\\). An open set generalizes an open interval. A closed set generalizes a closed interval. These are defined by a set that contains its boundary. Boundary points are any points that can be approached in the limit by points within the set.\nIn the univariate case, it can be useful to characterize a limit at \\(x=c\\) existing if both the left and right limits exist and the two are equal. Generalizing to getting close in \\(R^m\\) leads to the intuitive idea of a limit existing in terms of any continuous “path” that approaches \\(C\\) in the \\(x\\)-\\(y\\) plane has a limit and all are equal. Let \\(\\gamma\\) describe the path, and \\(\\lim_{s \\rightarrow t}\\gamma(s) = C\\). Then \\(f \\circ \\gamma\\) will be a univariate function. If there is a limit, \\(L\\), then this composition will also have the same limit as \\(s \\rightarrow t\\). Conversely, if for every path this composition has the same limit, then \\(f\\) will have a limit.\nThe “two path corollary” is a trick to show a limit does not exist - just find two paths where there is a limit, but they differ, then a limit does not exist in general.\n\n55.2.1 Continuity of scalar functions\nContinuity is defined in a familiar manner: \\(f(P)\\) is continuous at \\(C\\) if \\(\\lim_{P \\rightarrow C} f(P) = f(C)\\), where we interpret \\(P \\rightarrow C\\) in the sense of a ball about \\(C\\).\nAs with univariate functions continuity will be preserved under function addition, subtraction, multiplication, and division (provided there is no dividing by \\(0\\)). With this, all these functions are continuous everywhere and so have limits everywhere:\n\\[\nf(x,y) = \\sin(x + y), \\quad\ng(x,y,z) = x^2 + y^2 + z^2, \\quad\nh(w, x,y,z) = \\sqrt{w^2 + x^2 + y^2 + z^2}.\n\\]\nNot all functions will have a limit though. Consider \\(f(x,y) = 2x^2/(x^2+y^2)\\) and \\(C=(0,0)\\). It is not defined at \\(C\\) (dividing by \\(0\\)), but may have a limit at \\(C\\). Consider the path \\(x=0\\) (the \\(y\\)-axis) parameterized by \\(\\vec\\gamma(t) = \\langle 0, t\\rangle\\). Along this path \\((f\\circ \\vec\\gamma)(t) = 0/t^2 = 0\\) so will have a limit of \\(0\\). If the limit of \\(f\\) exists it must be \\(0\\). But, along the line \\(y=0\\) (the \\(x\\) axis) parameterized by \\(\\vec{\\gamma}(t) = \\langle t, 0 \\rangle\\), the function simplifies to \\((f\\circ\\vec\\gamma)(t)=2\\), so would have a limit of \\(2\\). As the limit along different paths is different, this function has no limit in general.\n\nExample\nIf is not enough that a limit exist along many paths to say a limit exists in general. It must be all paths and be equal. An example might be this function:\n\\[\nf(x,y) =\n\\begin{cases}\n(x + y)/(x-y) & x \\neq y,\\\\\n0 & x = y\n\\end{cases}\n\\]\nAt \\(\\vec{0}\\) this will not have a limit. However, along any line \\(y=mx\\) we have a limit. If \\(m=1\\) the function is constantly \\(0\\), and so has the limit. If \\(m \\neq 1\\), then we get \\(f(x, y) = f(x, mx) = (1 + m)/(1-m)\\), a constant So for each \\(m\\) there is a different limit. Consequently, the scalar function does not have a limit."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#partial-derivatives-and-the-gradient",
"href": "differentiable_vector_calculus/scalar_functions.html#partial-derivatives-and-the-gradient",
"title": "55  Scalar functions",
"section": "55.3 Partial derivatives and the gradient",
"text": "55.3 Partial derivatives and the gradient\nDiscussing the behaviour of a scalar function along a path is described mathematically through composition. If \\(\\vec\\gamma(t)\\) is a path in \\(R^n\\), then the composition \\(f \\circ \\vec\\gamma\\) will be a univariate function. When \\(n=2\\), we can visualize this composition directly, or as a \\(3\\)-D path on the surface given by \\(\\vec{r}(t) = \\langle \\gamma_1(t), \\gamma_2(t), \\dots, \\gamma_n(t), (f \\circ \\vec\\gamma)(t) \\rangle\\).\n\nf₁(x,y) = 2 - x^2 - 3y^2\nf₁(x) = f₁(x...)\nγ₁(t) = 2 * [t, -t^2] # use \\gamma[tab]\nx₁s = y₁s = range(-1, 1, length=100)\nsurface(x₁s, y₁s, f₁)\nr3₁(t) = [γ₁(t)..., f₁(γ₁(t))] # to plot the path on the surface\nplot_parametric!(0..1/2, r3₁, linewidth=5, color=:black)\n\nr2₁(t) = [γ₁(t)..., 0]\nplot_parametric!(0..1/2, r2₁, linewidth=5, color=:black) # in the $x$-$y$ plane\n\n\n\n\nThe vector valued function r3(t) = [γ(t)..., f(γ(t))] takes the \\(2\\)-dimensional path specified by \\(\\vec\\gamma(t)\\) and adds a third, \\(x\\), direction by composing the position with f. In this way, a \\(2\\)-D path is visualized with a \\(3\\)-D path. This viewpoint can be reversed, as desired.\nHowever, the composition, \\(f\\circ\\vec\\gamma\\), is a univariate function, so this can also be visualized by\n\nplot(f₁ ∘ γ₁, 0, 1/2)\n\n\n\n\nWith this graph, we might be led to ask about derivatives or rates of change. For this example, we can algebraically compute the composition:\n\\[\n(f \\circ \\vec\\gamma)(t) = 2 - (2t) - 3(-2t^2)^2 = 2 - 2t +12t^4\n\\]\nFrom here we clearly have \\(f'(t) = -2 + 48t^3\\). But could this be computed in terms of a “derivative” of \\(f\\) and the derivative of \\(\\vec\\gamma\\)?\nBefore answering this, we discuss directional derivatives along the simplified paths \\(\\vec\\gamma_x(t) = \\langle t, c\\rangle\\) or \\(\\vec\\gamma_y(t) = \\langle c, t\\rangle\\).\nIf we compose \\(f \\circ \\vec\\gamma_x\\), we can visualize this as a curve on the surface from \\(f\\) that moves in the \\(x\\)-\\(y\\) plane along the line \\(y=c\\). The derivative of this curve will satisfy:\n\\[\n\\begin{align}\n(f \\circ \\vec\\gamma_x)'(x) &=\n\\lim_{t \\rightarrow x} \\frac{(f\\circ\\vec\\gamma_x)(t) - (f\\circ\\vec\\gamma_x)(x)}{t-x}\\\\\n&= \\lim_{t\\rightarrow x} \\frac{f(t, c) - f(x,c)}{t-x}\\\\\n&= \\lim_{h \\rightarrow 0} \\frac{f(x+h, c) - f(x, c)}{h}.\n\\end{align}\n\\]\nThe latter expresses this to be the derivative of the function that holds the \\(y\\) value fixed, but lets the \\(x\\) value vary. It is the rate of change in the \\(x\\) direction. There is special notation for this:\n\\[\n\\begin{align}\n\\frac{\\partial f(x,y)}{\\partial x} &=\n\\lim_{h \\rightarrow 0} \\frac{f(x+h, y) - f(x, y)}{h},\\quad\\text{and analogously}\\\\\n\\frac{\\partial f(x,y)}{\\partial y} &=\n\\lim_{h \\rightarrow 0} \\frac{f(x, y+h) - f(x, y)}{h}.\n\\end{align}\n\\]\nThese are called the partial derivatives of \\(f\\). The symbol \\(\\partial\\), read as “partial”, is reminiscent of “\\(d\\)”, but indicates the derivative is only in a given direction. Other notations exist for this:\n\\[\n\\frac{\\partial f}{\\partial x}, \\quad f_x, \\quad \\partial_x f,\n\\]\nand more generally, when \\(n\\) may be \\(2\\) or more,\n\\[\n\\frac{\\partial f}{\\partial x_i}, \\quad f_{x_i}, \\quad f_i, \\quad \\partial_{x_i} f, \\quad \\partial_i f.\n\\]\nThe gradient of a scalar function \\(f\\) is the vector comprised of the partial derivatives:\n\\[\n\\nabla f(x_1, x_2, \\dots, x_n) = \\langle\n\\frac{\\partial f}{\\partial x_1},\n\\frac{\\partial f}{\\partial x_2}, \\dots,\n\\frac{\\partial f}{\\partial x_n} \\rangle.\n\\]\nAs seen, the gradient is a vector-valued function, but has, also, multivariable inputs. It is a function from \\(R^n \\rightarrow R^n\\).\n\nExample\nLet \\(f(x,y) = x^2 - 2xy\\), then to compute the partials, we just treat the other variables like a constant. (This is consistent with the view that the partial derivative is just a regular derivative along a line where all other variables are constant.)\nThen\n\\[\n\\begin{align}\n\\frac{\\partial (x^2 - 2xy)}{\\partial x} &= 2x - 2y\\\\\n\\frac{\\partial (x^2 - 2xy)}{\\partial y} &= 0 - 2x = -2x.\n\\end{align}\n\\]\nCombining, gives \\(\\nabla{f} = \\langle 2x -2y, -2x \\rangle\\).\nIf \\(g(x,y,z) = \\sin(x) + z\\cos(y)\\), then\n\\[\n\\begin{align}\n\\frac{\\partial g }{\\partial x} &= \\cos(x) + 0 = \\cos(x),\\\\\n\\frac{\\partial g }{\\partial y} &= 0 + z(-\\sin(y)) = -z\\sin(y),\\\\\n\\frac{\\partial g }{\\partial z} &= 0 + \\cos(y) = \\cos(y).\n\\end{align}\n\\]\nCombining, gives \\(\\nabla{g} = \\langle \\cos(x), -z\\sin(y), \\cos(y) \\rangle\\).\n\n\n55.3.1 Finding partial derivatives in Julia\nTwo different methods are described, one for working with functions, the other symbolic expressions. This mirrors our treatment for vector-valued functions, where ForwardDiff.derivative was used for functions, and SymPys diff function for symbolic expressions.\nSuppose, we consider \\(f(x,y) = x^2 - 2xy\\). We may define it with Julia through:\n\nf₂(x,y) = x^2 - 2x*y\nf₂(v) = f₂(v...) # to handle vectors. Need not be defined each time\n\nf₂ (generic function with 2 methods)\n\n\nThe numeric gradient at a point, can be found from the function ForwardDiff.gradient through:\n\npt₂ = [1, 2]\nForwardDiff.gradient(f₂, pt₂) # uses the f(v) call above\n\n2-element Vector{Int64}:\n -2\n -2\n\n\nThis, of course matches the computation above, where \\(\\nabla f = \\langle (2x -2y, -2x)\\), so at \\((1,2)\\) is \\((-2, 2)\\), as a point in \\(R^2\\).\nThe ForwardDiff.gradient function expects a function that accepts a vector of values, so the method for f(v) is needed for the computation.\nTo go from a function that takes a point to a function of that point, we have the following definition. This takes advantage of Julias multiple dispatch to add a new method for the gradient generic. This is done in the CalculusWithJulia package along the lines of:\nFowardDiff.gradient(f::Function) = x -> ForwardDiff.gradient(f, x)\nIt works as follows, where a vector of values is passed in for the point in question:\n\ngradient(f₂)([1,2]), gradient(f₂)([3,4])\n\n([-2, -2], [-2, -6])\n\n\nThis expects a point or vector for its argument, and not the expanded values. Were that desired, something like this would work:\nForwardDiff.gradient(f::Function) = (x, xs...) -> ForwardDiff.gradient(f, vcat(x, xs...))\n\ngradient(f₂)([1,2]), gradient(f₂)(3,4)\n\n([-2, -2], [-2, -6])\n\n\nFrom the gradient, finding the partial derivatives involves extraction of the corresponding component.\nFor example, were it desirable, this function could be used to find the partial in \\(x\\) for some constant \\(y\\):\n\npartial_x(f, y) = x -> ForwardDiff.gradient(f,[x,y])[1] # first component of gradient\n\npartial_x (generic function with 1 method)\n\n\nAnother alternative would be to hold one variable constant, and use the derivative function, as in:\n\npartial_x(f, y) = x -> ForwardDiff.derivative(u -> f(u,y), x)\n\npartial_x (generic function with 1 method)\n\n\n\n\n\n\n\n\nNote\n\n\n\nFor vector-valued functions, we can overide the syntax ' using Base.adjoint, as ' is treated as a postfix operator in Julia for the adjoint operation. The symbol \\\\nabla is also available in Julia, but it is not an operator, so cant be used as mathematically written ∇f (this could be used as a name though). In CalculusWithJulia a definition is made so essentially ∇(f) = x -> ForwardDiff.gradient(f, x). It does require parentheses to be called, as in ∇(f).\n\n\n\nSymbolic expressions\nThe partial derivatives are more directly found with SymPy. As with univariate functions, the diff function is used by simply passing in the variable in which to find the partial derivative:\n\n@syms x y\nex = x^2 - 2x*y\ndiff(ex, x)\n\n \n\\[\n2 x - 2 y\n\\]\n\n\n\nAnd evaluation:\n\ndiff(ex,x)(x=>1, y=>2)\n\n \n\\[\n-2\n\\]\n\n\n\nOr\n\ndiff(ex, y)(x=>1, y=>2)\n\n \n\\[\n-2\n\\]\n\n\n\nThe gradient would be found by combining the two:\n\n[diff(ex, x), diff(ex, y)]\n\n2-element Vector{Sym}:\n 2⋅x - 2⋅y\n -2⋅x\n\n\nThis can be simplified through broadcasting:\n\ngrad_ex = diff.(ex, [x,y])\n\n2-element Vector{Sym}:\n 2⋅x - 2⋅y\n -2⋅x\n\n\nTo evaluate at a point we have:\n\nsubs.(grad_ex, x=>1, y=>2)\n\n2-element Vector{Sym}:\n -2\n -2\n\n\nThe above relies on broadcasting treating the pair as a single value so the substitution is repeated for each entry of grad_ex.\nThe gradient function from CalculusWithJulia is defined to find the symbolic gradient. It uses free_symbols to specify the number and order of the variables, but that may be wrong; they are specified below:\n\ngradient(ex, [x, y]) # [∂f/∂x, ∂f/∂y]\n\n2-element Vector{Sym}:\n 2⋅x - 2⋅y\n -2⋅x\n\n\nTo use ∇ and specify the variables, a tuple (grouping parentheses) is used:\n\n∇((ex, [x,y]))\n\n2-element Vector{Sym}:\n 2⋅x - 2⋅y\n -2⋅x\n\n\n\nIn computer science there are two related concepts Currying and Partial application. For a function \\(f(x,y)\\), say, partial application is the process of fixing one of the variables, producing a new function of fewer variables. For example, fixing \\(y=c\\), the we get a new function (of just \\(x\\) and not \\((x,y)\\)) \\(g(x) = f(x,c)\\). In partial derivatives the partial derivative of \\(f(x,y)\\) with respect to \\(x\\) is the derivative of the function \\(g\\), as defined above.\nCurrying, is related, but technically returns a function, so we think of the curried version of \\(f\\) as a function, \\(h\\), which takes \\(x\\) and returns the function \\(y \\rightarrow f(x,y)\\) so that \\(h(x)(y) = f(x, y)\\).\n\n\n\n55.3.2 Visualizing the gradient\nThe gradient is not a univariate function, a simple vector-valued function, or a scalar function, but rather a vector field (which will be discussed later). For the case, \\(f: R^2 \\rightarrow R\\), the gradient will be a function which takes a point \\((x,y)\\) and returns a vector , \\(\\langle \\partial{f}/\\partial{x}(x,y), \\partial{f}/\\partial{y}(x,y) \\rangle\\). We can visualize this by plotting a vector at several points on a grid. This task is made easier with a function like the following, which handles the task of vectorizing the values. It is provided within the CalculusWithJulia package:\nfunction vectorfieldplot!(V; xlim=(-5,5), ylim=(-5,5), nx=10, ny=10, kwargs...)\n\n dx, dy = (xlim[2]-xlim[1])/nx, (ylim[2]-ylim[1])/ny\n xs, ys = xlim[1]:dx:xlim[2], ylim[1]:dy:ylim[2]\n\n ps = [[x,y] for x in xs for y in ys]\n vs = V.(ps)\n λ = 0.9 * minimum([u/maximum(getindex.(vs,i)) for (i,u) in enumerate((dx,dy))])\n\n quiver!(unzip(ps)..., quiver=unzip(λ * vs))\n\nend\nHere we show the gradient for the scalar function \\(f(x,y) = 2 - x^2 - 3y^2\\) over the region \\([-2, 2]\\times[-2,2]\\) along with a contour plot:\n\nf(x,y) = 2 - x^2 - 3y^2\nf(v) = f(v...)\n\nxs = ys = range(-2,2, length=50)\n\np = contour(xs, ys, f, nlevels=12)\nvectorfieldplot!(p, gradient(f), xlim=(-2,2), ylim=(-2,2), nx=10, ny=10)\n\np\n\n\n\n\nThe figure suggests a potential geometric relationship between the gradient and the contour line to be explored later."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#differentiable",
"href": "differentiable_vector_calculus/scalar_functions.html#differentiable",
"title": "55  Scalar functions",
"section": "55.4 Differentiable",
"text": "55.4 Differentiable\nWe see here how the gradient of \\(f\\), \\(\\nabla{f} = \\langle f_{x_1}, f_{x_2}, \\dots, f_{x_n} \\rangle\\), plays a similar role as the derivative does for univariate functions.\nFirst, we consider the role of the derivative for univariate functions. The main characterization - the derivative is the slope of the line that best approximates the function at a point - is quantified by Taylors theorem. For a function \\(f\\) with a continuous second derivative:\n\\[\nf(c+h) = f(c) = f'(c)h + \\frac{1}{2} f''(\\xi) h^2,\n\\]\nfor some \\(\\xi\\) within \\(c\\) and \\(c+h\\).\nWe re-express this through:\n\\[\n(f(c+h) - f(c)) - f'(c)h =\\frac{1}{2} f''(\\xi) h^2.\n\\]\nThe right hand side is the error term between the function value at \\(c+h\\) and, in this case, the linear approximation at the same value.\nIf the assumptions are relaxed, and \\(f\\) is just assumed to be differentiable at \\(x=c\\), then only this is known:\n\\[\n(f(c+h) - f(c)) - f'(c)h = \\epsilon(h) h,\n\\]\nwhere \\(\\epsilon(h) \\rightarrow 0\\) as \\(h \\rightarrow 0\\).\nIt is this characterization of differentiable that is generalized to define when a scalar function is differentiable.\n\nDifferentiable: Let \\(f\\) be a scalar function. Then \\(f\\) is differentiable at a point \\(C\\) if the first order partial derivatives exist at \\(C\\) and for \\(\\vec{h}\\) going to \\(\\vec{0}\\):\n\\(\\|f(C + \\vec{h}) - f(C) - \\nabla{f}(C) \\cdot \\vec{h}\\| = \\mathcal{o}(\\|\\vec{h}\\|),\\)\nwhere \\(\\mathcal{o}(\\|\\vec{h}\\|)\\) means that dividing the left hand side by \\(\\|\\vec{h}\\|\\) and taking a limit as \\(\\vec{h}\\rightarrow 0\\) the limit will be \\(0\\)..\n\nThe limits here are for limits of scalar functions, which means along any path going to \\(\\vec{0}\\), not just straight line paths, as are used to define the partial derivatives. Hidden above, is an assumption that there is some open set around \\(C\\) for which \\(f\\) is defined for \\(f(C + \\vec{h})\\) when \\(C+\\vec{h}\\) is in this open set.\nThe role of the derivative in the univariate case is played by the gradient in the scalar case, where \\(f'(c)h\\) is replaced by \\(\\nabla{f}(C) \\cdot \\vec{h}\\). For the univariate case, differentiable is simply the derivative existing, but saying a scalar function is differentiable at \\(C\\) is a stronger statement than saying it has a gradient or, equivalently, it has partial derivatives at \\(C\\), as this is assumed in the statement along with the other condition.\nLater we will see how Taylors theorem generalizes for scalar functions and interpret the gradient geometrically, as was done for the derivative (it being the slope of the tangent line)."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#the-chain-rule-to-evaluate-fcircvecgamma",
"href": "differentiable_vector_calculus/scalar_functions.html#the-chain-rule-to-evaluate-fcircvecgamma",
"title": "55  Scalar functions",
"section": "55.5 The chain rule to evaluate \\(f\\circ\\vec\\gamma\\)",
"text": "55.5 The chain rule to evaluate \\(f\\circ\\vec\\gamma\\)\nIn finding a partial derivative, we restricted the surface along a curve in the \\(x\\)-\\(y\\) plane, in this case the curve \\(\\vec{\\gamma}(t)=\\langle t, c\\rangle\\). In general if we have a curve in the \\(x\\)-\\(y\\) plane, \\(\\vec{\\gamma}(t)\\), we can compose the scalar function \\(f\\) with \\(\\vec{\\gamma}\\) to create a univariate function. If the functions are “smooth” then this composed function should have a derivative, and some version of a “chain rule” should provide a means to compute the derivative in terms of the “derivative” of \\(f\\) (the gradient) and the derivative of \\(\\vec{\\gamma}\\) (\\(\\vec{\\gamma}'\\)).\n\nChain rule: Suppose \\(f\\) is differentiable at \\(C\\), and \\(\\vec{\\gamma}(t)\\) is differentiable at \\(c\\) with \\(\\vec{\\gamma}(c) = C\\). Then \\(f\\circ\\vec{\\gamma}\\) is differentiable at \\(c\\) with derivative \\(\\nabla f(\\vec{\\gamma}(c)) \\cdot \\vec{\\gamma}'(c)\\).\n\nThis is similar to the chain rule for univariate functions \\((f\\circ g)'(u) = f'(g(u)) g'(u)\\) or \\(df/dx = df/du \\cdot du/dx\\). However, when we write out in components there are more terms. For example, for \\(n=2\\) we have with \\(\\vec{\\gamma} = \\langle x(t), y(t) \\rangle\\):\n\\[\n\\frac{d(f\\circ\\vec{\\gamma})}{dt} =\n\\frac{\\partial f}{\\partial x} \\frac{dx}{dt} +\n\\frac{\\partial f}{\\partial y} \\frac{dy}{dt}.\n\\]\nThe proof is a consequence of the definition of differentiability and will be shown in more generality later.\n\nExample\nConsider the function \\(f(x,y) = 2 - x^2 - y^2\\) and the curve \\(\\vec\\gamma(t) = t\\langle \\cos(t), -\\sin(t) \\rangle\\) at \\(t=\\pi/6\\). We visualize this below:\n\nf₃(x,y) = 2 - x^2 - y^2\nf₃(x) = f₃(x...)\nγ₃(t) = t*[cos(t), -sin(t)]\nt0₃ = pi/6\n\n0.5235987755982988\n\n\n\nxs = ys = range(-3/2, 3/2, length=100)\nsurface(xs, ys, f₃, legend=false)\n\nr(t) = [γ₃(t)..., (f₃∘γ₃)(t)]\nplot_parametric!(0..1/2, r, linewidth=5, color=:black)\n\narrow!(r(t0₃), r'(t0₃), linewidth=5, color=:black)\n\n\n\n\nIn three dimensions, the tangent line is seen, but the univariate function \\(f \\circ \\vec\\gamma\\) looks like:\n\nplot(f₃ ∘ γ₃, 0, pi/2)\nplot!(t -> (f₃ ∘ γ₃)(t0₃) + (f₃ ∘ γ₃)'(t0₃)*(t - t0₃), 0, pi/2)\n\n\n\n\nFrom the graph, the slope of the tangent line looks to be about \\(-1\\), using the chain rule gives the exact value:\n\nForwardDiff.gradient(f₃, γ₃(t0₃)) ⋅ γ₃'(t0₃)\n\n-1.0471975511965976\n\n\nWe can compare this to taking the derivative after composition:\n\n(f₃ ∘ γ₃)'(t0₃)\n\n-1.0471975511965976\n\n\n\n\nExample\nConsider the following plot showing a hiking trail on a surface:\n\n\n\n\n\nThough here it is hard to see the trail rendered on the surface, for the hiker, such questions are far from the mind. Rather, questions such as what is the steepest part of the trail may come to mind.\nFor this question, we can answer it in turns of the sampled data in the lenape variable. The steepness being the change in elevation with respect to distance in the \\(x\\)-\\(y\\) direction. Treating latitude and longitude coordinates describing motion in a plane (as opposed to a very big sphere), we can compute the maximum steepness:\n\nxs, ys, zs = lenape.longitude, lenape.latitude, lenape.elevation\ndzs = zs[2:end] - zs[1:end-1]\ndxs, dys = xs[2:end] - xs[1:end-1], ys[2:end] - ys[1:end-1]\ndeltas = sqrt.(dxs.^2 + dys.^2) * 69 / 1.6 * 1000 # in meters now\nglobal slopes = abs.(dzs ./ deltas) # to re-use\nm = maximum(slopes)\natand(maximum(slopes)) # in degrees due to the `d`\n\n58.377642682886105\n\n\nThis is certainly too steep for a trail, which should be at most \\(10\\) to \\(15\\) degrees or so, not \\(58\\). This is due to the inaccuracy in the measurements. An average might be better:\n\nimport Statistics: mean\natand(mean(slopes))\n\n8.817002448325248\n\n\nWhich seems about right for a generally uphill trail section, as this is.\nIn the above example, the data is given in terms of a sample, not a functional representation. Suppose instead, the surface was generated by f and the path - in the \\(x\\)-\\(y\\) plane - by \\(\\gamma\\). Then we could estimate the maximum and average steepness by a process like this:\n\nf₄(x,y) = 2 - x^2 - y^2\nf₄(x) = f₄(x...)\nγ₄(t) = t*[cos(t), -sin(t)]\n\nγ₄ (generic function with 1 method)\n\n\n\nxs = ys = range(-3/2, 3/2, length=100)\n\nsurface(xs, ys, f₄, legend=false)\nr(t) = [γ₄(t)..., (f₄ ∘ γ₄)(t)]\nplot_parametric!(0..1/2, r, linewidth=5, color=:black)\n\n\n\n\n\nplot(f₄ ∘ γ₄, 0, pi/2)\nslope(t) = abs((f₄ ∘ γ₄)'(t))\n\n1/(pi/2 - 0) * quadgk(t -> atand(slope(t)), 0, pi/2)[1] # the average\n\n50.585772642502285\n\n\nthe average is \\(50\\) degrees. As for the maximum slope:\n\ncps = find_zeros(slope, 0, pi/2) # critical points\n\nappend!(cps, (0, pi/2)) # add end points\nunique!(cps)\n\nM, i = findmax(slope.(cps)) # max, index\n\ncps[i], slope(cps[i])\n\n(1.5707963267948966, 3.1415926535897927)\n\n\nThe maximum slope occurs at an endpoint."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#directional-derivatives",
"href": "differentiable_vector_calculus/scalar_functions.html#directional-derivatives",
"title": "55  Scalar functions",
"section": "55.6 Directional Derivatives",
"text": "55.6 Directional Derivatives\nThe last example, how steep is the direction we are walking, is a question that can be asked when walking in a straight line in the \\(x\\)-\\(y\\) plane. The answer has a simplified answer:\nLet \\(\\vec\\gamma(t) = C + t \\langle a, b \\rangle\\) be a line that goes through the point \\(C\\) parallel, or in the direction of, to \\(\\vec{v} = \\langle a , b \\rangle\\).\nThen the function \\(f \\circ \\vec\\gamma(t)\\) will have a derivative when \\(f\\) is differentiable and by the chain rule will be:\n\\[\n(f\\circ\\vec\\gamma)'(\\vec\\gamma(t)) = \\nabla{f}(\\vec\\gamma(t)) \\cdot \\vec\\gamma'(t) =\n\\nabla{f}(\\vec\\gamma(t)) \\cdot \\langle a, b\\rangle =\n\\vec{v} \\cdot \\nabla{f}(\\vec\\gamma(t)).\n\\]\nAt \\(t=0\\), we see that \\((f\\circ\\vec\\gamma)'(C) = \\nabla{f}(C)\\cdot \\vec{v}\\).\nThis defines the directional derivative at \\(C\\) in the direction \\(\\vec{v}\\):\n\\[\n\\text{Directional derivative} = \\nabla_{\\vec{v}}(f) = \\nabla{f} \\cdot \\vec{v}.\n\\]\nIf \\(\\vec{v}\\) is a unit vector, then the value of the directional derivative is the rate of increase in \\(f\\) in the direction of \\(\\vec{v}\\).\nThis is a natural generalization of the partial derivatives, which, in two dimensions, are the directional derivative in the \\(x\\) direction and the directional derivative in the \\(y\\) direction.\nThe following figure shows \\(C = (1/2, -1/2)\\) and the two curves. Planes are added, as it can be easiest to visualize these curves as the intersection of the surface generated by \\(f\\) and the vertical planes \\(x=C_x\\) and \\(y=C_y\\)\n\n\n\n\n\nWe can then visualize the directional derivative by a plane through \\(C\\) in the direction \\(\\vec{v}\\). Here we take \\(C=(1/2, -1/2)\\), as before, and \\(\\vec{v} = \\langle 1, 1\\rangle\\):\n\n\n\n\n\nIn this figure, we see that the directional derivative appears to be \\(0\\), unlike the partial derivatives in \\(x\\) and \\(y\\), which are negative and positive, respectively.\n\nExample\nLet \\(f(x,y) = \\sin(x+2y)\\) and \\(\\vec{v} = \\langle 2, 1\\rangle\\). The directional derivative of \\(f\\) in the direction of \\(\\vec{v}\\) at \\((x,y)\\) is:\n\\[\n\\nabla{f}\\cdot \\frac{\\vec{v}}{\\|\\vec{v}\\|} = \\langle \\cos(x + 2y), 2\\cos(x + 2y)\\rangle \\cdot \\frac{\\langle 2, 1 \\rangle}{\\sqrt{5}} = \\frac{4}{\\sqrt{5}} \\cos(x + 2y).\n\\]\n\n\nExample\nSuppose \\(f(x,y)\\) describes a surface, and \\(\\vec\\gamma(t)\\) parameterizes a path in the \\(x\\)-\\(y\\) plane. Then the vector valued function \\(\\vec{r}(t) = \\langle \\vec\\gamma_1(t), \\vec\\gamma_2(t), (f\\circ\\vec\\gamma)(t)\\rangle\\) describes a path on the surface. The maximum steepness of the this path is found by maximizing the slope of the directional derivative in the direction of the tangent line. This would be the function of \\(t\\):\n\\[\n\\nabla{f}(\\vec\\gamma(t)) \\cdot \\vec{T}(t),\n\\]\nWhere \\(T(t) = \\vec\\gamma'(t)/\\|\\vec\\gamma'(t)\\|\\) is the unit tangent vector to \\(\\gamma\\).\nLet \\(f(x,y) = 2 - x^2 - y^2\\) and \\(\\vec\\gamma(t) = (\\pi-t) \\langle \\cos(t), \\sin(t) \\rangle\\). What is the maximum steepness?\nWe have \\(\\nabla{f} = \\langle -2x, -2y \\rangle\\) and \\(\\vec\\gamma'(t) = -\\langle(\\cos(t), \\sin(t)) + (\\pi-t) \\langle(-\\sin(t), \\cos(t)\\rangle\\). We maximize this over \\([0, \\pi]\\):\n\nf(x,y) = 2 - x^2 - y^2\nf(v) = f(v...)\ngamma(t) = (pi-t) * [cos(t), sin(t)]\ndd(t) = gradient(f)(gamma(t)) ⋅ gamma'(t)\n\ncps = find_zeros(dd, 0, pi)\nunique!(append!(cps, (0, pi))) # add endpoints\nM,i = findmax(dd.(cps))\nM\n\n6.283185307179586\n\n\n\n\nExample: The gradient indicates the direction of steepest ascent\nConsider this figure showing a surface and a level curve along with a contour line:\n\n\n\n\n\nWe have the level curve for \\(f(x,y) = c\\) represented, and a point \\((x,y, f(x,y))\\) drawn. At the point \\((x,y)\\) which sits on the level curve, we have indicated the gradient and the tangent curve to the level curve, or contour line. Worth reiterating, the gradient is not on the surface, but rather is a \\(2\\)-dimensional vector, but it does indicate a direction that can be taken on the surface. We will see that this direction indicates the path of steepest ascent.\nThe figure suggests a relationship between the gradient and the tangents to the contour lines. Lets parameterize the contour line by \\(\\vec\\gamma(t)\\), assuming such a parameterization exists, let \\(C = (x,y) = \\vec\\gamma(t)\\), for some \\(t\\), be a point on the level curve, and \\(\\vec{T} = \\vec\\gamma'(t)/\\|\\vec\\gamma'(t)\\|\\) be the tangent to to the level curve at \\(C\\). Then the directional derivative at \\(C\\) in the direction of \\(T\\) must be \\(0\\), as along the level curve, the function \\(f\\circ \\vec\\gamma = c\\), a constant. But by the chain rule, this says:\n\\[\n0 = (c)' = (f\\circ\\vec\\gamma)'(t) = \\nabla{f}(\\vec\\gamma(t)) \\cdot \\vec\\gamma'(t)\n\\]\nThat is the gradient is orthogonal to \\(\\vec{\\gamma}'(t)\\). As well, is orthogonal to the tangent vector \\(\\vec{T}\\) and hence to the level curve at any point. (As the dot product is \\(0\\).)\nNow, consider a unit vector \\(\\vec{v}\\) in the direction of steepest ascent at \\(C\\). Since \\(\\nabla{f}(C)\\) and \\(\\vec{T}\\) are orthogonal, we can express the unit vector uniquely as \\(a\\nabla{f}(C) + b \\vec{T}\\) with \\(a^2 + b^2 = 1\\). The directional derivative is then\n\\[\n\\nabla{f} \\cdot \\vec{v} = \\nabla{f} \\cdot (a\\nabla{f}(C) + b \\vec{T}) = a \\| \\nabla{f} \\|^2 + b \\nabla{f} \\cdot \\vec{T} = a \\| \\nabla{f} \\|^2.\n\\]\nThe will be largest when \\(a=1\\) and \\(b=0\\). That is, the direction of greatest ascent in indicated by the gradient. (It is smallest when \\(a=-1\\) and \\(b=0\\), the direction opposite the gradient.\nIn practical terms, if standing on a hill, walking in the direction of the gradient will go uphill the fastest possible way, walking along a contour will not gain any elevation. The two directions are orthogonal."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#other-types-of-compositions-and-the-chain-rule",
"href": "differentiable_vector_calculus/scalar_functions.html#other-types-of-compositions-and-the-chain-rule",
"title": "55  Scalar functions",
"section": "55.7 Other types of compositions and the chain rule",
"text": "55.7 Other types of compositions and the chain rule\nThe chain rule we discussed was for a composition of \\(f:R^n \\rightarrow R\\) with \\(\\vec\\gamma:R \\rightarrow R^n\\) resulting in a function \\(f\\circ\\vec\\gamma:R \\rightarrow R\\). There are other possible compositions.\nFor example, suppose we have an economic model for beverage consumption based on temperature given by \\(c(T)\\). But temperature depends on geographic location, so may be modeled through a function \\(T(x,y)\\). The composition \\(c \\circ T\\) would be a function from \\(R^2 \\rightarrow R\\), so should have partial derivatives with respect to \\(x\\) and \\(y\\) which should be expressible in terms of the derivative of \\(c\\) and the partial derivatives of \\(T\\).\nConsider a different situation, say we have \\(f(x,y)\\) a scalar function, but want to consider the position in polar coordinates involving \\(r\\) and \\(\\theta\\). We can think directly of \\(F(r,\\theta) = f(r\\cdot\\cos(\\theta), r\\cdot\\sin(\\theta))\\), but more generally, we have a function \\(G(r, \\theta)\\) that is vector valued: \\(G(r,\\theta) = \\langle r\\cdot\\cos(\\theta), r\\cdot\\sin(\\theta) \\rangle\\) (\\(G:R^2 \\rightarrow R^2\\)). The composition \\(F=f\\circ G\\) is a scalar function of \\(r\\) and \\(\\theta\\) and the partial derivatives with respect to these should be expressible in terms of the partial derivatives of \\(f\\) and the partial derivatives of \\(G\\).\nFinding the derivative of a composition in terms of the individual pieces involves some form of the chain rule, which will differ depending on the exact circumstances.\n\n55.7.1 Chain rule for a univariate function composed with a scalar function\nIf \\(f(t)\\) is a univariate function and \\(G(x,y)\\) a scalar function, the \\(F(x,y) = f(G(x,y))\\) will be a scalar function and may have partial derivatives. If \\(f\\) and \\(G\\) are differentiable at a point \\(P\\), then\n\\[\n\\frac{\\partial F}{\\partial x} = f'(G(x,y)) \\frac{\\partial G}{\\partial x}, \\quad\n\\frac{\\partial F}{\\partial y} = f'(G(x,y)) \\frac{\\partial G}{\\partial y},\n\\]\nand\n\\[\n\\nabla{F} = \\nabla{f \\circ G} = f'(G(x,y)) \\nabla{G}(x,y).\n\\]\nThe result is an immediate application of the univariate chain rule, when the partial functions are considered.\n\nExample\nImagine a scenario where sales of some commodity (say ice) depend on the temperature which in turn depends on location. Formally, we might have functions \\(S(T)\\) and \\(T(x,y)\\) and then sales would be the composition \\(S(T(x,y))\\). How might sales go up or down if one moved west, or one moved in the northwest direction? These would be a directional derivative, answered by \\(\\nabla{S}\\cdot \\hat{v}\\), where \\(\\vec{v}\\) is the direction. Of importance would be to compute \\(\\nabla{S}\\) which might best be done through the chain rule.\nFor example, if \\(S(T) = \\exp((T - 70)/10)\\) and \\(T(x,y) = (1-x^2)\\cdot y\\), the gradient of \\(S(T(x,y))\\) would be given by:\n\\[\nS'(T(x,y)) \\nabla{T}(x,y) = (S(T(x,y))/10) \\langle(-2xy, 1-x^2 \\rangle.\n\\]\n\n\n\n55.7.2 Chain rule for a scalar function, \\(f\\), composed with a function \\(G: R^m \\rightarrow R^n\\).\nIf \\(G(u_1, \\dots, u_m) = \\langle G_1, G_2,\\dots, G_n\\rangle\\) is a function of \\(m\\) inputs that returns \\(n\\) outputs we may view it as \\(G: R^m \\rightarrow R^n\\). The composition with a scalar function \\(f(v_1, v_2, \\dots, v_n)=z\\) from \\(R^n \\rightarrow R\\) creates a scalar function from \\(R^m \\rightarrow R\\), so the question of partial derivatives is of interest. We have:\n\\[\n\\frac{\\partial (f \\circ G)}{\\partial u_i} =\n\\frac{\\partial f}{\\partial v_1} \\frac{\\partial G}{\\partial u_i} +\n\\frac{\\partial f}{\\partial v_2} \\frac{\\partial G}{\\partial u_i} + \\dots +\n\\frac{\\partial f}{\\partial v_n} \\frac{\\partial G}{\\partial u_i}.\n\\]\nThe gradient is then:\n\\[\n\\nabla(f\\circ G) =\n\\frac{\\partial f}{\\partial v_1} \\nabla{G_1} +\n\\frac{\\partial f}{\\partial v_2} \\nabla{G_2} + \\dots +\n\\frac{\\partial f}{\\partial v_n} \\nabla{G_n} = \\nabla(f) \\cdot \\langle \\nabla{G_1}, \\nabla{G_2}, \\dots, \\nabla{G_n} \\rangle,\n\\]\nThe last expression is a suggestion, as it is an abuse of previously used notation: the dot product isnt between vectors of the same type, as the rightmost vector is representing a vector of vectors. The Jacobian matrix combines these vectors into a rectangular array, though with the vectors written as row vectors. If \\(G: R^m \\rightarrow R^n\\), then the Jacobian is the \\(n \\times m\\) matrix with \\((i,j)\\) entry given by \\(\\partial G_i, \\partial u_j\\):\n\\[\nJ = \\left[\n\\begin{align}\n\\frac{\\partial G_1}{\\partial u_1} & \\frac{\\partial G_1}{\\partial u_2} & \\dots \\frac{\\partial G_1}{\\partial u_m}\\\\\n\\frac{\\partial G_2}{\\partial u_1} & \\frac{\\partial G_2}{\\partial u_2} & \\dots \\frac{\\partial G_2}{\\partial u_m}\\\\\n& \\vdots & \\\\\n\\frac{\\partial G_n}{\\partial u_1} & \\frac{\\partial G_n}{\\partial u_2} & \\dots \\frac{\\partial G_n}{\\partial u_m}\n\\end{align}\n\\right].\n\\]\nWith this notation, and matrix multiplication we have \\((\\nabla(f\\circ G))^t = \\nabla(f)^t J\\).\n(Later, we will see that the chain rule in general has a familiar form using matrices, not vectors, which will avoid the need for a transpose.)\n\nExample\nLet \\(f(x,y) = x^2 + y^2\\) be a scalar function. We have if \\(G(r, \\theta) = \\langle r\\cos(\\theta)(, r\\sin(\\theta) \\rangle\\) then after simplification, we have \\((f \\circ G)(r, \\theta) = r^2\\). Clearly then \\(\\partial(f\\circ G)/\\partial r = 2r\\) and \\(\\partial(f\\circ G)/\\partial \\theta = 0\\).\nWere this computed through the chain rule, we have:\n\\[\n\\begin{align}\n\\nabla G_1 &= \\langle \\frac{\\partial r\\cos(\\theta)}{\\partial r}, \\frac{\\partial r\\cos(\\theta)}{\\partial \\theta} \\rangle=\n\\langle \\cos(\\theta), -r \\sin(\\theta) \\rangle,\\\\\n\\nabla G_2 &= \\langle \\frac{\\partial r\\sin(\\theta)}{\\partial r}, \\frac{\\partial r\\sin(\\theta)}{\\partial \\theta} \\rangle=\n\\langle \\sin(\\theta), r \\cos(\\theta) \\rangle.\n\\end{align}\n\\]\nWe have \\(\\partial f/\\partial x = 2x\\) and \\(\\partial f/\\partial y = 2y\\), which at \\(G\\) are \\(2r\\cos(\\theta)\\) and \\(2r\\sin(\\theta)\\), so by the chain rule, we should have\n\\[\n\\begin{align}\n\\frac{\\partial (f\\circ G)}{\\partial r} &=\n\\frac{\\partial{f}}{\\partial{x}}\\frac{\\partial G_1}{\\partial r} +\n\\frac{\\partial{f}}{\\partial{y}}\\frac{\\partial G_2}{\\partial r} =\n2r\\cos(\\theta) \\cos(\\theta) + 2r\\sin(\\theta) \\sin(\\theta) =\n2r (\\cos^2(\\theta) + \\sin^2(\\theta)) = 2r, \\\\\n\\frac{\\partial (f\\circ G)}{\\partial \\theta} &=\n\\frac{\\partial f}{\\partial x}\\frac{\\partial G_1}{\\partial \\theta} +\n\\frac{\\partial f}{\\partial y}\\frac{\\partial G_2}{\\partial \\theta} =\n2r\\cos(\\theta)(-r\\sin(\\theta)) + 2r\\sin(\\theta)(r\\cos(\\theta)) = 0.\n\\end{align}\n\\]"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#higher-order-partial-derivatives",
"href": "differentiable_vector_calculus/scalar_functions.html#higher-order-partial-derivatives",
"title": "55  Scalar functions",
"section": "55.8 Higher order partial derivatives",
"text": "55.8 Higher order partial derivatives\nIf \\(f:R^n \\rightarrow R\\), the \\(\\partial f/\\partial x_i\\) takes \\(R^n \\rightarrow R\\) too, so may also have a partial derivative.\nConsider the case \\(f: R^2 \\rightarrow R\\), then there are \\(4\\) possible partial derivatives of order 2: partial in \\(x\\) then \\(x\\), partial in \\(x\\) then \\(y\\), partial in \\(y\\) and then \\(x\\), and, finally, partial in \\(y\\) and then \\(y\\).\nThe notation for the partial in \\(y\\) of the partial in \\(x\\) is:\n\\[\n\\frac{\\partial^2 f}{\\partial{y}\\partial{x}} = \\frac{\\partial{\\frac{\\partial{f}}{\\partial{x}}}}{\\partial{y}} = \\frac{\\partial f_x}{\\partial{y}} = f_{xy}.\n\\]\nThe placement of \\(x\\) and \\(y\\) indicating the order is different in the two notations.\nWe can compute these for an example easily enough:\n\n@syms x y\nf(x, y) = exp(x) * cos(y)\nex = f(x,y)\ndiff(ex, x, x), diff(ex, x, y), diff(ex, y, x), diff(ex, y, y)\n\n(exp(x)*cos(y), -exp(x)*sin(y), -exp(x)*sin(y), -exp(x)*cos(y))\n\n\nIn SymPy the variable to differentiate by is taken from left to right, so diff(ex, x, y, x) would first take the partial in \\(x\\), then \\(y\\), and finally \\(x\\).\nWe see that diff(ex, x, y) and diff(ex, y, x) are identical. This is not a coincidence, as by Schwarzs Theorem (also known as Clairauts theorem) this will always be the case under typical assumptions:\n\nTheorem on mixed partials. If the mixed partials \\(\\partial^2 f/\\partial x \\partial y\\) and \\(\\partial^2 f/\\partial y \\partial x\\) exist and are continuous, then they are equal.\n\nFor higher order mixed partials, something similar to Schwarzs theorem still holds. Say \\(f:R^n \\rightarrow R\\) is \\(C^k\\) if \\(f\\) is continuous and all partial derivatives of order \\(j \\leq k\\) are continous. If \\(f\\) is \\(C^k\\), and \\(k=k_1+k_2+\\cdots+k_n\\) (\\(k_i \\geq 0\\)) then\n\\[\n\\frac{\\partial^k f}{\\partial x_1^{k_1} \\partial x_2^{k_2} \\cdots \\partial x_n^{k_n}},\n\\]\nis uniquely defined. That is, which order the partial derivatives are taken is unimportant if the function is sufficiently smooth.\n\nThe Hessian matrix is the matrix of mixed partials defined (for \\(n=2\\)) by:\n\\[\nH = \\left[\n\\begin{align}\n\\frac{\\partial^2 f}{\\partial x \\partial x} & \\frac{\\partial^2 f}{\\partial x \\partial y}\\\\\n\\frac{\\partial^2 f}{\\partial y \\partial x} & \\frac{\\partial^2 f}{\\partial y \\partial y}\n\\end{align}\n\\right].\n\\]\nFor symbolic expressions, the Hessian may be computed directly in SymPy with its hessian function:\n\nex\n\n \n\\[\ne^{x} \\cos{\\left(y \\right)}\n\\]\n\n\n\n\nhessian(ex, (x, y))\n\n2×2 Matrix{Sym}:\n exp(x)*cos(y) -exp(x)*sin(y)\n -exp(x)*sin(y) -exp(x)*cos(y)\n\n\nWhen the mixed partials are continuous, this will be a symmetric matrix. The Hessian matrix plays the role of the second derivative in the multivariate Taylor theorem.\nFor numeric use, FowardDiff has a hessian function. It expects a scalar function and a point and returns the Hessian matrix. We have for \\(f(x,y) = e^x\\cos(y)\\) at the point \\((1,2)\\), the Hessian matrix is:\n\nf(x,y) = exp(x) * cos(y)\nf(v) = f(v...)\npt = [1, 2]\n\nForwardDiff.hessian(f, pt) # symmetric\n\n2×2 Matrix{Float64}:\n -1.1312 -2.47173\n -2.47173 1.1312"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions.html#questions",
"href": "differentiable_vector_calculus/scalar_functions.html#questions",
"title": "55  Scalar functions",
"section": "55.9 Questions",
"text": "55.9 Questions\n\nQuestion\nConsider the graph of a function \\(z= f(x,y)\\) presented below:\n\n\n\n\n\nFrom the graph, is the value of \\(f(1/2, 1)\\) positive or negative?\n\n\n\n \n \n \n \n \n \n \n \n \n positive\n \n \n\n\n \n \n \n \n negative\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nOn which line is the function \\(0\\):\n\n\n\n \n \n \n \n \n \n \n \n \n The line \\(x=0\\)\n \n \n\n\n \n \n \n \n The line \\(y=0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nConsider the contour plot\n\n\n\n\n\nWhat is the value of \\(f(1, 0)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFrom this graph, the minimum value over this region is\n\n\n\n \n \n \n \n \n \n \n \n \n is around \\((0.7, 0)\\) and with a value less than \\(-0.4\\)\n \n \n\n\n \n \n \n \n is around \\((2.0, 0)\\) and with a value less than \\(-0.4\\)\n \n \n\n\n \n \n \n \n is around \\((-2.0, 0)\\) and with a value less than \\(-0.4\\)\n \n \n\n\n \n \n \n \n is around \\((-0.7, 0)\\) and with a value less than \\(-0.4\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFrom this graph, where is the surface steeper?\n\n\n\n \n \n \n \n \n \n \n \n \n near \\((1/4, 0)\\)\n \n \n\n\n \n \n \n \n near \\((1/2, 0)\\)\n \n \n\n\n \n \n \n \n near \\((3/4, 0)\\)\n \n \n\n\n \n \n \n \n near \\((1, 0)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the contour graph of a function below:\n\n\n\n\n\nAre there any peaks or valleys (local extrema) indicated?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes, the closed loops near \\((-1.5, 0)\\) and \\((1.5, 0)\\) will contain these\n \n \n\n\n \n \n \n \n No, the vertical lines parallel to \\(x=0\\) show this function to be flat\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nImagine hiking on this surface within this region. Could you traverse from left to right without having to go up or down?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nImagine hiking on this surface within this region. Could you traverse from top to bottom without having to go up or down?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe figure (taken from openstreetmap.org shows the Stelvio Pass in Northern Italy near the Swiss border.\n\n\n\nStelvio Pass\n\n\nThe road through the pass (on the right) makes a series of switch backs.\nAre these\n\n\n\n \n \n \n \n \n \n \n \n \n running essentially perpendicular to the contour lines\n \n \n\n\n \n \n \n \n running essentially parallel to the contour lines\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhy?\n\n\n\n \n \n \n \n \n \n \n \n \n By being essentially parallel, the steepness of the roadway can be kept to a passable level\n \n \n\n\n \n \n \n \n By being essentially perpendicular, the road can more quickly climb up the mountain\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe pass is at about 2700 meters. As shown towards the top and bottom of the figure the contour lines show increasing heights, and to the left and right decreasing heights. The shape of the pass would look like:\n\n\n\n \n \n \n \n \n \n \n \n \n A saddle-like shape, called a col or gap\n \n \n\n\n \n \n \n \n A upside down bowl-like shape like the top of a mountain\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLimits of scalar functions have the same set of rules as limits of univariate functions. These include limits of constants; limits of sums, differences, and scalar multiples; limits of products; and limits of ratios. The latter with the provision that division by \\(0\\) does not occur at the point in question.\nUsing these, identify any points where the following limit may not exist, knowing the limits of the individual functions exist at \\(\\vec{c}\\):\n\\[\n\\lim_{\\vec{x} \\rightarrow \\vec{x}} \\frac{af(\\vec{x})g(\\vec{x}) + bh(\\vec{x})}{ci(\\vec{x})}.\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n When \\(i(\\vec{x}) = 0\\)\n \n \n\n\n \n \n \n \n When any of \\(f(\\vec{x})\\), \\(g(\\vec{x})\\), or \\(i(\\vec{x})\\) are zero\n \n \n\n\n \n \n \n \n The limit exists everywhere, as the function \\(f\\), \\(g\\), \\(h\\), and \\(i\\) have limits at \\(\\vec{c}\\) by assumption\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = (x^2 - y^2) /(x^2 + y^2)\\).\nFix \\(y=0\\). What is \\(\\lim_{x \\rightarrow 0} f(x,0)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFix \\(x=0\\). What is \\(\\lim_{y \\rightarrow 0} f(0, y)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe two paths technique shows a limit does not exist by finding two paths with different limits as \\(\\vec{x}\\) approaches \\(\\vec{c}\\). Does this apply to \\(\\lim_{\\langle x,y\\rangle \\rightarrow\\langle 0, 0 \\rangle}f(x,y)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = \\langle \\sin(x)\\cos(2y), \\sin(2x)\\cos(y) \\rangle\\)\nCompute \\(f_x\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle \\cos(x)\\cos(2y), 2\\cos(2x)\\cos(y)\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle \\sin(x), \\sin(2x) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle \\cos(2y), \\cos(y) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\sin(x)\\cos(2y)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nCompute \\(f_y\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(- \\sin(2x)\\sin(y)\\)\n \n \n\n\n \n \n \n \n \\(\\langle -2\\sin(2y), -\\sin(y) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle 2\\sin(x), \\sin(2x) \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -2\\sin(x)\\sin(2y), -\\sin(2x)\\sin(y) \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = x^{y\\sin(xy)}\\). Using ForwardDiff, at the point \\((1/2, 1/2)\\), compute the following.\nThe value of \\(f_x\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe value of \\(\\partial{f}/\\partial{y}\\):\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(z = f(x,y)\\) have gradient \\(\\langle f_x, f_y \\rangle\\).\nThe gradient is:\n\n\n\n \n \n \n \n \n \n \n \n \n two dimensional\n \n \n\n\n \n \n \n \n three dimensional\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe surface is:\n\n\n\n \n \n \n \n \n \n \n \n \n two dimensional\n \n \n\n\n \n \n \n \n three dimensional\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe gradient points in the direction of greatest increase of \\(f\\). If a person were on a hill described by \\(z=f(x,y)\\), what three dimensional vector would they follow to go the steepest way up the hill?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle f_x, f_y, -1 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -f_x, -f_y, 1 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle f_x, f_y \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe figure shows climbers on their way to summit Mt. Everest:\n\n\n\nClimbers en route to the summit of Mt. Everest\n\n\nIf the surface of the mountain is given by a function \\(z=f(x,y)\\) then the climbers move along a single path parameterized, say, by \\(\\vec{\\gamma}(t) = \\langle x(t), y(t)\\rangle\\), as set up by the Sherpas.\nConsider the composition \\((f\\circ\\vec\\gamma)(t)\\).\nFor a climber with GPS coordinates \\((x,y)\\). What describes her elevation?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\vec\\gamma(x,y)\\)\n \n \n\n\n \n \n \n \n \\((f\\circ\\vec\\gamma)(x,y)\\)\n \n \n\n\n \n \n \n \n \\(f(x,y)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nA climber leaves base camp at \\(t_0\\). At time \\(t > t_0\\), what describes her elevation?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f(t)\\)\n \n \n\n\n \n \n \n \n \\((f\\circ\\vec\\gamma)(t)\\)\n \n \n\n\n \n \n \n \n \\(\\vec\\gamma(t)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does the vector-valued function \\(\\vec{r}(t) = \\langle x(t), y(t), (f\\circ\\vec\\gamma(t))\\rangle\\) describe:\n\n\n\n \n \n \n \n \n \n \n \n \n The climbers gradient, pointing in the direction of greatest ascent\n \n \n\n\n \n \n \n \n The three dimensional position of the climber\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIn the figure, the climbers are making a switch back, so as to avoid the steeper direct ascent. Mathematically \\(\\nabla{f}(\\vec\\gamma(t)) \\cdot \\vec\\gamma'(t)\\) describes the directional derivative that they follow. Using \\(\\|\\vec{u}\\cdot\\vec{v}\\| = \\|\\vec{u}\\|\\|\\vec{v}\\|\\cos(\\theta)\\), does this route:\n\n\n\n \n \n \n \n \n \n \n \n \n Keep \\(\\cos(\\theta)\\) as close to \\(1\\) as possible, so the slope taken is as big as possible\n \n \n\n\n \n \n \n \n Keep \\(̧\\cos(\\theta)\\) as close to \\(0\\) as possible, so that they climbers don't waste energy going up and down\n \n \n\n\n \n \n \n \n Keep \\(\\cos(\\theta)\\) smaller than \\(1\\), so that the slope taken is not too great\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSuppose our climber reaches the top at time \\(t\\). What would be \\((f\\circ\\vec\\gamma)'(t)\\), assuming the derivative exists?\n\n\n\n \n \n \n \n \n \n \n \n \n It would not exist, as there would not be enough oxygen to compute it\n \n \n\n\n \n \n \n \n It would be \\(\\langle f_x, f_y\\rangle\\) and point towards the sky, the direction of greatest ascent\n \n \n\n\n \n \n \n \n It would be \\(0\\), as the top would be maximum for \\(f\\circ\\vec\\gamma\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\nBuilding sustainable hiking trails involves proper water management. Two rules of thumb are 1) the trail should not be steeper than 10 degrees 2) the outward slope (in the steepest downhill direction) should be around 5%. (A trail tread is not flat, but rather sloped downward, similar to the crown on a road, so that water will run to the downhill side of the tread, not along the tread, which would cause erosion. In the best possible world, the outslope will exceed the downward slope.)\nSuppose a trail height is described parametrically by a composition \\((f \\circ \\vec\\gamma)(t))\\), where \\(\\vec\\gamma(t) = \\langle x(t),y(t)\\rangle\\). The vector \\(\\vec{T}(t) = \\langle x(t), y(t), \\nabla{f}(\\vec\\gamma(t)) \\rangle\\) describes the tangent to the trail at a point (\\(\\vec\\gamma(t)\\)). Let \\(\\hat{T}(t)\\) be the unit normal, and \\(\\hat{P}(t)\\) be a unit normal in the direction of the projection of \\(\\vec{T}\\) onto the \\(x\\)-\\(y\\) plane. (Make the third component of \\(\\vec{T}\\) \\(0\\), and then form a unit vector from that.)\nWhat expression below captures point 1 that the steepness should be no more than 10 degrees (\\(\\pi/18\\) radians):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(|\\hat{T} \\cdot \\hat{P}| \\leq \\pi/18\\)\n \n \n\n\n \n \n \n \n \\(|\\hat{T} \\cdot \\hat{P}| \\leq \\sin(\\pi/18)\\)\n \n \n\n\n \n \n \n \n \\(|\\hat{T} \\cdot \\hat{P}| \\leq \\cos(π/18)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe normal to the surface \\(z=f(x,y)\\) is not the normal to the trail tread. Suppose \\(\\vec{N}(t)\\) is a function that returns this. At the same point \\(\\vec\\gamma(t)\\), let \\(\\vec{M} = \\langle -f_x, -f_y, 0\\rangle\\) be a vector in 3 dimensions pointing downhill. Let “hats” indicate unit vectors. The outward slope is \\(\\pi/2\\) minus the angle between \\(\\hat{N}\\) and \\(\\hat{M}\\). What condition will ensure this angle is \\(5\\) degrees (\\(\\pi/36\\) radians)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(|\\hat{N} \\cdot \\hat{M}| \\leq \\sin(\\pi/2 - \\pi/18)\\)\n \n \n\n\n \n \n \n \n \\(|\\hat{N} \\cdot \\hat{M}| \\leq \\pi/2 - \\pi/18\\)\n \n \n\n\n \n \n \n \n \\(|\\hat{N} \\cdot \\hat{M}| \\leq \\cos(\\pi/2 - π/36)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = x^2 \\cdot(x - y^2)\\). Let \\(\\vec{v} = \\langle 1, 2\\rangle\\). Find the directional derivative in the direction of \\(\\vec{v}\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2 \\cos{\\left (3 \\right )} - 7 \\sin{\\left (3 \\right )}\\)\n \n \n\n\n \n \n \n \n \\(4 x^{2} y \\sin{\\left (x - y^{2} \\right )} - x^{2} \\sin{\\left (x - y^{2} \\right )} + 2 x \\cos{\\left (x - y^{2} \\right )}\\)\n \n \n\n\n \n \n \n \n \\(\\frac{\\sqrt{5}}{5}\\left(2 \\cos{\\left (3 \\right )} - 7 \\sin{\\left (3 \\right )}\\right)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{v}\\) be any non-zero vector. Does \\(\\nabla{f}(\\vec{x})\\cdot\\vec{v}\\) give the rate of increase of \\(f\\) per unit of distance in the direction of \\(\\vec{v}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n No, not unless \\(\\vec{v}\\) were a unit vector\n \n \n\n\n \n \n \n \n Yes, by definition\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y,z) = x^4 + 2xz + 2xy + y^4\\) and \\(\\vec\\gamma(t) = \\langle t, t^2, t^3\\rangle\\). Using the chain rule, compute \\((f\\circ\\vec\\gamma)'(t)\\).\nThe value of \\(\\nabla{f}(x,y,z)\\) is\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle 4x^3, 2z, 2y\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle x^3 + 2x + 2x, 2y+ y^3, 2x\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle 4x^3 + 2x + 2y, 2x + 4y^3, 2x \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe value of \\(\\vec\\gamma'(t)\\) is:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle 1, 2t, 3t^2\\rangle\\)\n \n \n\n\n \n \n \n \n \\(1 + 2y + 3t^2\\)\n \n \n\n\n \n \n \n \n \\(\\langle 1,2, 3 \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe value of \\((f\\circ\\vec\\gamma)'(t)\\) is found by:\n\n\n\n \n \n \n \n \n \n \n \n \n Taking the dot product of \\(\\nabla{f}(x,y,z)\\) and \\(\\vec\\gamma'(t)\\)\n \n \n\n\n \n \n \n \n Taking the dot product of \\(\\nabla{f}(\\vec\\gamma'(t))\\) and \\(\\vec\\gamma(t)\\)\n \n \n\n\n \n \n \n \n Taking the dot product of \\(\\nabla{f}(\\vec\\gamma(t))\\) and \\(\\vec\\gamma'(t)\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(z = f(x,y)\\) be some unknown function,\nFrom the figure, which drawn vector is the gradient at \\((1/2, -3/4)\\)?\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n The red one\n \n \n\n\n \n \n \n \n The green one\n \n \n\n\n \n \n \n \n The blue one\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nFrom the figure, which drawn vector is the gradient as \\((1/2, -3/4)\\)?\n\n\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n The blue one\n \n \n\n\n \n \n \n \n The red one\n \n \n\n\n \n \n \n \n The green one\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor a function \\(f(x,y)\\) and a point (as a vector, \\(\\vec{c}\\)) we consider this derived function:\n\\[\ng(\\vec{x}) = f(\\vec{c}) + \\nabla{f}(\\vec{c}) \\cdot(\\vec{x} - \\vec{c}) + \\frac{1}{2}(\\vec{x} - \\vec{c})^tH(\\vec{c})(\\vec{x} - \\vec{c}),\n\\]\nwhere \\(H(\\vec{c})\\) is the Hessian.\nFurther, suppose \\(\\nabla{f}(\\vec{c}) = \\vec{0}\\), so in fact:\n\\[\ng(\\vec{x}) = f(\\vec{c}) + \\frac{1}{2}(\\vec{x} - \\vec{c})^tH(\\vec{c})(\\vec{x} - \\vec{c}).\n\\]\nIf \\(f\\) is a linear function at \\(\\vec{c}\\), what does this say about \\(g\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Linear means \\(H\\) is the \\(0\\) matrix, so the gradient couldn't have been \\(\\vec{0}\\)\n \n \n\n\n \n \n \n \n Linear means \\(H\\) is the \\(0\\) matrix, so \\(g(\\vec{x})\\) is the constant \\(f(\\vec{c})\\)\n \n \n\n\n \n \n \n \n Linear means \\(H\\) is linear, so \\(g(\\vec{x})\\) describes a plane\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nSuppose, \\(H\\) has the magic property that for any vector \\(\\vec{v}^tH\\vec{v} < 0\\). What does this imply:\n\n\n\n \n \n \n \n \n \n \n \n \n That \\(g(\\vec{x}) \\geq f(\\vec{c})\\)\n \n \n\n\n \n \n \n \n That \\(g(\\vec{x}) = f(\\vec{c})\\)\n \n \n\n\n \n \n \n \n That \\(g(\\vec{x}) \\leq f(\\vec{c})\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = x^3y^3\\). Which partial derivative is identically \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\partial^4{f}/\\partial{x^4}\\)\n \n \n\n\n \n \n \n \n \\(\\partial^4{f}/\\partial{x^3}\\partial{y}\\)\n \n \n\n\n \n \n \n \n \\(\\partial^4{f}/\\partial{x^2}\\partial{y^2}\\)\n \n \n\n\n \n \n \n \n \\(\\partial^4{f}/\\partial{x^1}\\partial{y^3}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = 3x^2 y\\).\nWhich value is greater at the point \\((1/2,2)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(f_x\\)\n \n \n\n\n \n \n \n \n \\(f_y\\)\n \n \n\n\n \n \n \n \n \\(f_{xx}\\)\n \n \n\n\n \n \n \n \n \\(f_{xy}\\)\n \n \n\n\n \n \n \n \n \\(f_{yy}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe order of partial derivatives matters if the mixed partials are not continuous. Take\n\\[\nf(x,y) = \\frac{xy ( x^2 - y^2)}{x^2 + y^2}, \\quad f(0,0) = 0\n\\]\nUsing the definition of the derivative from a limit, we have\n\\[\n\\frac{\\partial \\frac{\\partial f}{\\partial x}}{ \\partial y} =\n\\lim_{\\Delta y \\rightarrow 0} \\lim_{\\Delta x \\rightarrow 0}\n\\frac{f(x+\\Delta x, y + \\Delta y) - f(x, y+\\Delta{y}) - f(x+\\Delta x,y) + f(x,y)}{\\Delta x \\Delta y}.\n\\]\nWhereas,\n\\[\n\\frac{\\partial \\frac{\\partial f}{\\partial y}}{ \\partial x} =\n\\lim_{\\Delta x \\rightarrow 0} \\lim_{\\Delta y \\rightarrow 0}\n\\frac{f(x+\\Delta x, y + \\Delta y) - f(x, y+\\Delta{y}) - f(x+\\Delta x,y) + f(x,y)}{\\Delta x \\Delta y}.\n\\]\nAt \\((0,0)\\) what is $ $?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAt \\((0,0)\\) what is $ $?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAway from \\((0,0)\\) the mixed partial is \\(\\frac{x^{6} + 9 x^{4} y^{2} - 9 x^{2} y^{4} - y^{6}}{x^{6} + 3 x^{4} y^{2} + 3 x^{2} y^{4} + y^{6}}\\).\n\n\n\n \n \n \n \n \n \n \n \n \n This is not continuous at \\((0,0)\\), still the limit along the two paths \\(x=0\\) and \\(y=0\\) are equivalent.\n \n \n\n\n \n \n \n \n This is not continuous at \\((0,0)\\), as the limit along the two paths \\(x=0\\) and \\(y=0\\) are not equivalent.\n \n \n\n\n \n \n \n \n As this is the ratio of continuous functions, it is continuous at the origin\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nKnill. Clairauts theorem is the name given to the fact that if the partial derivatives are continuous, the mixed partials are equal, \\(f_{xy} = f_{yx}\\).\nConsider the following code which computes the mixed partials for the discrete derivative:\n\n@syms x::real y::real Δ::real G()\n\nDx(f,h) = (subs(f, x=>x+h) - f)/h\nDy(f,h) = (subs(f, y=>y+h) - f)/h\n\nDy(Dx(G(x,y), Δ), Δ) - Dx(Dy(G(x,y), Δ), Δ)\n\n \n\\[\n- \\frac{- \\frac{- G{\\left(x,y \\right)} + G{\\left(x,y + Δ \\right)}}{Δ} + \\frac{- G{\\left(x + Δ,y \\right)} + G{\\left(x + Δ,y + Δ \\right)}}{Δ}}{Δ} + \\frac{- \\frac{- G{\\left(x,y \\right)} + G{\\left(x + Δ,y \\right)}}{Δ} + \\frac{- G{\\left(x,y + Δ \\right)} + G{\\left(x + Δ,y + Δ \\right)}}{Δ}}{Δ}\n\\]\n\n\n\nWhat does this simplify to?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs continuity required for this to be true?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n(Examples and descriptions from Krill)\nWhat equation does the function \\(f(x,y) = x^3 - 3xy^2\\) satisfy?\n\n\n\n \n \n \n \n \n \n \n \n \n The wave equation: \\(f_{tt} = f_{xx}\\); governs motion of light or sound\n \n \n\n\n \n \n \n \n The heat equation: \\(f_t = f_{xx}\\); describes diffusion of heat\n \n \n\n\n \n \n \n \n The Laplace equation: \\(f_{xx} + f_{yy} = 0\\); determines shape of a membrane\n \n \n\n\n \n \n \n \n The advection equation: \\(f_t = f_x\\); is used to model transport in a wire\n \n \n\n\n \n \n \n \n The eiconal equation: \\(f_x^2 + f_y^2 = 1\\); is used to model evolution of a wave front in optics\n \n \n\n\n \n \n \n \n The Burgers equation: \\(f_t + ff_x = f_{xx}\\); describes waves at the beach which break\n \n \n\n\n \n \n \n \n The KdV equation: \\(f_t + 6ff_x+ f_{xxx} = 0\\); models water waves in a narrow channel\n \n \n\n\n \n \n \n \n The Schrodinger equation: \\(f_t = (i\\hbar/(2m))f_xx\\); used to describe a quantum particle of mass \\(m\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat equation does the function \\(f(t, x) = sin(x-t) + sin(x+t)\\) satisfy?\n\n\n\n \n \n \n \n \n \n \n \n \n The wave equation: \\(f_{tt} = f_{xx}\\); governs motion of light or sound\n \n \n\n\n \n \n \n \n The heat equation: \\(f_t = f_{xx}\\); describes diffusion of heat\n \n \n\n\n \n \n \n \n The Laplace equation: \\(f_{xx} + f_{yy} = 0\\); determines shape of a membrane\n \n \n\n\n \n \n \n \n The advection equation: \\(f_t = f_x\\); is used to model transport in a wire\n \n \n\n\n \n \n \n \n The eiconal equation: \\(f_x^2 + f_y^2 = 1\\); is used to model evolution of a wave front in optics\n \n \n\n\n \n \n \n \n The Burgers equation: \\(f_t + ff_x = f_{xx}\\); describes waves at the beach which break\n \n \n\n\n \n \n \n \n The KdV equation: \\(f_t + 6ff_x+ f_{xxx} = 0\\); models water waves in a narrow channel\n \n \n\n\n \n \n \n \n The Schrodinger equation: \\(f_t = (i\\hbar/(2m))f_xx\\); used to describe a quantum particle of mass \\(m\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat equation does the function \\(f(t, x) = e^{-(x+t)^2}\\) satisfy?\n\n\n\n \n \n \n \n \n \n \n \n \n The wave equation: \\(f_{tt} = f_{xx}\\); governs motion of light or sound\n \n \n\n\n \n \n \n \n The heat equation: \\(f_t = f_{xx}\\); describes diffusion of heat\n \n \n\n\n \n \n \n \n The Laplace equation: \\(f_{xx} + f_{yy} = 0\\); determines shape of a membrane\n \n \n\n\n \n \n \n \n The advection equation: \\(f_t = f_x\\); is used to model transport in a wire\n \n \n\n\n \n \n \n \n The eiconal equation: \\(f_x^2 + f_y^2 = 1\\); is used to model evolution of a wave front in optics\n \n \n\n\n \n \n \n \n The Burgers equation: \\(f_t + ff_x = f_{xx}\\); describes waves at the beach which break\n \n \n\n\n \n \n \n \n The KdV equation: \\(f_t + 6ff_x+ f_{xxx} = 0\\); models water waves in a narrow channel\n \n \n\n\n \n \n \n \n The Schrodinger equation: \\(f_t = (i\\hbar/(2m))f_xx\\); used to describe a quantum particle of mass \\(m\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat equation does the function \\(f(x,y) = \\cos(x) + \\sin(y)\\) satisfy?\n\n\n\n \n \n \n \n \n \n \n \n \n The wave equation: \\(f_{tt} = f_{xx}\\); governs motion of light or sound\n \n \n\n\n \n \n \n \n The heat equation: \\(f_t = f_{xx}\\); describes diffusion of heat\n \n \n\n\n \n \n \n \n The Laplace equation: \\(f_{xx} + f_{yy} = 0\\); determines shape of a membrane\n \n \n\n\n \n \n \n \n The advection equation: \\(f_t = f_x\\); is used to model transport in a wire\n \n \n\n\n \n \n \n \n The eiconal equation: \\(f_x^2 + f_y^2 = 1\\); is used to model evolution of a wave front in optics\n \n \n\n\n \n \n \n \n The Burgers equation: \\(f_t + ff_x = f_{xx}\\); describes waves at the beach which break\n \n \n\n\n \n \n \n \n The KdV equation: \\(f_t + 6ff_x+ f_{xxx} = 0\\); models water waves in a narrow channel\n \n \n\n\n \n \n \n \n The Schrodinger equation: \\(f_t = (i\\hbar/(2m))f_xx\\); used to describe a quantum particle of mass \\(m\\)"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html",
"href": "differentiable_vector_calculus/scalar_functions_applications.html",
"title": "56  Applications with scalar functions",
"section": "",
"text": "This section uses these add-on packages:\nAnd the following from the Contour package:\nThis section presents different applications of scalar functions."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#tangent-planes-linearization",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#tangent-planes-linearization",
"title": "56  Applications with scalar functions",
"section": "56.1 Tangent planes, linearization",
"text": "56.1 Tangent planes, linearization\nConsider the case \\(f:R^2 \\rightarrow R\\). We visualize \\(z=f(x,y)\\) through a surface. At a point \\((a, b)\\), this surface, if \\(f\\) is sufficiently smooth, can be approximated by a flat area, or a plane. For example, the Northern hemisphere of the earth, might be modeled simplistically by \\(z = \\sqrt{R^2 - (x^2 + y^2)}\\) for some \\(R\\) and with the origin at the earths core. The ancient view of a “flat earth,” can be more generously seen as identifying this tangent plane with the sphere. More apt for current times, is the use of GPS coordinates to describe location. The difference between any two coordinates is technically a distance on a curved, nearly spherical, surface. But if the two points are reasonably closes (miles, not tens of miles) and accuracy isnt of utmost importance (i.e., not used for self-driving cars), then the distance can be found from the Euclidean distance formula, \\(\\sqrt{(\\Delta\\text{latitude})^2 + \\Delta\\text{longitude})^2}\\). That is, as if the points were on a plane, not a curved surface.\nFor the univariate case, the tangent line has many different uses. Here we see the tangent plane also does.\n\n56.1.1 Equation of the tangent plane\nThe partial derivatives have the geometric view of being the derivative of the univariate functions \\(f(\\vec\\gamma_x(t))\\) and \\(f(\\vec\\gamma_y(t))\\), where \\(\\vec\\gamma_x\\) moves just parallel to the \\(x\\) axis (e.g. \\(\\langle t + a, b\\rangle\\)). and \\(\\vec\\gamma_y\\) moves just parallel to the \\(y\\) axis. The partial derivatives then are slopes of tangent lines to each curve. The tangent plane, should it exist, should match both slopes at a given point. With this observation, we can identify it.\nConsider \\(f(\\vec\\gamma_x)\\) at a point \\((a,b)\\). The path has a tangent vector, which has “slope” \\(\\frac{\\partial f}{\\partial x}\\). and in the direction of the \\(x\\) axis, but not the \\(y\\) axis, as does this vector: \\(\\langle 1, 0, \\frac{\\partial f}{\\partial x} \\rangle\\). Similarly, this vector \\(\\langle 0, 1, \\frac{\\partial f}{\\partial y} \\rangle\\) describes the tangent line to \\(f(\\vec\\gamma_y)\\) a the point.\nThese two vectors will lie in the plane. The normal vector is found by their cross product:\n\n@syms f_x f_y\nn = [1, 0, f_x] × [0, 1, f_y]\n\n3-element Vector{Sym}:\n -fₓ\n -f_y\n 1\n\n\nLet \\(\\vec{x} = \\langle a, b, f(a,b)\\). The tangent plane at \\(\\vec{x}\\) then is described by all vectors \\(\\vec{v}\\) with \\(\\vec{n}\\cdot(\\vec{v} - \\vec{x}) = 0\\). Using \\(\\vec{v} = \\langle x,y,z\\rangle\\), we have:\n\\[\n[-\\frac{\\partial f}{\\partial x}, -\\frac{\\partial f}{\\partial y}, 1] \\cdot [x-a, y-b, z - f(a,b)] = 0,\n\\]\nor,\n\\[\nz = f(a,b) + \\frac{\\partial f}{\\partial x} (x-a) + \\frac{\\partial f}{\\partial y} (y-b),\n\\]\nwhich is more compactly expressed as\n\\[\nz = f(a,b) + \\nabla(f) \\cdot \\langle x-a, y-b \\rangle.\n\\]\nThis form would then generalize to scalar functions from \\(R^n \\rightarrow R\\). This is consistent with the definition of \\(f\\) being differentiable, where \\(\\nabla{f}\\) plays the role of the slope in the formulas.\nThe following figure illustrates the above for the function \\(f(x,y) = 6 - x^2 - y^2\\):\n\nf(x,y) = 6 - x^2 -y^2\nf(x)= f(x...)\n\na,b = 1, -1/2\n\n\n# draw surface\nxr = 7/4\nxs = ys = range(-xr, xr, length=100)\nsurface(xs, ys, f, legend=false)\n\n# visualize tangent plane as 3d polygon\npt = [a,b]\ntplane(x) = f(pt) + gradient(f)(pt) ⋅ (x - [a,b])\n\npts = [[a-1,b-1], [a+1, b-1], [a+1, b+1], [a-1, b+1], [a-1, b-1]]\nplot!(unzip([[pt..., tplane(pt)] for pt in pts])...)\n\n# plot paths in x and y direction through (a,b)\nγ_x(t) = pt + t*[1,0]\nγ_y(t) = pt + t*[0,1]\n\nplot_parametric!((-xr-a)..(xr-a), t -> [γ_x(t)..., (f∘γ_x)(t)], linewidth=3)\nplot_parametric!((-xr-b)..(xr-b), t -> [γ_y(t)..., (f∘γ_y)(t)], linewidth=3)\n\n# draw directional derivatives in 3d and normal\npt = [a, b, f(a,b)]\nfx, fy = gradient(f)(a,b)\narrow!(pt, [1, 0, fx], linewidth=3)\narrow!(pt, [0, 1, fy], linewidth=3)\narrow!(pt, [-fx, -fy, 1], linewidth=3) # normal\n\n# draw point in base, x-y, plane\npt = [a, b, 0]\nscatter!(unzip([pt])...)\narrow!(pt, [1,0,0], linestyle=:dash)\narrow!(pt, [0,1,0], linestyle=:dash)\n\n\n\n\n\nAlternate forms\nThe equation for the tangent plane is often expressed in a more explicit form. For \\(n=2\\), if we set \\(dx = x-a\\) and \\(dy=y-a\\), then the equation for the plane becomes:\n\\[\nf(a,b) + \\frac{\\partial f}{\\partial x} dx + \\frac{\\partial f}{\\partial y} dy,\n\\]\nwhich is a common form for the equation, though possibly confusing, as \\(\\partial x\\) and \\(dx\\) need to be distinguished. For \\(n > 2\\), additional terms follow this pattern. This explicit form is helpful when doing calculations by hand, but much less so when working on the computer, say with Julia, as the representations using vectors (or matrices) can be readily implemented and their representation much closer to the formulas. For example, consider these two possible functions to find the tangent plane (returned as a function) at a point in \\(2\\) dimensions\n\nfunction tangent_plane_1st_crack(f, pt)\n fx, fy = ForwardDiff.gradient(f, pt)\n x -> f(x...) + fx * (x[1]-pt[1]) + fy * (x[2]-pt[2])\nend\n\ntangent_plane_1st_crack (generic function with 1 method)\n\n\nIt isnt so bad, but as written, we specialized to the number of dimensions, used indexing, and with additional dimensions, it clearly would get tedious to generalize. Using vectors, we might have:\n\nfunction tangent_plane(f, pt)\n ∇f = ForwardDiff.gradient(f, pt) # using a variable ∇f\n x -> f(pt) + ∇f ⋅ (x - pt)\nend\n\ntangent_plane (generic function with 1 method)\n\n\nThis is much more like the compact formula and able to handle higher dimensions without rewriting.\n\n\n\n56.1.2 Tangent plane for level curves\nConsider the surface described by \\(f(x,y,z) = c\\), a constant. This is more general than surfaces described by \\(z = f(x,y)\\). The concept of a tangent plane should still be applicable though. Suppose, \\(\\vec{\\gamma}(t)\\) is a curve in the \\(x-y-z\\) plane, then we have \\((f\\circ\\vec\\gamma)(t)\\) is a curve on the surface and its derivative is given by the chain rule through: \\(\\nabla{f}(\\vec\\gamma(t))\\cdot \\vec\\gamma'(t)\\). But this composition is constantly the same value, so the derivative is \\(0\\). This says that \\(\\nabla{f}(\\vec\\gamma(t))\\) is orthogonal to \\(\\vec\\gamma'(t)\\) for any curve. As these tangential vectors to \\(\\vec\\gamma\\) lie in the tangent plane, the tangent plane can be characterized by having \\(\\nabla{f}\\) as the normal.\nThis computation was previously done in two dimensions, and showed the gradient is orthogonal to the contour lines (and points in the direction of greatest ascent). It can be generalized to higher dimensions.\nThe surface \\(F(x,y,z) = z - f(x,y) = 0\\) has gradient given by \\(\\langle -\\partial{f}/\\partial{x}, -\\partial{f}/\\partial{y}, 1\\rangle\\), and as seen above, this vector is normal to the tangent plane, so this generalization agrees on the easier case.\nFor clarity:\n\nThe scalar function \\(z = f(x,y)\\) describes a surface, \\((x,y,f(x,y))\\); the gradient, \\(\\nabla{f}\\), is \\(2\\) dimensional and points in the direction of greatest ascent for the surface.\nThe scalar function \\(f(x,y,z)\\) also describes a surface, through level curves \\(f(x,y,z) = c\\), for some constant \\(c\\). The gradient \\(\\nabla{f}\\) is \\(3\\) dimensional and orthogonal to the surface.\n\n\nExample\nLet \\(z = f(x,y) = \\sin(x)\\cos(x-y)\\). Find an equation for the tangent plane at \\((\\pi/4, \\pi/3)\\).\nWe have many possible forms to express this in, but we will use the functional description:\n\n@syms x, y\n\n(x, y)\n\n\n\nf(x,y) = sin(x) * cos(x-y)\nf(x) = f(x...)\nvars = [x, y]\n\ngradf = diff.(f(x,y), vars) # or use gradient(f, vars) or ∇((f,vars))\n\npt = [PI/4, PI/3]\ngradfa = subs.(gradf, x=>pt[1], y=>pt[2])\n\nf(pt) + gradfa ⋅ (vars - pt)\n\n \n\\[\n\\left(x - \\frac{\\pi}{4}\\right) \\left(- \\frac{\\sqrt{2} \\left(- \\frac{\\sqrt{6}}{4} + \\frac{\\sqrt{2}}{4}\\right)}{2} + \\frac{\\sqrt{2} \\left(\\frac{\\sqrt{2}}{4} + \\frac{\\sqrt{6}}{4}\\right)}{2}\\right) + \\frac{\\sqrt{2} \\left(- \\frac{\\sqrt{6}}{4} + \\frac{\\sqrt{2}}{4}\\right) \\left(y - \\frac{\\pi}{3}\\right)}{2} + \\frac{\\sqrt{2} \\left(\\frac{\\sqrt{2}}{4} + \\frac{\\sqrt{6}}{4}\\right)}{2}\n\\]\n\n\n\n\n\nExample\nA cylinder \\(f(x,y,z) = (x-a)^2 + y^2 = (2a)^2\\) is intersected with a sphere \\(g(x,y,z) = x^2 + y^2 + z^2 = a^2\\). Let \\(V\\) be the line of intersection. (Vivianis curve). Let \\(P\\) be a point on the curve. Describe the tangent to the curve.\nWe have the line of intersection will have tangent line lying in the tangent plane to both surfaces. These two surfaces have normal vectors given by the gradient, or \\(\\vec{n}_1 = \\langle 2(x-a), 2y, 0 \\rangle\\) and \\(\\vec{n}_2 = \\langle 2x, 2y, 2z \\rangle\\). The cross product of these two vectors will lie in both tangent planes, so we have:\n\\[\nP + t (\\vec{n}_1 \\times \\vec{n}_2),\n\\]\nwill describe the tangent.\nThe curve may be described parametrically by \\(\\vec\\gamma(t) = a \\langle 1 + \\cos(t), \\sin(t), 2\\sin(t/2) \\rangle\\). Lets see that the above is correct by verifying that the cross product of the tangent vector computed two ways is \\(0\\):\n\na = 1\ngamma(t) = a * [1 + cos(t), sin(t), 2sin(t/2) ]\nP = gamma(1/2)\nn1(x,y,z)= [2*(x-a), 2y, 0]\nn2(x,y,z) = [2x,2y,2z]\nn1(x) = n1(x...)\nn2(x) = n2(x...)\n\nt = 1/2\n(n1(gamma(t)) × n2(gamma(t))) × gamma'(t)\n\n3-element Vector{Float64}:\n 0.0\n 0.0\n 0.0\n\n\n\n\nPlotting level curves of \\(F(x,y,z) = c\\)\nThe wireframe plot can be used to visualize a surface of the type z=f(x,y), as previously illustrated. However we have no way of plotting \\(3\\)-dimensional implicit surfaces (of the type \\(F(x,y,z)=c\\)) as we do for \\(2\\)-dimensional implicit surfaces with Plots. (The MDBM or IntervalConstraintProgramming packages can be used along with Makie plotting package to produce one.)\nThe CalculusWithJulia package provides a stop-gap function, plot_implicit_surface for this task. The basic idea is to slice an axis, by default the \\(z\\) axis up and for each level plot the contours of \\((x,y) \\rightarrow f(x,y,z)-c\\), which becomes a \\(2\\)-dimensional problem. The function allows any of 3 different axes to be chosen to slice over, the default being just the \\(z\\) axis.\nWe demonstrate with an example from a February 14, 2019 article in the New York Times. It shows an equation for a “heart,” as the graphic will illustrate:\n\na, b = 1, 3\nf(x,y,z) = (x^2 + ((1+b) * y)^2 + z^2 - 1)^3 - x^2 * z^3 - a * y^2 * z^3\n\nCalculusWithJulia.plot_implicit_surface(f, xlim=-2..2, ylim=-1..1, zlim=-1..2)"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#linearization",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#linearization",
"title": "56  Applications with scalar functions",
"section": "56.2 Linearization",
"text": "56.2 Linearization\nThe tangent plane is the best “linear approximation” to a function at a point. “Linear” refers to mathematical properties of the tangent plane, but at a practical level it means easy to compute, as it will involve only multiplication and addition. “Approximation” is useful in that if a bit of error is an acceptable tradeoff for computational ease, the tangent plane may be used in place of the function. In the univariate case, this is known as linearization, and the tradeoff is widely used in the derivation of theoretical relationships, as well as in practice to get reasonable numeric values.\nFormally, this is saying:\n\\[\nf(\\vec{x}) \\approx f(\\vec{a}) + ∇f(\\vec{a}) ⋅ (\\vec{x} - \\vec{a}).\n\\]\nThe explicit meaning of \\(\\approx\\) will be made clear when the generalization of Taylors theorem is to be stated.\n\nExample: Linear approximation\nThe volume of a cylinder is \\(V=\\pi r^2 h\\). It is thought a cylinder has \\(r=1\\) and \\(h=2\\). If instead, the amounts are \\(r=1.01, h=2.01\\), what is the difference in volume?\nThat is, if \\(V(r,h) = \\pi r^2 h\\), what is \\(V(1.01, 2.01) - V(1,2)\\)?\nWe can use linear approximation to see that this difference is approximately \\(\\nabla{V} \\cdot \\langle 0.01, 0.01 \\rangle\\). This is:\n\nV(r, h) = pi * r^2 * h\nV(v) = V(v...)\na₁ = [1,2]\ndx₁ = [0.01, 0.01]\nForwardDiff.gradient(V, a₁) ⋅ dx₁ # or use ∇(V)(a)\n\n0.15707963267948966\n\n\nThe exact difference can be computed:\n\nV(a₁ + dx₁) - V(a₁)\n\n0.15833941133357854\n\n\n\n\nExample\nLet \\(f(x,y) = \\sin(\\pi x y^2)\\). Estimate \\(f(1.1, 0.9)\\).\nUsing linear approximation with \\(dx=0.1\\) and \\(dy=-0.1\\), this is\n\\[\nf(1,1) + \\nabla{f}(1,1) \\cdot \\langle 0.1, -0.1\\rangle,\n\\]\nwhere \\(f(1,1) = \\sin(\\pi) = 0\\) and \\(\\nabla{f} = \\langle y^2\\cos(\\pi x y^2), \\cos(\\pi x y^2) 2y\\rangle = \\cos(\\pi x y^2)\\langle x,2y\\rangle\\). So, the answer is:\n\\[\n0 + \\cos(\\pi) \\langle 1,2\\rangle\\cdot \\langle 0.1, -0.1 \\rangle =\n(-1)(0.1 - 2(0.1)) = 0.1.\n\\]\n\n\nExample\nA piriform is described by the quartic surface \\(f(x,y,z) = x^4 -x^3 + y^2+z^2 = 0\\). Find the tangent line at the point \\(\\langle 2,2,2 \\rangle\\).\nHere, \\(\\nabla{f}\\) describes a normal to the tangent plane. The description of a plane may be described by \\(\\hat{N}\\cdot(\\vec{x} - \\vec{x}_0) = 0\\), where \\(\\vec{x}_0\\) is identified with a point on the plane (the point \\((2,2,2)\\) here). With this, we have \\(\\hat{N}\\cdot\\vec{x} = ax + by + cz = \\hat{N}\\cdot\\langle 2,2,2\\rangle = 2(a+b+c)\\). For ths problem, \\(\\nabla{f}(2,2,2) = \\langle a, b, c\\rangle\\) is given by:\n\nf(x,y,z) = x^4 -x^3 + y^2 + z^2\nf(v) = f(v...)\na, b,c = ∇(f)(2,2,2)\n\"$a x + $b y + $c z = $([a,b,c] ⋅ [2,2,2])\"\n\n\"20 x + 4 y + 4 z = 56\"\n\n\n\n\n56.2.1 Newtons method to solve \\(f(x,y) = 0\\) and \\(g(x,y)=0\\).\nThe level curve \\(f(x,y)=0\\) and the level curve \\(g(x,y)=0\\) may intersect. Solving algebraically for the intersection may be difficult in most cases, though the linear case is not. (The linear case being the intersection of two lines).\nTo elaborate, consider two linear equations written in a general form:\n\\[\n\\begin{align}\nax + by &= u\\\\\ncx + dy &= v\n\\end{align}\n\\]\nA method to solve this by hand would be to solve for \\(y\\) from one equation, replace this expression into the second equation and then solve for \\(x\\). From there, \\(y\\) can be found. A more advanced method expresses the problem in a matrix formulation of the form \\(Mx=b\\) and solves that equation. This form of solving is implemented in Julia, through the “backslash” operator. Here is the general solution:\n\n@syms a b c d u v\nM = [a b; c d]\nB = [u, v]\nM \\ B .|> simplify\n\n2-element Vector{Sym}:\n (-b*v + d*u)/(a*d - b*c)\n (a*v - c*u)/(a*d - b*c)\n\n\nThe term \\(\\det(M) = ad-bc\\) term is important, as evidenced by its appearance in the denominator of each term. When this is zero there is not a unique solution, as in the typical case.\nUsing Newtons method to solve for intersection points, uses linearization of the surfaces to replace the problem to the intersection of level curves for tangent planes. This is the linear case that can be readily solved. As with Newtons method for the univariate case, the new answer is generally a better approximation to the answer, and the process is iterated to get a good enough approximation, as defined through some tolerance.\nConsider the functions \\(f(x,y) =2 - x^2 - y^2\\) and \\(g(x,y) = 3 - 2x^2 - (1/3)y^2\\). These graphs show their surfaces with the level sets for \\(c=0\\) drawn and just the levels sets, showing they intersect in \\(4\\) places.\n\n\n\n\n\nWe look to find the intersection point near \\((1,1)\\) using Newtons method\nWe have by linearization:\n\\[\n\\begin{align}\nf(x,y) &\\approx f(x_n, y_n) + \\frac{\\partial f}{\\partial x}\\Delta x + \\frac{\\partial f}{\\partial y}\\Delta y \\\\\ng(x,y) &\\approx g(x_n, y_n) + \\frac{\\partial g}{\\partial x}\\Delta x + \\frac{\\partial g}{\\partial y}\\Delta y,\n\\end{align}\n\\]\nwhere \\(\\Delta x = x- x_n\\) and \\(\\Delta y = y-y_n\\). Setting \\(f(x,y)=0\\) and \\(g(x,y)=0\\), leaves these two linear equations in \\(\\Delta x\\) and \\(\\Delta y\\):\n\\[\n\\begin{align}\n\\frac{\\partial f}{\\partial x} \\Delta x + \\frac{\\partial f}{\\partial y} \\Delta y &= -f(x_n, y_n)\\\\\n\\frac{\\partial g}{\\partial x} \\Delta x + \\frac{\\partial g}{\\partial y} \\Delta y &= -g(x_n, y_n).\n\\end{align}\n\\]\nOne step of Newtons method defines \\((x_{n+1}, y_{n+1})\\) to be the values \\((x,y)\\) that make the linearized functions about \\((x_n, y_n)\\) both equal to \\(\\vec{0}\\).\nAs just described, we can use Julias \\ operation to solve the above system of equations, if we express them in matrix form. With this, one step of Newtons method can be coded as follows:\n\nfunction newton_step(f, g, xn)\n M = [ForwardDiff.gradient(f, xn)'; ForwardDiff.gradient(g, xn)']\n b = -[f(xn), g(xn)]\n Delta = M \\ b\n xn + Delta\nend\n\nnewton_step (generic function with 1 method)\n\n\nWe investigate what happens starting at \\((1,1)\\) after one step:\n\n𝒇(x,y) = 2 - x^2 - y^2\n𝒈(x,y) = 3 - 2x^2 - (1/3)y^2\n𝒇(v) = 𝒇(v...); 𝒈(v) = 𝒈(v...)\n𝒙₀ = [1,1]\n𝒙₁ = newton_step(𝒇, 𝒈, 𝒙₀)\n\n2-element Vector{Float64}:\n 1.2\n 0.8\n\n\nThe new function values are\n\n𝒇(𝒙₁), 𝒈(𝒙₁)\n\n(-0.08000000000000007, -0.09333333333333327)\n\n\nWe can get better approximations by iterating. Here we hard code \\(4\\) more steps:\n\n𝒙₂ = newton_step(𝒇, 𝒈, 𝒙₁)\n𝒙₃ = newton_step(𝒇, 𝒈, 𝒙₂)\n𝒙₄ = newton_step(𝒇, 𝒈, 𝒙₃)\n𝒙₅ = newton_step(𝒇, 𝒈, 𝒙₄)\n𝒙₅, 𝒇(𝒙₅), 𝒈(𝒙₅)\n\n([1.1832159566199232, 0.7745966692414834], 0.0, 1.6653345369377348e-16)\n\n\nWe see that at the new point, x5, both functions are basically the same value, \\(0\\), so we have approximated the intersection point.\nFor nearby initial guesses and reasonable functions, Newtons method is quadratic, so should take few steps for convergence, as above.\nHere is a simplistic method to iterate \\(n\\) steps:\n\nfunction nm(f, g, x, n=5)\n for i in 1:n\n x = newton_step(f, g, x)\n end\n x\nend\n\nnm (generic function with 2 methods)\n\n\n\nExample\nConsider the bicylinder the intersection of two perpendicular cylinders of the same radius. If the radius is \\(1\\), we might express these by the functions:\n\\[\nf(x,y) = \\sqrt{1 - y^2}, \\quad g(x,y) = \\sqrt{1 - x^2}.\n\\]\nWe see that \\((1,1)\\), \\((-1,1)\\), \\((1,-1)\\) and \\((-1,-1)\\) are solutions to \\(f(x,y)=0\\), \\(g(x,y)=0\\) and \\((0,0)\\) is a solution to \\(f(x,y)=1\\) and \\(g(x,y)=1\\). What about a level like \\(1/2\\), say?\nRather than work with \\(f(x,y) = c\\) we solve \\(f(x,y)^2 = c^2\\), as that will be avoid issues with the square root not being defined. Here is one way to solve:\n\nc = 1/2\nf(x,y) = 1 - y^2 - c^2\ng(x,y) = (1 - x^2) - c^2\nf(v) = f(v...); g(v) = g(v...)\nnm(f, g, [1/2, 1/3])\n\n2-element Vector{Float64}:\n 0.8660254037844386\n 0.8660254037935468\n\n\nThat \\(x=y\\) is not so surprising, and in fact, this problem can more easily be solved analytically through \\(x^2 = y^2 = 1 - c^2\\)."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#implicit-differentiation",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#implicit-differentiation",
"title": "56  Applications with scalar functions",
"section": "56.3 Implicit differentiation",
"text": "56.3 Implicit differentiation\nImplicit differentiation of an equation of two variables (say \\(x\\) and \\(y\\)) is performed by assuming \\(y\\) is a function of \\(x\\) and when differentiating an expression with \\(y\\), use the chain rule. For example, the slope of the tangent line, \\(dy/dx\\), for the general ellipse \\(x^2/a + y^2/b = 1\\) can be found through this calculation:\n\\[\n\\frac{d}{dx}(\\frac{x^2}{a} + \\frac{y^2}{b}) =\n\\frac{d}{dx}(1),\n\\]\nor, using \\(d/dx(y^2) = 2y dy/dx\\):\n\\[\n\\frac{2x}{a} + \\frac{2y \\frac{dy}{dx}}{b} = 0.\n\\]\nFrom this, solving for \\(dy/dx\\) is routine, as the equation is linear in that unknown: \\(dy/dx = -(b/a)(x/y)\\)\nWith more variables, the same technique may be used. Say we have variables \\(x\\), \\(y\\), and \\(z\\) in a relation like \\(F(x,y,z) = 0\\). If we assume \\(z=z(x,y)\\) for some differentiable function (we mention later what conditions will ensure this assumption is valid for some open set), then we can proceed as before, using the chain rule as necessary.\nFor example, consider the ellipsoid: \\(x^2/a + y^2/b + z^2/c = 1\\). What is \\(\\partial z/\\partial x\\) and \\(\\partial{z}/\\partial{y}\\), as needed to describe the tangent plane as above?\nTo find \\(\\partial/\\partial{x}\\) we have:\n\\[\n\\frac{\\partial}{\\partial{x}}(x^2/a + y^2/b + z^2/c) =\n\\frac{\\partial}{\\partial{x}}1,\n\\]\nor\n\\[\n\\frac{2x}{a} + \\frac{0}{b} + \\frac{2z\\frac{\\partial{z}}{\\partial{x}}}{c} = 0.\n\\]\nAgain the desired unknown is within a linear equation so can readily be solved:\n\\[\n\\frac{\\partial{z}}{\\partial{x}} = -\\frac{c}{a} \\frac{x}{z}.\n\\]\nA similar approach can be used for \\(\\partial{z}/\\partial{y}\\).\n\nExample\nLet \\(f(x,y,z) = x^4 -x^3 + y^2 + z^2 = 0\\) be a surface with point \\((2,2,2)\\). Find \\(\\partial{z}/\\partial{x}\\) and \\(\\partial{z}/\\partial{y}\\).\nTo find \\(\\partial{z}/\\partial{x}\\) and \\(\\partial{z}/\\partial{y}\\) we have:\n\n@syms x, y, Z()\n∂x = solve(diff(x^4 -x^3 + y^2 + Z(x,y)^2, x), diff(Z(x,y),x))\n∂y = solve(diff(x^4 -x^3 + y^2 + Z(x,y)^2, x), diff(Z(x,y),y))\n∂x, ∂y\n\n(Sym[x^2*(3 - 4*x)/(2*Z(x, y))], Any[])"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#optimization",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#optimization",
"title": "56  Applications with scalar functions",
"section": "56.4 Optimization",
"text": "56.4 Optimization\nFor a continuous univariate function \\(f:R \\rightarrow R\\) over an interval \\(I\\) the question of finding a maximum or minimum value is aided by two theorems:\n\nThe Extreme Value Theorem, which states that if \\(I\\) is closed (e.g, \\(I=[a,b]\\)) then \\(f\\) has a maximum (minimum) value \\(M\\) and there is at least one value \\(c\\) with \\(a \\leq c \\leq b\\) with \\(M = f(x)\\).\nFermats theorem on critical points, which states that if \\(f:(a,b) \\rightarrow R\\) and \\(x_0\\) is such that \\(a < x_0 < b\\) and \\(f(x_0)\\) is a local extremum. If \\(f\\) is differentiable at \\(x_0\\), then \\(f'(x_0) = 0\\). That is, local extrema of \\(f\\) happen at points where the derivative does not exist or is \\(0\\) (critical points).\n\nThese two theorems provide an algorithm to find the extreme values of a continuous function over a closed interval: find the critical points, check these and the end points for the maximum and minimum value.\nThese checks can be reduced by two theorems that can classify critical points as local extrema, the first and second derivative tests.\nThese theorems have generalizations to scalar functions, allowing a similar study of extrema.\nFirst, we define a local maximum for \\(f:R^n \\rightarrow R\\) over a region \\(U\\): a point \\(\\vec{a}\\) in \\(U\\) is a local maximum if \\(f(\\vec{a}) \\geq f(\\vec{u})\\) for all \\(u\\) in some ball about \\(\\vec{a}\\). A local minimum would have \\(\\leq\\) instead.\nAn absolute maximum over \\(U\\), should it exist, would be \\(f(\\vec{a})\\) if there exists a value \\(\\vec{a}\\) in \\(U\\) with the property \\(f(\\vec{a}) \\geq f(\\vec{u})\\) for all \\(\\vec{u}\\) in \\(U\\).\nThe difference is the same as the one-dimensional case: local is a statement about nearby points only, absolute a statement about all the points in the specified set.\n\nThe Extreme Value Theorem Let \\(f:R^n \\rightarrow R\\) be continuous and defined on closed set \\(V\\). Then \\(f\\) has a minimum value \\(m\\) and maximum value \\(M\\) over \\(V\\) and there exists at least two points \\(\\vec{a}\\) and \\(\\vec{b}\\) with \\(m = f(\\vec{a})\\) and \\(M = f(\\vec{b})\\).\n\n\nFermats theorem on critical points. Let \\(f:R^n \\rightarrow R\\) be a continuous function defined on an open set \\(U\\). If \\(x \\in U\\) is a point where \\(f\\) has a local extrema and \\(f\\) is differentiable, then the gradient of \\(f\\) at \\(x\\) is \\(\\vec{0}\\).\n\nCall a point in the domain of \\(f\\) where the function is differentiable and the gradient is zero a stationary point and a point in the domain where the function is either not differentiable or is a stationary point a critical point. The local extrema can only happen at critical points by Fermat.\nConsider the function \\(f(x,y) = e^{-(x^2 + y^2)/5} \\cos(x^2 + y^2)\\).\n\nf(x,y)= exp(-(x^2 + y^2)/5) * cos(x^2 + y^2)\nxs = ys = range(-4, 4, length=100)\nsurface(xs, ys, f, legend=false)\n\n\n\n\nThis function is differentiable and the gradient is given by:\n\\[\n\\nabla{f} = -2/5e^{-(x^2 + y^2)/5} (5\\sin(x^2 + y^2) + \\cos(x^2 + y^2)) \\langle x, y \\rangle.\n\\]\nThis is zero at the origin, or when \\(5\\sin(x^2 + y^2) = -\\cos(x^2 + y^2)\\). The latter is \\(0\\) on circles of radius \\(r\\) where \\(5\\sin(r) = \\cos(r)\\) or \\(r = \\tan^{-1}(-1/5) + k\\pi\\) for \\(k = 1, 2, \\dots\\). This matches the graph, where the extrema are on circles by symmetry. Imagine now, picking a value where the function takes a maximum and adding the tangent plane. As the gradient is \\(\\vec{0}\\), this will be flat. The point at the origin will have the surface fall off from the tangent plane in each direction, whereas the other points, will have a circle where the tangent plane rests on the surface, but otherwise will fall off from the tangent plane. Characterizing this “falling off” will help to identify local maxima that are distinct.\n\nNow consider the differentiable function \\(f(x,y) = xy\\), graphed below with the projections of the \\(x\\) and \\(y\\) axes:\n\nf(x,y) = x*y\nxs = ys = range(-3, 3, length=100)\nsurface(xs, ys, f, legend=false)\n\nplot_parametric!(-4..4, t -> [t, 0, f(t, 0)], linewidth=5)\nplot_parametric!(-4..4, t -> [0, t, f(0, t)], linewidth=5)\n\n\n\n\nThe extrema happen at the edges of the region. The gradient is \\(\\nabla{f} = \\langle y, x \\rangle\\). This is \\(\\vec{0}\\) only at the origin. At the origin, were we to imagine a tangent plane, the surface falls off in one direction but falls above in the other direction. Such a point is referred to as a saddle point. A saddle point for a continuous \\(f:R^n \\rightarrow R\\) would be a critical point, \\(\\vec{a}\\) where for any ball with non-zero radius about \\(\\vec{a}\\), there are values where the function is greater than \\(f(\\vec{a})\\) and values where the function is less.\nTo identify these through formulas, and not graphically, we could try and use the first derivative test along all paths through \\(\\vec{a}\\), but this approach is better at showing something isnt the case, like two paths to show non-continuity.\nThe generalization of the second derivative test is more concrete though. Recall, the second derivative test is about the concavity of the function at the critical point. When the concavity can be determined as non-zero, the test is conclusive; when the concavity is zero, the test is not conclusive. Similarly here:\n\nThe second Partial Derivative Test for \\(f:R^2 \\rightarrow R\\).\nAssume the first and second partial derivatives of \\(f\\) are defined and continuous; \\(\\vec{a}\\) be a critical point of \\(f\\); \\(H\\) is the hessian matrix, \\([f_{xx}\\quad f_{xy};f_{xy}\\quad f_{yy}]\\), and \\(d = \\det(H) = f_{xx} f_{yy} - f_{xy}^2\\) is the determinant of the Hessian matrix. Then:\n\nThe function \\(f\\) has a local minimum at \\(\\vec{a}\\) if \\(f_{xx} > 0\\) and \\(d>0\\),\nThe function \\(f\\) has a local maximum at \\(\\vec{a}\\) if \\(f_{xx} < 0\\) and \\(d>0\\),\nThe function \\(f\\) has a saddle point at \\(\\vec{a}\\) if \\(d < 0\\),\nNothing can be said if \\(d=0\\).\n\n\n\nThe intuition behind a proof follows. The case when \\(f_{xx} > 0\\) and \\(d > 0\\) uses a consequence of these assumptions that for any non-zero vector \\(\\vec{x}\\) it must be that \\(x\\cdot(Hx) > 0\\) (positive definite) and the quadratic approximation \\(f(\\vec{a}+d\\vec{x}) \\approx f(\\vec{a}) + \\nabla{f}(\\vec{a}) \\cdot d\\vec{x} + d\\vec{x} \\cdot (Hd\\vec{x}) = f(\\vec{a}) + d\\vec{x} \\cdot (Hd\\vec{x})\\), so for any \\(d\\vec{x}\\) small enough, \\(f(\\vec{a}+d\\vec{x}) \\geq f(\\vec{a})\\). That is \\(f(\\vec{a})\\) is a local minimum. Similarly, a proof for the local maximum follows by considering \\(-f\\). Finally, if \\(d < 0\\), then there are vectors, \\(d\\vec{x}\\), for which \\(d\\vec{x} \\cdot (Hd\\vec{x})\\) will have different signs, and along these vectors the function will be concave up/concave down.\nApply this to \\(f(x,y) = xy\\) at \\(\\vec{a} = \\vec{0}\\) we have \\(f_{xx} = f_{yy} = 0\\) and \\(f_{xy} = 1\\), so the determinant of the Hessian is \\(-1\\). By the second partial derivative test, this critical point is a saddle point, as seen from a previous graph.\nApplying this to \\(f(x,y) = e^{-(x^2 + y^2)/5} \\cos(x^2 + y^2)\\), we will use SymPy to compute the derivatives, as they get a bit involved:\n\nfₖ(x,y) = exp(-(x^2 + y^2)/5) * cos(x^2 + y^2)\nHₖ = sympy.hessian(fₖ(x,y), (x,y))\n\n2×2 Matrix{Sym}:\n 8*x^2*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2)/5 - 96*x^2*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/25 - 2*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2) - 2*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/5 … 8*x*y*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2)/5 - 96*x*y*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/25\n 8*x*y*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2)/5 - 96*x*y*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/25 8*y^2*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2)/5 - 96*y^2*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/25 - 2*exp(-x^2/5 - y^2/5)*sin(x^2 + y^2) - 2*exp(-x^2/5 - y^2/5)*cos(x^2 + y^2)/5\n\n\nThis is messy, but we only consider it at critical points. The point \\((0,0)\\) is graphically a local maximum. We can see from the Hessian, that the second partial derivative test will give the same characterization:\n\nH₀₀ = subs.(Hₖ, x=>0, y=>0)\n\n2×2 Matrix{Sym}:\n -2/5 0\n 0 -2/5\n\n\nWhich satisfies:\n\nH₀₀[1,1] < 0 && det(H₀₀) > 0\n\ntrue\n\n\nNow consider \\(\\vec{a} = \\langle \\sqrt{2\\pi + \\tan^{-1}(-1/5)}, 0 \\rangle\\), a point on the first visible ring on the graph. The gradient vanishes here:\n\ngradfₖ = diff.(fₖ(x,y), [x,y])\na = [sqrt(2PI + atan(-Sym(1)//5)), 0]\nsubs.(gradfₖ, x => a[1], y => a[2])\n\n2-element Vector{Sym}:\n 0\n 0\n\n\nBut the test is inconclusive, as the determinant of the Hessian is \\(0\\):\n\na = [sqrt(PI + atan(-Sym(1)//5)), 0]\nH_a = subs.(Hₖ, x => a[1], y => a[2])\ndet(H_a)\n\n \n\\[\n0\n\\]\n\n\n\n(The test is inconclusive, as it needs the function to “fall away” from the tangent plane in all directions, in this case, along a circular curve, the function touches the tangent plane, so it doesnt fall away.)\n\nExample\nCharacterize the critical points of \\(f(x,y) = 4xy - x^4 - y^4\\).\nThe critical points may be found by solving when the gradient is \\(\\vec{0}\\):\n\nfⱼ(x,y) = 4x*y - x^4 - y^4\ngradfⱼ = diff.(fⱼ(x,y), [x,y])\n\n2-element Vector{Sym}:\n -4*x^3 + 4*y\n 4*x - 4*y^3\n\n\n\nall_ptsⱼ = solve(gradfⱼ, [x,y])\nptsⱼ = filter(u -> all(isreal.(u)), all_ptsⱼ)\n\n3-element Vector{Tuple{Sym, Sym}}:\n (-1, -1)\n (0, 0)\n (1, 1)\n\n\nThere are \\(3\\) real critical points. To classify them we need the sign of \\(f_{xx}\\) and the determinant of the Hessian. We make a simple function to compute these, then apply it to each point using a comprehension:\n\nHⱼ = sympy.hessian(fⱼ(x,y), (x,y))\nfunction classify(H, pt)\n Ha = subs.(H, x .=> pt[1], y .=> pt[2])\n (det=det(Ha), f_xx=Ha[1,1])\nend\n[classify(Hⱼ, pt) for pt in ptsⱼ]\n\n3-element Vector{NamedTuple{(:det, :f_xx), Tuple{Sym, Sym}}}:\n (det = 128, f_xx = -12)\n (det = -16, f_xx = 0)\n (det = 128, f_xx = -12)\n\n\nWe see the first and third points have positive determinant and negative \\(f_{xx}\\), so are relative maxima, and the second point has negative derivative, so is a saddle point. We graphically confirm this:\n\nxs = ys = range(-3/2, 3/2, length=100)\np = surface(xs, ys, fⱼ, legend=false)\nfor pt ∈ ptsⱼ\n scatter!(p, unzip([N.([pt...,fⱼ(pt...)])])...,\n markercolor=:black, markersize=5) # add each pt on surface\nend\np\n\n\n\n\n\n\nExample\nConsider the function \\(f(x,y) = x^2 + 3y^2 -x\\) over the region \\(x^2 + y^2 \\leq 1\\). This is a continuous function over a closed set, so will have both an absolute maximum and minimum. Find these from an investigation of the critical points and the boundary points.\nThe gradient is easily found: \\(\\nabla{f} = \\langle 2x - 1, 6y \\rangle\\), and is \\(\\vec{0}\\) only at \\(\\vec{a} = \\langle 1/2, 0 \\rangle\\). The Hessian is:\n\\[\nH = \\left[\n\\begin{array}{}\n2 & 0\\\\\n0 & 6\n\\end{array}\n\\right].\n\\]\nAt \\(\\vec{a}\\) this has positive determinant and \\(f_{xx} > 0\\), so \\(\\vec{a}\\) corresponds to a local minimum with values \\(f(\\vec{a}) = (1/2)^2 + 3(0) - 1/2 = -1/4\\). The absolute maximum and minimum may occur here (well, not the maximum) or on the boundary, so that must be considered. In this case we can easily parameterize the boundary and turn this into the univariate case:\n\nfₗ(x,y) = x^2 + 2y^2 - x\nfₗ(v) = fₗ(v...)\ngammaₗ(t) = [cos(t), sin(t)] # traces out x^2 + y^2 = 1 over [0, 2pi]\ngₗ = fₗ ∘ gammaₗ\n\ncpsₗ = find_zeros(gₗ', 0, 2pi) # critical points of g\nappend!(cpsₗ, [0, 2pi])\nunique!(cpsₗ)\ngₗ.(cpsₗ)\n\n5-element Vector{Float64}:\n 0.0\n 2.25\n 2.0\n 2.25\n 0.0\n\n\nWe see that maximum value is 2.25 and that the interior point, \\(\\vec{a}\\), will be where the minimum value occurs. To see exactly where the maximum occurs, we look at the values of gamma:\n\ninds = [2,4]\ncpsₗ[inds]\n\n2-element Vector{Float64}:\n 2.0943951023931953\n 4.1887902047863905\n\n\nThese are multiples of \\(\\pi\\):\n\ncpsₗ[inds]/pi\n\n2-element Vector{Float64}:\n 0.6666666666666666\n 1.3333333333333333\n\n\nSo we have the maximum occurs at the angles \\(2\\pi/3\\) and \\(4\\pi/3\\). Here we visualize, using a hacky trick of assigning NaN values to the function to avoid plotting outside the circle:\n\nhₗ(x,y) = fₗ(x,y) * (x^2 + y^2 <= 1 ? 1 : NaN)\n\nhₗ (generic function with 1 method)\n\n\n\nxs = ys = range(-1,1, length=100)\nsurface(xs, ys, hₗ)\n\nts = cpsₗ # 2pi/3 and 4pi/3 by above\nxs, ys = cos.(ts), sin.(ts)\nscatter!(xs, ys, fₗ)\n\n\n\n\nA contour plot also shows that some - and only one - extrema happens on the interior:\n\nxs = ys = range(-1,1, length=100)\ncontour(xs, ys, hₗ)\n\n\n\n\nThe extrema are identified by the enclosing regions, in this case the one around the point \\((1/2, 0)\\).\n\n\nExample: Steiners problem\nThis is from Strang p 506.\nWe have three points in the plane, \\((x_1, y_1)\\), \\((x_2, y_2)\\), and \\((x_3,y_3)\\). A point \\(p=(p_x, p_y)\\) will have \\(3\\) distances \\(d_1\\), \\(d_2\\), and \\(d_3\\). Broadly speaking we want to minimize to find the point \\(p\\) “nearest” the three fixed points within the triangle. Locating a facility so that it can service \\(3\\) separate cities might be one application. The answer depends on the notion of what measure of distance to use.\nIf the measure is the Euclidean distance, then \\(d_i^2 = (p_x - x_i)^2 + (p_y - y_i)^2\\). If we sought to minimize \\(d_1^2 + d_2^2 + d_3^2\\), then we would proceed as follows:\n\n@syms x1 y1 x2 y2 x3 y3\nd2(p,x) = (p[1] - x[1])^2 + (p[2]-x[2])^2\nd2_1, d2_2, d2_3 = d2((x,y), (x1, y1)), d2((x,y), (x2, y2)), d2((x,y), (x3, y3))\nexₛ = d2_1 + d2_2 + d2_3\n\n \n\\[\n\\left(x - x_{1}\\right)^{2} + \\left(x - x_{2}\\right)^{2} + \\left(x - x_{3}\\right)^{2} + \\left(y - y_{1}\\right)^{2} + \\left(y - y_{2}\\right)^{2} + \\left(y - y_{3}\\right)^{2}\n\\]\n\n\n\nWe then find the gradient, and solve for when it is \\(\\vec{0}\\):\n\ngradfₛ = diff.(exₛ, [x,y])\nxstarₛ = solve(gradfₛ, [x,y])\n\nDict{Any, Any} with 2 entries:\n x => x1/3 + x2/3 + x3/3\n y => y1/3 + y2/3 + y3/3\n\n\nThere is only one critical point, so must be a minimum.\nWe confirm this by looking at the Hessian and noting \\(H_{11} > 0\\):\n\nHₛ = subs.(hessian(exₛ, [x,y]), x=>xstarₛ[x], y=>xstarₛ[y])\n\n2×2 Matrix{Sym}:\n 6 0\n 0 6\n\n\nAs it occurs at \\((\\bar{x}, \\bar{y})\\) where \\(\\bar{x} = (x_1 + x_2 + x_3)/3\\) and \\(\\bar{y} = (y_1+y_2+y_3)/3\\) - the averages of the three values - the critical point is an interior point of the triangle.\nAs mentioned by Strang, the real problem is to minimize \\(d_1 + d_2 + d_3\\). A direct approach with SymPy - just replacing d2 above with the square root` fails. Consider instead the gradient of \\(d_1\\), say. To avoid square roots, this is taken implicitly from \\(d_1^2\\):\n\\[\n\\frac{\\partial}{\\partial{x}}(d_1^2) = 2 d_1 \\frac{\\partial{d_1}}{\\partial{x}}.\n\\]\nBut computing directly from the expression yields \\(2(x - x_1)\\) Solving, yields:\n\\[\n\\frac{\\partial{d_1}}{\\partial{x}} = \\frac{(x-x_1)}{d_1}, \\quad\n\\frac{\\partial{d_1}}{\\partial{y}} = \\frac{(y-y_1)}{d_1}.\n\\]\nThe gradient is then \\((\\vec{p} - \\vec{x}_1)/\\|\\vec{p} - \\vec{x}_1\\|\\), a unit vector, call it \\(\\hat{u}_1\\). Similarly for \\(\\hat{u}_2\\) and \\(\\hat{u}_3\\).\nLet \\(f = d_1 + d_2 + d_3\\). Then \\(\\nabla{f} = \\hat{u}_1 + \\hat{u}_2 + \\hat{u}_3\\). At the minimum, the gradient is \\(\\vec{0}\\), so the three unit vectors must cancel. This can only happen if the three make a “peace” sign with angles \\(120^\\circ\\) between them. To find the minimum then within the triangle, this point and the boundary must be considered, when this point falls outside the triangle.\nHere is a triangle, where the minimum would be within the triangle:\n\nusₛ = [[cos(t), sin(t)] for t in (0, 2pi/3, 4pi/3)]\npolygon(ps) = unzip(vcat(ps, ps[1:1])) # easier way to plot a polygon\n\npₛ = scatter([0],[0], markersize=2, legend=false, aspect_ratio=:equal)\n\nasₛ = (1,2,3)\nplot!(polygon([a*u for (a,u) in zip(asₛ, usₛ)])...)\n[arrow!([0,0], a*u, alpha=0.5) for (a,u) in zip(asₛ, usₛ)]\npₛ\n\n\n\n\nFor this triangle we find the Steiner point outside of the triangle.\n\nasₛ₁ = (1, -1, 3)\nscatter([0],[0], markersize=2, legend=false)\npsₛₗ = [a*u for (a,u) in zip(asₛ₁, usₛ)]\nplot!(polygon(psₛₗ)...)\n\n\n\n\nLets see where the minimum distance point is by constructing a plot. The minimum must be on the boundary, as the only point where the gradient vanishes is the origin, not in the triangle. The plot of the triangle has a contour plot of the distance function, so we see clearly that the minimum happens at the point [0.5, -0.866025]. On this plot, we drew the gradient at some points along the boundary. The gradient points in the direction of greatest increase - away from the minimum. That the gradient vectors have a non-zero projection onto the edges of the triangle in a direction pointing away from the point indicates that the function d would increase if moved along the boundary in that direction, as indeed it does.\n\neuclid_dist(x; ps=psₛₗ) = sum(norm(x-p) for p in ps)\neuclid_dist(x,y; ps=psₛₗ) = euclid_dist([x,y]; ps=ps)\n\neuclid_dist (generic function with 2 methods)\n\n\n\nxs = range(-1.5, 1.5, length=100)\nys = range(-3, 1.0, length=100)\n\np = plot(polygon(psₛₗ)..., linewidth=3, legend=false)\nscatter!(p, unzip(psₛₗ)..., markersize=3)\ncontour!(p, xs, ys, euclid_dist)\n\n# add some gradients along boundary\nli(t, p1, p2) = p1 + t*(p2-p1) # t in [0,1]\nfor t in range(1/100, 1/2, length=3)\n pt = li(t, psₛₗ[2], psₛₗ[3])\n arrow!(pt, ForwardDiff.gradient(euclid_dist, pt))\n pt = li(t, psₛₗ[2], psₛₗ[1])\n arrow!(pt, ForwardDiff.gradient(euclid_dist, pt))\nend\n\np\n\n\n\n\nThe following graph, shows distance along each edge:\n\nli(t, p1, p2) = p1 + t*(p2-p1)\np = plot(legend=false)\nfor i in 1:2, j in (i+1):3\n plot!(p, t -> euclid_dist(li(t, psₛₗ[i], psₛₗ[j]); ps=psₛₗ), 0, 1)\nend\np\n\n\n\n\nThe smallest value is when \\(t=0\\) or \\(t=1\\), so at one of the points, as li is defined above.\n\n\nExample: least squares\nWe know that two points determine a line. What happens when there are more than two points? This is common in statistics where a bivariate data set (pairs of points \\((x,y)\\)) are summarized through a linear model \\(\\mu_{y|x} = \\alpha + \\beta x\\), That is the average value for \\(y\\) given a particular \\(x\\) value is given through the equation of a line. The data is used to identify what the slope and intercept are for this line. We consider a simple case - \\(3\\) points. The case of \\(n \\geq 3\\) being similar.\nWe have a line \\(l(x) = \\alpha + \\beta(x)\\) and three points \\((x_1, y_1)\\), \\((x_2, y_2)\\), and \\((x_3, y_3)\\). Unless these three points happen to be collinear, they cant possibly all lie on the same line. So to approximate a relationship by a line requires some inexactness. One measure of inexactness is the vertical distance to the line:\n\\[\nd1(\\alpha, \\beta) = |y_1 - l(x_1)| + |y_2 - l(x_2)| + |y_3 - l(x_3)|.\n\\]\nAnother might be the vertical squared distance to the line:\n\\[\n\\begin{align*}\nd2(\\alpha, \\beta) &= (y_1 - l(x_1))^2 + (y_2 - l(x_2))^2 + (y_3 - l(x_3))^2 \\\\\n&= (y1 - (\\alpha + \\beta x_1))^2 + (y3 - (\\alpha + \\beta x_3))^2 + (y3 - (\\alpha + \\beta x_3))^2\n\\end{align*}\n\\]\nAnother might be the shortest distance to the line:\n\\[\nd3(\\alpha, \\beta) = \\frac{\\beta x_1 - y_1 + \\alpha}{\\sqrt{1 + \\beta^2}} + \\frac{\\beta x_2 - y_2 + \\alpha}{\\sqrt{1 + \\beta^2}} + \\frac{\\beta x_3 - y_3 + \\alpha}{\\sqrt{1 + \\beta^2}}.\n\\]\nThe method of least squares minimizes the second one of these. That is, it chooses \\(\\alpha\\) and \\(\\beta\\) that make the expression a minimum.\n\n@syms xₗₛ[1:3] yₗₛ[1:3] α β\nli(x, alpha, beta) = alpha + beta * x\nd₂(alpha, beta) = sum((y - li(x, alpha, beta))^2 for (y,x) in zip(yₗₛ, xₗₛ))\nd₂(α, β)\n\n \n\\[\n\\left(- xₗₛ₁ β + yₗₛ₁ - α\\right)^{2} + \\left(- xₗₛ₂ β + yₗₛ₂ - α\\right)^{2} + \\left(- xₗₛ₃ β + yₗₛ₃ - α\\right)^{2}\n\\]\n\n\n\nTo identify \\(\\alpha\\) and \\(\\beta\\) we find the gradient:\n\ngrad_d₂ = diff.(d₂(α, β), [α, β])\n\n2-element Vector{Sym}:\n 2⋅xₗₛ₁⋅β + 2⋅xₗₛ₂⋅β + 2⋅xₗₛ₃⋅β - 2⋅yₗₛ₁ - 2⋅yₗₛ₂ - 2⋅yₗₛ₃ + 6⋅α\n -2*xₗₛ₁*(-xₗₛ₁*β + yₗₛ₁ - α) - 2*xₗₛ₂*(-xₗₛ₂*β + yₗₛ₂ - α) - 2*xₗₛ₃*(-xₗₛ₃*β + yₗₛ₃ - α)\n\n\n\noutₗₛ = solve(grad_d₂, [α, β])\n\nDict{Any, Any} with 2 entries:\n β => (2*xₗₛ₁*yₗₛ₁ - xₗₛ₁*yₗₛ₂ - xₗₛ₁*yₗₛ₃ - xₗₛ₂*yₗₛ₁ + 2*xₗₛ₂*yₗₛ₂ - xₗₛ₂*yₗ…\n α => (xₗₛ₁^2*yₗₛ₂ + xₗₛ₁^2*yₗₛ₃ - xₗₛ₁*xₗₛ₂*yₗₛ₁ - xₗₛ₁*xₗₛ₂*yₗₛ₂ - xₗₛ₁*xₗₛ₃…\n\n\nAs found, the formulas arent pretty. If \\(x_1 + x_2 + x_3 = 0\\) they simplify. For example:\n\nsubs(outₗₛ[β], sum(xₗₛ) => 0)\n\n \n\\[\n\\frac{2 xₗₛ₁ yₗₛ₁ - xₗₛ₁ yₗₛ₂ - xₗₛ₁ yₗₛ₃ - xₗₛ₂ yₗₛ₁ + 2 xₗₛ₂ yₗₛ₂ - xₗₛ₂ yₗₛ₃ - xₗₛ₃ yₗₛ₁ - xₗₛ₃ yₗₛ₂ + 2 xₗₛ₃ yₗₛ₃}{2 xₗₛ₁^{2} - 2 xₗₛ₁ xₗₛ₂ - 2 xₗₛ₁ xₗₛ₃ + 2 xₗₛ₂^{2} - 2 xₗₛ₂ xₗₛ₃ + 2 xₗₛ₃^{2}}\n\\]\n\n\n\nLet \\(\\vec{x} = \\langle x_1, x_2, x_3 \\rangle\\) and \\(\\vec{y} = \\langle y_1, y_2, y_3 \\rangle\\) this is simply \\((\\vec{x} \\cdot \\vec{y})/(\\vec{x}\\cdot \\vec{x})\\), a formula that will generalize to \\(n > 3\\). The assumption is not a restriction - it comes about by subtracting the mean, \\(\\bar{x} = (x_1 + x_2 + x_3)/3\\), from each \\(x\\) term (and similarly subtract \\(\\bar{y}\\) from each \\(y\\) term). A process called “centering.”\nWith this observation, the formulas can be re-expressed through:\n\\[\n\\beta = \\frac{\\sum{x_i - \\bar{x}}(y_i - \\bar{y})}{\\sum(x_i-\\bar{x})^2},\n\\quad\n\\alpha = \\bar{y} - \\beta \\bar{x}.\n\\]\nRelative to the centered values, this may be viewed as a line through \\((\\bar{x}, \\bar{y})\\) with slope given by \\((\\vec{x}-\\bar{x})\\cdot(\\vec{y}-\\bar{y}) / \\|\\vec{x}-\\bar{x}\\|\\).\nAs an example, if the point are \\((1,1), (2,3), (5,8)\\) we get:\n\n[k => subs(v, xₗₛ[1]=>1, yₗₛ[1]=>1, xₗₛ[2]=>2, yₗₛ[2]=>3,\n xₗₛ[3]=>5, yₗₛ[3]=>8) for (k,v) in outₗₛ]\n\n2-element Vector{Pair{Sym, Sym}}:\n β => 45/26\n α => -8/13\n\n\n\n\n56.4.1 Gradient descent\nAs seen in the examples above, extrema may be identified analytically by solving for when the gradient is \\(0\\). Here we discuss some numeric algorithms for finding extrema.\nAn algorithm to identify where a surface is at its minimum is gradient descent. The gradient points in the direction of the steepest ascent of the surface and the negative gradient the direction of the steepest descent. To move to a minimum then, it make intuitive sense to move in the direction of the negative gradient. How far? That is a different question and one with different answers. Lets formulate the movement first, then discuss how far.\nLet \\(\\vec{x}_0\\), \\(\\vec{x}_1\\), \\(\\dots\\), \\(\\vec{x}_n\\) be the position of the algorithm for \\(n\\) steps starting from an initial point \\(\\vec{x}_0\\). The difference between these points is given by:\n\\[\n\\vec{x}_{n+1} = \\vec{x}_n - \\gamma \\nabla{f}(\\vec{x}_n),\n\\]\nwhere \\(\\gamma\\) is some scaling factor for the gradient. The above quantifies the idea: to go from \\(\\vec{x}_n\\) to \\(\\vec{x}_{n+1}\\), move along \\(-\\nabla{f}\\) by a certain amount.\nLet \\(\\Delta_x =\\vec{x}_{n}- \\vec{x}_{n-1}\\) and \\(\\Delta_y = \\nabla{f}(\\vec{x}_{n}) - \\nabla{f}(\\vec{x}_{n-1})\\) A variant of the Barzilai-Borwein method is to take \\(\\gamma_n = | \\Delta_x \\cdot \\Delta_y / \\Delta_y \\cdot \\Delta_y |\\).\nTo illustrate, take \\(f(x,y) = -(x^2 + y^2) \\cdot e^{-(2x^2 + y^2)}\\) and a starting point \\(\\langle 1, 1 \\rangle\\). We have, starting with \\(\\gamma_0 = 1\\) there are \\(5\\) steps taken:\n\nf₂(x,y) = -exp(-((x-1)^2 + 2(y-1/2)^2))\nf₂(x) = f₂(x...)\n\nxs₂ = [[0.0, 0.0]] # we store a vector\ngammas₂ = [1.0]\n\nfor n in 1:5\n xn = xs₂[end]\n gamma₀ = gammas₂[end]\n xn1 = xn - gamma₀ * gradient(f₂)(xn)\n dx, dy = xn1 - xn, gradient(f₂)(xn1) - gradient(f₂)(xn)\n gamman1 = abs( (dx ⋅ dy) / (dy ⋅ dy) )\n\n push!(xs₂, xn1)\n push!(gammas₂, gamman1)\nend\n\n[(x, f₂(x)) for x in xs₂]\n\n6-element Vector{Tuple{Vector{Float64}, Float64}}:\n ([0.0, 0.0], -0.22313016014842982)\n ([0.44626032029685964, 0.44626032029685964], -0.7316862045596354)\n ([0.5719399641782019, 0.4706543959065717], -0.8311394210020312)\n ([1.3127598757955443, 0.5722280351701136], -0.8974009578884475)\n ([0.9982224839581173, 0.4269509740243237], -0.9893813007474934)\n ([0.9996828943781475, 0.5469853998120562], -0.9955943772073014)\n\n\nWe now visualize, using the Contour package to draw the contour lines in the \\(x-y\\) plane:\n\nfunction surface_contour(xs, ys, f; offset=0)\n p = surface(xs, ys, f, legend=false, fillalpha=0.5)\n\n ## we add to the graphic p, then plot\n zs = [f(x,y) for x in xs, y in ys] # reverse order for use with Contour package\n for cl in levels(contours(xs, ys, zs))\n lvl = level(cl) # the z-value of this contour level\n for line in lines(cl)\n _xs, _ys = coordinates(line) # coordinates of this line segment\n _zs = offset * _xs\n plot!(p, _xs, _ys, _zs, alpha=0.5) # add curve on x-y plane\n end\n end\n p\nend\n\n\noffset = 0\nus = vs = range(-1, 2, length=100)\nsurface_contour(vs, vs, f₂, offset=offset)\npts = [[pt..., offset] for pt in xs₂]\nscatter!(unzip(pts)...)\nplot!(unzip(pts)..., linewidth=3)\n\n\n\n\n\n\n56.4.2 Newtons method for minimization\nA variant of Newtons method can be used to minimize a function \\(f:R^2 \\rightarrow R\\). We look for points where both partial derivatives of \\(f\\) vanish. Let \\(g(x,y) = \\partial f/\\partial x(x,y)\\) and \\(h(x,y) = \\partial f/\\partial y(x,y)\\). Then applying Newtons method, as above to solve simultaneously for when \\(g=0\\) and \\(h=0\\), we considered this matrix:\n\\[\nM = [\\nabla{g}'; \\nabla{h}'],\n\\]\nand had a step expressible in terms of the inverse of \\(M\\) as \\(M^{-1} [g; h]\\). In terms of the function \\(f\\), this step is \\(H^{-1}\\nabla{f}\\), where \\(H\\) is the Hessian matrix. Newtons method then becomes:\n\\[\n\\vec{x}_{n+1} = \\vec{x}_n - [H_f(\\vec{x}_n]^{-1} \\nabla(f)(\\vec{x}_n).\n\\]\nThe Wikipedia page states where applicable, Newtons method converges much faster towards a local maximum or minimum than gradient descent.\nWe apply it to the task of characterizing the following function, which has a few different peaks over the region \\([-3,3] \\times [-2,2]\\):\n\nfunction peaks(x, y)\n z = 3 * (1 - x)^2 * exp(-x^2 - (y + 1)^2)\n z += -10 * (x / 5 - x^3 - y^5) * exp(-x^2 - y^2)\n z += -1/3 * exp(-(x+1)^2 - y^2)\n return z\nend\npeaks(v) = peaks(v...)\n\npeaks (generic function with 2 methods)\n\n\n\nxs = range(-3, stop=3, length=100)\nys = range(-2, stop=2, length=100)\nPs = surface(xs, ys, peaks, legend=false)\nPc = contour(xs, ys, peaks, legend=false)\nplot(Ps, Pc, layout=2) # combine plots\n\n\n\n\nAs we will solve for the critical points numerically, we consider the contour plot as well, as it shows better where the critical points are.\nOver this region we see clearly 5 peaks or valleys: near \\((0, 1.5)\\), near \\((1.2, 0)\\), near \\((0.2, -1.8)\\), near \\((-0.5, -0.8)\\), and near \\((-1.2, 0.2)\\). To classify the \\(5\\) critical points we need to first identify them, then compute the Hessian, and then, possibly compute \\(f_xx\\) at the point. Here we do so for one of them using a numeric approach.\nFor concreteness, consider the peak or valley near \\((0,1.5)\\). We use Newtons method to numerically compute the critical point. The Newton step, specialized here is:\n\nfunction newton_stepₚ(f, x)\n M = ForwardDiff.hessian(f, x)\n b = ForwardDiff.gradient(f, x)\n x - M \\ b\nend\n\nnewton_stepₚ (generic function with 1 method)\n\n\nWe perform \\(3\\) steps of Newtons method, and see that it has found a critical point.\n\nxₚ = [0, 1.5]\nxₚ = newton_stepₚ(peaks, xₚ)\nxₚ = newton_stepₚ(peaks, xₚ)\nxₚ = newton_stepₚ(peaks, xₚ)\nxₚ, ForwardDiff.gradient(peaks, xₚ)\n\n([-0.009317581959954116, 1.5813679629389998], [1.734723475976807e-17, -6.9111383282915995e-15])\n\n\nThe Hessian at this point is given by:\n\nHₚ = ForwardDiff.hessian(peaks, xₚ)\n\n2×2 Matrix{Float64}:\n -16.2944 0.493227\n 0.493227 -32.413\n\n\nFrom which we see:\n\nfxx = Hₚ[1,1]\nd = det(Hₚ)\nfxx, d\n\n(-16.29442261058989, 527.9079596128478)\n\n\nConsequently we have a local maximum at this critical point.\n\n\n\n\n\n\nNote\n\n\n\n\n\n\nThe Optim.jl package provides efficient implementations of these two numeric methods, and others."
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#constrained-optimization-lagrange-multipliers",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#constrained-optimization-lagrange-multipliers",
"title": "56  Applications with scalar functions",
"section": "56.5 Constrained optimization, Lagrange multipliers",
"text": "56.5 Constrained optimization, Lagrange multipliers\nWe considered the problem of maximizing a function over a closed region. This maximum is achieved at a critical point or a boundary point. Investigating the critical points isnt so difficult and the second partial derivative test can help characterize the points along the way, but characterizing the boundary points usually involves parameterizing the boundary, which is not always so easy. However, if we put this problem into a more general setting a different technique becomes available.\nThe different setting is: maximize \\(f(x,y)\\) subject to the constraint \\(g(x,y) = k\\). The constraint can be used to describe the boundary used previously.\nWhy does this help? The key is something we have seen prior: If \\(g\\) is differentiable, and we take \\(\\nabla{g}\\), then it will point at directions orthogonal to the level curve \\(g(x,y) = 0\\). (Parameterize the curve, then \\((g\\circ\\vec{r})(t) = 0\\) and so the chain rule has \\(\\nabla{g}(\\vec{r}(t)) \\cdot \\vec{r}'(t) = 0\\).) For example, consider the function \\(g(x,y) = x^2 +2y^2 - 1\\). The level curve \\(g(x,y) = 0\\) is an ellipse. Here we plot the level curve, along with a few gradient vectors at points satisfying \\(g(x,y) = 0\\):\n\ng(x,y) = x^2 + 2y^2 -1\ng(v) = g(v...)\n\nxs = range(-3, 3, length=100)\nys = range(-1, 4, length=100)\n\np = plot(aspect_ratio=:equal, legend=false)\ncontour!(xs, ys, g, levels=[0])\n\ngi(x) = sqrt(1/2*(1-x^2)) # solve for y in terms of x\npts = [[x, gi(x)] for x in (-3/4, -1/4, 1/4, 3/4)]\n\nfor pt in pts\n arrow!(pt, ForwardDiff.gradient(g, pt) )\nend\n\np\n\n\n\n\nFrom the plot we see the key property that \\(g\\) is orthogonal to the level curve.\nNow consider \\(f(x,y)\\), a function we wish to maximize. The gradient points in the direction of greatest increase, provided \\(f\\) is smooth. We are interested in the value of this gradient along the level curve of \\(g\\). Consider this figure representing a portion of the level curve, its tangent, normal, the gradient of \\(f\\), and the contours of \\(f\\):\n\n\n\n\n\nWe can identify the tangent, the normal, and subsequently the gradient of \\(f\\). Is the point drawn a maximum of \\(f\\) subject to the constraint \\(g\\)?\nThe answer is no, but why? By adding the contours of \\(f\\), we see that moving along the curve from this point will increase or decrease \\(f\\), depending on which direction we move in. As the gradient is the direction of greatest increase, we can see that the projection of the gradient on the tangent will point in a direction of increase.\nIt isnt just because the point picked was chosen to make a pretty picture, and not be a maximum. Rather, the fact that \\(\\nabla{f}\\) has a non-trivial projection onto the tangent vector. What does it say if we move the point in the direction of this projection?\nThe gradient points in the direction of greatest increase. If we first move in one component of the gradient we will increase, just not as fast. This is because the directional derivative in the direction of the tangent will be non-zero. In the picture, if we were to move the point to the right along the curve \\(f(x,y)\\) will increase.\nNow consider this figure at a different point of the figure:\n\n\n\n\n\nWe can still identify the tangent and normal directions. What is different about this point is that local movement on the constraint curve is also local movement on the contour line of \\(f\\), so \\(f\\) doesnt increase or decrease here, as it would if this point were an extrema along the contraint. The key to seeing this is the contour lines of \\(f\\) are tangent to the constraint. The respective gradients are orthogonal to their tangent lines, and in dimension \\(2\\), this implies they are parallel to each other.\n\nThe method of Lagrange multipliers: To optimize \\(f(x,y)\\) subject to a constraint \\(g(x,y) = k\\) we solve for all simultaneous solutions to\n\\[\n\\begin{align}\n\\nabla{f}(x,y) &= \\lambda \\nabla{g}(x,y), \\text{and}\\\\\ng(x,y) &= k.\n\\end{align}\n\\]\nThese possible points are evaluated to see if they are maxima or minima.\n\nThe method will not work if \\(\\nabla{g} = \\vec{0}\\) or if \\(f\\) and \\(g\\) are not differentiable.\n\n\nExample\nWe consider again the problem of maximizing all rectangles subject to the perimeter being \\(20\\). We have seen this results in a square. This time we use the Lagrange multiplier technique. We have two equations:\n\\[\nA(x,y) = xy, \\quad P(x,y) = 2x + 2y = 25.\n\\]\nWe see \\(\\nabla{A} = \\lambda \\nabla{P}\\), or \\(\\langle y, x \\rangle = \\lambda \\langle 2, 2\\rangle\\). We see the solution has \\(x = y\\) and from the constraint \\(x=y = 5\\).\nThis is clearly the maximum for this problem, though the Lagrange technique does not imply that, it only identifies possible extrema.\n\n\nExample\nWe can reverse the question: what are the ranges for the perimeter when the area is a fixed value of \\(25\\)? We have:\n\\[\nP(x,y) = 2x + 2y, \\quad A(x,y) = xy = 25.\n\\]\nNow we look for \\(\\nabla{P} = \\lambda \\nabla{A}\\) and will get, as the last example, that \\(\\langle 2, 2 \\rangle = \\lambda \\langle y, x\\rangle\\). So \\(x=y\\) and from the constraint \\(x=y=5\\).\nHowever this is not the maximum perimeter, but rather the minimal perimeter. The maximum is \\(\\infty\\), which comes about in the limit by considering long skinny rectangles.\n\n\nExample: A rephrasing\nAn slightly different formulation of the Lagrange method is to combine the equation and the constraint into one equation:\n\\[\nL(x,y,\\lambda) = f(x,y) - \\lambda (g(x,y) - k).\n\\]\nThe we have\n\\[\n\\begin{align}\n\\frac{\\partial L}{\\partial{x}} &= \\frac{\\partial{f}}{\\partial{x}} - \\lambda \\frac{\\partial{g}}{\\partial{x}}\\\\\n\\frac{\\partial L}{\\partial{y}} &= \\frac{\\partial{f}}{\\partial{y}} - \\lambda \\frac{\\partial{g}}{\\partial{y}}\\\\\n\\frac{\\partial L}{\\partial{\\lambda}} &= 0 + (g(x,y) - k).\n\\end{align}\n\\]\nBut if the Lagrange condition holds, each term is \\(0\\), so Lagranges method can be seen as solving for point \\(\\nabla{L} = \\vec{0}\\). The optimization problem in two variables with a constraint becomes a problem of finding and classifying zeros of a function with three variables.\nApply this to the optimization problem:\nFind the extrema of \\(f(x,y) = x^2 - y^2\\) subject to the constraint \\(g(x,y) = x^2 + y^2 = 1\\).\nWe have:\n\\[\nL(x, y, \\lambda) = f(x,y) - \\lambda(g(x,y) - 1)\n\\]\nWe can solve for \\(\\nabla{L} = \\vec{0}\\) by hand, but we do so symbolically:\n\n@syms lambda\nfₗₐ(x, y) = x^2 - y^2\ngₗₐ(x, y) = x^2 + y^2\nLₗₐ(x, y, lambda) = fₗₐ(x,y) - lambda * (gₗₐ(x,y) - 1)\ndsₗₐ = solve(diff.(Lₗₐ(x, y, lambda), [x, y, lambda]))\n\n4-element Vector{Dict{Any, Any}}:\n Dict(lambda => -1, x => 0, y => -1)\n Dict(lambda => -1, x => 0, y => 1)\n Dict(lambda => 1, x => -1, y => 0)\n Dict(lambda => 1, x => 1, y => 0)\n\n\nThis has \\(4\\) easy solutions, here are the values at each point:\n\n[fₗₐ(d[x], d[y]) for d in dsₗₐ]\n\n4-element Vector{Sym}:\n -1\n -1\n 1\n 1\n\n\nSo \\(1\\) is a maximum value and \\(-1\\) a minimum value.\n\n\nExample: Didos problem\nConsider a slightly different problem: What shape should a rope (curve) of fixed length make to maximize the area between the rope and \\(x\\) axis?\nLet \\(L\\) be the length of the rope and suppose \\(y(x)\\) describes the curve. Then we wish to\n\\[\n\\text{Maximize } \\int y(x) dx, \\quad\\text{subject to }\n\\int \\sqrt{1 + y'(x)^2} dx = L.\n\\]\nThe latter being the formula for arc length. This is very much like a optimization problem that Lagranges method could help solve, but with one big difference: the answer is not a point but a function.\nThis is a variant of Didos problem, described by Bandle as\n\nDidos problem: The Roman poet Publius Vergilius Maro (7019 B.C.) tells in his epic Aeneid the story of queen Dido, the daughter of the Phoenician king of the 9th century B.C. After the assassination of her husband by her brother she fled to a haven near Tunis. There she asked the local leader, Yarb, for as much land as could be enclosed by the hide of a bull. Since the deal seemed very modest, he agreed. Dido cut the hide into narrow strips, tied them together and encircled a large tract of land which became the city of Carthage. Dido faced the following mathematical problem, which is also known as the isoperimetric problem: Find among all curves of given length the one which encloses maximal area. Dido found intuitively the right answer.\n\nThe problem as stated above and method of solution follows notes by Wang though Bandle attributes the ideas back to a 19-year old Lagrange in a letter to Euler.\nThe method of solution will be to assume we have the function and then characterize this function in such a way that it can be identified.\nFollowing Lagrange, we generalize the problem to the following: maximize \\(\\int_{x_0}^{x_1} f(x, y(x), y'(x)) dx\\) subject to a constraint \\(\\int_{x_0}^{x_1} g(x,y(x), y'(x)) dx = K\\). Suppose \\(y(x)\\) is a solution.\nThe starting point is a perturbation: \\(\\hat{y}(x) = y(x) + \\epsilon_1 \\eta_1(x) + \\epsilon_2 \\eta_2(x)\\). There are two perturbation terms, were only one term added, then the perturbation may make \\(\\hat{y}\\) not satisfy the constraint, the second term is used to ensure the constraint is not violated. If \\(\\hat{y}\\) is to be a possible solution to our problem, we would want \\(\\hat{y}(x_0) = \\hat{y}(x_1) = 0\\), as it does for \\(y(x)\\), so we assume \\(\\eta_1\\) and \\(\\eta_2\\) satisfy this boundary condition.\nWith this notation, and fixing \\(y\\) we can re-express the equations in terms ot \\(\\epsilon_1\\) and \\(\\epsilon_2\\):\n\\[\n\\begin{align}\nF(\\epsilon_1, \\epsilon_2) &= \\int f(x, \\hat{y}, \\hat{y}') dx =\n\\int f(x, y + \\epsilon_1 \\eta_1 + \\epsilon_2 \\eta_2, y' + \\epsilon_1 \\eta_1' + \\epsilon_2 \\eta_2') dx,\\\\\nG(\\epsilon_1, \\epsilon_2) &= \\int g(x, \\hat{y}, \\hat{y}') dx =\n\\int g(x, y + \\epsilon_1 \\eta_1 + \\epsilon_2 \\eta_2, y' + \\epsilon_1 \\eta_1' + \\epsilon_2 \\eta_2') dx.\n\\end{align}\n\\]\nThen our problem is restated as:\n\\[\n\\text{Maximize } F(\\epsilon_1, \\epsilon_2) \\text{ subject to }\nG(\\epsilon_1, \\epsilon_2) = L.\n\\]\nNow, Lagranges method can be employed. This will be fruitful - even though we know the answer - it being \\(\\epsilon_1 = \\epsilon_2 = 0\\)!\nForging ahead, we compute \\(\\nabla{F}\\) and \\(\\lambda \\nabla{G}\\) and set \\(\\epsilon_1 = \\epsilon_2 = 0\\) where the two are equal. This will lead to a description of \\(y\\) in terms of \\(y'\\).\nLagranges method has:\n\\[\n\\frac{\\partial{F}}{\\partial{\\epsilon_1}}(0,0) - \\lambda \\frac{\\partial{G}}{\\partial{\\epsilon_1}}(0,0) = 0, \\text{ and }\n\\frac{\\partial{F}}{\\partial{\\epsilon_2}}(0,0) - \\lambda \\frac{\\partial{G}}{\\partial{\\epsilon_2}}(0,0) = 0.\n\\]\nComputing just the first one, we have using the chain rule and assuming interchanging the derivative and integral is possible:\n\\[\n\\begin{align}\n\\frac{\\partial{F}}{\\partial{\\epsilon_1}}\n&= \\int \\frac{\\partial}{\\partial{\\epsilon_1}}(\nf(x, y + \\epsilon_1 \\eta_1 + \\epsilon_2 \\eta_2, y' + \\epsilon_1 \\eta_1' + \\epsilon_2 \\eta_2')) dx\\\\\n&= \\int \\left(\\frac{\\partial{f}}{\\partial{y}} \\eta_1 + \\frac{\\partial{f}}{\\partial{y'}} \\eta_1'\\right) dx\\quad\\quad(\\text{from }\\nabla{f} \\cdot \\langle 0, \\eta_1, \\eta_1'\\rangle)\\\\\n&=\\int \\eta_1 \\left(\\frac{\\partial{f}}{\\partial{y}} - \\frac{d}{dx}\\frac{\\partial{f}}{\\partial{y'}}\\right) dx.\n\\end{align}\n\\]\nThe last line by integration by parts: $u(x) v(x) dx = (u v)(x)_{x_0}^{x_1} - u(x) v(x) dx = - u(x) v(x) dx $. The last lines, as \\(\\eta_1 = 0\\) at \\(x_0\\) and \\(x_1\\) by assumption. We get:\n\\[\n0 = \\int \\eta_1\\left(\\frac{\\partial{f}}{\\partial{y}} - \\frac{d}{dx}\\frac{\\partial{f}}{\\partial{y'}}\\right).\n\\]\nSimilarly were \\(G\\) considered, we would find a similar statement. Setting \\(L(x, y, y') = f(x, y, y') - \\lambda g(x, y, y')\\), the combination of terms gives:\n\\[\n0 = \\int \\eta_1\\left(\\frac{\\partial{L}}{\\partial{y}} - \\frac{d}{dx}\\frac{\\partial{L}}{\\partial{y'}}\\right) dx.\n\\]\nSince \\(\\eta_1\\) is arbitrary save for its boundary conditions, under smoothness conditions on \\(L\\) this will imply the rest of the integrand must be \\(0\\).\nThat is, If \\(y(x)\\) is a maximizer of \\(\\int_{x_0}^{x_1} f(x, y, y')dx\\) and sufficiently smooth over \\([x_0, x_1]\\) and \\(y(x)\\) satisfies the constraint \\(\\int_{x_0}^{x_1} g(x, y, y')dx = K\\) then there exists a constant \\(\\lambda\\) such that \\(L = f -\\lambda g\\) will satisfy:\n\\[\n\\frac{d}{dx}\\frac{\\partial{L}}{\\partial{y'}} - \\frac{\\partial{L}}{\\partial{y}} = 0.\n\\]\nIf \\(\\partial{L}/\\partial{x} = 0\\), this simplifies to the Beltrami identity:\n\\[\nL - y' \\frac{\\partial{L}}{\\partial{y'}} = C.\\quad(\\text{Beltrami identity})\n\\]\n\nFor Didos problem, \\(f(x,y,y') = y\\) and \\(g(x, y, y') = \\sqrt{1 + y'^2}\\), so \\(L = y - \\lambda\\sqrt{1 + y'^2}\\) will have \\(0\\) partial derivative with respect to \\(x\\). Using the Beltrami identify we have:\n\\[\n(y - \\lambda\\sqrt{1 + y'^2}) - \\lambda y' \\frac{2y'}{2\\sqrt{1 + y'^2}} = C.\n\\]\nby multiplying through by the denominator and squaring to remove the square root, a quadratic equation in \\(y'^2\\) can be found. This can be solved to give:\n\\[\ny' = \\frac{dy}{dx} = \\sqrt{\\frac{\\lambda^2 -(y + C)^2}{(y+C)^2}}.\n\\]\nHere is a snippet of SymPy code to verify the above:\n\n@vars y y λ C\nex = Eq(-λ*y^2/sqrt(1 + y^2) + λ*sqrt(1 + y^2), C + y)\nΔ = sqrt(1 + y^2) / (C+y)\nex1 = Eq(simplify(ex.lhs()*Δ), simplify(ex.rhs() * Δ))\nex2 = Eq(ex1.lhs()^2 - 1, simplify(ex1.rhs()^2) - 1)\n\n \n\\[\n\\frac{λ^{2}}{\\left(C + y\\right)^{2}} - 1 = y^{2}\n\\]\n\n\n\nNow \\(y'\\) can be integrated using the substitution \\(y + C = \\lambda \\cos\\theta\\) to give: \\(-\\lambda\\int\\cos\\theta d\\theta = x + D\\), \\(D\\) some constant. That is:\n\\[\n\\begin{align}\nx + D &= - \\lambda \\sin\\theta\\\\\ny + C &= \\lambda\\cos\\theta.\n\\end{align}\n\\]\nSquaring gives the equation of a circle: \\((x +D)^2 + (y+C)^2 = \\lambda^2\\).\nWe center and rescale the problem so that \\(x_0 = -1, x_1 = 1\\). Then \\(L > 2\\) as otherwise the rope is too short. From here, we describe the radius and center of the circle.\nWe have \\(y=0\\) at \\(x=1\\) and \\(-1\\) giving:\n\\[\n\\begin{align}\n(-1 + D)^2 + (0 + C)^2 &= \\lambda^2\\\\\n(+1 + D)^2 + (0 + C)^2 &= \\lambda^2.\n\\end{align}\n\\]\nSquaring out and solving gives \\(D=0\\), \\(1 + C^2 = \\lambda^2\\). That is, an arc of circle with radius \\(1+C^2\\) and centered at \\((0, -C)\\).\n\\[\nx^2 + (y + C)^2 = 1 + C^2.\n\\]\nNow to identify \\(C\\) in terms of \\(L\\). \\(L\\) is the length of arc of circle of radius \\(r =\\sqrt{1 + C^2}\\) and angle \\(2\\theta\\), so \\(L = 2r\\theta\\) But using the boundary conditions in the equations for \\(x\\) and \\(y\\) gives \\(\\tan\\theta = 1/C\\), so \\(L = 2\\sqrt{1 + C^2}\\tan^{-1}(1/C)\\) which can be solved for \\(C\\) provided \\(L \\geq 2\\).\n\n\nExample: more constraints\nConsider now the case of maximizing \\(f(x,y,z)\\) subject to \\(g(x,y,z)=c\\) and \\(h(x,y,z) = d\\). Can something similar be said to characterize potential values for this to occur? Trying to describe where \\(g(x,y,z) = c\\) and \\(h(x,y,z)=d\\) in general will prove difficult. The easy case would be it the two equations were linear, in which case they would describe planes. Two non-parallel planes would intersect in a line. If the general case, imagine the surfaces locally replaced by their tangent planes, then their intersection would be a line, and this line would point in along the curve given by the intersection of the surfaces formed by the contraints. This line is similar to the tangent line in the \\(2\\)-variable case. Now if \\(\\nabla{f}\\), which points in the direction of greatest increase of \\(f\\), had a non-zero projection onto this line, then moving the point in that direction along the line would increase \\(f\\) and still leave the point following the contraints. That is, if there is a non-zero directional derivative the point is not a maximum.\nThe tangent planes are orthogonal to the vectors \\(\\nabla{g}\\) and \\(\\nabla{h}\\), so in this case parallel to \\(\\nabla{g} \\times \\nabla{h}\\). The condition that \\(\\nabla{f}\\) be orthogonal to this vector, means that \\(\\nabla{f}\\) must sit in the plane described by \\(\\nabla{g}\\) and \\(\\nabla{h}\\) - the plane of orthogonal vectors to \\(\\nabla{g} \\times \\nabla{h}\\). That is, this condition is needed:\n\\[\n\\nabla{f}(x,y,z) = \\lambda_1 \\nabla{g}(x,y,z) + \\lambda_2 \\nabla{h}(x,y,z).\n\\]\nAt a point satisfying the above, we would have the tangent “plane” of \\(f\\) is contained in the intersection of the tangent “plane”s to \\(g\\) and \\(h\\).\n\nConsider a curve given through the intersection of two expressions: \\(g_1(x,y,z) = x^2 + y^2 - z^2 = 0\\) and \\(g_2(x,y,z) = x - 2z = 3\\). What is the minimum distance to the origin along this curve?\nWe have \\(f(x,y,z) = \\text{distance}(\\vec{x},\\vec{0}) = \\sqrt{x^2 + y^2 + z^2}\\), subject to the two constraints. As the square root is increasing, we can actually just consider \\(f(x,y,z) = x^2 + y^2 + z^2\\), ignoring the square root. The Lagrange multiplier technique instructs us to look for solutions to:\n\\[\n\\langle 2x, 2y ,2x \\rangle = \\lambda_1\\langle 2x, 2y, -2z\\rangle + \\lambda_2 \\langle 1, 0, -2 \\rangle.\n\\]\nHere we use SymPy:\n\n@syms z lambda1 lambda2\ng1(x, y, z) = x^2 + y^2 - z^2\ng2(x, y, z) = x - 2z - 3\nfₘ(x,y,z)= x^2 + y^2 + z^2\nLₘ(x,y,z,lambda1, lambda2) = fₘ(x,y,z) - lambda1*(g1(x,y,z) - 0) - lambda2*(g2(x,y,z) - 0)\n\n∇Lₘ = diff.(Lₘ(x,y,z,lambda1, lambda2), [x, y, z,lambda1, lambda2])\n\n5-element Vector{Sym}:\n -2⋅λ₁⋅x - λ₂ + 2⋅x\n -2⋅λ₁⋅y + 2⋅y\n 2⋅λ₁⋅z + 2⋅λ₂ + 2⋅z\n -x^2 - y^2 + z^2\n -x + 2⋅z + 3\n\n\nBefore trying to solve for \\(\\nabla{L} = \\vec{0}\\) we see from the second equation that either \\(\\lambda_1 = 1\\) or \\(y = 0\\). First we solve with \\(\\lambda_1 = 1\\):\n\nsolve(subs.(∇Lₘ, lambda1 .=> 1))\n\n2-element Vector{Dict{Any, Any}}:\n Dict(z => 0, x => 3, lambda2 => 0, y => -3*I)\n Dict(z => 0, x => 3, lambda2 => 0, y => 3*I)\n\n\nThere are no real solutions. Next when \\(y = 0\\) we get:\n\noutₘ = solve(subs.(∇Lₘ, y .=> 0))\n\n2-element Vector{Dict{Any, Any}}:\n Dict(z => -1, x => 1, lambda2 => 4/3, lambda1 => 1/3)\n Dict(z => -3, x => -3, lambda2 => 12, lambda1 => 3)\n\n\nThe two solutions have values yielding the extrema:\n\n[fₘ(d[x], 0, d[z]) for d in outₘ]\n\n2-element Vector{Sym}:\n 2\n 18"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#taylors-theorem",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#taylors-theorem",
"title": "56  Applications with scalar functions",
"section": "56.6 Taylors theorem",
"text": "56.6 Taylors theorem\nTaylors theorem for a univariate function states that if \\(f\\) has \\(k+1\\) derivatives in an open interval around \\(a\\), \\(f^{(k)}\\) is continuous between the closed interval from \\(a\\) to \\(x\\) then:\n\\[\nf(x) = \\sum_{j=0}^k \\frac{f^{j}(a)}{j!} (x-a)^k + R_k(x),\n\\]\nwhere \\(R_k(x) = f^{k+1}(\\xi)/(k+1)!(x-a)^{k+1}\\) for some \\(\\xi\\) between \\(a\\) and \\(x\\).\nThis theorem can be generalized to scalar functions, but the notation can be cumbersome. Following Folland we use multi-index notation. Suppose \\(f:R^n \\rightarrow R\\), and let \\(\\alpha=(\\alpha_1, \\alpha_2, \\dots, \\alpha_n)\\). Then define the following notation:\n\\[\n\\begin{align*}\n|\\alpha| &= \\alpha_1 + \\cdots + \\alpha_n, \\\\\n\\alpha! &= \\alpha_1!\\alpha_2!\\cdot\\cdots\\cdot\\alpha_n!, \\\\\n\\vec{x}^\\alpha &= x_1^{\\alpha_1}x_2^{\\alpha_2}\\cdots x_n^{\\alpha^n}, \\\\\n\\partial^\\alpha f &= \\partial_1^{\\alpha_1}\\partial_2^{\\alpha_2}\\cdots \\partial_n^{\\alpha_n} f \\\\\n& = \\frac{\\partial^{|\\alpha|}f}{\\partial x_1^{\\alpha_1} \\partial x_2^{\\alpha_2} \\cdots \\partial x_n^{\\alpha_n}}.\n\\endalign*}\n\\]\nThis notation makes many formulas from one dimension carry over to higher dimensions. For example, the binomial theorem says:\n\\[\n(a+b)^n = \\sum_{k=0}^n \\frac{n!}{k!(n-k)!}a^kb^{n-k},\n\\]\nand this becomes:\n\\[\n(x_1 + x_2 + \\cdots + x_n)^n = \\sum_{|\\alpha|=k} \\frac{k!}{\\alpha!} \\vec{x}^\\alpha.\n\\]\nTaylors theorem then becomes:\nIf \\(f: R^n \\rightarrow R\\) is sufficiently smooth (\\(C^{k+1}\\)) on an open convex set \\(S\\) about \\(\\vec{a}\\) then if \\(\\vec{a}\\) and \\(\\vec{a}+\\vec{h}\\) are in \\(S\\),\n\\[\nf(\\vec{a} + \\vec{h}) = \\sum_{|\\alpha| \\leq k}\\frac{\\partial^\\alpha f(\\vec{a})}{\\alpha!}\\vec{h}^\\alpha + R_{\\vec{a},k}(\\vec{h}),\n\\]\nwhere \\(R_{\\vec{a},k} = \\sum_{|\\alpha|=k+1}\\partial^\\alpha \\frac{f(\\vec{a} + c\\vec{h})}{\\alpha!} \\vec{h}^\\alpha\\) for some \\(c\\) in \\((0,1)\\).\n\nExample\nThe elegant notation masks what can be complicated expressions. Consider the simple case \\(f:R^2 \\rightarrow R\\) and \\(k=2\\). Then this says:\n\\[\n\\begin{align*}\nf(x + dx, y+dy) &= f(x, y) + \\frac{\\partial f}{\\partial x} dx + \\frac{\\partial f}{\\partial y} dy \\\\\n&+ \\frac{\\partial^2 f}{\\partial x^2} \\frac{dx^2}{2} + 2\\frac{\\partial^2 f}{\\partial x\\partial y} \\frac{dx dy}{2}\\\\\n&+ \\frac{\\partial^2 f}{\\partial y^2} \\frac{dy^2}{2} + R_{\\langle x, y \\rangle, k}(\\langle dx, dy \\rangle).\n\\end{align*}\n\\]\nUsing \\(\\nabla\\) and \\(H\\) for the Hessian and \\(\\vec{x} = \\langle x, y \\rangle\\) and \\(d\\vec{x} = \\langle dx, dy \\rangle\\), this can be expressed as:\n\\[\nf(\\vec{x} + d\\vec{x}) = f(\\vec{x}) + \\nabla{f} \\cdot d\\vec{x} + d\\vec{x} \\cdot (H d\\vec{x}) +R_{\\vec{x}, k}d\\vec{x}.\n\\]\nAs for \\(R\\), the full term involves terms for \\(\\alpha = (3,0), (2,1), (1,2)\\), and \\((0,3)\\). Using \\(\\vec{a} = \\langle x, y\\rangle\\) and \\(\\vec{h}=\\langle dx, dy\\rangle\\):\n\\[\n\\frac{\\partial^3 f(\\vec{a}+c\\vec{h})}{\\partial x^3} \\frac{dx^3}{3!}+\n\\frac{\\partial^3 f(\\vec{a}+c\\vec{h})}{\\partial x^2\\partial y} \\frac{dx^2 dy}{2!1!} +\n\\frac{\\partial^3 f(\\vec{a}+c\\vec{h})}{\\partial x\\partial y^2} \\frac{dxdy^2}{1!2!} +\n\\frac{\\partial^3 f(\\vec{a}+c\\vec{h})}{\\partial y^3} \\frac{dy^3}{3!}.\n\\]\nThe exact answer is usually not as useful as the bound: \\(|R| \\leq M/(k+1)! \\|\\vec{h}\\|^{k+1}\\), for some finite constant \\(M\\).\n\n\nExample\nWe can encode multiindices using SymPy. The basic definitions are fairly straightforward using zip to pair variables with components of \\(\\alpha\\). We define a new type so that we can overload the familiar notation:\n\nstruct MultiIndex\n alpha::Vector{Int}\n end\nBase.show(io::IO, α::MultiIndex) = println(io, \"α = ($(join(α.alpha, \", \")))\")\n\n## |α| = α_1 + ... + α_m\nBase.length(α::MultiIndex) = sum(α.alpha)\n\n## factorial(α) computes α!\nBase.factorial(α::MultiIndex) = prod(factorial(Sym(a)) for a in α.alpha)\n\n## x^α = x_1^α_1 * x_2^α^2 * ... * x_n^α_n\nimport Base: ^\n^(x, α::MultiIndex) = prod(u^a for (u,a) in zip(x, α.alpha))\n\n## ∂^α(ex) = ∂_1^α_1 ∘ ∂_2^α_2 ∘ ... ∘ ∂_n^α_n (ex)\npartial(ex::SymPy.SymbolicObject, α::MultiIndex, vars=free_symbols(ex)) = diff(ex, zip(vars, α.alpha)...)\n\npartial (generic function with 2 methods)\n\n\n\n@syms w\nalpha = MultiIndex([1,2,1,3])\nlength(alpha) # 1 + 2 + 1 + 3=7\n[1,2,3,4]^alpha\nexₜ = x^3 * cos(w*y*z)\npartial(exₜ, alpha, [w,x,y,z])\n\n \n\\[\n6 w^{2} x y^{2} \\left(- w^{2} y^{2} z^{2} \\sin{\\left(w y z \\right)} + 7 w y z \\cos{\\left(w y z \\right)} + 9 \\sin{\\left(w y z \\right)}\\right)\n\\]\n\n\n\nThe remainder term needs to know information about sets like \\(|\\alpha| =k\\). This is a combinatoric problem, even to identify the length. Here we define an iterator to iterate over all possible MultiIndexes. This is low level, and likely could be done in a much better style, so shouldnt be parsed unless there is curiosity. It manually chains together iterators.\nstruct MultiIndices\n n::Int\n k::Int\nend\n\nfunction Base.length(as::MultiIndices)\n n,k = as.n, as.k\n n == 1 && return 1\n sum(length(MultiIndices(n-1, j)) for j in 0:k) # recursively identify length\nend\n\nfunction Base.iterate(alphas::MultiIndices)\n k, n = alphas.k, alphas.n\n n == 1 && return ([k],(0, MultiIndices(0,0), nothing))\n\n m = zeros(Int, n)\n m[1] = k\n betas = MultiIndices(n-1, 0)\n stb = iterate(betas)\n st = (k, MultiIndices(n-1, 0), stb)\n return (m, st)\nend\n\nfunction Base.iterate(alphas::MultiIndices, st)\n\n st == nothing && return nothing\n k,n = alphas.k, alphas.n\n k == 0 && return nothing\n n == 1 && return nothing\n\n # can we iterate the next on\n bk, bs, stb = st\n\n if stb==nothing\n bk = bk-1\n bk < 0 && return nothing\n bs = MultiIndices(bs.n, bs.k+1)\n val, stb = iterate(bs)\n return (vcat(bk,val), (bk, bs, stb))\n end\n\n resp = iterate(bs, stb)\n if resp == nothing\n bk = bk-1\n bk < 0 && return nothing\n bs = MultiIndices(bs.n, bs.k+1)\n val, stb = iterate(bs)\n return (vcat(bk, val), (bk, bs, stb))\n end\n\n val, stb = resp\n return (vcat(bk, val), (bk, bs, stb))\n\nend\nThis returns a vector, not a MultiIndex. Here we get all multiindices in two variables of size \\(3\\)\n\ncollect(MultiIndices(2, 3))\n\n4-element Vector{Any}:\n [3, 0]\n [2, 1]\n [1, 2]\n [0, 3]\n\n\nTo get all of size \\(3\\) or less, we could do something like this:\n\nunion((collect(MultiIndices(2, i)) for i in 0:3)...)\n\n10-element Vector{Any}:\n [0, 0]\n [1, 0]\n [0, 1]\n [2, 0]\n [1, 1]\n [0, 2]\n [3, 0]\n [2, 1]\n [1, 2]\n [0, 3]\n\n\nTo see the computational complexity. Suppose we had \\(3\\) variables and were interested in the error for order \\(4\\):\n\nk = 4\nlength(MultiIndices(3, k+1))\n\n21\n\n\nFinally, to see how compact the notation issue, suppose \\(f:R^3 \\rightarrow R\\), we have the third-order Taylor series expands to \\(20\\) terms as follows:\n\n@syms F() a[1:3] dx[1:3]\n\nsum(partial(F(a...), α, a) / factorial(α) * dx^α for k in 0:3 for α in MultiIndex.(MultiIndices(3, k))) # 3rd order\n\n \n\\[\n\\frac{dx₁^{3} \\frac{\\partial^{3}}{\\partial a₁^{3}} F{\\left(a₁,a₂,a₃ \\right)}}{6} + \\frac{dx₁^{2} dx₂ \\frac{\\partial^{3}}{\\partial a₂\\partial a₁^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + \\frac{dx₁^{2} dx₃ \\frac{\\partial^{3}}{\\partial a₃\\partial a₁^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + \\frac{dx₁^{2} \\frac{\\partial^{2}}{\\partial a₁^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + \\frac{dx₁ dx₂^{2} \\frac{\\partial^{3}}{\\partial a₂^{2}\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)}}{2} + dx₁ dx₂ dx₃ \\frac{\\partial^{3}}{\\partial a₃\\partial a₂\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)} + dx₁ dx₂ \\frac{\\partial^{2}}{\\partial a₂\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)} + \\frac{dx₁ dx₃^{2} \\frac{\\partial^{3}}{\\partial a₃^{2}\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)}}{2} + dx₁ dx₃ \\frac{\\partial^{2}}{\\partial a₃\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)} + dx₁ \\frac{\\partial}{\\partial a₁} F{\\left(a₁,a₂,a₃ \\right)} + \\frac{dx₂^{3} \\frac{\\partial^{3}}{\\partial a₂^{3}} F{\\left(a₁,a₂,a₃ \\right)}}{6} + \\frac{dx₂^{2} dx₃ \\frac{\\partial^{3}}{\\partial a₃\\partial a₂^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + \\frac{dx₂^{2} \\frac{\\partial^{2}}{\\partial a₂^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + \\frac{dx₂ dx₃^{2} \\frac{\\partial^{3}}{\\partial a₃^{2}\\partial a₂} F{\\left(a₁,a₂,a₃ \\right)}}{2} + dx₂ dx₃ \\frac{\\partial^{2}}{\\partial a₃\\partial a₂} F{\\left(a₁,a₂,a₃ \\right)} + dx₂ \\frac{\\partial}{\\partial a₂} F{\\left(a₁,a₂,a₃ \\right)} + \\frac{dx₃^{3} \\frac{\\partial^{3}}{\\partial a₃^{3}} F{\\left(a₁,a₂,a₃ \\right)}}{6} + \\frac{dx₃^{2} \\frac{\\partial^{2}}{\\partial a₃^{2}} F{\\left(a₁,a₂,a₃ \\right)}}{2} + dx₃ \\frac{\\partial}{\\partial a₃} F{\\left(a₁,a₂,a₃ \\right)} + F{\\left(a₁,a₂,a₃ \\right)}\n\\]"
},
{
"objectID": "differentiable_vector_calculus/scalar_functions_applications.html#questions",
"href": "differentiable_vector_calculus/scalar_functions_applications.html#questions",
"title": "56  Applications with scalar functions",
"section": "56.7 Questions",
"text": "56.7 Questions\n\nQuestion\nLet \\(f(x,y) = \\sqrt{x + y}\\). Find the tangent plane approximation for \\(f(2.1, 2.2)\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y,z) = xy + yz + zx\\). Using a linear approximation estimate \\(f(1.1, 1.0, 0.9)\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y,z) = xy + yz + zx - 3\\). What equation describes the tangent approximation at \\((1,1,1)\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x + y + z = 3\\)\n \n \n\n\n \n \n \n \n \\(2x + y - 2z = 1\\)\n \n \n\n\n \n \n \n \n \\(x + 2y + 3z = 6\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n(Knill) Let \\(f(x,y) = xy + x^2y + xy^2\\).\nFind the gradient of \\(f\\):\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle 2xy + y^2 + y, 2xy + x^2 + x\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle 2y + y^2, 2x + x^2\\)\n \n \n\n\n \n \n \n \n \\(y^2 + y, x^2 + x\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nIs this the Hessian of \\(f\\)?\n\\[\n\\left[\\begin{matrix}2 y & 2 x + 2 y + 1\\\\2 x + 2 y + 1 & 2 x\\end{matrix}\\right]\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe point \\((-1/3, -1/3)\\) is a solution to the \\(\\nabla{f} = 0\\). What is the determinant, \\(d\\), of the Hessian at this point?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich is true of \\(f\\) at \\((-1/3, 1/3)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n The function \\(f\\) has a local minimum, as \\(f_{xx} > 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a local maximum, as \\(f_{xx} < 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a saddle point, as \\(d < 0\\)\n \n \n\n\n \n \n \n \n Nothing can be said, as \\(d=0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\n(Knill) Let the Tutte polynomial be \\(f(x,y) = x + 2x^2 + x^3 + y + 2xy + y^2\\).\nDoes this accurately find the gradient of \\(f\\)?\n\nf(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2\n@syms x::real y::real\ngradf = gradient(f(x,y), [x,y])\n\n2-element Vector{Sym}:\n 3*x^2 + 4*x + 2*y + 1\n 2⋅x + 2⋅y + 1\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nHow many answers does this find to \\(\\nabla{f} = \\vec{0}\\)?\n\nf(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2\n@syms x::real y::real\ngradf = gradient(f(x,y), [x,y])\n\nsolve(gradf, [x,y])\n\n2-element Vector{Tuple{Sym, Sym}}:\n (-2/3, 1/6)\n (0, -1/2)\n\n\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe Hessian is found by\n\nf(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2\n@syms x::real y::real\ngradf = gradient(f(x,y), [x,y])\n\nsympy.hessian(f(x,y), [x,y])\n\n2×2 Matrix{Sym}:\n 6⋅x + 4 2\n 2 2\n\n\nWhich is true of \\(f\\) at \\((-1/3, 1/3)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n The function \\(f\\) has a local minimum, as \\(f_{xx} > 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a local maximum, as \\(f_{xx} < 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a saddle point, as \\(d < 0\\)\n \n \n\n\n \n \n \n \n Nothing can be said, as \\(d=0\\)\n \n \n\n\n \n \n \n \n The test does not apply, as \\(\\nabla{f}\\) is not \\(0\\) at this point.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich is true of \\(f\\) at \\((0, -1/2)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n The function \\(f\\) has a local minimum, as \\(f_{xx} > 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a local maximum, as \\(f_{xx} < 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a saddle point, as \\(d < 0\\)\n \n \n\n\n \n \n \n \n Nothing can be said, as \\(d=0\\)\n \n \n\n\n \n \n \n \n The test does not apply, as \\(\\nabla{f}\\) is not \\(0\\) at this point.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich is true of \\(f\\) at \\((1/2, 0)\\):\n\n\n\n \n \n \n \n \n \n \n \n \n The function \\(f\\) has a local minimum, as \\(f_{xx} > 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a local maximum, as \\(f_{xx} < 0\\) and \\(d >0\\)\n \n \n\n\n \n \n \n \n The function \\(f\\) has a saddle point, as \\(d < 0\\)\n \n \n\n\n \n \n \n \n Nothing can be said, as \\(d=0\\)\n \n \n\n\n \n \n \n \n The test does not apply, as \\(\\nabla{f}\\) is not \\(0\\) at this point.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\nQuestion\n(Strang p509) Consider the quadratic function \\(f(x,y) = ax^2 + bxy +cy^2\\). Since the second partial derivative test is essentially done by replacing the function at a critical point by a quadratic function, understanding this \\(f\\) is of some interest.\nIs this the Hessian of \\(f\\)?\n\\[\n\\left[\n\\begin{array}{}\n2a & 2b\\\\\n2b & 2c\n\\end{array}\n\\right]\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nOr is this the Hessian of \\(f\\)?\n\\[\n\\left[\n\\begin{array}{}\n2ax & by\\\\\nbx & 2cy\n\\end{array}\n\\right]\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nExplain why \\(ac - b^2\\) is of any interest here:\n\n\n\n \n \n \n \n \n \n \n \n \n It isn't, \\(b^2-4ac\\) is from the quadratic formula\n \n \n\n\n \n \n \n \n It is the determinant of the Hessian\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich condition on \\(a\\), \\(b\\), and \\(c\\) will ensure a local maximum:\n\n\n\n \n \n \n \n \n \n \n \n \n That \\(a>0\\) and \\(ac-b^2 > 0\\)\n \n \n\n\n \n \n \n \n That \\(a<0\\) and \\(ac-b^2 > 0\\)\n \n \n\n\n \n \n \n \n That \\(ac-b^2 < 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich condition on \\(a\\), \\(b\\), and \\(c\\) will ensure a saddle point?\n\n\n\n \n \n \n \n \n \n \n \n \n That \\(a>0\\) and \\(ac-b^2 > 0\\)\n \n \n\n\n \n \n \n \n That \\(a<0\\) and \\(ac-b^2 > 0\\)\n \n \n\n\n \n \n \n \n That \\(ac-b^2 < 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = e^{-x^2 - y^2} (2x^2 + y^2)\\). Use Lagranges method to find the absolute maximum and absolute minimum over \\(x^2 + y^2 = 3\\).\nIs \\(\\nabla{f}\\) given by the following?\n\\[\n\\nabla{f} =2 e^{-x^2 - y^2} \\langle x(2 - 2x^2 - y^2), y(1 - 2x^2 - y^2)\\rangle.\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhich vector is orthogonal to the contour line \\(x^2 + y^2 = 3\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle 2x, 2y\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle 2x, y^2\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle x^2, 2y \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDue to the form of the gradient of the constraint, finding when \\(\\nabla{f} = \\lambda \\nabla{g}\\) is the same as identifying when this ratio \\(|f_x/f_y|\\) is \\(1\\). The following solves for this by checking each point on the constraint:\n\nf(x,y) = exp(-x^2-y^2) * (2x^2 + y^2)\nf(v) = f(v...)\nr(t) = 3*[cos(t), sin(t)]\nrat(x) = abs(x[1]/x[2]) - 1\nfn = rat ∘ ∇(f) ∘ r\nts = fzeros(fn, 0, 2pi)\n\n4-element Vector{Float64}:\n 0.7449781871982899\n 2.396614466391503\n 3.886570840788083\n 5.538207119981296\n\n\nUsing these points, what is the largest value on the boundary?"
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html",
"href": "differentiable_vector_calculus/vector_fields.html",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "",
"text": "This section uses these add-on packages:\nFor a scalar function \\(f: R^n \\rightarrow R\\), the gradient of \\(f\\), \\(\\nabla{f}\\), is a function from \\(R^n \\rightarrow R^n\\). Specializing to \\(n=2\\), a function that for each point, \\((x,y)\\), assigns a vector \\(\\vec{v}\\). This is an example of vector field. More generally, we could have a function \\(f: R^n \\rightarrow R^m\\), of which we have discussed many already:\nAfter an example where the use of a multivariable function is of necessity, we discuss differentiation in general for a multivariable functions."
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html#vector-fields",
"href": "differentiable_vector_calculus/vector_fields.html#vector-fields",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "57.1 Vector fields",
"text": "57.1 Vector fields\nWe have seen that the gradient of a scalar function, \\(f:R^2 \\rightarrow R\\), takes a point in \\(R^2\\) and associates a vector in \\(R^2\\). As such \\(\\nabla{f}:R^2 \\rightarrow R^2\\) is a vector field. A vector field can be visualized by sampling a region and representing the field at those points. The details, as previously mentioned, are in the vectorfieldplot function of CalculusWithJulia.\n\nF(u,v) = [-v, u]\nvectorfieldplot(F, xlim=(-5,5), ylim=(-5,5), nx=10, ny=10)\n\n\n\n\nThe optional arguments nx=10 and ny=10 determine the number of points on the grid that a vector will be plotted. These vectors are scaled to not overlap.\nVector field plots are useful for visualizing velocity fields, where a velocity vector is associated to each point; or streamlines, curves whose tangents are follow the velocity vector of a flow. Vector fields are used in physics to model the electric field and the magnetic field. These are used to describe forces on objects within the field.\nThe three dimensional vector field is one way to illustrate a vector field, but there is an alternate using field lines. Like Eulers method, imagine starting at some point, \\(\\vec{r}\\) in \\(R^3\\). The field at that point is a vector indicating a direction of motion. Follow that vector for some infinitesimal amount, \\(d\\vec{r}\\). From here repeat. The field curve would satisfy \\(\\vec{r}'(t) = F(\\vec{r}(t))\\). Field curves only show direction, to indicate magnitude at a point, the convention is to use denser lines when the field is stronger.\n\n\n\nIllustration of the magnetic field of the earth using field lines to indicate the field. From Wikipedia.\n\n\n\nVector fields are also useful for other purposes, such as transformations, examples of which are a rotation or the conversion from polar to rectangular coordinates.\nFor transformations, a useful visualization is to plot curves where one variables is fixed. Consider the transformation from polar coordinates to cartesian coordinates \\(F(r, \\theta) = r \\langle\\cos(\\theta),\\sin(\\theta)\\rangle\\). The following plot will show in blue fixed values of \\(r\\) (circles) and in red fixed values of \\(\\theta\\) (rays).\n\nF(r,theta) = r*[cos(theta), sin(theta)]\nF(v) = F(v...)\n\nrs = range(0, 2, length=5)\nthetas = range(0, pi/2, length=9)\n\nplot(legend=false, aspect_ratio=:equal)\nplot!(unzip(F.(rs, thetas'))..., color=:red)\nplot!(unzip(F.(rs', thetas))..., color=:blue)\n\npt = [1, pi/4]\nJ = ForwardDiff.jacobian(F, pt)\narrow!(F(pt...), J[:,1], linewidth=5, color=:red)\narrow!(F(pt...), J[:,2], linewidth=5, color=:blue)\n\n\n\n\nTo the plot, we added the partial derivatives with respect to \\(r\\) (in red) and with respect to \\(\\theta\\) (in blue). These are found with the soon-to-be discussed Jacobian. From the graph, you can see that these vectors are tangent vectors to the drawn curves."
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html#parametrically-defined-surfaces",
"href": "differentiable_vector_calculus/vector_fields.html#parametrically-defined-surfaces",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "57.2 Parametrically defined surfaces",
"text": "57.2 Parametrically defined surfaces\nFor a one-dimensional curve we have several descriptions. For example, as the graph of a function \\(y=f(x)\\); as a parametrically defined curve \\(\\vec{r}(t) = \\langle x(t), y(t)\\rangle\\); or as a level curve of a scalar function \\(f(x,y) = c\\).\nFor two-dimensional surfaces in three dimensions, we have discussed describing these in terms of a function \\(z = f(x,y)\\) and as level curves of scalar functions: \\(c = f(x,y,z)\\). They can also be described parametrically.\nWe pick a familiar case, to make this concrete: the unit sphere in \\(R^3\\). We have\n\nIt is described by two functions through \\(f(x,y) = \\pm \\sqrt{1 - (x^2 + y^2)}\\).\nIt is described by \\(f(x,y,z) = 1\\), where \\(f(x,y,z) = x^2 + y^2 + z^2\\).\nIt can be described in terms of spherical coordinates:\n\n\\[\n\\Phi(\\theta, \\phi) = \\langle \\sin(\\phi)\\cos(\\theta), \\sin(\\phi)\\sin(\\theta), \\cos(\\phi) \\rangle,\n\\]\nwith \\(\\theta\\) the azimuthal angle and \\(\\phi\\) the polar angle (measured down from the \\(z\\) axis).\nThe function \\(\\Phi\\) takes \\(R^2\\) into \\(R^3\\), so is a multivariable function.\nWhen a surface is described by a function, \\(z=f(x,y)\\), then the gradient points (in the \\(x-y\\) plane) in the direction of greatest increase of \\(f\\). The vector \\(\\langle -f_x, -f_y, 1\\rangle\\) is a normal.\nWhen a surface is described as a level curve, \\(f(x,y,z) = c\\), then the gradient is normal to the surface.\nWhen a surface is described parametrically, there is no “gradient.” The partial derivatives are of interest, e.g., \\(\\partial{F}/\\partial{\\theta}\\) and \\(\\partial{F}/\\partial{\\phi}\\), vectors defined componentwise. These will be lie in the tangent plane of the surface, as they can be viewed as tangent vectors for parametrically defined curves on the surface. Their cross product will be normal to the surface. The magnitude of the cross product, which reflects the angle between the two partial derivatives, will be informative as to the surface area.\n\n57.2.1 Plotting parametrized surfaces in Julia\nConsider the parametrically described surface above. How would it be plotted? Using the Plots package, the process is quite similar to how a surface described by a function is plotted, but the \\(z\\) values must be computed prior to plotting.\nHere we define the parameterization using functions to represent each component:\n\nX(theta,phi) = sin(phi) * cos(theta)\nY(theta,phi) = sin(phi) * sin(theta)\nZ(theta,phi) = cos(phi)\n\nZ (generic function with 1 method)\n\n\nThen:\n\nthetas = range(0, stop=pi/2, length=50)\nphis = range(0, stop=pi, length=50)\n\nxs = [X(theta, phi) for theta in thetas, phi in phis]\nys = [Y(theta, phi) for theta in thetas, phi in phis]\nzs = [Z(theta, phi) for theta in thetas, phi in phis]\n\nsurface(xs, ys, zs) ## see note\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\n\n\n\nOnly some backends for Plots will produce this type of plot. Both plotly() and pyplot() will, but not gr().\n\n\n\n\n\n\nNote\n\n\n\nPyPlot can be used directly to make these surface plots: `import PyPlot; PyPlot.plot_surface(xs,ys,zs).\n\n\nInstead of the comprehension, broadcasting can be used\n\nsurface(X.(thetas, phis'), Y.(thetas, phis'), Z.(thetas, phis'))\n\n\n\n\nIf the parameterization is presented as a function, broadcasting can be used to succintly plot\n\nPhi(theta, phi) = [X(theta, phi), Y(theta, phi), Z(theta, phi)]\n\nsurface(unzip(Phi.(thetas, phis'))...)\n\n\n\n\nThe partial derivatives of each component, \\(\\partial{\\Phi}/\\partial{\\theta}\\) and \\(\\partial{\\Phi}/\\partial{\\phi}\\), can be computed directly:\n\\[\n\\begin{align*}\n\\partial{\\Phi}/\\partial{\\theta} &= \\langle -\\sin(\\phi)\\sin(\\theta), \\sin(\\phi)\\cos(\\theta),0 \\rangle,\\\\\n\\partial{\\Phi}/\\partial{\\phi} &= \\langle \\cos(\\phi)\\cos(\\theta), \\cos(\\phi)\\sin(\\theta), -\\sin(\\phi) \\rangle.\n\\end{align*}\n\\]\nUsing SymPy, we can compute through:\n\n@syms theta phi\nout = [diff.(Phi(theta, phi), theta) diff.(Phi(theta, phi), phi)]\n\n3×2 Matrix{Sym}:\n -sin(φ)⋅sin(θ) cos(φ)⋅cos(θ)\n sin(φ)⋅cos(θ) sin(θ)⋅cos(φ)\n 0 -sin(φ)\n\n\nAt the point \\((\\theta, \\phi) = (\\pi/12, \\pi/6)\\) this evaluates to the following.\n\nsubs.(out, theta.=> PI/12, phi.=>PI/6) .|> N\n\n3×2 Matrix{Real}:\n -0.12941 0.836516\n 0.482963 0.224144\n 0 -1//2\n\n\nWe found numeric values, so that we can compare to the numerically identical values computed by the jacobian function from ForwardDiff:\n\npt = [pi/12, pi/6]\nout₁ = ForwardDiff.jacobian(v -> Phi(v...), pt)\n\n3×2 Matrix{Float64}:\n -0.12941 0.836516\n 0.482963 0.224144\n -0.0 -0.5\n\n\nWhat this function computes exactly will be described next, but here we visualize the partial derivatives and see they lie in the tangent plane at the point:\n\nus, vs = range(0, pi/2, length=25), range(0, pi, length=25)\nxs, ys, zs = unzip(Phi.(us, vs'))\nsurface(xs, ys, zs, legend=false)\narrow!(Phi(pt...), out₁[:,1], linewidth=3)\narrow!(Phi(pt...), out₁[:,2], linewidth=3)"
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html#the-total-derivative",
"href": "differentiable_vector_calculus/vector_fields.html#the-total-derivative",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "57.3 The total derivative",
"text": "57.3 The total derivative\nInformally, the total derivative at \\(a\\) is the best linear approximation of the value of a function, \\(F\\), near \\(a\\) with respect to its arguments. If it exists, denote it \\(dF_a\\).\nFor a function \\(F: R^n \\rightarrow R^m\\) we have the total derivative at \\(\\vec{a}\\) (a point or vector in \\(R^n\\)) is a matrix \\(J\\) (a linear transformation) taking vectors in \\(R^n\\) and returning, under multiplication, vectors in \\(R^m\\) (this matrix will be \\(m \\times n\\)), such that for some neighborhood of \\(\\vec{a}\\), we have:\n\\[\n\\lim_{\\vec{x} \\rightarrow \\vec{a}} \\frac{\\|F(\\vec{x}) - F(\\vec{a}) - J\\cdot(\\vec{x}-\\vec{a})\\|}{\\|\\vec{x} - \\vec{a}\\|} = \\vec{0}.\n\\]\n(That is \\(\\|F(\\vec{x}) - F(\\vec{a}) - J\\cdot(\\vec{x}-\\vec{a})\\|=\\mathcal{o}(\\|\\vec{x}-\\vec{a}\\|)\\).)\nIf for some \\(J\\) the above holds, the function \\(F\\) is said to be totally differentiable, and the matrix \\(J =J_F=dF_a\\) is the total derivative.\nFor a multivariable function \\(F:R^n \\rightarrow R^m\\), we may express the function in vector-valued form \\(F(\\vec{x}) = \\langle f_1(\\vec{x}), f_2(\\vec{x}),\\dots,f_m(\\vec{x})\\rangle\\), each component a scalar function. Then, if the total derivative exists, it can be expressed by the Jacobian:\n\\[\nJ = \\left[\n\\begin{align*}\n\\frac{\\partial f_1}{\\partial x_1} &\\quad \\frac{\\partial f_1}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_1}{\\partial x_n}\\\\\n\\frac{\\partial f_2}{\\partial x_1} &\\quad \\frac{\\partial f_2}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_2}{\\partial x_n}\\\\\n&&\\vdots&\\\\\n\\frac{\\partial f_m}{\\partial x_1} &\\quad \\frac{\\partial f_m}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_m}{\\partial x_n}\n\\end{align*}\n\\right].\n\\]\nThis may also be viewed as:\n\\[\nJ = \\left[\n\\begin{align*}\n&\\nabla{f_1}'\\\\\n&\\nabla{f_2}'\\\\\n&\\quad\\vdots\\\\\n&\\nabla{f_m}'\n\\end{align*}\n\\right] =\n\\left[\n\\frac{\\partial{F}}{\\partial{x_1}}\\quad\n\\frac{\\partial{F}}{\\partial{x_2}} \\cdots\n\\frac{\\partial{F}}{\\partial{x_n}}\n\\right].\n\\]\nThe latter representing a matrix of \\(m\\) row vectors, each with \\(n\\) components or as a matrix of \\(n\\) column vectors, each with \\(m\\) components.\n\nAfter specializing the total derivative to the cases already discussed, we have:\n\nUnivariate functions. Here \\(f'(t)\\) is also univariate. Identifying \\(J\\) with the \\(1 \\times 1\\) matrix with component \\(f'(t)\\), then the total derivative is just a restatement of the derivative existing.\nVector-valued functions \\(\\vec{f}(t) = \\langle f_1(t), f_2(t), \\dots, f_m(t) \\rangle\\), each component univariate. Then the derivative, \\(\\vec{f}'(t) = \\langle \\frac{df_1}{dt}, \\frac{df_2}{dt}, \\dots, \\frac{df_m}{dt} \\rangle\\). The total derivative in this case, is a a \\(m \\times 1\\) vector of partial derivatives, and since there is only \\(1\\) variable, would be written without partials. So the two agree.\nScalar functions \\(f(\\vec{x}) = a\\) of type \\(R^n \\rightarrow R\\). The\n\ndefinition of differentiability for \\(f\\) involved existence of the partial derivatives and moreover, the fact that a limit like the above held with \\(\\nabla{f}(C) \\cdot \\vec{h}\\) in place of \\(J\\cdot(\\vec{x}-\\vec{a})\\). Here \\(\\vec{h}\\) and \\(\\vec{x}-\\vec{a}\\) are vectors in \\(R^n\\). Were the dot product in \\(\\nabla{f}(C) \\cdot \\vec{h}\\) expressed in matrix multiplication we would have for this case a \\(1 \\times n\\) matrix of the correct form:\n\\[\nJ = [\\nabla{f}'].\n\\]\n\nFor \\(f:R^2 \\rightarrow R\\), the Hessian matrix, was the matrix of \\(2\\)nd partial derivatives. This may be viewed as the total derivative of the the gradient function, \\(\\nabla{f}\\):\n\n\\[\n\\text{Hessian} =\n\\left[\n\\begin{align*}\n\\frac{\\partial^2 f}{\\partial x^2} &\\quad \\frac{\\partial^2 f}{\\partial x \\partial y}\\\\\n\\frac{\\partial^2 f}{\\partial y \\partial x} &\\quad \\frac{\\partial^2 f}{\\partial y \\partial y}\n\\end{align*}\n\\right]\n\\]\nThis is equivalent to:\n\\[\n\\left[\n\\begin{align*}\n\\frac{\\partial \\frac{\\partial f}{\\partial x}}{\\partial x} &\\quad \\frac{\\partial \\frac{\\partial f}{\\partial x}}{\\partial y}\\\\\n\\frac{\\partial \\frac{\\partial f}{\\partial y}}{\\partial x} &\\quad \\frac{\\partial \\frac{\\partial f}{\\partial y}}{\\partial y}\\\\\n\\end{align*}\n\\right].\n\\]\nAs such, the total derivative is a generalization of what we have previously discussed."
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html#the-chain-rule",
"href": "differentiable_vector_calculus/vector_fields.html#the-chain-rule",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "57.4 The chain rule",
"text": "57.4 The chain rule\nIf \\(G:R^k \\rightarrow R^n\\) and \\(F:R^n \\rightarrow R^m\\), then the composition \\(F\\circ G\\) takes \\(R^k \\rightarrow R^m\\). If all three functions are totally differentiable, then a chain rule will hold (total derivative of \\(F\\circ G\\) at point \\(a\\)):\n$$ d(FG)a = dF{G(a)} dG_a\n$$\nIf correct, this has the same formulation as the chain rule for the univariate case: derivative of outer at the inner times the derivative of the inner.\nFirst we check that the dimensions are correct: We have \\(dF_{G(a)}\\) (the total derivative of \\(F\\) at the point \\(G(a)\\)) is an \\(m \\times n\\) matrix and \\(dG_a\\) (the total derivative of \\(G\\) at the point \\(a\\)) is a \\(n \\times k\\) matrix. The product of a \\(m \\times n\\) matrix with a \\(n \\times k\\) matrix is defined, and is a \\(m \\times k\\) matrix, as is \\(d(F \\circ G)_a\\).\nThe proof that the formula is correct uses the definition of totally differentiable written as\n\\[\nF(b + \\vec{h}) - F(b) - dF_b\\cdot \\vec{h} = \\epsilon(\\vec{h}) \\vec{h},\n\\]\nwhere \\(\\epsilon(h) \\rightarrow \\vec{0}\\) as \\(h \\rightarrow \\vec{0}\\).\nWe have, using this for both \\(F\\) and \\(G\\):\n\\[\n\\begin{align*}\nF(G(a + \\vec{h})) - F(G(a)) &=\nF(G(a) + (dG_a \\cdot \\vec{h} + \\epsilon_G \\vec{h})) - F(G(a))\\\\\n&= F(G(a)) + dF_{G(a)} \\cdot (dG_a \\cdot \\vec{h} + \\epsilon_G \\vec{h}) \\\\\n&+ \\quad\\epsilon_F (dG_a \\cdot \\vec{h} + \\epsilon_G \\vec{h}) - F(G(a))\\\\\n&= dF_{G(a)} \\cdot (dG_a \\cdot \\vec{h}) + dF_{G(a)} \\cdot (\\epsilon_G \\vec{h}) + \\epsilon_F (dG_a \\cdot \\vec{h}) + (\\epsilon_F \\cdot \\epsilon_G\\vec{h})\n\\end{align*}\n\\]\nThe last line uses the linearity of \\(dF\\) to isolate \\(dF_{G(a)} \\cdot (dG_a \\cdot \\vec{h})\\). Factoring out \\(\\vec{h}\\) and taking norms gives:\n\\[\n\\begin{align*}\n\\frac{\\| F(G(a+\\vec{h})) - F(G(a)) - dF_{G(a)}dG_a \\cdot \\vec{h} \\|}{\\| \\vec{h} \\|} &=\n\\frac{\\| dF_{G(a)}\\cdot(\\epsilon_G\\vec{h}) + \\epsilon_F (dG_a\\cdot \\vec{h}) + (\\epsilon_F\\cdot\\epsilon_G\\vec{h}) \\|}{\\| \\vec{h} \\|} \\\\\n&\\leq \\| dF_{G(a)}\\cdot\\epsilon_G + \\epsilon_F (dG_a) + \\epsilon_F\\cdot\\epsilon_G \\|\\frac{\\|\\vec{h}\\|}{\\| \\vec{h} \\|}\\\\\n&\\rightarrow 0.\n\\end{align*}\n\\]\n\n57.4.1 Examples\nOur main use of the total derivative will be the change of variables in integration.\n\nExample: polar coordinates\nA point \\((a,b)\\) in the plane can be described in polar coordinates by a radius \\(r\\) and polar angle \\(\\theta\\). We can express this formally by \\(F:(a,b) \\rightarrow (r, \\theta)\\) with\n\\[\nr(a,b) = \\sqrt{a^2 + b^2}, \\quad\n\\theta(a,b) = \\tan^{-1}(b/a),\n\\]\nthe latter assuming the point is in quadrant I or IV (though atan(y,x) will properly handle the other quadrants). The Jacobian of this transformation may be found with\n\n@syms a::real b::real\n\nrⱼ = sqrt(a^2 + b^2)\nθⱼ = atan(b/a)\n\nJac = Sym[diff.(rⱼ, [a,b])'; # [∇f_1'; ∇f_2']\n diff.(θⱼ, [a,b])']\n\nsimplify.(Jac)\n\n2×2 Matrix{Sym}:\n a*conjugate(1/sqrt(a^2 + b^2)) b*conjugate(1/sqrt(a^2 + b^2))\n -b/(a^2 + b^2) a/(a^2 + b^2)\n\n\nSymPy array objects have a jacobian method to make this easier to do. The calling style is Python-like, using object.method(...):\n\n[rⱼ, θⱼ].jacobian([a, b])\n\n2×2 Matrix{Sym}:\n a/sqrt(a^2 + b^2) b/sqrt(a^2 + b^2)\n -b/(a^2*(1 + b^2/a^2)) 1/(a*(1 + b^2/a^2))\n\n\nThe determinant, of geometric interest, will be\n\ndet(Jac) |> simplify\n\n \n\\[\n\\overline{\\frac{1}{\\sqrt{a^{2} + b^{2}}}}\n\\]\n\n\n\nThe determinant is of interest, as the linear mapping represented by the Jacobian changes the area of the associated coordinate vectors. The determinant describes ow this area changes, as a multiplying factor.\n\n\nExample Spherical Coordinates\nIn \\(3\\) dimensions a point can be described by (among other ways):\n\nCartesian coordinates: three coordinates relative to the \\(x\\), \\(y\\), and \\(z\\) axes as \\((a,b,c)\\).\nSpherical coordinates: a radius, \\(r\\), an azimuthal angle \\(\\theta\\), and a polar angle\n\n\\[\n\\phi\n\\]\nmeasured down from the \\(z\\) axes. (We use the mathematics naming convention, the physics one has \\(\\phi\\) and \\(\\theta\\) reversed.)\n\nCylindrical coordinates: a radius, \\(r\\), a polar angle \\(\\theta\\), and height \\(z\\).\n\nSome mappings are:\n\n\n\n\n\n\n\n\nCartesian (x,y,z)\nSpherical (\\(r\\), \\(\\theta\\), \\(\\phi\\))\nCylindrical (\\(r\\), \\(\\theta\\), \\(z\\))\n\n\n\n\n(1, 1, 0)\n\\((\\sqrt{2}, \\pi/4, \\pi/2)\\)\n\\((\\sqrt{2},\\pi/4, 0)\\)\n\n\n(0, 1, 1)\n\\((\\sqrt{2}, 0, \\pi/4)\\)\n\\((\\sqrt{2}, 0, 1)\\)\n\n\n\n\nFormulas can be found to convert between the different systems, here are a few written as multivariable functions:\n\nfunction spherical_from_cartesian(x,y,z)\n r = sqrt(x^2 + y^2 + z^2)\n theta = atan(y/x)\n phi = acos(z/r)\n [r, theta, phi]\nend\n\nfunction cartesian_from_spherical(r, theta, phi)\n x = r*sin(phi)*cos(theta)\n y = r*sin(phi)*sin(theta)\n z = r*cos(phi)\n [x, y, z]\nend\n\nfunction cylindrical_from_cartesian(x, y, z)\n r = sqrt(x^2 + y^2)\n theta = atan(y/x)\n z = z\n [r, theta, z]\nend\n\nfunction cartesian_from_cylindrical(r, theta, z)\n x = r*cos(theta)\n y = r*sin(theta)\n z = z\n [x, y, z]\nend\n\nspherical_from_cartesian(v) = spherical_from_cartesian(v...)\ncartesian_from_spherical(v) = cartesian_from_spherical(v...)\ncylindrical_from_cartesian(v)= cylindrical_from_cartesian(v...)\ncartesian_from_cylindrical(v) = cartesian_from_cylindrical(v...)\n\ncartesian_from_cylindrical (generic function with 2 methods)\n\n\nThe Jacobian of a transformation can be found from these conversions. For example, the conversion from spherical to cartesian would have Jacobian computed by:\n\n@syms r::real\n\nex1 = cartesian_from_spherical(r, theta, phi)\nJ1 = ex1.jacobian([r, theta, phi])\n\n3×3 Matrix{Sym}:\n sin(φ)⋅cos(θ) -r⋅sin(φ)⋅sin(θ) r⋅cos(φ)⋅cos(θ)\n sin(φ)⋅sin(θ) r⋅sin(φ)⋅cos(θ) r⋅sin(θ)⋅cos(φ)\n cos(φ) 0 -r⋅sin(φ)\n\n\nThis has determinant:\n\ndet(J1) |> simplify\n\n \n\\[\n- r^{2} \\sin{\\left(\\phi \\right)}\n\\]\n\n\n\nThere is no function to convert from spherical to cylindrical above, but clearly one can be made by composition:\n\ncylindrical_from_spherical(r, theta, phi) =\n cylindrical_from_cartesian(cartesian_from_spherical(r, theta, phi)...)\ncylindrical_from_spherical(v) = cylindrical_from_spherical(v...)\n\ncylindrical_from_spherical (generic function with 2 methods)\n\n\nFrom this composition, we could compute the Jacobian directly, as with:\n\nex2 = cylindrical_from_spherical(r, theta, phi)\nJ2 = ex2.jacobian([r, theta, phi])\n\n3×3 Matrix{Sym}:\n (r*sin(phi)^2*sin(theta)^2 + r*sin(phi)^2*cos(theta)^2)/sqrt(r^2*sin(phi)^2*sin(theta)^2 + r^2*sin(phi)^2*cos(theta)^2) … (r^2*sin(phi)*sin(theta)^2*cos(phi) + r^2*sin(phi)*cos(phi)*cos(theta)^2)/sqrt(r^2*sin(phi)^2*sin(theta)^2 + r^2*sin(phi)^2*cos(theta)^2)\n 0 0\n cos(φ) -r⋅sin(φ)\n\n\nNow to see that this last expression could have been found by the chain rule. To do this we need to find the Jacobian of each function; evaluate them at the proper places; and, finally, multiply the matrices. The J1 object, found above, does one Jacobian. We now need to find that of cylindrical_from_cartesian:\n\n@syms x::real y::real z::real\nex3 = cylindrical_from_cartesian(x, y, z)\nJ3 = ex3.jacobian([x,y,z])\n\n3×3 Matrix{Sym}:\n x/sqrt(x^2 + y^2) y/sqrt(x^2 + y^2) 0\n -y/(x^2*(1 + y^2/x^2)) 1/(x*(1 + y^2/x^2)) 0\n 0 0 1\n\n\nThe chain rule is not simply J3 * J1 in the notation above, as the J3 matrix must be evaluated at “G(a)”, which is ex1 from above:\n\nJ3_Ga = subs.(J3, x => ex1[1], y => ex1[2], z => ex1[3]) .|> simplify # the dots are important\n\n3×3 Matrix{Sym}:\n r*sin(phi)*cos(theta)/(sqrt(sin(phi)^2)*Abs(r)) … 0\n -sin(theta)/(r*sin(phi)) 0\n 0 1\n\n\nThe chain rule now says this product should be equivalent to J2 above:\n\nJ3_Ga * J1\n\n3×3 Matrix{Sym}:\n r*sin(phi)^2*sin(theta)^2/(sqrt(sin(phi)^2)*Abs(r)) + r*sin(phi)^2*cos(theta)^2/(sqrt(sin(phi)^2)*Abs(r)) … r^2*sin(phi)*sin(theta)^2*cos(phi)/(sqrt(sin(phi)^2)*Abs(r)) + r^2*sin(phi)*cos(phi)*cos(theta)^2/(sqrt(sin(phi)^2)*Abs(r))\n 0 0\n cos(φ) -r⋅sin(φ)\n\n\nThe two are equivalent after simplification, as seen here:\n\nJ3_Ga * J1 - J2 .|> simplify\n\n3×3 Matrix{Sym}:\n 0 0 0\n 0 0 0\n 0 0 0\n\n\n\n\nExample\nThe above examples were done symbolically. Performing the calculation numerically is quite similar. The ForwardDiff package has a gradient function to find the gradient at a point. The CalculusWithJulia package extends this to take a gradient of a function and return a function, also called gradient. This is defined along the lines of:\ngradient(f::Function) = x -> ForwardDiff.gradient(f, x)\n(though more flexibly, as either vector or a separate arguments can be used.)\nWith this, defining a Jacobian function could be done like:\nfunction Jacobian(F, x)\n n = length(F(x...))\n grads = [gradient(x -> F(x...)[i])(x) for i in 1:n]\n vcat(grads'...)\nend\nBut, like SymPy, ForwardDiff provides a jacobian function directly, so we will use that; it requires a function definition where a vector is passed in and is called by ForwardDiff.jacobian. (The ForwardDiff package does not export its methods, they are qualified using the module name.)\nUsing the above functions, we can verify the last example at a point:\n\nrtp = [1, pi/3, pi/4]\nForwardDiff.jacobian(cylindrical_from_spherical, rtp)\n\n3×3 Matrix{Float64}:\n 0.707107 0.0 0.707107\n 0.0 1.0 0.0\n 0.707107 0.0 -0.707107\n\n\nThe chain rule gives the same answer up to roundoff error:\n\nForwardDiff.jacobian(cylindrical_from_cartesian, cartesian_from_spherical(rtp)) * ForwardDiff.jacobian(cartesian_from_spherical, rtp)\n\n3×3 Matrix{Float64}:\n 0.707107 0.0 0.707107\n 5.55112e-17 1.0 5.55112e-17\n 0.707107 0.0 -0.707107\n\n\n\n\nExample: The Inverse Function Theorem\nFor a change of variable problem, \\(F:R^n \\rightarrow R^n\\), the determinant of the Jacobian quantifies how volumes get modified under the transformation. When this determinant is nonzero, then more can be said. The Inverse Function Theorem states\n\nif \\(F\\) is a continuously differentiable function from an open set of \\(R^n\\) into \\(R^n\\)and the total derivative is invertible at a point \\(p\\) (i.e., the Jacobian determinant of \\(F\\) at \\(p\\) is non-zero), then \\(F\\) is invertible near \\(p\\). That is, an inverse function to \\(F\\) is defined on some neighborhood of \\(q\\), where \\(q=F(p)\\). Further, \\(F^{-1}\\) will be continuously differentiable at \\(q\\) with \\(J_{F^{-1}}(q) = [J_F(p)]^{-1}\\), the latter being the matrix inverse. Taking determinants, \\(\\det(J_{F^{-1}}(q)) = 1/\\det(J_F(p))\\).\n\nAssuming \\(F^{-1}\\) exists, we can verify the last part from the chain rule, in an identical manner to the univariate case, starting with \\(F^{-1} \\circ F\\) being the identity, we would have:\n\\[\nJ_{F^{-1}\\circ F}(p) = I,\n\\]\nwhere \\(I\\) is the identity matrix with entry \\(a_{ij} = 1\\) when \\(i=j\\) and \\(0\\) otherwise.\nBut the chain rule then says \\(J_{F^{-1}}(F(p)) J_F(p) = I\\). This implies the two matrices are inverses to each other, and using the multiplicative mapping property of the determinant will also imply the determinant relationship.\nThe theorem is an existential theorem, in that it implies \\(F^{-1}\\) exists, but doesnt indicate how to find it. When we have an inverse though, we can verify the properties implied.\nThe transformation examples have inverses indicated. Using one of these we can verify things at a point, as done in the following:\n\np = [1, pi/3, pi/4]\nq = cartesian_from_spherical(p)\n\nA1 = ForwardDiff.jacobian(spherical_from_cartesian, q) # J_F⁻¹(q)\nA2 = ForwardDiff.jacobian(cartesian_from_spherical, p) # J_F(p)\n\nA1 * A2\n\n3×3 Matrix{Float64}:\n 1.0 0.0 0.0\n 5.55112e-17 1.0 5.55112e-17\n 0.0 0.0 1.0\n\n\nUp to roundoff error, this is the identity matrix. As for the relationship between the determinants, up to roundoff error the two are related, as expected:\n\ndet(A1), 1/det(A2)\n\n(-1.4142135623730956, -1.4142135623730951)\n\n\n\n\nExample: Implicit Differentiation, the Implicit Function Theorem\nThe technique of implicit differentiation is a useful one, as it allows derivatives of more complicated expressions to be found. The main idea, expressed here with three variables is if an equation may be viewed as \\(F(x,y,z) = c\\), \\(c\\) a constant, then \\(z=\\phi(x,y)\\) may be viewed as a function of \\(x\\) and \\(y\\). Hence, we can use the chain rule to find: \\(\\partial z / \\partial x\\) and \\(\\partial z /\\partial x\\). Let \\(G(x,y) = \\langle x, y, \\phi(x,y) \\rangle\\) and then differentiation \\((F \\circ G)(x,y) = c\\):\n\\[\n\\begin{align*}\n0 &= dF_{G(x,y)} \\circ dG_{\\langle x, y\\rangle}\\\\\n&= [\\frac{\\partial F}{\\partial x}\\quad \\frac{\\partial F}{\\partial y}\\quad \\frac{\\partial F}{\\partial z}](G(x,y)) \\cdot\n\\left[\\begin{array}{}\n1 & 0\\\\\n0 & 1\\\\\n\\frac{\\partial \\phi}{\\partial x} & \\frac{\\partial \\phi}{\\partial y}\n\\end{array}\\right].\n\\end{align*}\n\\]\nSolving yields\n\\[\n\\frac{\\partial \\phi}{\\partial x} = -\\frac{\\partial F/\\partial x}{\\partial F/\\partial z},\\quad\n\\frac{\\partial \\phi}{\\partial y} = -\\frac{\\partial F/\\partial y}{\\partial F/\\partial z}.\n\\]\nWhere the right hand side of each is evaluated at \\(G(x,y)\\).\nWhen can it be reasonably assumed that such a function \\(z= \\phi(x,y)\\) exists?\nThe Implicit Function Theorem provides a statement (slightly abridged here):\n\nLet \\(F:R^{n+m} \\rightarrow R^m\\) be a continuously differentiable function and let \\(R^{n+m}\\) have (compactly defined) coordinates \\(\\langle \\vec{x}, \\vec{y} \\rangle\\), Fix a point \\(\\langle \\vec{a}, \\vec{b} \\rangle\\) with \\(F(\\vec{a}, \\vec{b}) = \\vec{0}\\). Let \\(J_{F, \\vec{y}}(\\vec{a}, \\vec{b})\\) be the Jacobian restricted to just the \\(y\\) variables. (\\(J\\) is \\(m \\times m\\).) If this matrix has non-zero determinant (it is invertible), then there exists an open set \\(U\\) containing \\(\\vec{a}\\) and a unique continuously differentiable function \\(G: U \\subset R^n \\rightarrow R^m\\) such that \\(G(\\vec{a}) = \\vec{b}\\), \\(F(\\vec{x}, G(\\vec{x})) = 0\\) for \\(\\vec x\\) in \\(U\\). Moreover, the partial derivatives of \\(G\\) are given by the matrix product:\n\\(\\frac{\\partial G}{\\partial x_j}(\\vec{x}) = - [J_{F, \\vec{y}}(x, F(\\vec{x}))]^{-1} \\left[\\frac{\\partial F}{\\partial x_j}(x, G(\\vec{x}))\\right].\\)\n\n\nSpecializing to our case above, we have \\(f:R^{2+1}\\rightarrow R^1\\) and \\(\\vec{x} = \\langle a, b\\rangle\\) and \\(\\phi:R^2 \\rightarrow R\\). Then\n\\[\n[J_{f, \\vec{y}}(x, g(\\vec{x}))] = [\\frac{\\partial f}{\\partial z}(a, b, \\phi(a,b)],\n\\]\na \\(1\\times 1\\) matrix, identified as a scalar, so inversion is just the reciprocal. So the formula, becomes, say for \\(x_1 = x\\):\n\\[\n\\frac{\\partial \\phi}{\\partial x}(a, b) = - \\frac{\\frac{\\partial{f}}{\\partial{x}}(a, b,\\phi(a,b))}{\\frac{\\partial{f}}{\\partial{z}}(a, b, \\phi(a,b))},\n\\]\nas expressed above. Here invertibility is simply a non-zero value, and is needed for the division. In general, we see inverse (the \\(J^{-1}\\)) is necessary to express the answer.\nUsing this, we can answer questions like the following (as we did before) on a more solid ground:\nLet \\(x^2/a^2 + y^2/b^2 + z^2/c^2 = 1\\) be an equation describing an ellipsoid. Describe the tangent plane at a point on the ellipse.\nWe would like to express the tangent plane in terms of \\(\\partial{z}/\\partial{x}\\) and \\(\\partial{z}/\\partial{y}\\), which we can do through:\n\\[\n\\frac{2x}{a^2} + \\frac{2z}{c^2} \\frac{\\partial{z}}{\\partial{x}} = 0, \\quad\n\\frac{2y}{a^2} + \\frac{2z}{c^2} \\frac{\\partial{z}}{\\partial{y}} = 0.\n\\]\nSolving, we get\n\\[\n\\frac{\\partial{z}}{\\partial{x}} = -\\frac{2x}{a^2}\\frac{c^2}{2z},\n\\quad\n\\frac{\\partial{z}}{\\partial{y}} = -\\frac{2y}{a^2}\\frac{c^2}{2z},\n\\]\nprovided \\(z \\neq 0\\). At \\(z=0\\) the tangent plane exists, but we cant describe it in this manner, as it is vertical. However, the choice of variables to use is not fixed in the theorem, so if \\(x \\neq 0\\) we can express \\(x = x(y,z)\\) and express the tangent plane in terms of \\(\\partial{x}/\\partial{y}\\) and \\(\\partial{x}/\\partial{z}\\). The answer is similar to the above, and we wont repeat. Similarly, should \\(x = z = 0\\), the \\(y \\neq 0\\) and we can use an implicit definition \\(y = y(x,z)\\) and express the tangent plane through \\(\\partial{y}/\\partial{x}\\) and \\(\\partial{y}/\\partial{z}\\).\n\n\nExample: Lagrange multipliers in more dimensions\nConsider now the problem of maximizing \\(f:R^n \\rightarrow R\\) subject to \\(k < n\\) constraints \\(g_1(\\vec{x}) = c_1, g_2(\\vec{x}) = c_2, \\dots, g_{k}(\\vec{x}) = c_{k}\\). For \\(n=1\\) and \\(2\\), we saw that if all derivatives exist, then a necessary condition to be at a maximum is that \\(\\nabla{f}\\) can be written as \\(\\lambda_1 \\nabla{g_1}\\) (\\(n=1\\)) or \\(\\lambda_1 \\nabla{g_1} + \\lambda_2 \\nabla{g_2}\\). The key observation is that the gradient of \\(f\\) must have no projection on the intersection of the tangent planes found by linearizing \\(g_i\\).\nThe same thing holds in dimension \\(n > 2\\): Let \\(\\vec{x}_0\\) be a point where \\(f(\\vec{x})\\) is maximum subject to the \\(p\\) constraints. We want to show that \\(\\vec{x}_0\\) must satisfy:\n\\[\n\\nabla{f}(\\vec{x}_0) = \\sum \\lambda_i \\nabla{g_i}(\\vec{x}_0).\n\\]\nBy considering \\(-f\\), the same holds for a minimum.\nWe follow the sketch of Sawyer.\nUsing Taylors theorem, we have \\(f(\\vec{x} + h \\vec{y}) = f(\\vec{x}) + h \\vec{y}\\cdot\\nabla{f} + h^2\\vec{c}\\), for some \\(\\vec{c}\\). If \\(h\\) is small enough, this term can be ignored.\nThe tangent “plane” for each constraint, \\(g_i(\\vec{x}) = c_i\\), is orthogonal to the gradient vector \\(\\nabla{g_i}(\\vec{x})\\). That is, \\(\\nabla{g_i}(\\vec{x})\\) is orthogonal to the level-surface formed by the constraint \\(g_i(\\vec{x}) = 0\\). Let \\(A\\) be the set of all linear combinations of \\(\\nabla{g_i}\\), that are possible: \\(\\lambda_1 g_1(\\vec{x}) + \\lambda_2 g_2(\\vec{x}) + \\cdots + \\lambda_p g_p(\\vec{x})\\), as in the statement. Through projection, we can write \\(\\nabla{f}(\\vec{x}_0) = \\vec{a} + \\vec{b}\\), where \\(\\vec{a}\\) is in \\(A\\) and \\(\\vec{b}\\) is orthogonal to \\(A\\).\nLet \\(\\vec{r}(t)\\) be a parameterization of a path through the intersection of the \\(p\\) tangent planes that goes through \\(\\vec{x}_0\\) at \\(t_0\\) and \\(\\vec{b}\\) is parallel to \\(\\vec{x}_0'(t_0)\\). (The implicit function theorem would guarantee this path.)\nIf we consider \\(f(\\vec{x}_0 + h \\vec{b})\\) for small \\(h\\), then unless \\(\\vec{b} \\cdot \\nabla{f} = 0\\), the function would increase in the direction of \\(\\vec{b}\\) due to the \\(h \\vec{b}\\cdot\\nabla{f}\\) term in the approximating Taylor series. That is, \\(\\vec{x}_0\\) would not be a maximum on the constraint. So at \\(\\vec{x}_0\\) this directional derivative is \\(0\\).\nThen we have the directional derivative in the direction of \\(b\\) is \\(\\vec{0}\\), as the gradient\n\\[\n\\vec{0} = \\vec{b} \\cdot \\nabla{f}(\\vec{x}_0) = \\vec{b} \\cdot (\\vec{a} + \\vec{b}) = \\vec{b}\\cdot \\vec{a} + \\vec{b}\\cdot\\vec{b} = \\vec{b}\\cdot\\vec{b},\n\\]\nor \\(\\| \\vec{b} \\| = 0\\) and \\(\\nabla{f}(\\vec{x}_0)\\) must lie in the plane \\(A\\).\n\nHow does the implicit function theorem guarantee a parameterization of a curve along the constraint in the direction of \\(b\\)?\nA formal proof requires a bit of linear algebra, but here we go. Let \\(G(\\vec{x}) = \\langle g_1(\\vec{x}), g_2(\\vec{x}), \\dots, g_k(\\vec{x}) \\rangle\\). Then \\(G(\\vec{x}) = \\vec{c}\\) encodes the constraint. The tangent planes are orthogonal to each \\(\\nabla{g_i}\\), so using matrix notation, the intersection of the tangent planes is any vector \\(\\vec{h}\\) satisfying \\(J_G(\\vec{x}_0) \\vec{h} = 0\\). Let \\(k = n - 1 - p\\). If \\(k > 0\\), there will be \\(k\\) vectors orthogonal to each of \\(\\nabla{g_i}\\) and \\(\\vec{b}\\). Call these \\(\\vec{v}_j\\). Then define additional constraints \\(h_j(\\vec{x}) = \\vec{v}_j \\cdot \\vec{x} = 0\\). Let \\(H(x_1, x_2, \\dots, x_n) = \\langle g_1, g_2, \\dots, g_p, h_1, \\dots, h_{n-1-p}\\rangle\\). \\(H:R^{1 + (n-1)} \\rightarrow R^{n-1}\\). Let \\(H(x_1, \\dots, x_n) = H(x, \\vec{y})\\) The \\(H\\) restricted to the \\(\\vec{y}\\) variables is a function from \\(R^{n-1}\\rightarrow R^{n-1}\\). If this restricted function has a Jacobian with non-zero determinant, then there exists a \\(\\vec\\phi(x): R \\rightarrow R^{n-1}\\) with \\(H(x, \\vec\\phi(x)) = \\vec{c}\\). Let \\(\\vec{r}(t) = \\langle t, \\phi_1(t), \\dots, \\phi_{n-1}(t)\\rangle\\). Then \\((H\\circ\\vec{r})(t) = \\vec{c}\\), so by the chain rule \\(d_H(\\vec{r}) d\\vec{r} = 0\\). But \\(dH = [\\nabla{g_1}'; \\nabla{g_2}' \\dots;\\nabla{g_p}', v_1';\\dots;v_{n-1-p}']\\) (A matrix of row vectors). The condition \\(dH(\\vec{r}) d\\vec{r} = \\vec{0}\\) is equivalent to saying \\(d\\vec{r}\\) is orthogonal to the row vectors in \\(dH\\). A basis for \\(R^n\\) are these vectors and \\(\\vec{b}\\), so \\(\\vec{r}\\) and \\(\\vec{b}\\) must be parallel.\n\n\nExample\nWe apply this to two problems, also from Sawyer. First, let \\(n > 1\\) and \\(f(x_1, \\dots, x_n) = \\sum x_i^2\\). Minimize this subject to the constraint \\(\\sum x_i = 1\\). This one constraint means an answer must satisfy \\(\\nabla{L} = \\vec{0}\\) where\n\\[\nL(x_1, \\dots, x_n, \\lambda) = \\sum x_i^2 + \\lambda \\sum x_i - 1.\n\\]\nTaking \\(\\partial/\\partial{x_i}\\) we have \\(2x_i + \\lambda = 0\\), so \\(x_i = \\lambda/2\\), a constant. From the constraint, we see \\(x_i = 1/n\\). This does not correspond to a maximum, but a minimum. A maximum would be at point on the constraint such as \\(\\langle 1, 0, \\dots, 0\\rangle\\), which gives a value of \\(1\\) for \\(f\\), not \\(n \\times 1/n^2 = 1/n\\).\n\n\nExample\nIn statistics, there are different ways to define the best estimate for a population parameter based on the data. That is, suppose \\(X_1, X_2, \\dots, X_n\\) are random variables. The population parameters of interest here are the mean \\(E(X_i) = \\mu\\) and the variance \\(Var(X_i) = \\sigma_i^2\\). (The mean is assumed to be the same for all, but the variance need not be.) What should someone use to estimate \\(\\mu\\) using just the sample values \\(X_1, X_2, \\dots, X_n\\)? The average, \\((X_1 + \\cdots + X_n)/n\\) is a well known estimate, but is it the “best” in some sense for this set up? Here some variables are more variable, should they count the same, more, or less in the weighting for the estimate?\nIn Sawyer, we see an example of applying the Lagrange multiplier method to the best linear unbiased estimator (BLUE). The BLUE is a choice of coefficients \\(a_i\\) such that \\(Var(\\sum a_i X_i)\\) is smallest subject to the constraint \\(E(\\sum a_i X_i) = \\mu\\).\nThe BLUE minimizes the variance of the estimator. (This is the Best part of BLUE). The estimator, \\(\\sum a_i X_i\\), is Linear. The constraint is that the estimator has theoretical mean given by \\(\\mu\\). (This is the Unbiased part of BLUE.)\nGoing from statistics to mathematics, we use formulas for independent random variables to restate this problem mathematically as:\n\\[\n\\text{Minimize } \\sum a_i^2 \\sigma_i^2 \\text{ subject to } \\sum a_i = 1.\n\\]\nThis problem is similar now to the last one, save the sum to minimize includes the sigmas. Set \\(L = \\sum a_i^2 \\sigma_i^2 + \\lambda\\sum a_i - 1\\)\nTaking \\(\\partial/\\partial{a_i}\\) gives equations \\(2a_i\\sigma_i^2 + \\lambda = 0\\), \\(a_i = -\\lambda/(2\\sigma_i^2) = c/\\sigma_i^2\\). The constraint implies \\(c = 1/\\sum(1/\\sigma_i)^2\\). So variables with more variance, get smaller weights.\nFor the special case of a common variance, \\(\\sigma_i=\\sigma\\), the above simplifies to \\(a_i = 1/n\\) and the estimator is \\(\\sum X_i/n\\), the familiar sample mean, \\(\\bar{X}\\)."
},
{
"objectID": "differentiable_vector_calculus/vector_fields.html#questions",
"href": "differentiable_vector_calculus/vector_fields.html#questions",
"title": "57  Functions \\(R^n \\rightarrow R^m\\)",
"section": "57.5 Questions",
"text": "57.5 Questions\n\nQuestion\nThe following plots a surface defined by a (hidden) function \\(F: R^2 \\rightarrow R^3\\):\n\n\n𝑭 (generic function with 1 method)\n\n\n\nus, vs = range(0, 1, length=25), range(0, 2pi, length=25)\nxs, ys, zs = unzip(𝑭.(us, vs'))\nsurface(xs, ys, zs)\n\n\n\n\nIs this the surface generated by \\(F(u,v) = \\langle u\\cos(v), u\\sin(v), 2v\\rangle\\)? This functions surface is termed a helicoid.\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe following plots a surface defined by a (hidden) function \\(F: R^2 \\rightarrow R^3\\) of the form \\(F(u,v) = \\langle r(u)\\cos(v), r(u)\\sin(v), u\\rangle\\)\n\n\n (generic function with 1 method)\n\n\n\nus, vs = range(-1, 1, length=25), range(0, 2pi, length=25)\nxs, ys, zs = unzip(.(us, vs'))\nsurface(xs, ys, zs)\n\n\n\n\nIs this the surface generated by \\(r(u) = 1+u^2\\)? This form of a function is for a surface of revolution about the \\(z\\) axis.\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe transformation \\(F(x, y) = \\langle 2x + 3y + 1, 4x + y + 2\\rangle\\) is an example of an affine transformation. Is this the Jacobian of \\(F\\)\n\\[\nJ = \\left[\n\\begin{array}{}\n2 & 4\\\\\n3 & 1\n\\end{array}\n\\right].\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No, it is the transpose\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nDoes the transformation \\(F(u,v) = \\langle u^2 - v^2, u^2 + v^2 \\rangle\\) have Jacobian\n\\[\nJ = \\left[\n\\begin{array}{}\n2u & -2v\\\\\n2u & 2v\n\\end{array}\n\\right]?\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No, it is the transpose\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFix constants \\(\\lambda_0\\) and \\(\\phi_0\\) and define a transformation\n\\[\nF(\\lambda, \\phi) = \\langle \\cos(\\phi)\\sin(\\lambda - \\lambda_0),\n\\cos(\\phi_0)\\sin(\\phi) - \\sin(\\phi_0)\\cos(\\phi)\\cos(\\lambda - \\lambda_0) \\rangle\n\\]\nWhat does the following SymPy code compute?\n\n@syms lambda lambda_0 phi phi_0\nF(lambda,phi) = [cos(phi)*sin(lambda-lambda_0), cos(phi_0)*sin(phi) - sin(phi_0)*cos(phi)*cos(lambda-lambda_0)]\n\nout = [diff.(F(lambda, phi), lambda) diff.(F(lambda, phi), phi)]\ndet(out) |> simplify\n\n \n\\[\n\\left(\\sin{\\left(\\phi \\right)} \\sin{\\left(\\phi_{0} \\right)} + \\cos{\\left(\\phi \\right)} \\cos{\\left(\\phi_{0} \\right)} \\cos{\\left(\\lambda - \\lambda_{0} \\right)}\\right) \\cos{\\left(\\phi \\right)}\n\\]\n\n\n\n\n\n\n \n \n \n \n \n \n \n \n \n The determinant of the Jacobian.\n \n \n\n\n \n \n \n \n The determinant of the Hessian.\n \n \n\n\n \n \n \n \n The determinant of the gradient.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat would be a more direct method:\n\n\n\n \n \n \n \n \n \n \n \n \n det(F(lambda, phi).jacobian([lambda, phi]))\n \n \n\n\n \n \n \n \n det(hessian(F(lambda, phi), [lambda, phi]))\n \n \n\n\n \n \n \n \n det(gradient(F(lambda, phi), [lambda, phi]))\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(z\\sin(z) = x^3y^2 + z\\). Compute \\(\\partial{z}/\\partial{x}\\) implicitly.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(3x^2y^2/(z\\cos(z) + \\sin(z) + 1)\\)\n \n \n\n\n \n \n \n \n \\(2x^3y/ (z\\cos(z) + \\sin(z) + 1)\\)\n \n \n\n\n \n \n \n \n \\(3x^2y^2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(x^4 + y^4 + z^4 + x^2y^2z^2 = 1\\). Compute \\(\\partial{z}/\\partial{y}\\) implicitly.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\frac{y \\left(- x^{2} z^{2}{\\left (x,y \\right )} + 2 y^{2}\\right)}{\\left(x^{2} y^{2} - 2 z^{2}{\\left (x,y \\right )}\\right) z{\\left (x,y \\right )}}\\)\n \n \n\n\n \n \n \n \n \\(\\frac{x \\left(2 x^{2} - z^{2}{\\left (x,y \\right )}\\right)}{\\left(x^{2} - 2 z^{2}{\\left (x,y \\right )}\\right) z{\\left (x,y \\right )}}\\)\n \n \n\n\n \n \n \n \n \\(\\frac{x \\left(2 x^{2} - y^{2} z^{2}{\\left (x,y \\right )}\\right)}{\\left(x^{2} y^{2} - 2 z^{2}{\\left (x,y \\right )}\\right) z{\\left (x,y \\right )}}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the vector field \\(R:R^2 \\rightarrow R^2\\) defined by \\(R(x,y) = \\langle x, y\\rangle\\) and the vector field \\(S:R^2\\rightarrow R^2\\) defined by \\(S(x,y) = \\langle -y, x\\rangle\\). Let \\(r = \\|R\\| = \\sqrt{x^2 + y^2}\\). \\(R\\) is a radial field, \\(S\\) a spin field.\nWhat is \\(\\nabla{r}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(S/r\\)\n \n \n\n\n \n \n \n \n \\(R/r\\)\n \n \n\n\n \n \n \n \n \\(R\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nLet \\(\\phi = r^k\\). What is \\(\\nabla{\\phi}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(k r^{k-2} S\\)\n \n \n\n\n \n \n \n \n \\(k r^{k-2} R\\)\n \n \n\n\n \n \n \n \n \\(kr^k R\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nBased on your last answer, are all radial fields \\(R/r^n\\), \\(n\\geq 0\\) gradients of scalar functions?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nLet \\(\\phi = \\tan^{-1}(y/x)\\). What is \\(\\nabla{\\phi}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(S/r^2\\)\n \n \n\n\n \n \n \n \n \\(S\\)\n \n \n\n\n \n \n \n \n \\(S/r\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nExpress \\(S/r^n = \\langle F_x, F_y\\rangle\\). For which \\(n\\) is \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y} = 0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n All \\(n \\geq 0\\)\n \n \n\n\n \n \n \n \n As the left-hand side becomes \\((-n+2)r^{-n}\\), only \\(n=2\\).\n \n \n\n\n \n \n \n \n No values of \\(n\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n(The latter is of interest, as only when the expression is \\(0\\) will the vector field be the gradient of a scalar function.)"
},
{
"objectID": "differentiable_vector_calculus/plots_plotting.html",
"href": "differentiable_vector_calculus/plots_plotting.html",
"title": "58  2D and 3D plots in Julia with Plots",
"section": "",
"text": "This section uses these add-on packages:\nThis covers plotting the typical 2D and 3D plots in Julia with the Plots package.\nWe will make use of some helper functions that will simplify plotting provided by the CalculusWithJulia package. As well, we will need to manipulate contours directly, so pull in the Contours package, using import to avoid name collisions and explicitly listing the methods we will use."
},
{
"objectID": "differentiable_vector_calculus/plots_plotting.html#parametrically-described-curves-in-space",
"href": "differentiable_vector_calculus/plots_plotting.html#parametrically-described-curves-in-space",
"title": "58  2D and 3D plots in Julia with Plots",
"section": "58.1 Parametrically described curves in space",
"text": "58.1 Parametrically described curves in space\nLet \\(r(t)\\) be a vector-valued function with values in \\(R^d\\), \\(d\\) being \\(2\\) or \\(3\\). A familiar example is the equation for a line that travels in the direction of \\(\\vec{v}\\) and goes through the point \\(P\\): \\(r(t) = P + t \\cdot \\vec{v}\\). A parametric plot over \\([a,b]\\) is the collection of all points \\(r(t)\\) for \\(a \\leq t \\leq b\\).\nIn Plots, parameterized curves can be plotted through two interfaces, here illustrated for \\(d=2\\): plot(f1, f2, a, b) or plot(xs, ys). The former is convenient for some cases, but typically we will have a function r(t) which is vector-valued, as opposed to a vector of functions. As such, we only discuss the latter.\nAn example helps illustrate. Suppose \\(r(t) = \\langle \\sin(t), 2\\cos(t) \\rangle\\) and the goal is to plot the full ellipse by plotting over \\(0 \\leq t \\leq 2\\pi\\). As with plotting of curves, the goal would be to take many points between a and b and from there generate the \\(x\\) values and \\(y\\) values.\nLets see this with 5 points, the first and last being identical due to the curve:\n\nr₂(t) = [sin(t), 2cos(t)]\nts = range(0, stop=2pi, length=5)\n\n0.0:1.5707963267948966:6.283185307179586\n\n\nThen we can create the \\(5\\) points easily through broadcasting:\n\nvs = r₂.(ts)\n\n5-element Vector{Vector{Float64}}:\n [0.0, 2.0]\n [1.0, 1.2246467991473532e-16]\n [1.2246467991473532e-16, -2.0]\n [-1.0, -3.6739403974420594e-16]\n [-2.4492935982947064e-16, 2.0]\n\n\nThis returns a vector of points (stored as vectors). The plotting function wants two collections: the set of \\(x\\) values for the points and the set of \\(y\\) values. The data needs to be generated differently or reshaped. The function unzip above takes data in this style and returns the desired format, returning a tuple with the \\(x\\) values and \\(y\\) values pulled out:\n\nunzip(vs)\n\n([0.0, 1.0, 1.2246467991473532e-16, -1.0, -2.4492935982947064e-16], [2.0, 1.2246467991473532e-16, -2.0, -3.6739403974420594e-16, 2.0])\n\n\nTo plot this, we “splat” the tuple so that plot gets the arguments separately:\n\nplot(unzip(vs)...)\n\n\n\n\nThis basic plot is lacking, of course, as there are not enough points. Using more initially is a remedy.\n\nts = range(0, 2pi, length=100)\nplot(unzip(r₂.(ts))...)\n\n\n\n\nAs a convenience, CalculusWithJulia provides plot_parametric to produce this plot. The interval is specified with the a..b notation of IntervalSets (which is available when the CalculusWithJulia package is loaded), the points to plot are adaptively chosen:\n\nplot_parametric(0..2pi, r₂) # interval first\n\n\n\n\n\n58.1.1 Plotting a space curve in 3 dimensions\nA parametrically described curve in 3D is similarly created. For example, a helix is described mathematically by \\(r(t) = \\langle \\sin(t), \\cos(t), t \\rangle\\). Here we graph two turns:\n\nr₃(t) = [sin(t), cos(t), t]\nplot_parametric(0..4pi, r₃)\n\n\n\n\n\n\n58.1.2 Adding a vector\nThe tangent vector indicates the instantaneous direction one would travel were they walking along the space curve. We can add a tangent vector to the graph. The quiver! function would be used to add a 2D vector, but Plots does not currently have a 3D analog. In addition, quiver! has a somewhat cumbersome calling pattern when adding just one vector. The CalculusWithJulia package defines an arrow! function that uses quiver for 2D arrows and a simple line for 3D arrows. As a vector incorporates magnitude and direction, but not a position, arrow! needs both a point for the position and a vector.\nHere is how we can visualize the tangent vector at a few points on the helix:\nplot_parametric(0..4pi, r₃, legend=false)\nts = range(0, 4pi, length=5)\nfor t in ts\n arrow!(r₃(t), r₃'(t))\nend\n\n\nInfoAdding many arrows this way would be inefficient.\n\n\n\n\n\n\n58.1.3 Setting a viewing angle for 3D plots\nFor 3D plots, the viewing angle can make the difference in visualizing the key features. In Plots, some backends allow the viewing angle to be set with the mouse by clicking and dragging. Not all do. For such, the camera argument is used, as in camera(azimuthal, elevation) where the angles are given in degrees. If the \\(x\\)-\\(y\\)-\\(z\\) coorinates are given, then elevation or inclination, is the angle between the \\(z\\) axis and the \\(x-y\\) plane (so 90 is a top view) and azimuthal is the angle in the \\(x-y\\) plane from the \\(x\\) axes."
},
{
"objectID": "differentiable_vector_calculus/plots_plotting.html#visualizing-functions-from-r2-rightarrow-r",
"href": "differentiable_vector_calculus/plots_plotting.html#visualizing-functions-from-r2-rightarrow-r",
"title": "58  2D and 3D plots in Julia with Plots",
"section": "58.2 Visualizing functions from \\(R^2 \\rightarrow R\\)",
"text": "58.2 Visualizing functions from \\(R^2 \\rightarrow R\\)\nIf a function \\(f: R^2 \\rightarrow R\\) then a graph of \\((x,y,f(x,y))\\) can be represented in 3D. It will form a surface. Such graphs can be most simply made by specifying a set of \\(x\\) values, a set of \\(y\\) values and a function \\(f\\), as with:\n\nxs = range(-2, stop=2, length=100)\nys = range(-pi, stop=pi, length=100)\nf(x,y) = x*sin(y)\nsurface(xs, ys, f)\n\n\n\n\nRather than pass in a function, values can be passed in. Here they are generated with a list comprehension. The y values are innermost to match the graphic when passing in a function object:\n\nzs = [f(x,y) for y in ys, x in xs]\nsurface(xs, ys, zs)\n\n\n\n\nRemembering if the ys or xs go first in the above can be hard. Alternatively, broadcasting can be used. The command f.(xs,ys) would return a vector, as the xs and ys match in shapethey are both column vectors. But the transpose of xs looks like a row vector and ys looks like a column vector, so broadcasting will create a matrix of values, as desired here:\n\nsurface(xs, ys, f.(xs', ys))\n\n\n\n\nThis graph shows the tessalation algorithm. Here only the grid in the \\(x\\)-\\(y\\) plane is just one cell:\n\nxs = ys = range(-1, 1, length=2)\nf(x,y) = x*y\nsurface(xs, ys, f)\n\n\n\n\nA more accurate graph, can be seen here:\n\nxs = ys = range(-1, 1, length=100)\nf(x,y) = x*y\nsurface(xs, ys, f)\n\n\n\n\n\n58.2.1 Contour plots\nReturning to the\nThe contour plot of \\(f:R^2 \\rightarrow R\\) draws level curves, \\(f(x,y)=c\\), for different values of \\(c\\) in the \\(x-y\\) plane. They are produced in a similar manner as the surface plots:\n\nxs = ys = range(-2,2, length=100)\nf(x,y) = x*y\ncontour(xs, ys, f)\n\n\n\n\nThe cross in the middle corresponds to \\(c=0\\), as when \\(x=0\\) or \\(y=0\\) then \\(f(x,y)=0\\).\nSimilarly, computed values for \\(f(x,y)\\) can be passed in. Here we change the function:\n\nf(x,y) = 2 - (x^2 + y^2)\nxs = ys = range(-2,2, length=100)\n\nzs = [f(x,y) for y in ys, x in xs]\n\ncontour(xs, ys, zs)\n\n\n\n\nThe chosen levels can be specified by the user through the levels argument, as in:\n\nf(x,y) = 2 - (x^2 + y^2)\nxs = ys = range(-2,2, length=100)\n\nzs = [f(x,y) for y in ys, x in xs]\n\ncontour(xs, ys, zs, levels = [-1.0, 0.0, 1.0])\n\n\n\n\nIf only a single level is desired, as scalar value can be specified. Though not with all backends for Plots. For example, this next graphic shows the \\(0\\)-level of the devils curve.\n\na, b = -1, 2\nf(x,y) = y^4 - x^4 + a*y^2 + b*x^2\nxs = ys = range(-5, stop=5, length=100)\ncontour(xs, ys, f, levels=[0.0])\n\n\n\n\nContour plots are well known from the presence of contour lines on many maps. Contour lines indicate constant elevations. A peak is characterized by a series of nested closed paths. The following graph shows this for the peak at \\((x,y)=(0,0)\\).\n\nxs = ys = range(-pi/2, stop=pi/2, length=100)\nf(x,y) = sinc(sqrt(x^2 + y^2)) # sinc(x) is sin(x)/x\ncontour(xs, ys, f)\n\n\n\n\nContour plots can be filled with colors through the contourf function:\n\nxs = ys = range(-pi/2, stop=pi/2, length=100)\nf(x,y) = sinc(sqrt(x^2 + y^2))\n\ncontourf(xs, ys, f)\n\n\n\n\n\n\n58.2.2 Combining surface plots and contour plots\nIn PyPlot it is possible to add a contour lines to the surface, or projected onto an axis. To replicate something similar, though not as satisfying, in Plots we use the Contour package.\n\nf(x,y) = 2 + x^2 + y^2\nxs = ys = range(-2, stop=2, length=100)\nzs = [f(x,y) for y in ys, x in xs]\n\np = surface(xs, ys, zs, legend=false, fillalpha=0.5)\n\n## we add to the graphic p, then plot\nfor cl in levels(contours(xs, ys, zs))\n lvl = level(cl) # the z-value of this contour level\n for line in lines(cl)\n _xs, _ys = coordinates(line) # coordinates of this line segment\n _zs = 0 * _xs\n plot!(p, _xs, _ys, lvl .+ _zs, alpha=0.5) # add on surface\n plot!(p, _xs, _ys, _zs, alpha=0.5) # add on x-y plane\n end\nend\np\n\n\n\n\nThere is no hidden line calculuation, in place we give the contour lines a transparency through the argument alpha=0.5.\n\n\n58.2.3 Gradient and surface plots\nThe surface plot of \\(f: R^2 \\rightarrow R\\) plots \\((x, y, f(x,y))\\) as a surface. The gradient of \\(f\\) is \\(\\langle \\partial f/\\partial x, \\partial f/\\partial y\\rangle\\). It is a two-dimensional object indicating the direction at a point \\((x,y)\\) where the surface has the greatest ascent. Illurating the gradient and the surface on the same plot requires embedding the 2D gradient into the 3D surface. This can be done by adding a constant \\(z\\) value to the gradient, such as \\(0\\).\n\nf(x,y) = 2 - (x^2 + y^2)\nxs = ys = range(-2, stop=2, length=100)\nzs = [f(x,y) for y in ys, x in xs]\n\nsurface(xs, ys, zs, camera=(40, 25), legend=false)\np = [-1, 1] # in the region graphed, [-2,2] × [-2, 2]\n\nf(x) = f(x...)\nv = ForwardDiff.gradient(f, p)\n\n\n# add 0 to p and v (two styles)\npush!(p, -15)\nscatter!(unzip([p])..., markersize=3)\n\nv = vcat(v, 0)\narrow!(p, v)\n\n\n\n\n\n\n58.2.4 The tangent plane\nLet \\(z = f(x,y)\\) describe a surface, and \\(F(x,y,z) = f(x,y) - z\\). The the gradient of \\(F\\) at a point \\(p\\) on the surface, \\(\\nabla F(p)\\), will be normal to the surface and for a function, \\(f(p) + \\nabla f \\cdot (x-p)\\) describes the tangent plane. We can visualize each, as follows:\n\nf(x,y) = 2 - x^2 - y^2\nf(v) = f(v...)\nF(x,y,z) = z - f(x,y)\nF(v) = F(v...)\np = [1/10, -1/10]\nglobal p1 = vcat(p, f(p...)) # note F(p1) == 0\nglobal n⃗ = ForwardDiff.gradient(F, p1)\nglobal tl(x) = f(p) + ForwardDiff.gradient(f, p) ⋅ (x - p)\ntl(x,y) = tl([x,y])\n\nxs = ys = range(-2, stop=2, length=100)\nsurface(xs, ys, f)\nsurface!(xs, ys, tl)\narrow!(p1, 5n⃗)\n\n\n\n\nFrom some viewing angles, the normal does not look perpendicular to the tangent plane. This is a quick verification for a randomly chosen point in the \\(x-y\\) plane:\n\na, b = randn(2)\ndot(n⃗, (p1 - [a,b, tl(a,b)]))\n\n-2.220446049250313e-16\n\n\n\n\n58.2.5 Parameterized surface plots\nAs illustrated, we can plot surfaces of the form \\((x,y,f(x,y)\\). However, not all surfaces are so readily described. For example, if \\(F(x,y,z)\\) is a function from \\(R^3 \\rightarrow R\\), then \\(F(x,y,z)=c\\) is a surface of interest. For example, the sphere of radius one is a solution to \\(F(x,y,z)=1\\) where \\(F(x,y,z) = x^2 + y^2 + z^2\\).\nPlotting such generally described surfaces is not so easy, but parameterized surfaces can be represented. For example, the sphere as a surface is not represented as a surface of a function, but can be represented in spherical coordinates as parameterized by two angles, essentially an “azimuth” and and “elevation”, as used with the camera argument.\nHere we define functions that represent \\((x,y,z)\\) coordinates in terms of the corresponding spherical coordinates \\((r, \\theta, \\phi)\\).\n\n# spherical: (radius r, inclination θ, azimuth φ)\nX(r,theta,phi) = r * sin(theta) * sin(phi)\nY(r,theta,phi) = r * sin(theta) * cos(phi)\nZ(r,theta,phi) = r * cos(theta)\n\nZ (generic function with 1 method)\n\n\nWe can parameterize the sphere by plotting values for \\(x\\), \\(y\\), and \\(z\\) produced by a sequence of values for \\(\\theta\\) and \\(\\phi\\), holding \\(r=1\\):\n\nthetas = range(0, stop=pi, length=50)\nphis = range(0, stop=pi/2, length=50)\n\nxs = [X(1, theta, phi) for theta in thetas, phi in phis]\nys = [Y(1, theta, phi) for theta in thetas, phi in phis]\nzs = [Z(1, theta, phi) for theta in thetas, phi in phis]\n\nsurface(xs, ys, zs)\n\n\n\n\n\n\nInfoThe above may not work with all backends for Plots, even if those that support 3D graphics.\n\n\n\n\nFor convenience, the plot_parametric function from CalculusWithJulia can produce these plots using interval notation, a..b, and a function:\n\nF(theta, phi) = [X(1, theta, phi), Y(1, theta, phi), Z(1, theta, phi)]\nplot_parametric(0..pi, 0..pi/2, F)\n\n\n\n\n\n\n58.2.6 Plotting F(x,y, z) = c\nThere is no built in functionality in Plots to create surface described by \\(F(x,y,z) = c\\). An example of how to provide some such functionality for PyPlot appears here. The non-exported plot_implicit_surface function can be used to approximate this.\nTo use it, we see what happens when a sphere if rendered:\n\nf(x,y,z) = x^2 + y^2 + z^2 - 25\nCalculusWithJulia.plot_implicit_surface(f)\n\n\n\n\nThis figure comes from a February 14, 2019 article in the New York Times. It shows an equation for a “heart,” as the graphic will illustrate:\n\na,b = 1,3\nf(x,y,z) = (x^2+((1+b)*y)^2+z^2-1)^3-x^2*z^3-a*y^2*z^3\nCalculusWithJulia.plot_implicit_surface(f, xlim=-2..2, ylim=-1..1, zlim=-1..2)"
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html",
"href": "integral_vector_calculus/double_triple_integrals.html",
"title": "59  Multi-dimensional integrals",
"section": "",
"text": "This section uses these add-on packages:\nThe definition of the definite integral, \\(\\int_a^b f(x)dx\\), is based on Riemann sums.\nWe review, using a more general form than previously. Consider a bounded function \\(f\\) over \\([a,b]\\). A partition, \\(P\\), is based on \\(a = x_0 < x_1 < \\cdots < x_n = b\\). For each subinterval \\([x_{i-1}, x_{i}]\\) take \\(m_i(f) = \\inf_{u \\text{ in } [x_{i-1},x_i]} f(u)\\) and \\(M_i(f) = \\sup_{u \\text{ in } [x_{i-1},x_i]} f(u)\\). (When \\(f\\) is continuous, \\(m_i\\) and \\(M_i\\) are realized at points of \\([x_{i-1},x_i]\\), though that isnt assumed here. The use of “\\(\\sup\\)” and “\\(\\inf\\)” is a mathematically formal means to replace this in general.) Let \\(\\Delta x_i = x_i - x_{i-1}\\). Form the sums \\(m(f, P) = \\sum_i m_i(f) \\Delta x_i\\) and \\(M(f, P) = \\sum_i M_i(f) \\Delta x_i\\). These are the lower and upper Riemann sums for a partition. A general Riemann sum would be formed by selecting \\(c_i\\) from \\([x_{i-1}, x_i]\\) and forming \\(S(f,P) = \\sum f(c_i) \\Delta x_i\\). It will be the case that \\(m(f,P) \\leq S(f,P) \\leq M(f,P)\\), as this is true for each sub-interval of the partition.\nIf, as the largest diameter (\\(\\Delta x_i\\)) of the partition \\(P\\) goes to \\(0\\), the upper and lower sums converge to the same limit, then \\(f\\) is called Riemann integrable over \\([a,b]\\). If \\(f\\) is Riemann integrable, any Riemann sum will converge to the definite integral as the partitioning shrinks.\nContinuous functions are known to be Riemann integrable, as are functions with only finitely many discontinuities, though this isnt the most general case of integrable functions, which will be stated below.\nIn practice, we dont typically compute integrals using a limit of a partition, though the approach may provide direction to numeric answers, as the Fundamental Theorem of Calculus relates the definite integral with an antiderivative of the integrand.\nThe multidimensional case will prove to be similar where a Riemann sum is used to define the value being discussed, but a theorem of Fubini will allow the computation of integrals using the Fundamental Theorem of Calculus."
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#integration-theory",
"href": "integral_vector_calculus/double_triple_integrals.html#integration-theory",
"title": "59  Multi-dimensional integrals",
"section": "59.1 Integration theory",
"text": "59.1 Integration theory\n\n\n\nHow to estimate the volume contained within the Chrysler Building? One way might be to break the building up into tall vertical blocks based on its skyline; compute the volume of each block using the formula of volume as area of the base times the height; and, finally, adding up the computed volumes This is the basic idea of finding volumes under surfaces using Riemann integration.\n\n\n\n\n\nComputing the volume of a nano-block construction of the Chrysler building is easier than trying to find an actual tree at the Chrysler building, as we can easily compute the volume of columns of equal-sized blocks. Riemann sums are similar.\n\n\nThe definition of the multi-dimensional integral is more involved then the one-dimensional case due to the possibly increased complexity of the region. This will require additional steps. The basic approach is as follows.\nFirst, let \\(R = [a_1, b_1] \\times [a_2, b_2] \\times \\cdots \\times [a_n, b_n]\\) be a closed rectangular region. If \\(n=2\\), this is a rectangle, and if \\(n=3\\), a box. We begin by defining integration over closed rectangular regions. For each side, a partition \\(P_i\\) is chosen based on \\(a_i = x_{i0} < x_{i1} < \\cdots < x_{ik} = b_i\\). Then a sub-rectangular region would be of the form \\(R' = P_{1j_1} \\times P_{2j_2} \\times \\cdots \\times P_{nj_n}\\), where \\(P_{ij_i}\\) is one of the partitioning sub intervals of \\([a_i, b_i]\\). Set \\(\\Delta R' = \\Delta P_{1j_1} \\cdot \\Delta P_{2j_2} \\cdot\\cdots\\cdot\\Delta P_{nj_n}\\) to be the \\(n\\)-dimensional volume of the sub-rectangular region.\nFor each sub-rectangular region, we can define \\(m(f,R')\\) to be \\(\\inf_{u \\text{ in } R'} f(u)\\) and \\(M(f, R') = \\sup_{u \\text{ in } R'} f(u)\\). If we enumerate all the sub-rectangular regions, we can define \\(m(f, P) = \\sum_i m(f, R_i) \\Delta R_i\\) and \\(M(f,P) = \\sum_i M(f, R_i)\\Delta R_i\\), as in the one-dimensional case. These are upper and lower sums, and, as before, would bound the Riemann sum formed by choosing any \\(c_i\\) in \\(R_i\\) and computing \\(S(f,P) = \\sum_i f(c_i) \\Delta R_i\\).\nAs with the one-dimensional case, \\(f\\) is Riemann integrable over \\(R\\) if the limits of \\(m(f,P)\\) and \\(M(f,P)\\) exist and are identical as the diameter of the partition (defined as the largest diameter of each side) goes to \\(0\\). If the limits are equal, then so is the limit of any Riemann sum.\nWhen \\(f\\) is Riemann integrable over a rectangular region \\(R\\), we denote the limit by any of:\n\\[\n\\iint_R f(x) dV, \\quad \\iint_R fdV, \\quad \\iint_R f(x_1, \\dots, x_n) dx_1 \\cdot\\cdots\\cdot dx_n, \\quad\\iint_R f(\\vec{x}) d\\vec{x}.\n\\]\nA key fact, requiring proof, is:\n\nAny continuous function, \\(f\\), is Riemann integrable over a closed, bounded rectangular region.\n\n\nAs with one-dimensional integrals, from the Riemann sum definition, several familiar properties for integrals follow. Let \\(V(R)\\) be the volume of \\(R\\) found by multiplying the side-lengths together.\nConstants:\n\nA constant is Riemann integrable and: \\(\\iint_R c dV = c V(R)\\).\n\nLinearity:\n\nFor integrable \\(f\\) and \\(g\\) and constants \\(a\\) and \\(b\\):\n\n\\[\n\\iint_R (af(x) + bg(x))dV = a\\iint_R f(x)dV + b\\iint_R g(x) dV.\n\\]\nDisjoint:\n\nIf \\(R\\) and \\(R'\\) are disjoint rectangular regions (possibly sharing a boundary), then the integral over the union is defined by linearity:\n\n\\[\n\\iint_{R \\cup R'} f(x) dV = \\iint_R f(x)dV + \\iint_{R'} f(x) dV.\n\\]\nMonotonicity:\n\nAs \\(f\\) is bounded, let \\(m \\leq f(x) \\leq M\\) for all \\(x\\) in \\(R\\). Then\n\n\\[\nm V(R) \\leq \\iint_R f(x) dV \\leq MV(R).\n\\]\n\nIf \\(f\\) and \\(g\\) are integrable and \\(f(x) \\leq g(x)\\), then the integrals have the same property, namely \\(\\iint_R f dV \\leq \\iint_R gdV\\).\nIf \\(S \\subset R\\), both closed rectangles, then if \\(f\\) is integrable over \\(R\\) it will be also over \\(S\\) and, when \\(f\\geq 0\\), \\(\\iint_S f dV \\leq \\iint_R fdV\\).\n\nTriangle inequality:\n\nIf \\(f\\) is bounded and integrable, then \\(|\\iint_R fdV| \\leq \\iint_R |f| dV\\).\n\n\n59.1.1 HCubature\nTo numerically compute multidimensional integrals over rectangular regions in Julia is efficiently done with the HCubature package. The hcubature function is defined for \\(n\\)-dimensional integrals, so the integrand is specified through a function which takes a vector as an input. The region to integrate over is of rectangular form. It is specified by a tuple of left endpoints and a tuple of right endpoints. The order is in terms of the order of the vector.\nTo elaborate, if we think of \\(f(\\vec{x}) = f(x_1, x_2, \\dots, x_n)\\) and we are integrating over \\([a_1, b_1] \\times \\cdots \\times [a_n, b_n]\\), then the region would be specified through two tuples: (a1, a2, ..., an) and (b1, b2, ..., bn).\nTo illustrate, to integrate the function \\(f(x,y) = x^2 + 5y^2\\) over the region \\([0,1] \\times [0,2]\\) using HCubatures hcubature function, we would proceed as follows:\n\nf(x,y) = x^2 + 5y^2\nf(v) = f(v...) # f accepts a vector\na0, b0 = 0, 1\na1, b1 = 0, 2\nhcubature(f, (a0, a1), (b0, b1))\n\n(14.0, 1.7763568394002505e-15)\n\n\nThe computed value and a worst case estimate for the error is returned, in a manner similar to the quadgk function (from the QuadGK package) used previously for one-dimensional numeric integrals.\nThe order above is x then y, which is clear from the first definition of f and as belabored in the tuples passed to hcubature. A more convenient use is to just put the constants into the function call, as in hcubature(f, (0,0), (1,2)).\n\nExample\nLets verify the numeric approach works for figures where an answer is known from the geometry of the problem.\n\nA constant function \\(c=f(x,y)\\). In this case, the volume is simply a box, so the volume will come from multiplying the three dimensions. Here is an example:\n\n\nf(x,y) = 3\nf(v) = f(v...)\na0, b0 = 0, 4\na1, b1 = 0, 5 # R is area 20, so V = 60 = 3 ⋅ 20\nhcubature(f, (a0, a1), (b0, b1))\n\n(60.0, 7.105427357601002e-15)\n\n\n\nA wedge. Let \\(f(x,y) = x\\) and \\(R= [0,1] \\times [0,1]\\). The the volume is a wedge, and should be half the value of the unit cube, or simply \\(1/2\\):\n\n\nf(x,y) = x\nf(v) = f(v...)\na0, b0 = 0, 1\na1, b1 = 0, 1\nhcubature(f, (a0, a1), (b0, b1))\n\n(0.5, 0.0)\n\n\n\nThe volume of a right square pyramid is \\(V=(1/3)a^2 h\\), or a third of an enclosing box. We computed this area previously using the method of slices. Here we do it thinking of the pyramid as the volume formed by the surface over the region \\([-a,a] \\times [-a,a]\\) generated by \\(f(x,y) = h \\cdot (l(x,y) - d(x,y))/l(x,y)\\) where \\(d(x,y)\\) is the distance to the origin, or \\(\\sqrt{x^2 + y^2}\\) and \\(l(x,y)\\) is the length of the line segment from the origin to the boundary of \\(R\\) that goes through \\((x,y)\\).\n\nIdentifying a formula for this is a bit tricky. Here we use a brute force approach; later we will simplify this. Using polar coordinates, we know \\(r\\cos(\\theta) = a\\) describes the line \\(x=a\\) and \\(r\\sin(\\theta)=a\\) describes the line \\(y=a\\). Using the square, we have to alternate between these depending on where \\(\\theta\\) is (e.g., between \\(-\\pi/4\\) and \\(\\pi/4\\) it would be \\(r\\cos(\\theta)=a\\) or \\(a/\\cos(\\theta)\\) is \\(l(x,y)\\). We write a function for this:\n\n𝒅(x, y) = sqrt(x^2 + y^2)\nfunction 𝒍(x, y, a)\n theta = atan(y,x)\n atheta = abs(theta)\n if (pi/4 <= atheta < 3pi/4) # this is the y=a or y=-a case\n (a/2)/sin(atheta)\n else\n (a/2)/abs(cos(atheta))\n end\nend\n\n𝒍 (generic function with 1 method)\n\n\nAnd then\n\n𝒇(x,y,a,h) = h * (𝒍(x,y,a) - 𝒅(x,y))/𝒍(x,y,a)\n𝒂, 𝒉 = 2, 3\n𝒇(x,y) = 𝒇(x, y, 𝒂, 𝒉) # fix a and h\n𝒇(v) = 𝒇(v...)\n\n𝒇 (generic function with 3 methods)\n\n\nWe can visualize the volume to be computed, as follows:\n\nxs = ys = range(-1, 1, length=20)\nsurface(xs, ys, 𝒇)\n\n\n\n\nTrying this, we have:\n\nhcubature(𝒇, (-𝒂/2, -𝒂/2), (𝒂/2, 𝒂/2))\n\n(4.000000009419327, 5.9590510310780554e-8)\n\n\nThe answer agrees with that known from the formula, \\(4 = (1/3)a^2 h\\), but the answer takes a long time to be produce. The hcubature function is slow with functions defined in terms of conditions. For this problem, volumes by slicing is more direct. But also symmetry can be used, were we able to compute the volume above the triangular region formed by the \\(x\\)-axis, the line \\(x=a/2\\) and the line \\(y=x\\), which would be \\(1/8\\)th the total volume. (As then \\(l(x,y,a) = (a/2)/\\sin(\\tan^{-1}(y,x))\\).).\n\nThe volume of a sphere is \\(4/3 \\pi r^3\\). We could verify this by integrating \\(z = f(x,y) = \\sqrt{r^2 - (x^2 + y^2)}\\) over \\(R = \\{(x,y): x^2 + y^2 \\leq r^2\\}\\). However, this is not a rectangular region, so we couldnt directly proceed.\n\nWe might try integrating a function with a condition:\n\nfunction f(x, y, r)\n if x^2 + y^2 < r\n sqrt(z - x^2 + y^2)\n else\n 0.0\n end\nend\n\nf (generic function with 3 methods)\n\n\nBut hcubature is very slow to integrate such functions. We will see our instincts are good this is the approach taken to discuss integrals over general regions but this is not practical here. There are two alternative approaches to be discussed: approach the integral iteratively or transform the circular region into a rectangular region and integrate. Before doing so, we discuss how the integral is developed for more general regions.\n\n\n\n\n\n\nNote\n\n\n\nThe approach above takes a nice smooth function and makes it non smooth at the boundary. In general this is not a good idea for numeric solutions, as many algorithms work better with assumptions of smoothness.\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe Quadrature package provides a uniform interface for QuadGK, HCubature, and other numeric integration routines available in Julia."
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#integrals-over-more-general-regions",
"href": "integral_vector_calculus/double_triple_integrals.html#integrals-over-more-general-regions",
"title": "59  Multi-dimensional integrals",
"section": "59.2 Integrals over more general regions",
"text": "59.2 Integrals over more general regions\nTo proceed further, it is necessary to discuss certain types of sets that will be used to describe the boundaries of regions that can be integrated over, though we dont dig into the details.\nLet the measure of a rectangular region be its volume and for any subset of \\(S \\subset R^n\\), define the outer measure of \\(S\\) by \\(m^*(S) = \\inf\\sum_{j=1}^\\infty V(R_j)\\) where the infimum is taken over all closed, countable, rectangles with \\(S \\subset \\cup_{j=1}^\\infty R_j\\).\nIn two dimensions, if \\(S\\) is viewed on a grid, then this would be area of the smallest collection of cells that contain any part of \\(S\\). This is the smallest this value takes as the grid becomes infinite.\nFor the following graph, there are \\(100\\) cells each of area \\(8/100\\). Their are 58 cells covering the curve and its interior. So the outer measure is less than \\(58\\cdot 8/100\\), as this is just one possible covering.\n\n\n\n\n\nA set has measure \\(0\\) if the outer measure is \\(0\\). An alternate definition, among other characterizations, is a set has measure \\(0\\) if for any \\(\\epsilon > 0\\) there exists rectangular regions \\(R_1, R_2, \\dots, R_n\\) (for some \\(n\\)) with \\(\\sum V(R_i) < \\epsilon\\). Measure zero sets have many properties not discussed here.\nFor now, lets see that graph of \\(y=f(x)\\) over \\([a,b]\\), as a two dimensional set, has measure zero when \\(f(x)\\) has a bounded derivative (\\(|f'|\\) bounded by \\(M\\)). Fix some \\(\\epsilon>0\\). Take \\(n\\) with \\(2M(b-a)^2/n < \\epsilon\\), then divide \\([a,b]\\) into \\(n\\) equal length intervals (of length \\(\\delta = (b-a)/n)\\). For each interval, we consider the box \\([a_i, b_i] \\times [f(a_i)-\\delta M, f(a_i) + \\delta M]\\). By the mean value theorem, we have \\(|f(x) - f(a_i)| \\leq |b_i-a_i|M\\) so \\(f(a_i) - \\delta M \\leq f(x) \\leq f(a_i) + \\delta M\\), so the curve will stay in the boxes. These boxes have total area \\(n \\cdot \\delta \\cdot 2\\delta M = 2M(b-a)^2/n\\), an area less than \\(\\epsilon\\).\nThe above can be extended to any graph of a continuous function over \\([a,b]\\).\nFor a function \\(f\\) the set of discontinuities in \\(R\\) is all points where \\(f\\) is not continuous. A formal definition is often given in terms of oscillation. Let \\(o(f, \\vec{x}, \\delta) = \\sup_{\\{\\vec{y} : \\| \\vec{y}-\\vec{x}\\| < \\delta\\}}f(\\vec{y}) - \\inf_{\\{\\vec{y}: \\|\\vec{y}-\\vec{x}\\|<\\delta\\}}f(\\vec{y})\\). A function is discontinuous at \\(\\vec{x}\\) if the limit as \\(\\delta \\rightarrow 0+\\) (which must exist) is not \\(0\\).\nWith this, we can state the Riemann-Lebesgue theorem on integrable functions:\n\nLet \\(R\\) be a closed, rectangular region, and \\(f:R^n \\rightarrow R\\) a bounded function. Then \\(f\\) is Riemann integrable over \\(R\\) if and only if the set of discontinuities is a set of measure \\(0\\).\n\nIt was said at the outset we would generalize the regions we can integrate over, but this theorem generalizes the functions. We can tie the two together as follows. Define the integral over any bounded set \\(S\\) with boundary of measure \\(0\\). Bounded means \\(S\\) is contained in some bounded rectangle \\(R\\). Let \\(f\\) be defined on \\(S\\) and extend it to be \\(0\\) on points in \\(R\\) that are not in \\(S\\). If this extended function is integrable over \\(R\\), then we can define the integral over \\(S\\) in terms of that. This is why the boundary of \\(S\\) must have measure zero, as in general it is among the set of discontinuities of the extend function \\(f\\). Such regions are also called Jordan regions."
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#fubinis-theorem",
"href": "integral_vector_calculus/double_triple_integrals.html#fubinis-theorem",
"title": "59  Multi-dimensional integrals",
"section": "59.3 Fubinis theorem",
"text": "59.3 Fubinis theorem\nConsider again this figure\n\n\n\n\n\nLet \\(C_i\\) enumerate all the cells shown, assume \\(f\\) is extended to be \\(0\\) outside the region, and let \\(c_i\\) be a point in the cell. Then the Riemann sum \\(\\sum_i f(c_i) V(C_i)\\) can be visualized three identical ways:\n\nas a linear sum over the indices \\(i\\), as written, leading to \\(\\iint_R f(x) dV\\).\nby indexing the cells by row (\\(i\\)) and column (\\(j\\)) and summing as \\(\\sum_i (\\sum_j f(x_{ij}, y_{ij}) \\Delta y_j) \\Delta x_i\\).\nby indexing the cells by row (\\(i\\)) and column (\\(j\\)) and summing as \\(\\sum_j (\\sum_i f(x_{ij}, y_{ij}) \\Delta x_i) \\Delta y_j\\).\n\nThe last two suggest that their limit will be iterated integrals of the form \\(\\int_{-1}^1 (\\int_{-2}^2 f(x,y) dy) dx\\) and \\(\\int_{-2}^2 (\\int_{-1}^1 f(x,y) dx) dy\\).\nBy “iterated” we mean performing two different definite integrals. For example, to compute \\(\\int_{-1}^1 (\\int_{-2}^2 f(x,y) dy) dx\\) the first task would be to compute \\(I(x) = \\int_{-2}^2 f(x,y) dy\\). Like partial derivatives, this integrates in \\(y\\) while treating \\(x\\) as a constant. Once the interior integral is computed, then the integral \\(\\int_{-1}^1 I(x) dx\\) would be computed to find the answer.\nThe question then: under what conditions will the three integrals be equal?\n\nFubini. Let \\(R \\times S\\) be a closed rectangular region in \\(R^n \\times R^m\\). Suppose \\(f\\) is bounded. Define \\(f_x(y) = f(x,y)\\) and \\(f^y(x) = f(x,y)\\) where \\(x\\) is in \\(R^n\\) and \\(y\\) in \\(R^m\\). If \\(f_x\\) and \\(f^y\\) are integrable then\n\\[\n\\iint_{R\\times S}fdV = \\iint_R \\left(\\iint_S f_x(y) dy\\right) dx\n= \\iint_S \\left(\\iint_R f^y(x) dx\\right) dy.\n\\]\n\nSimilarly, if \\(f^y\\) is integrable for all \\(y\\), then \\(\\iint_{R\\times S}fdV =\\iint_S \\iint_R f(x,y) dx dy\\).\nAn immediate corollary is that the above holds for continuous functions when \\(R\\) and \\(S\\) are bounded, the case described here.\nThe case of continuous functions was known to Euler, Lebesgue (1904) discussed bounded functions, as in our statement, and Fubini and Tonnelli (1907 and 1909) generalized the statement to more general functions than continuous functions, thereby earning naming rights.\nIn Ferzola we can read a summary of Eulers thinking of 1769 when trying to understand the integral of a function \\(f(x,y)\\) over a bounded domain \\(R\\) enclosed by arcs in the \\(x\\)-\\(y\\) plane. (That is, the area below \\(g(x)\\) and above \\(h(x)\\) over the interval \\([a,b]\\).) Euler wrote the answer as \\(\\int_a^b dx (\\int_{g(x)}^{h(x)} f(x,y)dy)\\). Ferzola writes that Euler saw this integral yielding a volume as the integral \\(\\int_{g(x)}^{h(x)} f(x,y)dy\\) gives the area of a slice (parallel to the \\(y\\) axis) and integrating in \\(x\\) adds these slices to give a volume. This is the typical usage of Fubinis theorem today.\n\n\n\nFigure 14.2 of Strang illustrating the slice when either \\(x\\) is fixed or \\(y\\) is fixed. The inner integral computes the shared area, the outer integral adds the areas up to compute volume.\n\n\nIn Volumes the formula for a volume with a known cross-sectional area is given by \\(V = \\int_a^b CA(x) dx\\). The inner integral, \\(\\int_{R_x} f(x,y) dy\\) is a function depending on \\(x\\) that yields the area of the slice (where \\(R_x\\) is the region sliced by the line of constant \\(x\\) value). This is consistent with Eulers view of the iterated integral.\nA domain, as described above, is known as a normal domain. Using Fubinis theorem to integrate iteratively, employing the fundamental theorem of calculus at each step, is the standard approach.\nFor example, we return to the problem of a square pyramid, only now using symmetry, we integrate only over the triangular region between \\(0 \\leq x \\leq a/2\\) and \\(0 \\leq y \\leq x\\). The answer is then (the \\(8\\) by symmetry)\n\\[\nV = 8 \\int_0^{a/2} \\int_0^x h(l(x,y) - d(x,y))/l(x,y) dy dx.\n\\]\nBut, using similar triangles, we have \\(d/x = l/(a/2)\\) so \\((l-d)/l = 1 - 2x/a\\). Continuing, our answer becomes\n\\[\nV = 8 \\int_0^{a/2} (\\int_0^x h(1-\\frac{2x}{a}) dy) dx =\n8 \\int_0^{a/2} (h(1-2x/a) \\cdot x) dx =\n8 (hx^2_2 \\big\\lvert_{0}^{a/2} - \\frac{2}{a}\\frac{x^3}{3}\\big\\lvert_0^{a/2})=\n8 h(\\frac{a^2}{8} - \\frac{2}{24}a^2) = \\frac{a^2h}{3}.\n\\]\n\n59.3.1 SymPys integrate\nThe integrate function of SymPy uses various algorithms to symbolically integrate definite (and indefinite) integrals. In the section on integrals its use for one-dimensional integrals was shown. For multi-dimensional integrals the usage is similar, the syntax following, somewhat, the Fubini-like notation.\nFor example, to perform the integral\n\\[\n\\int_a^b \\int_{h(x)}^{g(x)} f(x,y) dy dx\n\\]\nthe call would look like:\nintegrate(f(x,y), (y, h(x), g(x)), (x, a, b))\nThat is, the variable to integrate and the endpoints are passed as tuples. (Unlike hcubature which always uses two tuples to specify the bounds, integrate uses \\(n\\) tuples to specify an \\(n\\)-dimensional integral.) The iteration happens from left to write, so in the above the y integral is done (and, as seen, may depend on the variable x) and then the x integral is performed. The above uses f(x,y), h(x) and g(x), but these may be simple symbolic expressions and not function calls using symbolic variables.\nWe define x and y below for use throughout:\n\n@syms x::real y::real z::real\n\n(x, y, z)\n\n\n\nExample\nFor example, the last integral to compute the volume of a square pyramid, could be computed through\n\n@syms a height\n8 * integrate(height * (1 - 2x/a), (y, 0, x), (x, 0, a/2))\n\n \n\\[\n\\frac{a^{2} height}{3}\n\\]\n\n\n\n\n\nExample\nFind the integral \\(\\int_0^1\\int_{y^2}^1 y \\sin(x^2) dx dy\\).\nWithout concerning ourselves with what or why, we just translate:\n\nintegrate( y * sin(x^2), (x, y^2, 1), (y, 0, 1))\n\n \n\\[\n- \\frac{3 \\sqrt{2} \\sqrt{\\pi} \\left(\\frac{3 \\sqrt{2} \\cos{\\left(1 \\right)} \\Gamma\\left(\\frac{3}{4}\\right)}{16 \\sqrt{\\pi} \\Gamma\\left(\\frac{7}{4}\\right)} + \\frac{3 S\\left(\\frac{\\sqrt{2}}{\\sqrt{\\pi}}\\right) \\Gamma\\left(\\frac{3}{4}\\right)}{8 \\Gamma\\left(\\frac{7}{4}\\right)}\\right) \\Gamma\\left(\\frac{3}{4}\\right)}{8 \\Gamma\\left(\\frac{7}{4}\\right)} + \\frac{3 \\sqrt{2} \\sqrt{\\pi} S\\left(\\frac{\\sqrt{2}}{\\sqrt{\\pi}}\\right) \\Gamma\\left(\\frac{3}{4}\\right)}{16 \\Gamma\\left(\\frac{7}{4}\\right)} + \\frac{9 \\Gamma^{2}\\left(\\frac{3}{4}\\right)}{64 \\Gamma^{2}\\left(\\frac{7}{4}\\right)}\n\\]\n\n\n\n\n\nExample\nFind the volume enclosed by \\(y = x^2\\), \\(y = 5\\), \\(z = x^2\\), and \\(z = 0\\).\nThe limits on \\(z\\) say this is the volume under the surface \\(f(x,y) = x^2\\), over the region defined by \\(y=5\\) and \\(y = x^2\\). The region is a parabola with \\(y\\) running from \\(x^2\\) to \\(5\\), while \\(x\\) ranges from \\(-\\sqrt{5}\\) to \\(\\sqrt{5}\\).\n\nf(x, y) = x^2\nh(x) = x^2\ng(x) = 5\nintegrate(f(x,y), (y, h(x), g(x)), (x, -sqrt(Sym(5)), sqrt(Sym(5))))\n\n \n\\[\n\\frac{20 \\sqrt{5}}{3}\n\\]\n\n\n\n\n\nExample\nFind the volume above the \\(x\\)-\\(y\\) plane when a cylinder, \\(x^2 + y^2 = 2^2\\) is intersected by a plane \\(3x + 4y + 5z = 6\\).\nWe solve for \\(z = (1/5)\\cdot(6 - 3x - 4y)\\) and take \\(R\\) as the disk at the origin of radius \\(2\\):\n\nf(x,y) = 6 - 3x - 4y\ng(x) = sqrt(2^2 - x^2)\nh(x) = -sqrt(2^2 - x^2)\n(1//5) * integrate(f(x,y), (y, h(x), g(x)), (x, -2, 2))\n\n \n\\[\n\\frac{24 \\pi}{5}\n\\]\n\n\n\n\n\nExample\nFind the volume:\n\nin the first octant\nbounded by \\(x+y+z = 10\\), \\(2x + 3y = 20\\), and \\(x + 3y = 10\\)\n\nThe first plane can be expressed as \\(z = f(x,y) = 10 - x - y\\) and the volume is that below the surface of \\(f\\) over the region \\(R\\) formed by the two lines and the \\(x\\) and \\(y\\) axes. Plotting that we have:\n\ng1(x) = (20 - 2x)/3\ng2(x) = (10 - x)/3\nplot(g1, 0, 20)\nplot!(g2, 0, 20)\n\n\n\n\nWe see the intersection is when \\(x=10\\), so this becomes\n\nf(x,y) = 10 - x - y\nh(x) = (10 - x)/3\ng(x) = (20 - 3x)/3\nintegrate(f(x,y), (y, h(x), g(x)), (x, 0, 10))\n\n \n\\[\n\\frac{500}{27}\n\\]\n\n\n\n\n\nExample\nLet \\(r=1\\) and define three cylinders along the \\(x\\), \\(y\\), and \\(z\\) axes by: \\(y^2+z^2 = r^2\\), \\(x^2 + z^2 = r^2\\), and \\(x^2 + y^2 = r^2\\). What is the enclosed volume?\nUsing the cylinder along the \\(z\\) axis, we have the volume sits above and below the disk \\(R = x^2 + y^2 \\leq r^2\\). By symmetry, we can double the volume that sits above the disk to answer the question.\nUsing symmetry, we can tell that the the wedge between \\(x=0\\), \\(y=x\\), and \\(x^2 + y^2 \\leq 1\\) (corresponding to a polar angle in \\([0,\\pi/4]\\) in \\(R\\) contains \\(1/8\\) the volume of the top, so \\(1/16\\) of the total.\n\n\n\n\n\nOver this wedge the height is given by the cylinder along the \\(y\\) axis, \\(x^2 + z^2 = r^2\\). We could break this wedge into a triangle and a semicircle to integrate piece by piece. However, from the figure we can integrate in the \\(y\\) direction on the outside, and use only one intergral:\n\nr = 1 # if using r as a symbolic variable specify `positive=true`\nf(x,y) = sqrt(r^2 - x^2)\n16 * integrate(f(x,y), (x, y, sqrt(r^2-y^2)), (y, 0, r*cos(PI/4)))\n\n \n\\[\n16 - 8 \\sqrt{2}\n\\]\n\n\n\n\n\nExample\nFind the volume under \\(f(x,y) = xy\\) in the cone swept out by \\(r(\\theta) = 1\\) as \\(\\theta\\) goes between \\([0, \\pi/4]\\).\nThe region \\(R\\), the same as the last one. As seen, it can be described in two pieces as a function of \\(x\\), but needs only \\(1\\) as a function of \\(y\\), so we use that below:\n\nf(x,y) = x*y\ng(y) = sqrt(1 - y^2)\nh(y) = y\nintegrate(f(x,y), (x, h(y), g(y)), (y, 0, sin(PI/4)))\n\n \n\\[\n\\frac{1}{16}\n\\]\n\n\n\n\n\nExample: Average value\nThe average value of a function, \\(f(x,y)\\), over a region \\(R\\) is the integral of \\(f\\) over \\(R\\) divided by the area of \\(R\\). It can be computed through two integrals, as below.\nlet \\(R\\) be the region in the first quadrant bounded by \\(x - y = 0\\) and \\(f(x,y) = x^2 + y^2\\). Find the average value.\n\nf(x,y) = x^2 + y^2\ng(x) = x # solve x - y = 0 for y\nh(x) = 0\nA = integrate(f(x,y), (y, h(x), g(x)), (x, 0, 1))\nB = integrate(Sym(1), (y, h(x), g(x)), (x, 0, 1))\nA/B\n\n \n\\[\n\\frac{2}{3}\n\\]\n\n\n\n(We integrate Sym(1) and not just 1, as we either need to have a symbolic value for the first argument or use the sympy.integrate method directly.)\n\n\nExample: Density\nThe area of a region \\(R\\) can be computed by \\(\\iint_R 1 dA\\). If the region is physical, say a disc, then its mass can be of interest. If the mass is uniform with density \\(\\rho\\), then the mass would be \\(\\iint_R \\rho dA\\). If the mass is non uniform, say it is a function \\(\\rho(x,y)\\), then the integral to find the mass becomes \\(\\iint_R \\rho(x,y) dA\\). (In a Riemann sum, the term \\(\\rho(c_{ij}) \\Delta x_i\\Delta y_j\\) would be the mass of a constant-density solid, the integral just adds these up to find total mass.)\nFind the mass of a disc bounded by the two parabolas \\(y=2 - x^2\\) and \\(y = -3 + 2x^2\\) with density function given by \\(\\rho(x,y) = x^2y^2\\).\nFirst we need the intersection points of the two parabolas. Solving \\(2-x^2 = -3 + 2x^2\\) for \\(x\\) yields: \\(5 = x^2\\).\nSo we get a mass of:\n\nrho(x,y) = x^2*y^2\ng(x) = 2 - x^2\nh(x) = -3 + 2x^2\na = sqrt(Sym(5))\nintegrate(rho(x,y), (y, h(x), g(x)), (x, -a, a))\n\n \n\\[\n- \\frac{880 \\sqrt{5}}{9}\n\\]\n\n\n\n\n\nExample (Strang)\nIntegrate \\(\\int_0^1 \\int_y^1 \\cos(x^2) dx dy\\) avoiding the impossible integral of \\(\\cos(x^2)\\). As the integrand is continuous, Fubinis Theorem allows the interchange of the variable of integraton. The region, \\(R\\), is a triangle in the first quadrant below the line \\(y=x\\) and left of the line \\(x=1\\). So we have:\n\\[\n\\int_0^1 \\int_0^x \\cos(x^2) dy dx\n\\]\nWe can integrate this, as the interior integral leaves \\(x \\cos(x^2)\\) to integrate:\n\nintegrate(cos(x^2), (y, 0, x), (x, 0, 1))\n\n \n\\[\n\\frac{\\sin{\\left(1 \\right)}}{2}\n\\]\n\n\n\n\n\n\n59.3.2 A “Fubini” function\nThe computationally efficient way to perform multiple integrals numerically would be to use hcubature. However, this function is defined only for rectangular regions. In the event of non-rectangular regions, the suggested performant way would be to find a suitable transformation (below).\nHowever, for simple problems, where ease of expressing a region is preferred to computational efficiency, something can be implemented using repeated uses of quadgk. Again, this isnt recommended, save for its relationship to how iteration is approached algebraically.\nIn the CalculusWithJulia package, the fubini function is provided. For these notes, we define three operations using Unicode operators entered with \\int[tab], \\iint[tab], \\iiint[tab]. (Using this, better shows the mechanics involved.)\n\n# adjust endpoints when expressed as a functions of outer variables\ncallf(f::Number, x) = f\ncallf(f, x) = f(x...)\nendpoints(ys, x) = callf.(ys, Ref(x))\n\n# integrate f(x) dx\n∫(@nospecialize(f), xs) = quadgk(f, xs...)[1] # @nospecialize is not necessary, but offers a speed boost\n\n# integrate int_a^b int_h(x)^g(y) f(x,y) dy dx\n∬(f, ys, xs) = ∫(x -> ∫(y -> f(x,y), endpoints(ys, x)), xs)\n\n# integrate f(x,y,z) dz dy dx\n∭(f, zs, ys, xs) = ∫(\n x -> ∫(\n y -> ∫(\n z -> f(x,y,z),\n endpoints(zs, (x,y))),\n endpoints(ys,x)),\n xs)\n\n∭ (generic function with 1 method)\n\n\n\nExample\nCompare the integral of \\(f(x,y) = \\exp(-x^2 -2y^2)\\) over the region \\(R=[0,3]\\times[0,3]\\) using hcubature and the above.\n\nf(x,y) = exp(-x^2 - 2y^2)\nf(v) = f(v...)\nhcubature(f, (0,0), (3,3)) # (a0, a1), (b0, b1)\n\n(0.5553480979840428, 8.155874399598429e-9)\n\n\n\nf(x,y) = exp(-x^2 - 2y^2)\n∬(f, (0,3), (0,3)) # (a1, b1), (a0, b0)\n\n0.5553480979874703\n\n\n\n\nExample\nShow the area of the unit circle is \\(\\pi\\) using the “Fubini” function.\n\nf(x,y) = 1\na = ∬(f, (x-> -sqrt(1-x^2), x-> sqrt(1-x^2)), (-1, 1))\na, a - pi # answer and error\n\n(3.1415926559132474, 2.3234543178318745e-9)\n\n\n(The error is similar to that returned by quadgk(x -> sqrt(1-x^2), -1, 1).)\n\nExample\nShow the volume of a sphere of radius \\(1\\) is \\(4/3\\pi = 4/3\\pi\\cdot 1^3\\) by doubling the integral of \\(f(x,y) = \\sqrt{1-x^2-y^2}\\) over \\(R\\), the unit disk.\n\nf(x,y) = sqrt(1 - x^2 - y^2)\na = 2 * ∬(f, (x-> -sqrt(1-x^2), x-> sqrt(1-x^2)), (-1, 1))\na, a - 4/3*pi\n\n(4.188790207884331, 3.0979405707398655e-9)\n\n\n\n\n\nExample\nNumeric integrals dont need to worry about integrands without antiderivatives. Their concerns are highly oscillatory integrands. Here we compute \\(\\int_0^1 \\int_y^1 \\cos(x^2) dx dy\\) directly. The limits are in a different order than the “Fubini” function expects, so we switch the variables:\n\n∬((y,x) -> cos(x^2), (y -> y, 1), (0, 1))\n\n0.4207354924039483\n\n\nCompare to\n\nsin(1)/2\n\n0.42073549240394825"
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#triple-integrals",
"href": "integral_vector_calculus/double_triple_integrals.html#triple-integrals",
"title": "59  Multi-dimensional integrals",
"section": "59.4 Triple integrals",
"text": "59.4 Triple integrals\nTriple integrals are identical in theory to double integrals, though the computations can be more involved and the regions more complicated to describe. The main regions (emphasized by Strang) to understand are: box, prism, cylinder, cone, tetrahedron, and sphere.\n\n\n\n\n\nHere we compute the volumes of these using a triple integral of the form \\(\\iint_R 1 dV\\).\n\nBox. Consider the box-like, or “rectangular,” region \\([0,a]\\times [0,b] \\times [0,c]\\). This has volume \\(abc\\) which we see here using Fubinis theorem:\n\n\n@syms a b c\nf(x,y,z) = Sym(1) # need to integrate a symbolic object in integrand or call `sympy.integrate`\nintegrate(f(x,y,z), (x, 0, a), (y, 0, b), (z, 0, c))\n\n \n\\[\na b c\n\\]\n\n\n\n\nPrism. Consider a prism or wedge formed by \\(ay + bz = 1\\) with \\(a,b > 0\\) and over the region in the first quadrant \\(0 \\leq x \\leq c\\). Find its area.\n\nThe function to integrate is \\(f(x,y) = (1 - ay)/b\\) over the region bounded by \\([0,c] \\times [0,1/a]\\):\n\n@syms a b c\nf(x,y,z) = Sym(1)\nintegrate(f(x,y,z), (z, 0, (1 - a*y)/b), (y, 0, 1/a), (x, 0, c))\n\n \n\\[\n\\frac{c}{2 a b}\n\\]\n\n\n\nWhich, as expected, is half the volume of the box \\([0,c] \\times [0, 1/a] \\times [0, 1/b]\\).\n\nTetrahedron. Consider the volume formed by \\(x,y,z \\geq 0\\) and bounded by \\(ax+by+cz = 1\\) where \\(a,b,c \\geq 0\\). The volume is a tetrahedron. The base in the \\(x\\)-\\(y\\) plane is a triangle with vertices \\((1/a, 0, 0)\\) and \\((0, 1/b, 0)\\).\n\n(The third easy-to-find point is \\((0, 0, 1/c)\\)). The line connecting the points in the \\(x\\)-\\(y\\) plane is \\(ax + by = 1\\). With this, the integral to compute the volume is\n\n@syms a b c\nf(x,y,z) = Sym(1)\nintegrate(f(x,y,z), (z, 0, (1 - a*x - b*y)/c), (y, 0, (1 - a*x)/b), (x, 0, 1/a))\n\n \n\\[\n\\frac{1}{6 a b c}\n\\]\n\n\n\nThis is \\(1/6\\)th the volume of the box.\n\nCone. Consider a cone formed by the function \\(z = f(x,y) = a - b(x^2+y^2)^{1/2}\\) (\\(a,b > 0\\)) and the \\(x\\)-\\(y\\) plane. This will have radius \\(r = a/b\\) and height \\(a\\). The volume is given by this integral:\n\n\\[\n\\int_{x=-r}^r \\int_{y=-\\sqrt{r^2 - x^2}}^{\\sqrt{r^2-x^2}} \\int_0^{a - b(x^2 + y^2)} 1 dz dy dx.\n\\]\nThis integral is doable, but SymPy has trouble with it. We will return to this when cylindrical coordinates are defined.\n\nSphere. The sphere \\(x^2 + y^2 + z^2 \\leq 1\\) has a known volume. Can we compute it using integration? In Cartesian coordinates, we can describe the region \\(x^2 + y^2 \\leq 1\\) and then the \\(z\\)-limits will follow:\n\n\\[\n\\int_{x=-1}^1 \\int_{y=-\\sqrt{1-x^2}}^{\\sqrt{1-x^2}} \\int_{z=-\\sqrt{1 - x^2 - y^2}}^{\\sqrt{1-x^2 - y^2}} 1 dz dy dx.\n\\]\nThis integral is doable, but SymPy has trouble with it. We will return to this when spherical coordinates are defined."
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#change-of-variables",
"href": "integral_vector_calculus/double_triple_integrals.html#change-of-variables",
"title": "59  Multi-dimensional integrals",
"section": "59.5 Change of variables",
"text": "59.5 Change of variables\nThe change of variables, or substitution, formula from first-semester calculus is expressed, under assumptions, by:\n\\[\n\\int_{g(R)} f(x) dx = \\int_R (f\\circ g)(u)g'(u) du.\n\\]\nThe derivation comes from reversing the chain rule. When using it, we start on the right hand side and typically write \\(x = g(u)\\) and from here derive an expression involving differentials: \\(dx = g'(u) du\\) and the rest follows. In practice, this is used to simplify the integrand in the search for an antiderivative, as \\((f\\circ g)\\) is generally more complicated than \\(f\\) alone.\nIn higher dimensions, we will see that change of variables can not only simplify the integrand, but is also of great use to simplify the region to integrate over. We mentioned, for example, that to use hcubature efficiently over a non-rectangular region, a transformation-or change of variables-is needed. The key to the multi-dimensional formula is understanding what should replace \\(dx = g'(u) du\\). We take a bit of a circuitous route to get there.\nIn Katz a review of the history of “change of variables” from Euler to Cartan is given. We follow Lagranges formal analysis to derive the change of variable formula in two dimensions.\nWe view \\(R\\) in two coordinate systems \\((x,y)\\) and \\((u,v)\\). We have that\n\\[\n\\begin{align}\ndx &= A du + B dv\\\\\ndy &= C du + D dv,\n\\end{align}\n\\]\nwhere \\(A = \\partial{x}/\\partial{u}\\), \\(B = \\partial{x}/\\partial{v}\\), \\(C= \\partial{y}/\\partial{u}\\), and \\(D = \\partial{y}/\\partial{v}\\). Lagrange, following Euler, first sets \\(x\\) to be constant (as is done in iterated integration). Hence, \\(dx = 0\\) and so \\(du = -C(B/A) dv\\) and, after substitution, \\(dy = (D-C(B/A))dv\\). Then Lagrange set \\(y\\) to be a constant, so \\(dy = 0\\) and hence \\(dv=0\\) so \\(dx = Adu\\). The area “element” \\(dx dy = A du \\cdot (D - (B/A)) dv = (AD - BC) du dv\\). Since areas and volumes are non-negative, the absolute value is used. With this, we have “\\(dxdy = |AD-BC|du dv\\)” as the analog of \\(dx = g'(u) du\\).\nThe expression \\(AD - BC\\) was also derived by Euler, by related means. Lagrange extended the analysis to 3 dimensions. Before doing so, it is helpful to understand the problem from a geometric perspective. Euler was attempting to understand the effects of the following change of variable:\n\\[\n\\begin{align}\nx &= a + mt + \\sqrt{1-m^2} v\\\\\ny & = b + \\sqrt{1-m^2}t -mv\n\\end{align}\n\\]\nEuler knew this to be a clockwise rotation by an angle \\(\\theta\\) with \\(\\cos(\\theta) = m\\), a reflection through the \\(x\\) axis, and a translation by \\(\\langle a, b\\rangle\\). All these should preserve the area represented by \\(dx dy\\), so he was expecting \\(dx dy = dt dv\\).\n\n\n\nFigure from Katz showing rotation of Euler.\n\n\nThe figure, taken from Katz, shows the translation, and rotation that should preserve area on a differential scale.\nHowever Euler knew \\(dx = (\\partial{g_1}/\\partial{t}) dt + (\\partial{g_1}/\\partial{v}) dv\\) and \\(dy = (\\partial{g_2}/{\\partial{t}}) dt + (\\partial{g_2}/\\partial{v}) dv\\). Just multiplying gives \\(dx dy = m\\sqrt{1-m^2} dt dt + (1-m^2) dv dt -m^2 dt dv -m\\sqrt{1-m^2} dv dv\\), a result that didnt make sense physically as \\(dt dt\\) and \\(dv dv\\) have no meaning in integration and \\(1 - m^2 - m^2\\) is not \\(1\\) as expected. Euler, like Lagrange, used a formal trick to proceed, but the geometric insight that the incremental areas for a change of variable should be related and for this change of variable identical is correct.\nThe following illustrates the polar-coordinate transformation \\(\\langle x,y\\rangle = G(r, \\theta) = r \\langle \\cos\\theta, \\sin\\theta\\rangle\\).\n\nG(u, v) = u * [cos(v), sin(v)]\n\nG(v) = G(v...)\nJ(v) = ForwardDiff.jacobian(G, v) # [∇g1', ∇g2']\n\nn = 6\nus = range(0, 1, length=3n) # radius\nvs = range(0, 2pi, length=3n) # angle\n\nplot(unzip(G.(us', vs))..., legend = false, aspect_ratio=:equal) # plots constant u lines\nplot!(unzip(G.(us, vs'))...) # plots constant v lines\n\npt = [us[n],vs[n]]\n\n\narrow!(G(pt), J(pt)*[1,0], color=:blue)\narrow!(G(pt), J(pt)*[0,1], color=:blue)\n\n\n\n\nThis graphic shows the image of the box \\([0,1] \\times [0, 2\\pi]\\) under the transformation. The plot commands draw lines for values of constant u or constant v. If \\(G(u,v) = \\langle g_1(u,v), g_2(u,v)\\rangle\\), then the Taylor expansion for \\(g_i\\) is \\(g_i(u+du, v+dv) \\approx g_i(u,v) + (\\nabla{g_i})^T \\cdot \\langle du, dv \\rangle\\) and combining \\(G(u+du, v+dv) \\approx G(u,v) + J_G(u,v) \\langle du, dv \\rangle\\). The vectors added above represent the images when \\(u\\) is constant (so \\(du=0\\)) and when \\(v\\) is constant (so \\(dv=0\\)). The two arrows define a parallelogram whose area gives the change of area undergone by the unit square under the transformation. The area is \\(|\\det(J_G)|\\), the absolute value of the determinant of the Jacobian.\n\n\nshowG (generic function with 3 methods)\n\n\nThe tranformation to elliptical coordinates, \\(G(u,v) = \\langle \\cosh(u)\\cos(v), \\sinh(u)\\sin(v)\\rangle\\), may be viewed similarly:\n\n\n\n\n\nThe transformation \\(G(u,v) = v \\langle e^u, e^{-u} \\rangle\\) uses hyperbolic coordinates:\n\n\n\n\n\nThe transformation \\(G(u,v) = \\langle u^2-v^2, u\\cdot v \\rangle\\) yields a partition of the plane:\n\n\n\n\n\nThe arrows are the images of the standard unit vectors. We see some transformations leave these orthogonal and some change the respective lengths. The area of the associated parallelogram can be found using the determinant of an accompanying matrix. For two dimensions, using the cross product formulation on the embedded vectors, the area is\n\\[\n\\| \\det\\left(\\left[\n\\begin{array}{}\n\\hat{i} & \\hat{j} & \\hat{k}\\\\\nu_1 & u_2 & 0\\\\\nv_1 & v_2 & 0\n\\end{array}\n\\right]\n\\right) \\|\n=\n\\| \\hat{k} \\det\\left(\\left[\n\\begin{array}{}\nu_1 & u_2\\\\\nv_1 & v_2\n\\end{array}\n\\right]\n\\right) \\|\n= | \\det\\left(\\left[\n\\begin{array}{}\nu_1 & u_2\\\\\nv_1 & v_2\n\\end{array}\n\\right]\n\\right)|.\n\\]\nUsing the fact that the two vectors involved are columns in the Jacobian of the transformation, this is just \\(|\\det(J_G)|\\). For \\(3\\) dimensions, the determinant gives the volume of the 3-dimensional parallelepiped in the same manner. This holds for higher dimensions.\nThe absolute value of the determinant of the Jacobian is the multiplying factor that is seen in the change of variable formula for all dimensions:\n\nChange of variable Let \\(U\\) be an open set in \\(R^n\\), \\(G:U \\rightarrow R^n\\) be an injective differentiable function with continuous partial derivatives. If \\(f\\) is continuous and compactly supported, then\n\\[\n\\iint_{G(S)} f(\\vec{x}) dV = \\iint_S (f \\circ G)(\\vec{u}) |\\det(J_G)(\\vec{u})| dU.\n\\]\n\nFor the one-dimensional case, there is no absolute value, but there the interval is reversed, producing “negative” area. This is not the case here, where \\(S\\) is parameterized to give positive volume.\n\n\n\n\n\n\nNote\n\n\n\nThe term “functional determinant” is found for the value \\(\\det(J_G)\\), as is the notation \\(\\partial(x_1, x_2, \\dots x_n)/\\partial(u_1, u_2, \\dots, u_n)\\).\n\n\n\n59.5.1 Two dimensional change of variables\nNow we see several examples of two-dimensional transformations.\n\nPolar integrals\nWe have seen how to compute area in polar coordinates through the formula \\(A = \\int (1/2) r^2(\\theta) d\\theta\\). This formula can be derived as follows. Consider a region \\(R\\) parameterized in polar coordinates by \\(r(\\theta)\\) for \\(a \\leq \\theta \\leq b\\). The area of this region would be \\(\\iint_R fdA\\). Let \\(G(r, \\theta) = r \\langle \\cos\\theta, \\sin\\theta\\rangle\\). Then\n\\[\nJ_G = \\left[\n\\begin{array}{}\n\\cos(\\theta) & - r\\sin(\\theta)\\\\\n\\sin(\\theta) & r\\cos(\\theta)\n\\end{array}\n\\right],\n\\]\nwith determinant \\(r\\).\nThat is, for polar coordinates \\(dx dy = r dr d\\theta\\) (\\(r \\geq 0\\)).\nSo by the change of variable formula, we have:\n\\[\nA = \\iint_R 1 dx dy = \\int_a^b \\int_0^{r(\\theta)} 1 r dr d\\theta = \\int_a^b \\frac{r^2(\\theta)}{2} d\\theta.\n\\]\nThe key is noting that the region, \\(S\\), described by \\(\\theta\\) running from \\(a\\) to \\(b\\) and \\(r\\) running from \\(0\\) to \\(r(\\theta)\\), maps onto \\(R\\) through the change of variables. As polar coordinates is just a renaming, this is clear to see.\n\nNow consider finding the volume of a sphere using polar coordinates. We have, with \\(\\rho\\) being the radius:\n\\[\nV = 2 \\iint_R \\sqrt{\\rho^2 - x^2 - y^2} dy dx,\n\\]\nwhere \\(R\\) is the disc of radius \\(\\rho\\). Using polar coordinates, we have \\(x^2 + y^2 = r^2\\) and the expression becomes:\n\\[\nV = 2 \\int_0^{2\\pi} \\int_0^\\rho \\sqrt{\\rho^2 - r^2} r dr d\\theta = 2 \\int_0^{2\\pi} -(1 - r^2)^{3/2}\\frac{1}{3} \\mid_0^\\rho d\\theta = 2\\int_0^{2\\pi} \\frac{\\rho^3}{3}d\\theta = \\frac{4\\pi\\rho^3}{3}.\n\\]\n\nLinear transformations\nSome transformations from \\(2\\)D computer graphics are represented in matrix notation:\n\\[\n\\left[\n\\begin{array}{}\nx\\\\\ny\n\\end{array}\n\\right] =\n\\left[\n\\begin{array}{}\na & b\\\\\nc & d\n\\end{array}\n\\right]\n\\left[\n\\begin{array}{}\nu\\\\\nv\n\\end{array}\n\\right],\n\\]\nor \\(G(u,v) = \\langle au+bv, cu+dv\\rangle\\). The Jacobian of this linear transformation is the matrix itself.\nSome common transformations are:\n\nStretching or \\(G(u,v) = \\langle ku, v \\rangle\\) or \\(G(u,v) = \\langle u, kv\\rangle\\) for some \\(k >0\\). The former stretching the \\(x\\) axis, the latter the \\(y\\). These have Jacobian determinant \\(k\\)\n\n\n\n\n\n\n\nRotation. Let \\(\\theta\\) be a clockwise rotation parameter, then \\(G(u,v) = \\langle\\cos\\theta u + \\sin\\theta v, -\\sin\\theta u + \\cos\\theta v\\rangle\\) will be the transform. The Jacobian is \\(1\\). This figure rotates by \\(\\pi/6\\):\n\n\n\n\n\n\n\nShearing. Let \\(k > 0\\) and \\(G(u,v) = \\langle u + kv, v \\rangle\\). This transformation is shear parallel to the \\(x\\) axis. (Use \\(G(u,v) = \\langle u, ku+v\\rangle\\) for the \\(y\\) axis). A shear has Jacobian \\(1\\).\n\n\nk = 2\nG(u, v) = [u + 2v, v]\nshowG(G)\n\n\n\n\n\nReflection If \\(\\vec{l} = \\langle l_x, l_y \\rangle\\) with norm \\(\\|\\vec{l}\\|\\). The reflection through the line in the direction of \\(\\vec{l}\\) through the origin is defined, using a matrix, by:\n\n\\[\n\\frac{1}{\\| \\vec{l} \\|^2}\n\\left[\n\\begin{array}{}\nl_x^2 - l_y^2 & 2 l_x l_y\\\\\n2l_x l_y & l_y^2 - l_x^2\n\\end{array}\n\\right]\n\\]\nFor some simple cases: \\(\\langle l_x, l_y \\rangle = \\langle 1, 1\\rangle\\), the diagonal, this is \\(G(u,v) = (1/2) \\langle 2v, 2u \\rangle\\); \\(\\langle l_x, l_y \\rangle = \\langle 0, 1\\rangle\\) (the \\(y\\)-axis) this is \\(G(u,v) = \\langle -u, v\\rangle\\).\n\nA translation by \\(\\langle a ,b \\rangle\\) would be given by \\(G(u,v) = \\langle u+a, y+b \\rangle\\) and would have Jacobian determinant \\(1\\).\n\nAs an example, consider the transformation of reflecting through the line \\(x = 1/2\\). Let \\(\\vec{ab} = \\langle 1/2, 0\\rangle\\). This would be found by translating by \\(-\\vec{ab}\\) then reflecting through the \\(y\\) axis, then translating by \\(\\vec{ab}\\):\n\nT(u, v, a, b) = [u+a, v+b]\nG(u, v) = [-u, v]\n@syms u v\na,b = 1//2, 0\nx1, y1 = T(u,v, -a, -b)\nx2, y2 = G(x1, y1)\nx, y = T(x2, y2, a, b)\n\n2-element Vector{Sym}:\n 1 - u\n v\n\n\n\n\nTriangle\nConsider the problem of integrating \\(f(x,y)\\) over the triangular region bounded by \\(y=x\\), \\(y=0\\), and \\(x=1\\). Such an integral may be computed through Fubinis theorem through \\(\\int_0^1 \\int_0^x f(x,y) dy dx\\) or \\(\\int_0^1 \\int_y^1 f(x,y) dx dy\\), but if these can not be computed, and a numeric option is preferred, a transformation so that the integral is over a rectangle is preferred.\nFor this, the transformation \\(x = u\\), \\(y=uv\\) for \\((u,v)\\) in \\([0,1] \\times [0,1]\\) is possible:\n\n\n\n\n\nThe determinant of the Jacobian is\n\\[\n\\det(J_G) = \\det\\left(\n\\left[\n\\begin{array}{}\n1 & 0\\\\\nv & u\n\\end{array}\n\\right]\n\\right) = u.\n\\]\nSo, \\(\\iint_R f(x,y) dA = \\int_0^1\\int_0^1 f(u, uv) u du dv\\). Here we illustrate with a generic monomial:\n\n@syms x y n::positive m::positive\nmonomial(x,y) = x^n*y^m\nintegrate(monomial(x,y), (y, 0, x), (x, 0, 1))\n\n \n\\[\n\\frac{1}{\\left(m + 1\\right) \\left(m + n + 2\\right)}\n\\]\n\n\n\nAnd compare with:\n\n@syms u v\nintegrate(monomial(u, u*v)*u, (u,0,1), (v,0,1))\n\n \n\\[\n\\frac{1}{\\left(m + 1\\right) \\left(m + n + 2\\right)}\n\\]\n\n\n\n\nComposition of transformations\nWhat about other triangles, say the triangle bounded by \\(x=0\\), \\(y=0\\) and \\(y-x=1\\)?\nThis can be seen as a reflection through the line \\(x=1/2\\) of the triangle above. If \\(G_1\\) represents the mapping from \\(U [0,1]\\times[0,1]\\) into the triangle of the last problem, and \\(G_2\\) represents the reflection through the line \\(x=1/2\\), then the transformation \\(G_2 \\circ G_1\\) will map the box \\(U\\) into the desired region. By the chain rule, we have:\n\\[\n\\begin{align*}\n\\int_{(G_2\\circ G_1)(U))} f dx &= \\int_U (f\\circ G_2 \\circ G_1) |\\det(J_{G_2 \\circ G_1}| du \\\\\n&=\n\\int_U (f\\circ G_2 \\circ G_1) |\\det(J_{G_2}(G_1(u))||\\det J_{G_1}(u)| du.\n\\end{align*}\n\\]\n(In Katz it is mentioned that Jacobi showed this in 1841.)\nThe flip through the \\(x=1/2\\) line was done above and is \\(\\langle u, v\\rangle \\rightarrow \\langle 1-u, v\\rangle\\) which has Jacobian determinant \\(-1\\).\nWe compare now using and hcubature and our “Fubini” function:\n\nG1(u,v) = [u, u*v]\nG1(v) = G1(v...)\nG2(u,v) = [1-u, v]\nG2(v) = G2(v...)\nf(x,y) = x^2*y^3\nf(v) = f(v...)\nA = ∬((y,x) -> f(x,y), (0, x -> 1 - x), (0, 1))\nB = hcubature(v -> (f∘G2∘G1)(v) * v[1] * 1, (0,0), (1, 1))\nA, B[1], A - B[1]\n\n(0.0023809523809523807, 0.0023809523809509895, 1.3912482277333993e-15)\n\n\n\n\n\nHyperbolic transformation\nConsider the region, \\(R\\), bounded by \\(y=0\\), \\(x=e^{-n}\\), \\(x=e^n\\), and \\(y=1/x\\). An integral over this region may be computed with the help of the transform \\(G(u,v) = v \\langle e^u, e^{-u}\\rangle\\) which takes the box \\([-n, n] \\times [0,1]\\) onto \\(R\\).\nWith this, we compute \\(\\iint_R x^2 y^3 dA\\) using SymPy to compute the Jacobian:\n\n@syms u v n\nG(u,v) = v * [exp(u), exp(-u)]\nJac = G(u,v).jacobian([u,v])\nf(x,y) = x^2 * y^3\nf(v) = f(v...)\nintegrate(f(G(u,v)) * abs(det(Jac)), (u, -n, n), (v, 0, 1))\n\n \n\\[\n\\frac{\\left(2 e^{2 n} - 2\\right) e^{- n}}{7}\n\\]\n\n\n\n\nThis collection shows a summary of the above \\(2\\)D transformations:\n\n\n\n\n\n\n\n\n\n59.5.2 Examples\n\nCentroid:\nThe center of mass is a balancing point of a region with density \\(\\rho(x,y)\\). In two dimensions it is a point \\(\\langle \\bar{x}, \\bar{y}\\rangle\\). These are found by the following formulas:\n\\[\nA = \\iint_R \\rho(x,y) dA, \\quad \\bar{x} = \\frac{1}{A} \\iint_R x \\rho(x,y) dA, \\quad\n\\bar{y} = \\frac{1}{A} \\iint_R y \\rho(x,y) dA.\n\\]\nThe \\(x\\) value can be seen in terms of Fubini by integrating in \\(y\\) first:\n\\[\n\\iint_R x \\rho(x,y) dA = \\int_{x=a}^b (\\int_{y=h(x)}^{g(x)} \\rho(x,y) dy) dx.\n\\]\nThe inner integral is the mass of a slice at a value along the \\(x\\) axis. The center of mass is formed then by the mass times the distance from the origin. The center of mass is a “balance” point, in the sense that \\(\\iint_R (x - \\bar{x}) dA = 0\\) and \\(\\iint_R (y-\\bar{y})dA = 0\\).\nFor example, the center of mass of the upper half unit disc will have a centroid with \\(\\bar{x} = 0\\), by symmetry. We can see this by integrating in Cartesian coordinates, as follows\n\\[\n\\iint_R x dA = \\int_{y=0}^1 \\int_{x=-\\sqrt{1-y^2}}^{\\sqrt{1 - y^2}} x dx dy.\n\\]\nThe inner integral is \\(0\\) as it an integral of an odd function over an interval symmetric about \\(0\\).\nThe value of \\(\\bar{y}\\) is found using polar coordinate transformation from:\n\\[\n\\iint_R y dA = \\int_{r=0}^1 \\int_{\\theta=0}^{\\pi} (r\\sin(\\theta))r d\\theta dr =\n\\int_{r=0}^1 r^2 dr \\int_{\\theta=0}^{\\pi}\\sin(\\theta) = \\frac{1}{3} \\cdot 2.\n\\]\nThe third equals sign uses separability. The answer for ${ is this value divided by the area, or \\(2/(3\\pi)\\).\n\n\nExample: Moment of inertia\nThe moment of inertia of a point mass about an axis is \\(I = mr^2\\) where \\(m\\) is the mass and \\(r\\) the distance to the axis. The moment of inertia of a body is the sum of the moment of inertia of each piece. If \\(R\\) is a region in the \\(x\\)-\\(y\\) plane with density \\(\\rho(x,y)\\) and the axis is the \\(y\\) axis, then an approximate moment of inertia would be \\(\\sum (x_i)^2\\rho(x_i, y_i)\\Delta x_i \\Delta y_i\\) which would lead to \\(I = \\iint_R x^2\\rho(x,y) dA\\).\nLet \\(R\\) be the half disc contained by \\(x^2 + y^2 = 1\\) and \\(y \\geq 0\\). Let \\(\\rho(x,y) = xy^2\\). Find the moment of inertia.\n\\[\nR\n\\]\nis best described in polar coordinates, so we try to compute\n\\[\n\\int_0^1 \\int_{-\\pi/2}^{\\pi/2} (r\\cos(\\theta))^2 (r\\cos(\\theta))(r\\sin(\\theta)) r d\\theta dr.\n\\]\nThat requires integrating \\(\\sin^2(\\theta)\\cos^3(\\theta)\\), a doable task, but best left to SymPy:\n\n@syms r theta\nx = r*cos(theta)\ny = r*sin(theta)\nrho(x,y) = x*y^2\nintegrate(x^2 * rho(x, y), (theta, -PI/2, PI/2), (r, 0, 1))\n\n \n\\[\n\\frac{2}{45}\n\\]\n\n\n\n\n\nExample\n(Strang) Find the moment of inertia about the \\(y\\) axis of the unit square tilted counter-clockwise an angle \\(0 \\leq \\alpha \\leq \\pi/2\\).\nThe counterclockwise rotation of the unit square is \\(G(u,v) = \\langle \\cos(\\alpha)u-\\sin(\\alpha)v, \\sin(\\alpha)u + \\cos(\\alpha) v\\rangle\\). This comes from the above formula for clockwise rotation using \\(-\\alpha\\). This transformation has Jacobian determinant \\(1\\), as the area is not deformed. With this, we have\n\\[\n\\iint_R x^2 dA = \\iint_{G(U)} (f\\circ G)(u) |\\det(J_G(u))| dU,\n\\]\nwhich is computed with:\n\n@syms u v alpha\nf(x,y) = x^2\nG(u,v) = [cos(alpha)*u - sin(alpha)*v, sin(alpha)*u + cos(alpha)*v]\nJac = det(G(u,v).jacobian([u,v])) |> simplify\nintegrate(f(G(u,v)...) * Jac , (u, 0, 1), (v, 0, 1))\n\n \n\\[\n\\frac{\\sin^{2}{\\left(\\alpha \\right)}}{3} - \\frac{\\sin{\\left(\\alpha \\right)} \\cos{\\left(\\alpha \\right)}}{2} + \\frac{\\cos^{2}{\\left(\\alpha \\right)}}{3}\n\\]\n\n\n\n\n\nExample\nLet \\(R\\) be a ring with inner radius \\(4\\) and outer radius \\(5\\). Find its moment of inertia about the \\(y\\) axis.\nThe integral to compute is:\n\\[\n\\iint_R x^2 dA,\n\\]\nwith domain that is easy to describe in polar coordinates:\n\n@syms r theta\nx = r*cos(theta)\nintegrate(x^2 * r, (r, 4, 5), (theta, 0, 2PI))\n\n \n\\[\n\\frac{369 \\pi}{4}\n\\]\n\n\n\n\n\n\n59.5.3 Three dimensional change of variables\nThe change of variables formula is no different between dimensions \\(2\\) and \\(3\\) (or higher), but the question of suitable transformation is more involved as the dimensions increase. We stick here to a few widely used ones.\n\nCylindrical coordinates\nPolar coordinates describe the \\(x\\)-\\(y\\) plane in terms of a radius \\(r\\) and angle \\(\\theta\\). Cylindrical coordinates describe the \\(x-y-z\\) plane in terms of \\(r, \\theta\\), and \\(z\\). A transformation is:\n\\[\nG(r,\\theta, z) = \\langle r\\cos(\\theta), r\\sin(\\theta), z\\rangle.\n\\]\nThis has Jacobian determinant \\(r\\), similar to polar coordinates.\n\nExample\nReturning to the volume of a cone above the \\(x\\)-\\(y\\) plane under \\(z = a - b(x^2 + y^2)^{12}\\). This yielded the integral in Cartesian coordinates:\n\\[\n\\int_{x=-r}^r \\int_{y=-\\sqrt{r^2 - x^2}}^{\\sqrt{r^2-x^2}} \\int_0^{a - b(x^2 + y^2)} 1 dz dy dx,\n\\]\nwhere \\(r=a/b\\). This is much simpler in Cylindrical coordinates, as the region is described by the rectangle in \\((r, \\theta)\\): \\([0, \\sqrt{b/a}] \\times [0, 2\\pi]\\) and the \\(z\\) range is from \\(0\\) to \\(a - b r\\).\nThe volume then is:\n\\[\n\\int_{theta=0}^{2\\pi} \\int_{r=0}^{a/b} \\int_{z=0}^{a - br} 1 r dz dr d\\theta =\n2\\pi \\int_{r=0}^{a/b} (a-br)r dr = \\frac{\\pi a^3}{3b^2}.\n\\]\nThis is in agreement with \\(\\pi r^2 h/3\\).\n\nFind the centroid for the cone. First in the \\(x\\) direction, \\(\\iint_R x dV\\) is found by:\n\n@syms r theta z a b\nf(x,y,z) = x\nx = r*cos(theta)\ny = r*sin(theta)\nJac = r\nintegrate(f(x,y,z) * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\nThat this is \\(0\\) is no surprise. The same will be true for the \\(y\\) direction, as the figure is symmetric about the plane \\(y=0\\) and \\(x=0\\). However, the \\(z\\) direction is different:\n\n@syms r theta z a b\nf(x,y,z) = z\nx = r*cos(theta)\ny = r*sin(theta)\nJac = r\nA = integrate(f(x,y,z) * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI))\nB = integrate(1 * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI))\nA, B, A/B\n\n(pi*a^4/(12*b^2), pi*a^3/(3*b^2), a/4)\n\n\nThe answer depends on the height through \\(a\\), but not the size of the base, parameterized by \\(b\\). To finish, the centroid is \\(\\langle 0, 0, a/4\\rangle\\).\n\n\nExample\nA sphere of radius \\(2\\) is intersected by a cylinder of radius \\(1\\) along the \\(z\\) axis. Find the volume of the intersection.\nWe have \\(x^2 + y^2 + z^2 = 4\\) or \\(z^2 = 4 - r^2\\) in cylindrical coordinates. The integral then is:\n\n@syms r::real theta::real z::real\nintegrate(1 * r, (z, -sqrt(4-r^2), sqrt(4-r^2)), (r, 0, 1), (theta, 0, 2PI))\n\n \n\\[\n2 \\pi \\left(\\frac{16}{3} - 2 \\sqrt{3}\\right)\n\\]\n\n\n\nIf instead of a fixed radius of \\(1\\) we use \\(0 \\leq a \\leq 2\\) we have:\n\n@syms a r theta\nintegrate(1 * r, (z, -sqrt(4-r^2), sqrt(4-r^2)), (r, 0, a), (theta,0, 2PI))\n\n \n\\[\n2 \\pi \\left(\\frac{2 a^{2} \\sqrt{4 - a^{2}}}{3} - \\frac{8 \\sqrt{4 - a^{2}}}{3} + \\frac{16}{3}\\right)\n\\]\n\n\n\n\n\n\nSpherical integrals\nSpherical coordinates describe a point in space by a radius from the origin, \\(r\\) or \\(\\rho\\); a azimuthal angle \\(\\theta\\) in \\([0, 2\\pi]\\) and an inclination angle \\(\\phi\\) (also called polar angle) in \\([0, \\pi]\\). The \\(z\\) axis is the direction of the zenith and gives a reference line to define the inclination angle. The \\(x\\)-\\(y\\) plane is the reference plane, with the \\(x\\) axis giving a reference direction for the azimuth measurement.\nThe exact formula to relate \\((\\rho, \\theta, \\phi)\\) to \\((x,y,z)\\) is given by\n\\[\nG(\\rho, \\theta, \\phi) = \\rho \\langle\n\\sin(\\phi)\\cos(\\theta),\n\\sin(\\phi)\\sin(\\theta),\n\\cos(\\phi)\n\\rangle.\n\\]\n\n\n\nFigure showing the parameterization by spherical coordinates. (Wikipedia)\n\n\nThe Jacobian can be computed to be \\(\\rho^2\\sin(\\phi)\\).\n\n@syms ρ theta phi\nG(ρ, theta, phi) = ρ * [sin(phi)*cos(theta), sin(phi)*sin(theta), cos(phi)]\ndet(G(ρ, theta, phi).jacobian([ρ, theta, phi])) |> simplify |> abs\n\n \n\\[\n\\left|{ρ^{2} \\sin{\\left(\\phi \\right)}}\\right|\n\\]\n\n\n\n\nExample\nComputing the volume of a sphere is a challenge (for SymPy) in Cartesian coordinates, but a breeze in spherical coordinates. Using \\(r^2\\sin(\\phi)\\) as the multiplying factor, the volume is simply:\n\\[\n\\int_{\\theta=0}^{2\\pi} \\int_{\\phi=0}^{\\pi} \\int_{r=0}^R 1 \\cdot r^2 \\sin(\\phi) dr d\\phi d\\theta =\n\\int_{\\theta=0}^{2\\pi} d\\theta \\int_{\\phi=0}^{\\pi} \\sin(\\phi)d\\phi \\int_{r=0}^R r^2 dr = (2\\pi)(2)\\frac{R^3}{3} = \\frac{4\\pi R^3}{3}.\n\\]\n\n\nExample\nCompute the volume of the ellipsoid, \\(R\\), described by \\((x/a)^2 + (y/v)^2 + (z/c)^2 \\leq 1\\).\nWe first change variables via \\(G(u,v,w) = \\langle ua, vb, wc \\rangle\\). This maps the unit sphere, \\(S\\), given by \\(u^2 + v^2 + w^2 \\leq 1\\) into the ellipsoid. Then\n\\[\n\\iint_R 1 dV = \\iint_S 1 |\\det(J_G)| dU\n\\]\nBut the Jacobian is a constant:\n\n@syms u v w a b c\nG(u,v,w) = [u*a, v*b, w*c]\ndet(G(u,v,w).jacobian([u,v,w]))\n\n \n\\[\na b c\n\\]\n\n\n\nSo the answer is \\(abc V(S) = 4\\pi abc/3\\)"
},
{
"objectID": "integral_vector_calculus/double_triple_integrals.html#questions",
"href": "integral_vector_calculus/double_triple_integrals.html#questions",
"title": "59  Multi-dimensional integrals",
"section": "59.6 Questions",
"text": "59.6 Questions\n\nQuestion\nSuppose \\(f(x,y) = f_1(x)f_2(y)\\) and \\(R = [a_1, b_1] \\times [a_2,b_2]\\) is a rectangular region. Is this true?\n\\[\n\\iint_R f dA = (\\int_{a_1}^{b_1} f_1(x) dx) \\cdot (\\int_{a_2}^{b_2} f_2(y) dy).\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n No.\n \n \n\n\n \n \n \n \n Yes. As an inner integral \\(\\int_{a^2}^{b_2} f(x,y) dy = f_1(x) \\int_{a_2}^{b_2} f_2(y) dy\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nWhich integrals of the following are \\(0\\) by symmetry? Let \\(R\\) be the unit disc.\n\\[\na = \\iint_R x dA, \\quad b = \\iint_R (x^2 + y^2) dA, \\quad c = \\iint_R xy dA\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Both \\(b\\) and \\(c\\)\n \n \n\n\n \n \n \n \n Both \\(a\\) and \\(b\\)\n \n \n\n\n \n \n \n \n Both \\(a\\) and \\(c\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(R\\) be the unit disc. Which integrals can be found from common geometric formulas (e.g., known formulas for the sphere, cone, pyramid, ellipse, …)\n\\[\na = \\iint_R (1 - (x^2+y2)) dA, \\quad\nb = \\iint_R (1 - \\sqrt{x^2 + y^2}) dA, \\quad\nc = \\iint_R (1 - (x^2 + y^2)^2 dA\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n Both \\(a\\) and \\(b\\)\n \n \n\n\n \n \n \n \n Both \\(a\\) and \\(c\\)\n \n \n\n\n \n \n \n \n Both \\(b\\) and \\(c\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet the region \\(R\\) be described by: in the first quadrant and bounded by \\(x^3 + y^3 = 1\\). What integral below will not find the area of \\(R\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_0^1 \\int_0^{(1-x^3)^{1/3}} 1\\cdot dy dx\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^1 \\int_0^{(1-y^3)^{1/3}} 1\\cdot dx dy\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^1 \\int_0^{(1-y^3)^{1/3}} 1\\cdot dy dx\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(R\\) be a triangular region with vertices \\((0,0), (2,0), (1, b)\\) where \\(b \\geq 0\\). What integral below computes the area of :R?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_0^2\\int_0^{bx} dy dx\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^2 \\int_0^{2b - bx} dy dx\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^b\\int_{y/b}^{2-y/b} dx dy\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x) \\geq 0\\) be an integrable function. The area under \\(f(x)\\) over \\([a,b]\\), \\(\\int_a^b f(x) dx\\), is equivalent to?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_a^b \\int_0^{f(x)} dx dy\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^{f(x)} \\int_a^b dx dy\\)\n \n \n\n\n \n \n \n \n \\(\\int_a^b \\int_0^{f(x)} dy dx\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe region \\(R\\) contained within \\(|x| + |y| = 1\\) is square, but not rectangular (in the sense of integration). What transformation of \\(S = [-1/2,1/2] \\times [-1/2,1/2]\\) will have \\(G(S) = R\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(G(u,v) = \\langle u^2-v^2, u^2+v^2 \\rangle\\)\n \n \n\n\n \n \n \n \n \\(G(u,v) = \\langle u-v, u+v \\rangle\\)\n \n \n\n\n \n \n \n \n \\(G(u,v) = \\langle u-v, u \\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(G(u,v) = \\langle \\cosh(u)\\cos(v), \\sinh(u)\\sin(v) \\rangle\\). Using ForwardDiff find the determinant of the Jacobian at \\([1,2]\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(G(u, v) = \\langle \\cosh(u)\\cos(v), \\sinh(u)\\sin(v) \\rangle\\). Compute the determinant of the Jacobian symbolically:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(\\sinh(u)\\cosh(v)\\)\n \n \n\n\n \n \n \n \n \\(\\sin^{2}{\\left (v \\right )} \\cosh^{2}{\\left (u \\right )} + \\cos^{2}{\\left (v \\right )} \\sinh^{2}{\\left (u \\right )}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nCompute the determinant of the Jacobian of the composition of a clockwise rotation by \\(\\theta\\), a reflection through the \\(x\\) axis, and then a translation by \\(\\langle a,b\\rangle\\), using the fact that the Jacobian determinant of compositions can be written as product of determinants of the individual Jacobians.\n\n\n\n \n \n \n \n \n \n \n \n \n It is \\(r^2 \\sin(\\phi)\\), as the rotations use spherical coordinates\n \n \n\n\n \n \n \n \n It is \\(r\\), as the rotation uses polar coordinates\n \n \n\n\n \n \n \n \n It is \\(1\\), as each is area preserving\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nA wedge, \\(R\\), is specified by \\(0 \\leq r \\leq a\\), \\(0 \\leq \\theta \\leq b\\).\n\n\n \n\\[\n- \\frac{a^{3} \\cos{\\left(b \\right)}}{3} + \\frac{a^{3}}{3}\n\\]\n\n\n\nWhat does A compute?\n\n\n\n \n \n \n \n \n \n \n \n \n The area of \\(R\\)\n \n \n\n\n \n \n \n \n The value \\(\\bar{x}\\) of the centroid\n \n \n\n\n \n \n \n \n The value \\(\\bar{y}\\) of the centroid\n \n \n\n\n \n \n \n \n The moment of inertia of \\(R\\) about the \\(x\\) axis\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat does \\(B/A\\) compute?\n\n\n\n \n \n \n \n \n \n \n \n \n The area of \\(R\\)\n \n \n\n\n \n \n \n \n The value \\(\\bar{x}\\) of the centroid\n \n \n\n\n \n \n \n \n The value \\(\\bar{y}\\) of the centroid\n \n \n\n\n \n \n \n \n The moment of inertia of \\(R\\) about the \\(x\\) axis\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nAccording to Katz in 1899 Cartan formalized the subject of differential forms (elements such as \\(dx\\) or \\(du\\)). Using the rules \\(dtdt = 0 = dv=dv\\) and \\(dv dt = - dt dv\\), what is the product of \\(dx=mdt + dv\\sqrt{1-m^2}\\) and \\(dy=dt\\sqrt{1-m^2}-mdv\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(m\\sqrt{1-m^2}dt^2+(1-2m^2)dtdv -m\\sqrt{1-m^2}dv^2\\)\n \n \n\n\n \n \n \n \n \\((1-2m^2)dt dv\\)\n \n \n\n\n \n \n \n \n \\(dtdv\\)"
},
{
"objectID": "integral_vector_calculus/line_integrals.html",
"href": "integral_vector_calculus/line_integrals.html",
"title": "60  Line and Surface Integrals",
"section": "",
"text": "This section uses these add-on packages:\nThis section discusses generalizations to the one- and two-dimensional definite integral. These two integrals integrate a function over a one or two dimensional region (e.g., \\([a,b]\\) or \\([a,b]\\times[c,d]\\)). The generalization is to change this region to a one-dimensional piece of path in \\(R^n\\) or a two-dimensional surface in \\(R^3\\).\nTo fix notation, consider \\(\\int_a^b f(x)dx\\) and \\(\\int_a^b\\int_c^d g(x,y) dy dx\\). In defining both, a Riemann sum is involved, these involve a partition of \\([a,b]\\) or \\([a,b]\\times[c,d]\\) and terms like \\(f(c_i) \\Delta{x_i}\\) and \\(g(c_i, d_j) \\Delta{x_i}\\Delta{y_j}\\). The \\(\\Delta\\)s the diameter of an intervals \\(I_i\\) or \\(J_j\\). Consider now two parameterizations: \\(\\vec{r}(t)\\) for \\(t\\) in \\([a,b]\\) and \\(\\Phi(u,v)\\) for \\((u,v)\\) in \\([a,b]\\times[c,d]\\). One is a parameterization of a space curve, \\(\\vec{r}:R\\rightarrow R^n\\); the other a parameterization of a surface, \\(\\Phi:R^2 \\rightarrow R^3\\). The image of \\(I_i\\) or \\(I_i\\times{J_j}\\) under \\(\\vec{r}\\) and \\(\\Phi\\), respectively, will look almost linear if the intervals are small enough, so, at least on the microscopic level. A Riemann term can be based around this fact, provided it is understood how much the two parameterizations change the interval \\(I_i\\) or region \\(I_i\\times{J_j}\\).\nThis chapter will quantify this change, describing it in terms of associated vectors to \\(\\vec{r}\\) and \\(\\Phi\\), yielding formulas for an integral of a scalar function along a path or over a surface. Furthermore, these integrals will be generalized to give meaning to physically useful interactions between the path or surface and a vector field."
},
{
"objectID": "integral_vector_calculus/line_integrals.html#line-integrals",
"href": "integral_vector_calculus/line_integrals.html#line-integrals",
"title": "60  Line and Surface Integrals",
"section": "60.1 Line integrals",
"text": "60.1 Line integrals\nIn arc length a formula to give the arc-length of the graph of a univariate function or parameterized curve in \\(2\\) dimensions is given in terms of an integral. The intuitive approximation involved segments of the curve. To review, let \\(\\vec{r}(t)\\), \\(a \\leq t \\leq b\\), describe a curve, \\(C\\), in \\(R^n\\), \\(n \\geq 2\\). Partition \\([a,b]\\) into \\(a=t_0 < t_1 < \\cdots < t_{n-1} < t_n = b\\).\nConsider the path segment connecting \\(\\vec{r}(t_{i-1})\\) to \\(\\vec{r}(t_i)\\). If the partition of \\([a,b]\\) is microscopically small, this path will be approximated by \\(\\vec{r}(t_i) - \\vec{r}(t_{i-1})\\). This difference in turn is approximately \\(\\vec{r}'(t_i) (t_i - t_{i-1}) = \\vec{r}'(t_i) \\Delta{t}_i\\), provided \\(\\vec{r}\\) is differentiable.\nIf \\(f:R^n \\rightarrow R\\) is a scalar function. Taking right-hand end points, we can consider the Riemann sum \\(\\sum (f\\circ\\vec{r})(t_i) \\|\\vec{r}'(t_i)\\| \\Delta{t}_i\\). For integrable functions, this sum converges to the line integral defined as a one-dimensional integral for a given parameterization:\n\\[\n\\int_a^b f(\\vec{r}(t)) \\| \\vec{r}'(t) \\| dt.\n\\]\nThe weight \\(\\| \\vec{r}'(t) \\|\\) can be interpreted by how much the parameterization stretches (or contracts) an interval \\([t_{i-1},t_i]\\) when mapped to its corresponding path segment.\n\nThe curve \\(C\\) can be parameterized many different ways by introducing a function \\(s(t)\\) to change the time. If we use the arc-length parameterization with \\(\\gamma(0) = a\\) and \\(\\gamma(l) = b\\), where \\(l\\) is the arc-length of \\(C\\), then we have by change of variables \\(t = \\gamma(s)\\) that\n\\[\n\\int_a^b f(\\vec{r}(t)) \\| \\vec{r}'(t) \\| dt =\n\\int_0^l (f \\circ \\vec{r} \\circ \\gamma)(s) \\| \\frac{d\\vec{r}}{dt}\\mid_{t = \\gamma(s)}\\| \\gamma'(s) ds.\n\\]\nBut, by the chain rule:\n\\[\n\\frac{d(\\vec{r} \\circ\\gamma)}{du}(s) = \\frac{d\\vec{r}}{dt}\\mid_{t=\\gamma(s)} \\frac{d\\gamma}{du}.\n\\]\nSince \\(\\gamma\\) is increasing, \\(\\gamma' \\geq 0\\), so we get:\n\\[\n\\int_a^b f(\\vec{r}(t)) \\| \\vec{r}'(t) \\| dt =\n\\int_0^l (f \\circ \\vec{r} \\circ \\gamma)(s) \\|\\frac{d(\\vec{r}\\circ\\gamma)}{ds}\\| ds =\n\\int_0^l (f \\circ \\vec{r} \\circ \\gamma)(s) ds.\n\\]\nThe last line, as the derivative is the unit tangent vector, \\(T\\), with norm \\(1\\).\nThis shows that the line integral is not dependent on the parameterization. The notation \\(\\int_C f ds\\) is used to represent the line integral of a scalar function, the \\(ds\\) emphasizing an implicit parameterization of \\(C\\) by arc-length. When \\(C\\) is a closed curve, the \\(\\oint_C fds\\) is used to indicate that.\n\n60.1.1 Example\nWhen \\(f\\) is identically \\(1\\), the line integral returns the arc length. When \\(f\\) varies, then the line integral can be interpreted a few ways. First, if \\(f \\geq 0\\) and we consider a sheet hung from the curve \\(f\\circ \\vec{r}\\) and cut to just touch the ground, the line integral gives the area of this sheet, in the same way an integral gives the area under a positive curve.\nIf the composition \\(f \\circ \\vec{r}\\) is viewed as a density of the arc (as though it were constructed out of some non-uniform material), then the line integral can be seen to return the mass of the arc.\nSuppose \\(\\rho(x,y,z) = 5 - z\\) gives the density of an arc where the arc is parameterized by \\(\\vec{r}(t) = \\langle \\cos(t), 0, \\sin(t) \\rangle\\), \\(0 \\leq t \\leq \\pi\\). (A half-circular arc.) Find the mass of the arc.\n\nrho(x,y,z) = 5 - z\nrho(v) = rho(v...)\nr(t) = [cos(t), 0, sin(t)]\n\n@syms t\nrp = diff.(r(t),t) # r'\narea = integrate((rho ∘ r)(t) * norm(rp), (t, 0, PI))\n\n \n\\[\n-2 + 5 \\pi\n\\]\n\n\n\nContinuing, we could find the center of mass by integrating \\(\\int_C z (f\\circ \\vec{r}) \\|r'\\| dt\\):\n\nMz = integrate(r(t)[3] * (rho ∘ r)(t) * norm(rp), (t, 0, PI))\nMz\n\n \n\\[\n10 - \\frac{\\pi}{2}\n\\]\n\n\n\nFinally, we get the center of mass by\n\nMz / area\n\n \n\\[\n\\frac{10 - \\frac{\\pi}{2}}{-2 + 5 \\pi}\n\\]\n\n\n\n\nExample\nLet \\(f(x,y,z) = x\\sin(y)\\cos(z)\\) and \\(C\\) the path described by \\(\\vec{r}(t) = \\langle t, t^2, t^3\\rangle\\) for \\(0 \\leq t \\leq \\pi\\). Find the line integral \\(\\int_C fds\\).\nWe find the numeric value with:\n\nf(x,y,z) = x*sin(y)*cos(z)\nf(v) = f(v...)\nr(t) = [t, t^2, t^3]\nintegrand(t) = (f ∘ r)(t) * norm(r'(t))\nquadgk(integrand, 0, pi)\n\n(-1.2230621144956229, 1.783298175794812e-8)\n\n\n\n\nExample\nImagine the \\(z\\) axis is a wire and in the \\(x\\)-\\(y\\) plane the unit circle is a path. If there is a magnetic field, \\(B\\), then the field will induce a current to flow along the wire. [Amperes]https://tinyurl.com/y4gl9pgu) circuital law states \\(\\oint_C B\\cdot\\hat{T} ds = \\mu_0 I\\), where \\(\\mu_0\\) is a constant and \\(I\\) the current. If the magnetic field is given by \\(B=(x^2+y^2)^{1/2}\\langle -y,x,0\\rangle\\) compute \\(I\\) in terms of \\(\\mu_0\\).\nWe have the path is parameterized by \\(\\vec{r}(t) = \\langle \\cos(t), \\sin(t), 0\\rangle\\), and so \\(\\hat{T} = \\langle -\\sin(t), \\cos(t), 0\\rangle\\) and the integrand, \\(B\\cdot\\hat{T}\\) is\n\\[\n(x^2 + y^2)^{-1/2}\\langle -\\sin(t), \\cos(t), 0\\rangle\\cdot\n\\langle -\\sin(t), \\cos(t), 0\\rangle = (x^2 + y^2)(-1/2),\n\\]\nwhich is \\(1\\) on the path \\(C\\). So \\(\\int_C B\\cdot\\hat{T} ds = \\int_C ds = 2\\pi\\). So the current satisfies \\(2\\pi = \\mu_0 I\\), so \\(I = (2\\pi)/\\mu_0\\).\n(Amperes law is more typically used to find \\(B\\) from an current, then \\(I\\) from \\(B\\), for special circumstances. The Biot-Savart does this more generally.)\n\n\n\n60.1.2 Line integrals and vector fields; work and flow\nAs defined above, the line integral is defined for a scalar function, but this can be generalized. If \\(F:R^n \\rightarrow R^n\\) is a vector field, then each component is a scalar function, so the integral \\(\\int (F\\circ\\vec{r}) \\|\\vec{r}'\\| dt\\) can be defined component by component to yield a vector.\nHowever, it proves more interesting to define an integral incorporating how properties of the path interact with the vector field. The key is \\(\\vec{r}'(t) dt = \\hat{T} \\| \\vec{r}'(t)\\|dt\\) describes both the magnitude of how the parameterization stretches an interval but also a direction the path is taking. This direction allows interaction with the vector field.\nThe canonical example is work, which is a measure of a force times a distance. For an object following a path, the work done is still a force times a distance, but only that force in the direction of the motion is considered. (The constraint force keeping the object on the path does no work.) Mathematically, \\(\\hat{T}\\) describes the direction of motion along a path, so the work done in moving an object over a small segment of the path is \\((F\\cdot\\hat{T}) \\Delta{s}\\). Adding up incremental amounts of work leads to a Riemann sum for a line integral involving a vector field.\n\nThe work done in moving an object along a path \\(C\\) by a force field, \\(F\\), is given by the integral\n\\[\n\\int_C (F \\cdot \\hat{T}) ds = \\int_C F\\cdot d\\vec{r} = \\int_a^b ((F\\circ\\vec{r}) \\cdot \\frac{d\\vec{r}}{dt})(t) dt.\n\\]\n\n\nIn the \\(n=2\\) case, there is another useful interpretation of the line integral. In this dimension the normal vector, \\(\\hat{N}\\), is well defined in terms of the tangent vector, \\(\\hat{T}\\), through a rotation: \\(\\langle a,b\\rangle^t = \\langle b,-a\\rangle^t\\). (The negative, \\(\\langle -b,a\\rangle\\) is also a candidate, the difference in this choice would lead to a sign difference in in the answer.) This allows the definition of a different line integral, called a flow integral, as detailed later:\n\nThe flow across a curve \\(C\\) is given by\n\\[\n\\int_C (F\\cdot\\hat{N}) ds = \\int_a^b (F \\circ \\vec{r})(t) \\cdot (\\vec{r}'(t))^t dt.\n\\]\n\n\n\n60.1.3 Examples\n\nExample\nLet \\(F(x,y,z) = \\langle x - y, x^2 - y^2, x^2 - z^2 \\rangle\\) and \\(\\vec{r}(t) = \\langle t, t^2, t^3 \\rangle\\). Find the work required to move an object along the curve described by \\(\\vec{r}\\) between \\(0\\) and \\(1\\).\n\nF(x,y,z) = [x-y, x^2 - y^2, x^2 - z^2]\nF(v) = F(v...)\nr(t) = [t, t^2, t^3]\n\n@syms t::real\nintegrate((F ∘ r)(t) ⋅ diff.(r(t), t), (t, 0, 1))\n\n \n\\[\n\\frac{3}{5}\n\\]\n\n\n\n\n\nExample\nLet \\(C\\) be a closed curve. For a closed curve, the work integral is also termed the circulation. For the vector field \\(F(x,y) = \\langle -y, x\\rangle\\) compute the circulation around the triangle with vertices \\((-1,0)\\), \\((1,0)\\), and \\((0,1)\\).\nWe have three integrals using \\(\\vec{r}_1(t) = \\langle -1+2t, 0\\rangle\\), \\(\\vec{r}_2(t) = \\langle 1-t, t\\rangle\\) and \\(\\vec{r}_3(t) = \\langle -t, 1-t \\rangle\\), all from \\(0\\) to \\(1\\). (Check that the parameterization is counter clockwise.)\nThe circulation then is:\n\nr1(t) = [-1 + 2t, 0]\nr2(t) = [1-t, t]\nr3(t) = [-t, 1-t]\nF(x,y) = [-y, x]\nF(v) = F(v...)\nintegrand(r) = t -> (F ∘ r)(t) ⋅ r'(t)\nC1 = quadgk(integrand(r1), 0, 1)[1]\nC2 = quadgk(integrand(r2), 0, 1)[1]\nC3 = quadgk(integrand(r3), 0, 1)[1]\nC1 + C2 + C3\n\n2.0\n\n\nThat this is non-zero reflects a feature of the vector field. In this case, the vector field spirals around the origin, and the circulation is non zero.\n\n\nExample\nLet \\(F\\) be the force of gravity exerted by a mass \\(M\\) on a mass \\(m\\) a distance \\(\\vec{r}\\) away, that is \\(F(\\vec{r}) = -(GMm/\\|\\vec{r}\\|^2)\\hat{r}\\).\nLet \\(\\vec{r}(t) = \\langle 1-t, 0, t\\rangle\\), \\(0 \\leq t \\leq 1\\). For concreteness, we take \\(G M m\\) to be \\(10\\). Then the work to move the mass is given by:\n\nuvec(v) = v/norm(v) # unit vector\nGMm = 10\nFₘ(r) = - GMm /norm(r)^2 * uvec(r)\nrₘ(t) = [1-t, 0, t]\nquadgk(t -> (Fₘ ∘ rₘ)(t) ⋅ rₘ'(t), 0, 1)\n\n(0.0, 0.0)\n\n\nHmm, a value of \\(0\\). Thats a bit surprising at first glance. Maybe it had something to do with the specific path chosen. To investigate, we connect the start and endpoints with a circular arc, instead of a straight line:\n\nrₒ(t) = [cos(t), 0, sin(t)]\nquadgk(t -> (Fₘ ∘ rₒ)(t) ⋅ rₒ'(t), 0, 1)\n\n(-1.2493163125924272e-17, 2.7429251495998208e-17)\n\n\nStill \\(0\\). We will see next that this is not surprising if something about \\(F\\) is known.\n\n\n\n\n\n\nNote\n\n\n\nThe Washington Post had an article by Richard Panek with the quote “Well, yes — depending on what we mean by attraction. Two bodies of mass dont actually exert some mysterious tugging on each other. Newton himself tried to avoid the word attraction for this very reason. All (!) he was trying to do was find the math to describe the motions both down here on Earth and up there among the planets (of which Earth, thanks to Copernicus and Kepler and Galileo, was one).” The point being the formula above is a mathematical description of the force, but not an explanation of how the force actually is transferred.\n\n\n\n\nWork in a conservative vector field\nLet \\(f: R^n \\rightarrow R\\) be a scalar function. Its gradient, \\(\\nabla f\\) is a vector field. For a scalar function, we have by the chain rule:\n\\[\n\\frac{d(f \\circ \\vec{r})}{dt} = \\nabla{f}(\\vec{r}(t)) \\cdot \\frac{d\\vec{r}}{dt}.\n\\]\nIf we integrate, we see:\n\\[\nW = \\int_a^b \\nabla{f}(\\vec{r}(t)) \\cdot \\frac{d\\vec{r}}{dt} dt =\n\\int_a^b \\frac{d(f \\circ \\vec{r})}{dt} dt =\n(f\\circ\\vec{r})\\mid_{t = a}^b =\n(f\\circ\\vec{r})(b) - (f\\circ\\vec{r})(a),\n\\]\nusing the Fundamental Theorem of Calculus.\nThe main point above is that if the vector field is the gradient of a scalar field, then the work done depends only on the endpoints of the path and not the path itself.\n\nConservative vector field: If \\(F\\) is a vector field defined in an open region \\(R\\); \\(A\\) and \\(B\\) are points in \\(R\\) and if for any curve \\(C\\) in \\(R\\) connecting \\(A\\) to \\(B\\), the line integral of \\(F \\cdot \\vec{T}\\) over \\(C\\) depends only on the endpoint \\(A\\) and \\(B\\) and not the path, then the line integral is called path indenpendent and the field is called a conservative field.\n\nThe force of gravity is the gradient of a scalar field. As such, the two integrals above which yield \\(0\\) could have been computed more directly. The particular scalar field is \\(f = -GMm/\\|\\vec{r}\\|\\), which goes by the name the gravitational potential function. As seen, \\(f\\) depends only on magnitude, and as the endpoints of the path in the example have the same distance to the origin, the work integral, \\((f\\circ\\vec{r})(b) - (f\\circ\\vec{r})(a)\\) will be \\(0\\).\n\nExample\nCoulombs law states that the electrostatic force between two charged particles is proportional to the product of their charges and inversely proportional to square of the distance between the two particles. That is,\n\\[\nF = k\\frac{ q q_0}{\\|\\vec{r}\\|^2}\\frac{\\vec{r}}{\\|\\vec{r}\\|}.\n\\]\nThis is similar to gravitational force and is a conservative force. We saw that a line integral for work in a conservative force depends only on the endpoints. Verify, that for a closed loop the work integral will yield \\(0\\).\nTake as a closed loop the unit circle, parameterized by arc-length by \\(\\vec{r}(t) = \\langle \\cos(t), \\sin(t)\\rangle\\). The unit tangent will be \\(\\hat{T} = \\vec{r}'(t) = \\langle -\\sin(t), \\cos(t) \\rangle\\). The work to move a particle of charge \\(q_0\\) about a partical of charge \\(q\\) at the origin around the unit circle would be computed through:\n\n@syms k q q0 t\nF(r) = k*q*q0 * r / norm(r)^3\nr(t) = [cos(t), sin(t)]\nT(r) = [-r[2], r[1]]\nW = integrate(F(r(t)) ⋅ T(r(t)), (t, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\n\n\n\n\n60.1.4 Closed curves and regions;\nThere are technical assumptions about curves and regions that are necessary for some statements to be made:\n\nLet \\(C\\) be a Jordan curve - a non-self-intersecting continuous loop in the plane. Such a curve divides the plane into two regions, one bounded and one unbounded. The normal to a Jordan curve is assumed to be in the direction of the unbounded part.\nFurther, we will assume that our curves are piecewise smooth. That is comprised of finitely many smooth pieces, continuously connected.\nThe region enclosed by a closed curve has an interior, \\(D\\), which we assume is an open set (one for which every point in \\(D\\) has some “ball” about it entirely within \\(D\\) as well.)\nThe region \\(D\\) is connected meaning between any two points there is a continuous path in \\(D\\) between the two points.\nThe region \\(D\\) is simply connected. This means it has no “holes.” Technically, any path in \\(D\\) can be contracted to a point. Connected means one piece, simply connected means no holes.\n\n\n\n60.1.5 The fundamental theorem of line integrals\nThe fact that work in a potential field is path independent is a consequence of the Fundamental Theorem of Line Integrals:\n\nLet \\(U\\) be an open subset of \\(R^n\\), \\(f: U \\rightarrow R\\) a differentiable function and \\(\\vec{r}: R \\rightarrow R^n\\) a differentiable function such that the the path \\(C = \\vec{r}(t)\\), \\(a\\leq t\\leq b\\) is contained in \\(U\\). Then\n\\[\n\\int_C \\nabla{f} \\cdot d\\vec{r} =\n\\int_a^b \\nabla{f}(\\vec{r}(t)) \\cdot \\vec{r}'(t) dt =\nf(\\vec{r}(b)) - f(\\vec{r}(a)).\n\\]\n\nThat is, a line integral through a gradient field can be evaluated by evaluating the original scalar field at the endpoints of the curve. In other words, line integrals through gradient fields are conservative.\nAre conservative fields gradient fields? The answer is yes.\nAssume \\(U\\) is an open region in \\(R^n\\) and \\(F\\) is a continuous and conservative vector field in \\(U\\).\nLet \\(a\\) in \\(U\\) be some fixed point. For \\(\\vec{x}\\) in \\(U\\), define:\n\\[\n\\phi(\\vec{x}) = \\int_{\\vec\\gamma[a,\\vec{x}]} F \\cdot \\frac{d\\vec\\gamma}{dt}dt,\n\\]\nwhere \\(\\vec\\gamma\\) is any differentiable path in \\(U\\) connecting \\(a\\) to \\(\\vec{x}\\) (as a point in \\(U\\)). The function \\(\\phi\\) is uniquely defined, as the integral only depends on the endpoints, not the choice of path.\nIt is shown that the directional derivative \\(\\nabla{\\phi} \\cdot \\vec{v}\\) is equal to \\(F \\cdot \\vec{v}\\) by showing\n\\[\n\\lim_{t \\rightarrow 0}\\frac{\\phi(\\vec{x} + t\\vec{v}) - \\phi(\\vec{x})}{t}\n= \\lim_{t \\rightarrow 0} \\frac{1}{t} \\int_{\\vec\\gamma[\\vec{x},\\vec{x}+t\\vec{v}]} F \\cdot \\frac{d\\vec\\gamma}{dt}dt\n= F(\\vec{x}) \\cdot \\vec{v}.\n\\]\nThis is so for all \\(\\vec{v}\\), so in particular for the coordinate vectors. So \\(\\nabla\\phi = F\\).\n\nExample\nLet \\(Radial(x,y) = \\langle x, y\\rangle\\). This is a conservative field. Show the work integral over the half circle in the upper half plane is the same as the work integral over the \\(x\\) axis connecting \\(-1\\) to \\(1\\).\nWe have:\n\nRadial(x,y) = [x,y]\nRadial(v) = Radial(v...)\n\nr₁(t) = [-1 + t, 0]\nquadgk(t -> Radial(r₁(t)) ⋅ r₁'(t), 0, 2)\n\n(0.0, 0.0)\n\n\nCompared to\n\nr₂(t) = [-cos(t), sin(t)]\nquadgk(t -> Radial(r₂(t)) ⋅ r₂'(t), 0, pi)\n\n(0.0, 0.0)\n\n\n\n\nExample\n\nNot all vector fields are conservative. How can a vector field in \\(U\\) be identified as conservative? For now, this would require either finding a scalar potential or showing all line integrals are path independent.\nIn dimension \\(2\\) there is an easy to check method assuming \\(U\\) is simply connected: If \\(F=\\langle F_x, F_y\\rangle\\) is continuously differentiable in an simply connected region and \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y} = 0\\) then \\(F\\) is conservative. A similarly statement is available in dimension \\(3\\). The reasoning behind this will come from the upcoming Greens theorem.\n\n\n\n60.1.6 Flow across a curve\nThe flow integral in the \\(n=2\\) case was\n\\[\n\\int_C (F\\cdot\\hat{N}) ds = \\int_a^b (F \\circ \\vec{r})(t) \\cdot (\\vec{r}'(t))^{t} dt,\n\\]\nwhere \\(\\langle a,b\\rangle^t = \\langle b, -a\\rangle\\).\nFor a given section of \\(C\\), the vector field breaks down into a tangential and normal component. The tangential component moves along the curve and so doesnt contribute to any flow across the curve, only the normal component will contribute. Hence the \\(F\\cdot\\hat{N}\\) integrand. The following figure indicates the flow of a vector field by horizontal lines, the closeness of the lines representing strength, though these are all evenly space. The two line segments have equal length, but the one captures more flow than the other, as its normal vector is more parallel to the flow lines:\n\n\n\n\n\nThe flow integral is typically computed for a closed (Jordan) curve, measuring the total flow out of a region. In this case, the integral is written \\(\\oint_C (F\\cdot\\hat{N})ds\\).\n\n\n\n\n\n\nNote\n\n\n\nFor a Jordan curve, the positive orientation of the curve is such that the normal direction (proportional to \\(\\hat{T}'\\)) points away from the bounded interior. For a non-closed path, the choice of parameterization will determine the normal and the integral for flow across a curve is dependent - up to its sign - on this choice.\n\n\n\nExample\nThe New York Times showed aerial photos to estimate the number of protest marchers in Hong Kong. This is a more precise way to estimate crowd size, but requires a drone or some such to take photos. If one is on the ground, the number of marchers could be estimated by finding the flow of marchers across a given width. In the Times article, we see “Protestors packed the width of Hennessy Road for more than 5 hours. If this road is 50 meters wide and the rate of the marchers is 3 kilometers per hour, estimate the number of marchers.\nThe basic idea is to compute the rate of flow across a part of the street and then multiply by time. For computational sake, say the marchers are on a grid of 1 meters (that is in a 40m wide street, there is room for 40 marchers at a time. In one minute, the marchers move 50 meters:\n\n3000/60\n\n50.0\n\n\nThis means the rate of marchers per minute is 40 * 50. If this is steady over 5 hours, this simple count gives:\n\n40 * 50 * 5 * 60\n\n600000\n\n\nThis is short of the estimate 2M marchers, but useful for a rough estimate. The point is from rates of flow, which can be calculated locally, amounts over bigger scales can be computed. The word “across” is used, as only the direction across the part of the street counts in the computation. Were the marchers in total unison and then told to take a step to the left and a step to the right, they would have motion, but since it wasnt across the line in the road (rather along the line) there would be no contribution to the count. The dot product with the normal vector formalizes this.\n\n\nExample\nLet a path \\(C\\) be parameterized by \\(\\vec{r}(t) = \\langle \\cos(t), 2\\sin(t)\\rangle\\), \\(0 \\leq t \\leq \\pi/2\\) and \\(F(x,y) = \\langle \\cos(x), \\sin(xy)\\rangle\\). Compute the flow across \\(C\\).\nWe have\n\nr(t) = [cos(t), 2sin(t)]\nF(x,y) = [cos(x), sin(x*y)]\nF(v) = F(v...)\nnormal(a,b) = [b, -a]\nG(t) = (F ∘ r)(t) ⋅ normal(r(t)...)\na, b = 0, pi/2\nquadgk(G, a, b)[1]\n\n1.0894497472261733\n\n\n\n\nExample\nExample, let \\(F(x,y) = \\langle -y, x\\rangle\\) be a vector field. (It represents an rotational flow.) What is the flow across the unit circle?\n\n@syms t::real\nF(x,y) = [-y,x]\nF(v) = F(v...)\nr(t) = [cos(t),sin(t)]\nT(t) = diff.(r(t), t)\nnormal(a,b) = [b,-a]\nintegrate((F ∘ r)(t) ⋅ normal(T(t)...) , (t, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\n\n\nExample\nLet \\(F(x,y) = \\langle x,y\\rangle\\) be a vector field. (It represents a source.) What is the flow across the unit circle?\n\n@syms t::real\nF(x,y) = [x, y]\nF(v) = F(v...)\nr(t) = [cos(t),sin(t)]\nT(t) = diff.(r(t), t)\nnormal(a,b) = [b,-a]\nintegrate((F ∘ r)(t) ⋅ normal(T(t)...) , (t, 0, 2PI))\n\n \n\\[\n2 \\pi\n\\]\n\n\n\n\n\nExample\nLet \\(F(x,y) = \\langle x, y\\rangle / \\| \\langle x, y\\rangle\\|^3\\):\n\nF₁(x,y) = [x,y] / norm([x,y])^2\nF₁(v) = F₁(v...)\n\nF₁ (generic function with 2 methods)\n\n\nConsider \\(C\\) to be the square with vertices at \\((-1,-1)\\), \\((1,-1)\\), \\((1,1)\\), and \\((-1, 1)\\). What is the flow across \\(C\\) for this vector field? The region has simple outward pointing unit normals, these being \\(\\pm\\hat{i}\\) and \\(\\pm\\hat{j}\\), the unit vectors in the \\(x\\) and \\(y\\) direction. The integral can be computed in 4 parts. The first (along the bottom):\n\n@syms s::real\n\nr(s) = [-1 + s, -1]\nn = [0,-1]\nA1 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2))\n\n#The other three sides are related as each parameterization and normal is similar:\n\nr(s) = [1, -1 + s]\nn = [1, 0]\nA2 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2))\n\n\nr(s) = [1 - s, 1]\nn = [0, 1]\nA3 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2))\n\n\nr(s) = [-1, 1-s]\nn = [-1, 0]\nA4 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2))\n\nA1 + A2 + A3 + A4\n\n \n\\[\n2 \\pi\n\\]\n\n\n\nAs could have been anticipated by symmetry, the answer is simply 4A1 or \\(2\\pi\\). What likely is not anticipated, is that this integral will be the same as that found by integrating over the unit circle (an easier integral):\n\n@syms t::real\nr(t) = [cos(t), sin(t)]\nN(t) = r(t)\nintegrate(F₁(r(t)) ⋅ N(t), (t, 0, 2PI))\n\n \n\\[\n2 \\pi\n\\]\n\n\n\nThis equivalence is a consequence of the upcoming Greens theorem, as the vector field satisfies a particular equation."
},
{
"objectID": "integral_vector_calculus/line_integrals.html#surface-integrals",
"href": "integral_vector_calculus/line_integrals.html#surface-integrals",
"title": "60  Line and Surface Integrals",
"section": "60.2 Surface integrals",
"text": "60.2 Surface integrals\n\n\n\nThe Anish Kapoor sculpture Cloud Gate maps the Cartesian grid formed by its concrete resting pad onto a curved surface showing the local distortions. Knowing the areas of the reflected grid after distortion would allow the computation of the surface area of the sculpture through addition. (Wikipedia)\n\n\nWe next turn attention to a generalization of line integrals to surface integrals. Surfaces were described in one of three ways: directly through a function as \\(z=f(x,y)\\), as a level curve through \\(f(x,y,z) = c\\), and parameterized through a function \\(\\Phi: R^2 \\rightarrow R^3\\). The level curve description is locally a function description, and the function description leads to a parameterization (\\(\\Phi(u,v) = \\langle u,v,f(u,v)\\rangle\\)) so we restrict to the parameterized case.\nConsider the figure of the surface described by \\(\\Phi(u,v) = \\langle u,v,f(u,v)\\rangle\\):\n\n\n\n\n\nThe partitioning of the \\(u-v\\) plane into a grid, lends itself to a partitioning of the surface. To compute the total surface area of the surface, it would be natural to begin by approximating the area of each cell of this partition and add. As with other sums, we would expect that as the cells got smaller in diameter, the sum would approach an integral, in this case an integral yielding the surface area.\nConsider a single cell:\n\n\n\n\n\nThe figure shows that a cell on the grid in the \\(u-v\\) plane of area \\(\\Delta{u}\\Delta{v}\\) maps to a cell of the partition with surface area \\(\\Delta{S}\\) which can be approximated by a part of the tangent plane described by two vectors \\(\\vec{v}_1 = \\partial{\\Phi}/\\partial{u}\\) and \\(\\vec{v}_2 = \\partial{\\Phi}/\\partial{v}\\). These two vectors have cross product which a) points in the direction of the normal vector, and b) has magnitude yielding the approximation \\(\\Delta{S} \\approx \\|\\vec{v}_1 \\times \\vec{v}_2\\|\\Delta{u}\\Delta{v}\\).\nIf we were to integrate the function \\(G(x,y, z)\\) over the surface \\(S\\), then an approximating Riemann sum could be produced by \\(G(c) \\| \\vec{v}_1 \\times \\vec{v}_2\\| \\Delta u \\Delta v\\), for some point \\(c\\) on the surface.\nIn the limit a definition of an integral over a surface \\(S\\) in \\(R^3\\) is found by a two-dimensional integral over \\(R\\) in \\(R^2\\):\n\\[\n\\int_S G(x,y,z) dS = \\int_R G(\\Phi(u,v))\n\\| \\frac{\\partial{\\Phi}}{\\partial{u}} \\times \\frac{\\partial{\\Phi}}{\\partial{v}} \\| du dv.\n\\]\nIn the case that the surface is described by \\(z = f(x,y)\\), then the formulas become \\(\\vec{v}_1 = \\langle 1,0,\\partial{f}/\\partial{x}\\rangle\\) and \\(\\vec{v}_2 = \\langle 0, 1, \\partial{f}/\\partial{y}\\rangle\\) with cross product \\(\\vec{v}_1\\times\\vec{v}_2 =\\langle -\\partial{f}/\\partial{x}, -\\partial{f}/\\partial{y},1\\rangle\\).\nThe value \\(\\| \\frac{\\partial{\\Phi}}{\\partial{u}} \\times \\frac{\\partial{\\Phi}}{\\partial{y}} \\|\\) is called the surface element. As seen, it is the scaling between a unit area in the \\(u-v\\) plane and the approximating area on the surface after the parameterization.\n\n60.2.1 Examples\nLet us see that the formula holds for some cases where the answer is known by other means.\n\nA cone\nThe surface area of cone is a known quantity. In cylindrical coordinates, the cone may be described by \\(z = a - br\\), so the parameterization \\((r, \\theta) \\rightarrow \\langle r\\cos(\\theta), r\\sin(\\theta), a - br \\rangle\\) maps \\(T = [0, a/b] \\times [0, 2\\pi]\\) onto the surface (less the bottom).\nThe surface element is the cross product \\(\\langle \\cos(\\theta), \\sin(\\theta), -b\\rangle\\) and \\(\\langle -r\\sin(\\theta), r\\cos(\\theta), 0\\rangle\\), which is:\n\n@syms 𝑹::postive θ::positive 𝒂::positive 𝒃::positive\n𝒏 = [cos(θ), sin(θ), -𝒃] × [-𝑹*sin(θ), 𝑹*cos(θ), 0]\n𝒔𝒆 = simplify(norm(𝒏))\n\n \n\\[\n\\sqrt{𝒃^{2} \\left|{𝑹 \\sin{\\left(θ \\right)}}\\right|^{2} + 𝒃^{2} \\left|{𝑹 \\cos{\\left(θ \\right)}}\\right|^{2} + \\left|{𝑹}\\right|^{2}}\n\\]\n\n\n\n(To do this computationally, one might compute:\n\nPhi(r, theta) = [r*cos(theta), r*sin(theta), 𝒂 - 𝒃*r]\nPhi(𝑹, θ).jacobian([𝑹, θ])\n\n3×2 Matrix{Sym}:\n cos(θ) -𝑹⋅sin(θ)\n sin(θ) 𝑹⋅cos(θ)\n -𝒃 0\n\n\nand from here pull out the two vectors to take a cross product.)\nThe surface area is then found by integrating \\(G(\\vec{x}) = 1\\):\n\nintegrate(1 * 𝒔𝒆, (𝑹, 0, 𝒂/𝒃), (θ, 0, 2PI))\n\n \n\\[\n\\frac{\\pi 𝒂^{2} \\sqrt{𝒃^{2} + 1}}{𝒃^{2}}\n\\]\n\n\n\nA formula from a quick Google search is \\(A = \\pi r(r^2 + \\sqrt{h^2 + r^2}\\). Does this match up?\n\n𝑹 = 𝒂/𝒃; 𝒉 = 𝒂\npi * 𝑹 * (𝑹 + sqrt(𝑹^2 + 𝒉^2)) |> simplify\n\n \n\\[\n\\frac{\\pi 𝒂^{2} \\left(\\sqrt{𝒃^{2} + 1} + 1\\right)}{𝒃^{2}}\n\\]\n\n\n\nNope, off by a summand of \\(\\pi(a/b)^2 = \\pi r^2\\), which may be recognized as the area of the base, which we did not compute, but which the Google search did. So yes, the formulas do agree.\n\n\nExample\nThe sphere has known surface area \\(4\\pi r^2\\). Lets see if we can compute this. With the parameterization from spherical coordinates \\((\\theta, \\phi) \\rightarrow \\langle r\\sin\\phi\\cos\\theta, r\\sin\\phi\\sin\\theta,r\\cos\\phi\\rangle\\), we have approaching this numerically:\n\nRad = 1\nPhi(theta, phi) = Rad * [sin(phi)*cos(theta), sin(phi)*sin(theta), cos(phi)]\nPhi(v) = Phi(v...)\n\nfunction surface_element(pt)\n Jac = ForwardDiff.jacobian(Phi, pt)\n v1, v2 = Jac[:,1], Jac[:,2]\n norm(v1 × v2)\nend\nout = hcubature(surface_element, (0, 0), (2pi, 1pi))\nout[1] - 4pi*Rad^2 # *basically* zero\n\n8.15347789284715e-13\n\n\n\n\nExample\nIn Surface area the following formula for the surface area of a surface of revolution about the \\(x\\) axis is described by \\(r=f(x)\\) is given:\n\\[\n\\int_a^b 2\\pi f(x) \\cdot \\sqrt{1 + f'(x)^2} dx.\n\\]\nConsider the transformation \\((x, \\theta) \\rightarrow \\langle x, f(x)\\cos(\\theta), f(x)\\sin(\\theta)\\). This maps the region \\([a,b] \\times [0, 2\\pi]\\) onto the surface of revolution. As such, the surface element would be:\n\n@syms 𝒇()::positive x::real theta::real\n\nPhi(x, theta) = [x, 𝒇(x)*cos(theta), 𝒇(x)*sin(theta)]\nJac = Phi(x, theta).jacobian([x, theta])\nv1, v2 = Jac[:,1], Jac[:,2]\nse = norm(v1 × v2)\nse .|> simplify\n\n \n\\[\n\\sqrt{\\left|{\\frac{d}{d x} 𝒇{\\left(x \\right)}}\\right|^{2} + 1} 𝒇{\\left(x \\right)}\n\\]\n\n\n\nThis in agreement with the previous formula.\n\n\nExample\nConsider the upper half sphere, \\(S\\). Compute \\(\\int_S z dS\\).\nWere the half sphere made of a thin uniform material, this would be computed to find the \\(z\\) direction of the centroid.\nWe use the spherical coordinates to parameterize:\n\\[\n\\Phi(\\theta, \\phi) = \\langle \\cos(\\phi)\\cos(\\theta), \\cos(\\phi)\\sin(\\theta), \\sin(\\phi) \\rangle\n\\]\nThe Jacobian and surface element are computed and then the integral is performed:\n\n@syms theta::real phi::real\nPhi(theta, phi) = [cos(phi)*cos(theta), cos(phi)*sin(theta), sin(phi)]\nJac = Phi(theta,phi).jacobian([theta, phi])\n\nv1, v2 = Jac[:,1], Jac[:,2]\nSurfElement = norm(v1 × v2) |> simplify\n\nz = sin(phi)\nintegrate(z * SurfElement, (theta, 0, 2PI), (phi, 0, PI/2))\n\n \n\\[\n\\pi\n\\]\n\n\n\n\n\n\n60.2.2 Orientation\nA smooth surface \\(S\\) is orientable if it possible to define a unit normal vector, \\(\\vec{N}\\) that varies continuously with position. For example, a sphere has a normal vector that does this. On the other hand, a Mobius strip does not, as a normal when moved around the surface may necessarily be reversed as it returns to its starting point. For a closed, orientable smooth surface there are two possible choices for a normal, and convention chooses the one that points away from the contained region, such as the outward pointing normal for the sphere or torus.\n\n\n60.2.3 Surface integrals in vector fields\nBeyond finding surface area, surface integrals can also compute interesting physical phenomena. These are often associated to a vector field (in this case a function \\(\\vec{F}: R^3 \\rightarrow R^3\\)), and the typical case is the flux through a surface defined locally by \\(\\vec{F} \\cdot \\hat{N}\\), that is the magnitude of the projection of the field onto the unit normal vector.\nConsider the flow of water through an opening in a time period \\(\\Delta t\\). The amount of water mass to flow through would be the area of the opening times the velocity of the flow perpendicular to the surface times the density times the time period; symbolically: \\(dS \\cdot ((\\rho \\vec{v}) \\cdot \\vec{N}) \\cdot \\Delta t\\). Dividing by \\(\\Delta t\\) gives a rate of flow as \\(((\\rho \\vec{v}) \\cdot \\vec{N}) dS\\). With \\(F = \\rho \\vec{v}\\), the flux integral can be seen as the rate of flow through a surface.\nTo find the normal for a surface element arising from a parameterization \\(\\Phi\\), we have the two partial derivatives \\(\\vec{v}_1=\\partial{\\Phi}/\\partial{u}\\) and \\(\\vec{v}_2 = \\partial{\\Phi}/\\partial{v}\\), the two column vectors of the Jacobian matrix of \\(\\Phi(u,v)\\). These describe the tangent plane, and even more their cross product will be a) normal to the tangent plane and b) have magnitude yielding the surface element of the transformation.\nFrom this, for a given parameterization, \\(\\Phi(u,v):T \\rightarrow S\\), the following formula is suggested for orientable surfaces:\n\\[\n\\int_S \\vec{F} \\cdot \\hat{N} dS =\n\\int_T \\vec{F}(\\Phi(u,v)) \\cdot\n(\\frac{\\partial{\\Phi}}{\\partial{u}} \\times \\frac{\\partial{\\Phi}}{\\partial{v}})\ndu dv.\n\\]\nWhen the surface is described by a function, \\(z=f(x,y)\\), the parameterization is \\((u,v) \\rightarrow \\langle u, v, f(u,v)\\rangle\\), and the two vectors are \\(\\vec{v}_1 = \\langle 1, 0, \\partial{f}/\\partial{u}\\rangle\\) and \\(\\vec{v}_2 = \\langle 0, 1, \\partial{f}/\\partial{v}\\rangle\\) and their cross product is \\(\\vec{v}_1\\times\\vec{v}_1=\\langle -\\partial{f}/\\partial{u}, -\\partial{f}/\\partial{v}, 1\\rangle\\).\n\nExample\nSuppose a vector field \\(F(x,y,z) = \\langle 0, y, -z \\rangle\\) is given. Let \\(S\\) be the surface of the paraboloid \\(y = x^2 + z^2\\) between \\(y=0\\) and \\(y=4\\). Compute the surface integral \\(\\int_S F\\cdot \\hat{N} dS\\).\nThis is a surface of revolution about the \\(y\\) axis, so a parameterization is \\(\\Phi(y,\\theta) = \\langle \\sqrt{y} \\cos(\\theta), y, \\sqrt{y}\\sin(\\theta) \\rangle\\). The surface normal is given by:\n\n@syms y::positive theta::positive\nPhi(y,theta) = [sqrt(y)*cos(theta), y, sqrt(y)*sin(theta)]\nJac = Phi(y, theta).jacobian([y, theta])\nv1, v2 = Jac[:,1], Jac[:,2]\nNormal = v1 × v2\n\n# With this, the surface integral becomes:\n\nF(x,y,z) = [0, y, -z]\nF(v) = F(v...)\nintegrate(F(Phi(y,theta)) ⋅ Normal, (theta, 0, 2PI), (y, 0, 4))\n\n \n\\[\n- 16 \\pi\n\\]\n\n\n\n\n\nExample\nLet \\(S\\) be the closed surface bounded by the cylinder \\(x^2 + y^2 = 1\\), the plane \\(z=0\\), and the plane \\(z = 1+x\\). Let \\(F(x,y,z) = \\langle 1, y, -z \\rangle\\). Compute \\(\\oint_S F\\cdot\\vec{N} dS\\).\n\n𝐅(x,y,z) = [1, y, z]\n𝐅(v) = 𝐅(v...)\n\n𝐅 (generic function with 2 methods)\n\n\nThe surface has three faces, with different outward pointing normals for each. Let \\(S_1\\) be the unit disk in the \\(x-y\\) plane with normal \\(-\\hat{k}\\); \\(S_2\\) be the top part, with normal \\(\\langle \\langle-1, 0, 1\\rangle\\) (as the plane is \\(-1x + 0y + 1z = 1\\)); and \\(S_3\\) be the cylindrical part with outward pointing normal \\(\\vec{r}\\).\nIntegrating over \\(S_1\\), we have the parameterization \\(\\Phi(r,\\theta) = \\langle r\\cos(\\theta), r\\sin(\\theta), 0\\rangle\\):\n\n@syms 𝐑::positive 𝐭heta::positive\n𝐏hi₁(r,theta) = [r*cos(theta), r*sin(theta), 0]\n𝐉ac₁ = 𝐏hi₁(𝐑, 𝐭heta).jacobian([𝐑, 𝐭heta])\n𝐯₁, 𝐰₁ = 𝐉ac₁[:,1], 𝐉ac₁[:,2]\n𝐍ormal₁ = 𝐯₁ × 𝐰₁ .|> simplify\n\n3-element Vector{Sym}:\n 0\n 0\n 𝐑\n\n\n\nA₁ = integrate(𝐅(𝐏hi₁(𝐑, 𝐭heta)) ⋅ (-𝐍ormal₁), (𝐭heta, 0, 2PI), (𝐑, 0, 1)) # use -Normal for outward pointing\n\n \n\\[\n0\n\\]\n\n\n\nIntegrating over \\(S_2\\) we use the parameterization \\(\\Phi(r, \\theta) = \\langle r\\cos(\\theta), r\\sin(\\theta), 1 + r\\cos(\\theta)\\rangle\\).\n\n𝐏hi₂(r, theta) = [r*cos(theta), r*sin(theta), 1 + r*cos(theta)]\n𝐉ac₂ = 𝐏hi₂(𝐑, 𝐭heta).jacobian([𝐑, 𝐭heta])\n𝐯₂, 𝐰₂ = 𝐉ac₂[:,1], 𝐉ac₂[:,2]\n𝐍ormal₂ = 𝐯₂ × 𝐰₂ .|> simplify # has correct orientation\n\n3-element Vector{Sym}:\n -𝐑\n 0\n 𝐑\n\n\nWith this, the contribution for \\(S_2\\) is:\n\nA₂ = integrate(𝐅(𝐏hi₂(𝐑, 𝐭heta)) ⋅ (𝐍ormal₂), (𝐭heta, 0, 2PI), (𝐑, 0, 1))\n\n \n\\[\n0\n\\]\n\n\n\nFinally for \\(S_3\\), the parameterization used is \\(\\Phi(z, \\theta) = \\langle \\cos(\\theta), \\sin(\\theta), z\\rangle\\), but this is over a non-rectangular region, as \\(z\\) is between \\(0\\) and \\(1 + x\\).\nThis parameterization gives a normal computed through:\n\n@syms 𝐳::positive\n𝐏hi₃(z, theta) = [cos(theta), sin(theta), 𝐳]\n𝐉ac₃ = 𝐏hi₃(𝐳, 𝐭heta).jacobian([𝐳, 𝐭heta])\n𝐯₃, 𝐰₃ = 𝐉ac₃[:,1], 𝐉ac₃[:,2]\n𝐍ormal₃ = 𝐯₃ × 𝐰₃ .|> simplify # wrong orientation, so we change sign below\n\n3-element Vector{Sym}:\n -cos(𝐭heta)\n -sin(𝐭heta)\n 0\n\n\nThe contribution is\n\nA₃ = integrate(𝐅(𝐏hi₃(𝐑, 𝐭heta)) ⋅ (-𝐍ormal₃), (𝐳, 0, 1 + cos(𝐭heta)), (𝐭heta, 0, 2PI))\n\n \n\\[\n2 \\pi\n\\]\n\n\n\nIn total, the surface integral is\n\nA₁ + A₂ + A₃\n\n \n\\[\n2 \\pi\n\\]\n\n\n\n\n\nExample\nTwo point charges with charges \\(q\\) and \\(q_0\\) will exert an electrostatic force of attraction or repulsion according to Coulombs law. The Coulomb force is \\(kqq_0\\vec{r}/\\|\\vec{r}\\|^3\\). This force is proportional to the product of the charges, \\(qq_0\\), and inversely proportional to the square of the distance between them.\nThe electric field is a vector field is the field generated by the force on a test charge, and is given by \\(E = kq\\vec{r}/\\|\\vec{r}\\|^3\\).\nLet \\(S\\) be the unit sphere \\(\\|\\vec{r}\\|^2 = 1\\). Compute the surface integral of the electric field over the closed surface, \\(S\\).\nWe have (using \\(\\oint\\) for a surface integral over a closed surface):\n\\[\n\\oint_S S \\cdot \\vec{N} dS =\n\\oint_S \\frac{kq}{\\|\\vec{r}\\|^2} \\hat{r} \\cdot \\hat{r} dS =\n\\oint_S \\frac{kq}{\\|\\vec{r}\\|^2} dS =\nkqq_0 \\cdot SA(S) =\n4\\pi k q\n\\]\nNow consider the electric field generated by a point charge within the unit sphere, but not at the origin. The integral now will not fall in place by symmetry considerations, so we will approach the problem numerically.\n\nE(r) = (1/norm(r)^2) * uvec(r) # kq = 1\n\nPhiₑ(theta, phi) = 1*[sin(phi)*cos(theta), sin(phi) * sin(theta), cos(phi)]\nPhiₑ(r) = Phiₑ(r...)\n\nnormal(r) = Phiₑ(r)/norm(Phiₑ(r))\n\nfunction SE(r)\n Jac = ForwardDiff.jacobian(Phiₑ, r)\n v1, v2 = Jac[:,1], Jac[:,2]\n v1 × v2\nend\n\na = rand() * Phiₑ(2pi*rand(), pi*rand())\nA1 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,0.0), (2pi, 1pi))\nA1[1]\n\n12.566370613450168\n\n\nThe answer is \\(4\\pi\\), regardless of the choice of a, as long as it is inside the surface. (We see above, some fussiness in the limits of integration. HCubature does some conversion of the limits, but does not currently do well with mixed types, so in the above only floating point values are used.)\nWhen a is outside the surface, the answer is always a constant:\n\na = 2 * Phiₑ(2pi*rand(), pi*rand()) # random point with radius 2\nA1 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,0.0), (2pi, pi/2))\nA2 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,pi/2), (2pi, 1pi))\nA1[1] + A2[1]\n\n-1.3888223904245933e-11\n\n\nThat constant being \\(0\\).\nThis is a consequence of Gausss law, which states that for an electric field \\(E\\), the electric flux through a closed surface is proportional to the total charge contained. (Gausss law is related to the upcoming divergence theorem.) When a is inside the surface, the total charge is the same regardless of exactly where, so the integrals value is always the same. When a is outside the surface, the total charge inside the sphere is \\(0\\), so the flux integral is as well.\nGausss law is typically used to identify the electric field by choosing a judicious surface where the surface integral can be computed. For example, suppose a ball of radius \\(R_0\\) has a uniform charge. What is the electric field generated? Assuming it is dependent only on the distance from the center of the charged ball, we can, first, take a sphere of radius \\(R > R_0\\) and note that \\(E(\\vec{r})\\cdot\\hat{N}(r) = \\|E(R)\\|\\), the magnitude a distance \\(R\\) away. So the surface integral is simply \\(\\|E(R)\\|4\\pi R^2\\) and by Gausss law a constant depending on the total charge. So \\(\\|E(R)\\| ~ 1/R^2\\). When \\(R < R_0\\), the same applies, but the total charge within the surface will be like \\((R/R_0 )^3\\), so the result will be linear in \\(R\\), as:\n\\[\n4 \\pi \\|E(R)\\| R^2 = k 4\\pi \\left(\\frac{R}{R_0}\\right)^3.\n\\]"
},
{
"objectID": "integral_vector_calculus/line_integrals.html#questions",
"href": "integral_vector_calculus/line_integrals.html#questions",
"title": "60  Line and Surface Integrals",
"section": "60.3 Questions",
"text": "60.3 Questions\n\nQuestion\nLet \\(\\vec{r}(t) = \\langle e^t\\cos(t), e^{-t}\\sin(t) \\rangle\\).\nWhat is \\(\\|\\vec{r}'(1/2)\\|\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the \\(x\\) (first) component of \\(\\hat{N}(t) = \\hat{T}'(t)/\\|\\hat{T}'(t)\\|\\) at \\(t=1/2\\)?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\Phi(u,v) = \\langle u,v,u^2+v^2\\rangle\\) parameterize a surface. Find the magnitude of \\(\\| \\partial{\\Phi}/\\partial{u} \\times \\partial{\\Phi}/\\partial{v} \\|\\) at \\(u=1\\) and \\(v=2\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor a plane \\(ax+by+cz=d\\) find the unit normal.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle a, b, c\\rangle / \\| \\langle a, b, c\\rangle\\|\\)\n \n \n\n\n \n \n \n \n \\(\\langle d-a, d-b, d-c\\rangle / \\| \\langle d-a, d-b, d-c\\rangle\\|\\)\n \n \n\n\n \n \n \n \n \\(\\langle a, b, c\\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nDoes it depend on \\(d\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes. Of course. Different values for \\(d\\) mean different values for \\(x\\), \\(y\\), and \\(z\\) are needed.\n \n \n\n\n \n \n \n \n Yes. The gradient of \\(F(x,y,z) = ax + by + cz\\) will be normal to the level curve \\(F(x,y,z)=d\\), and so this will depend on \\(d\\).\n \n \n\n\n \n \n \n \n No. Moving \\(d\\) just shifts the plane up or down the \\(z\\) axis, but won't change the normal vector\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{r}(t) = \\langle \\cos(t), \\sin(t), t\\rangle\\) and let \\(F(x,y,z) = \\langle -y, x, z\\rangle\\)\nNumerically compute \\(\\int_0^{2\\pi} F(\\vec{r}(t)) \\cdot \\vec{r}'(t) dt\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\nCompute the value symbolically:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2\\pi + 2\\pi^2\\)\n \n \n\n\n \n \n \n \n \\(2\\pi^2\\)\n \n \n\n\n \n \n \n \n \\(4\\pi\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = \\langle 2x^3y^2, xy^4 + 1\\rangle\\). What is the work done in integrating \\(F\\) along the parabola \\(y=x^2\\) between \\((-1,1)\\) and \\((1,1)\\)? Give a numeric answer:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F = \\nabla{f}\\) where \\(f:R^2 \\rightarrow R\\). The level curves of \\(f\\) are curves in the \\(x-y\\) plane where \\(f(x,y)=c\\), for some constant \\(c\\). Suppose \\(\\vec{r}(t)\\) describes a path on the level curve of \\(f\\). What is the value of \\(\\int_C F \\cdot d\\vec{r}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It will be \\(0\\), as \\(\\nabla{f}\\) is orthogonal to the level curve and \\(\\vec{r}'\\) is tangent to the level curve\n \n \n\n\n \n \n \n \n It will \\(f(b)-f(a)\\) for any \\(b\\) or \\(a\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = (x^2+y^2)^{-k/2} \\langle x, y \\rangle\\) be a radial field. The work integral around the unit circle simplifies:\n\\[\n\\int_C F\\cdot \\frac{dr}{dt} dt = \\int_0^{2pi} \\langle (1)^{-k/2} \\cos(t), \\sin(t) \\rangle \\cdot \\langle-\\sin(t), \\cos(t)\\rangle dt.\n\\]\nFor any \\(k\\), this integral will be:\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(f(x,y) = \\tan^{-1}(y/x)\\). We will integrate \\(\\nabla{f}\\) over the unit circle. The integrand wil be:\n\n@syms t::real x::real y::real\nf(x,y) = atan(y/x)\nr(t) = [cos(t), sin(t)]\n∇f = subs.(∇(f(x,y)), x .=> r(t)[1], y .=> r(t)[2]) .|> simplify\ndrdt = diff.(r(t), t)\n∇f ⋅ drdt |> simplify\n\n \n\\[\n1\n\\]\n\n\n\nSo \\(\\int_C \\nabla{f}\\cdot d\\vec{r} = \\int_0^{2\\pi} \\nabla{f}\\cdot d\\vec{r}/dt dt = 2\\pi\\).\nWhy is this surprising?\n\n\n\n \n \n \n \n \n \n \n \n \n The value of \\(d/dt(f\\circ\\vec{r})=0\\), so the integral should be \\(0\\).\n \n \n\n\n \n \n \n \n The field is a potential field, but the path integral around \\(0\\) is not path dependent.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nThe function \\(F = \\nabla{f}\\) is\n\n\n\n \n \n \n \n \n \n \n \n \n Continuous everywhere\n \n \n\n\n \n \n \n \n Not continuous everywhere\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = \\langle F_x, F_y\\rangle = \\langle 2x^3y^2, xy^4 + 1\\rangle\\). Compute\n\\[\n\\frac{\\partial{F_y}}{\\partial{x}}- \\frac{\\partial{F_x}}{\\partial{y}}.\n\\]\nIs this \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = \\langle F_x, F_y\\rangle = \\langle 2x^3, y^4 + 1\\rangle\\). Compute\n\\[\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}.\n\\]\nIs this \\(0\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n Yes\n \n \n\n\n \n \n \n \n No\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIt is not unusual to see a line integral, \\(\\int F\\cdot d\\vec{r}\\), where \\(F=\\langle M, N \\rangle\\) expressed as \\(\\int Mdx + Ndy\\). This uses the notation for a differential form, so is familiar in some theoretical usages, but does not readily lend itself to computation. It does yield pleasing formulas, such as \\(\\oint_C x dy\\) to give the area of a two-dimensional region, \\(D\\), in terms of a line integral around its perimeter. To see that this is so, let \\(\\vec{r}(t) = \\langle a\\cos(t), b\\sin(t)\\rangle\\), \\(0 \\leq t \\leq 2\\pi\\). This parameterizes an ellipse. Let \\(F(x,y) = \\langle 0,x\\rangle\\). What does \\(\\oint_C xdy\\) become when translated into \\(\\int_a^b (F\\circ\\vec{r})\\cdot\\vec{r}' dt\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\int_0^{2\\pi} (a\\cos(t)) \\cdot (b\\cos(t)) dt\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^{2\\pi} (-b\\sin(t)) \\cdot (b\\cos(t)) dt\\)\n \n \n\n\n \n \n \n \n \\(\\int_0^{2\\pi} (a\\cos(t)) \\cdot (a\\cos(t)) dt\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet a surface be parameterized by \\(\\Phi(u,v) = \\langle u\\cos(v), u\\sin(v), u\\rangle\\).\nCompute \\(\\vec{v}_1 = \\partial{\\Phi}/\\partial{u}\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle \\cos(v), \\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -u\\sin(v), u\\cos(v), 0\\rangle\\)\n \n \n\n\n \n \n \n \n \\(u\\langle -\\cos(v), -\\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nCompute \\(\\vec{v}_2 = \\partial{\\Phi}/\\partial{u}\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle \\cos(v), \\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -u\\sin(v), u\\cos(v), 0\\rangle\\)\n \n \n\n\n \n \n \n \n \\(u\\langle -\\cos(v), -\\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nCompute \\(\\vec{v}_1 \\times \\vec{v}_2\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\langle \\cos(v), \\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \\(\\langle -u\\sin(v), u\\cos(v), 0\\rangle\\)\n \n \n\n\n \n \n \n \n \\(u\\langle -\\cos(v), -\\sin(v), 1\\rangle\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the surface parameterized by \\(\\Phi(u,v) = \\langle uv, u^2v, uv^2\\rangle\\) for \\((u,v)\\) in \\([0,1]\\times[0,1]\\), numerically find the surface area.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor the surface parameterized by \\(\\Phi(u,v) = \\langle uv, u^2v, uv^2\\rangle\\) for \\((u,v)\\) in \\([0,1]\\times[0,1]\\) and vector field \\(F(x,y,z) =\\langle y^2, x, z\\langle\\), numerically find \\(\\iint_S (F\\cdot\\hat{N}) dS\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F=\\langle 0,0,1\\rangle\\) and \\(S\\) be the upper-half unit sphere, parameterized by \\(\\Phi(\\theta, \\phi) = \\langle \\sin(\\phi)\\cos(\\theta), \\sin(\\phi)\\sin(\\theta), \\cos(\\phi)\\rangle\\). Compute \\(\\iint_S (F\\cdot\\hat{N}) dS\\) numerically. Choose the normal direction so that the answer is postive.\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\phi(x,y,z) = xy\\) and \\(S\\) be the triangle \\(x+y+z=1\\), \\(x,y,z \\geq 0\\). The surface may be described by \\(z=f(x,y) = 1 - (x + y)\\), \\(0\\leq y \\leq 1-x, 0 \\leq x \\leq 1\\) is useful in describing the surface. With this, the following integral will compute \\(\\int_S \\phi dS\\):\n\\[\n\\int_0^1 \\int_0^{1-x} xy \\sqrt{1 + \\left(\\frac{\\partial{f}}{\\partial{x}}\\right)^2 + \\left(\\frac{\\partial{f}}{\\partial{y}}\\right)^2} dy dx.\n\\]\nCompute this.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\sqrt{2}/24\\)\n \n \n\n\n \n \n \n \n \\(2/\\sqrt{24}\\)\n \n \n\n\n \n \n \n \n \\(1/12\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\Phi(u,v) = \\langle u^2, uv, v^2\\rangle\\), \\((u,v)\\) in \\([0,1]\\times[0,1]\\) and \\(F(x,y,z) = \\langle x,y^2,z^3\\rangle\\). Find \\(\\int_S (F\\cdot\\hat{N})dS\\)\n\n\n\n \n \n \n \n \n \n \n \n \n \\(7/36\\)\n \n \n\n\n \n \n \n \n \\(1/60\\)\n \n \n\n\n \n \n \n \n \\(17/252\\)\n \n \n\n\n \n \n \n \n \\(0\\)"
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html",
"href": "integral_vector_calculus/div_grad_curl.html",
"title": "61  The Gradient, Divergence, and Curl",
"section": "",
"text": "This section uses these add-on packages:\nThe gradient of a scalar function \\(f:R^n \\rightarrow R\\) is a vector field of partial derivatives. In \\(R^2\\), we have:\n\\[\n\\nabla{f} = \\langle \\frac{\\partial{f}}{\\partial{x}},\n\\frac{\\partial{f}}{\\partial{y}} \\rangle.\n\\]\nIt has the interpretation of pointing out the direction of greatest ascent for the surface \\(z=f(x,y)\\).\nWe move now to two other operations, the divergence and the curl, which combine to give a language to describe vector fields in \\(R^3\\)."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#the-divergence",
"href": "integral_vector_calculus/div_grad_curl.html#the-divergence",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.1 The divergence",
"text": "61.1 The divergence\nLet \\(F:R^3 \\rightarrow R^3 = \\langle F_x, F_y, F_z\\rangle\\) be a vector field. Consider now a small box-like region, \\(R\\), with surface, \\(S\\), on the cartesian grid, with sides of length \\(\\Delta x\\), \\(\\Delta y\\), and \\(\\Delta z\\) with \\((x,y,z)\\) being one corner. The outward pointing unit normals are \\(\\pm \\hat{i}, \\pm\\hat{j},\\) and \\(\\pm\\hat{k}\\).\n\n\n\n\n\nConsider the sides with outward normal \\(\\hat{i}\\). The contribution to the surface integral, \\(\\oint_S (F\\cdot\\hat{N})dS\\), could be approximated by\n\\[\n\\left(F(x + \\Delta x, y, z) \\cdot \\hat{i}\\right) \\Delta y \\Delta z,\n\\]\nwhereas, the contribution for the face with outward normal \\(-\\hat{i}\\) could be approximated by:\n\\[\n\\left(F(x, y, z) \\cdot (-\\hat{i}) \\right) \\Delta y \\Delta z.\n\\]\nThe functions are being evaluated at a point on the face of the surface. For Riemann integrable functions, any point in a partition may be chosen, so our choice will not restrict the generality.\nThe total contribution of the two would be:\n\\[\n\\left(F(x + \\Delta x, y, z) \\cdot \\hat{i}\\right) \\Delta y \\Delta z +\n\\left(F(x, y, z) \\cdot (-\\hat{i})\\right) \\Delta y \\Delta z =\n\\left(F_x(x + \\Delta x, y, z) - F_x(x, y, z)\\right) \\Delta y \\Delta z,\n\\]\nas \\(F \\cdot \\hat{i} = F_x\\).\nWere we to divide by \\(\\Delta V = \\Delta x \\Delta y \\Delta z\\) and take a limit as the volume shrinks, the limit would be \\(\\partial{F}/\\partial{x}\\).\nIf this is repeated for the other two pair of matching faces, we get a definition for the divergence:\n\nThe divergence of a vector field \\(F:R^3 \\rightarrow R^3\\) is given by\n\\[\n\\text{divergence}(F) =\n\\lim \\frac{1}{\\Delta V} \\oint_S F\\cdot\\hat{N} dS =\n\\frac{\\partial{F_x}}{\\partial{x}} +\\frac{\\partial{F_y}}{\\partial{y}} +\\frac{\\partial{F_z}}{\\partial{z}}.\n\\]\n\nThe limit expression for the divergence will hold for any smooth closed surface, \\(S\\), converging on \\((x,y,z)\\), not just box-like ones.\n\n61.1.1 General \\(n\\)\nThe derivation of the divergence is done for \\(n=3\\), but could also have easily been done for two dimensions (\\(n=2\\)) or higher dimensions \\(n>3\\). The formula in general would be: for \\(F(x_1, x_2, \\dots, x_n): R^n \\rightarrow R^n\\):\n\\[\n\\text{divergence}(F) = \\sum_{i=1}^n \\frac{\\partial{F_i}}{\\partial{x_i}}.\n\\]\n\nIn Julia, the divergence can be implemented different ways depending on how the problem is presented. Here are two functions from the CalculusWithJulia package for when the problem is symbolic or numeric:\ndivergence(F::Vector{Sym}, vars) = sum(diff.(F, vars))\ndivergence(F::Function, pt) = sum(diag(ForwardDiff.jacobian(F, pt)))\nThe latter being a bit inefficient, as all \\(n^2\\) partial derivatives are found, but only the \\(n\\) diagonal ones are used."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#the-curl",
"href": "integral_vector_calculus/div_grad_curl.html#the-curl",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.2 The curl",
"text": "61.2 The curl\nBefore considering the curl for \\(n=3\\), we derive a related quantity in \\(n=2\\). The “curl” will be a measure of the microscopic circulation of a vector field. To that end we consider a microscopic box-region in \\(R^2\\):\n\n\n\n\n\nLet \\(F=\\langle F_x, F_y\\rangle\\). For small enough values of \\(\\Delta{x}\\) and \\(\\Delta{y}\\) the line integral, \\(\\oint_C F\\cdot d\\vec{r}\\) can be approximated by \\(4\\) terms:\n\\[\n\\begin{align}\n\\left(F(x,y) \\cdot \\hat{i}\\right)\\Delta{x} &+\n\\left(F(x+\\Delta{x},y) \\cdot \\hat{j}\\right)\\Delta{y} +\n\\left(F(x,y+\\Delta{y}) \\cdot (-\\hat{i})\\right)\\Delta{x} +\n\\left(F(x,y) \\cdot (-\\hat{j})\\right)\\Delta{x}\\\\\n&=\nF_x(x,y) \\Delta{x} + F_y(x+\\Delta{x},y)\\Delta{y} +\nF_x(x, y+\\Delta{y}) (-\\Delta{x}) + F_y(x,y) (-\\Delta{y})\\\\\n&=\n(F_y(x + \\Delta{x}, y) - F_y(x, y))\\Delta{y} -\n(F_x(x, y+\\Delta{y})-F_x(x,y))\\Delta{x}.\n\\end{align}\n\\]\nThe Riemann approximation allows a choice of evaluation point for Riemann integrable functions, and the choice here lends itself to further analysis. Were the above divided by \\(\\Delta{x}\\Delta{y}\\), the area of the box, and a limit taken, partial derivatives appear to suggest this formula:\n\\[\n\\lim \\frac{1}{\\Delta{x}\\Delta{y}} \\oint_C F\\cdot d\\vec{r} =\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}.\n\\]\nThe scalar function on the right hand side is called the (two-dimensional) curl of \\(F\\) and the left-hand side lends itself as a measure of the microscopic circulation of the vector field, \\(F:R^2 \\rightarrow R^2\\).\n\nConsider now a similar scenario for the \\(n=3\\) case. Let \\(F=\\langle F_x, F_y,F_z\\rangle\\) be a vector field and \\(S\\) a box-like region with side lengths \\(\\Delta x\\), \\(\\Delta y\\), and \\(\\Delta z\\), anchored at \\((x,y,z)\\).\n\n\n\n\n\nThe box-like volume in space with the top area, with normal \\(\\hat{k}\\), designated as \\(S_1\\). The curve \\(C_1\\) traces around \\(S_1\\) in a counter clockwise manner, consistent with the right-hand rule pointing in the outward normal direction. The face \\(S_1\\) with unit normal \\(\\hat{k}\\) looks like:\n\n\n\n\n\nNow we compute the line integral. Consider the top face, \\(S_1\\), connecting \\((x,y,z+\\Delta z), (x + \\Delta x, y, z + \\Delta z), (x + \\Delta x, y + \\Delta y, z + \\Delta z), (x, y + \\Delta y, z + \\Delta z)\\), Using the right hand rule, parameterize the boundary curve, \\(C_1\\), in a counter clockwise direction so the right hand rule yields the outward pointing normal (\\(\\hat{k}\\)). Then the integral \\(\\oint_{C_1} F\\cdot \\hat{T} ds\\) is approximated by the following Riemann sum of \\(4\\) terms:\n\\[\n\\begin{align*}\nF(x,y, z+\\Delta{z}) \\cdot \\hat{i}\\Delta{x} &+ F(x+\\Delta x, y, z+\\Delta{z}) \\cdot \\hat{j} \\Delta y \\\\\n&+ F(x, y+\\Delta y, z+\\Delta{z}) \\cdot (-\\hat{i}) \\Delta{x} \\\\\n&+ F(x, y, z+\\Delta{z}) \\cdot (-\\hat{j}) \\Delta{y}.\n\\end{align*}\n\\]\n(The points \\(c_i\\) are chosen from the endpoints of the line segments.)\n\\[\n\\begin{align*}\n\\oint_{C_1} F\\cdot \\hat{T} ds\n&\\approx (F_y(x+\\Delta x, y, z+\\Delta{z}) \\\\\n&- F_y(x, y, z+\\Delta{z})) \\Delta{y} \\\\\n&- (F_x(x,y + \\Delta{y}, z+\\Delta{z}) \\\\\n&- F_x(x, y, z+\\Delta{z})) \\Delta{x}\n\\end{align*}\n\\]\nAs before, were this divided by the area of the surface, we have after rearranging and cancellation:\n\\[\n\\begin{align*}\n\\frac{1}{\\Delta{S_1}} \\oint_{C_1} F \\cdot \\hat{T} ds &\\approx\n\\frac{F_y(x+\\Delta x, y, z+\\Delta{z}) - F_y(x, y, z+\\Delta{z})}{\\Delta{x}}\\\\\n&- \\frac{F_x(x, y+\\Delta y, z+\\Delta{z}) - F_x(x, y, z+\\Delta{z})}{\\Delta{y}}.\n\\end{align*}\n\\]\nIn the limit, as \\(\\Delta{S} \\rightarrow 0\\), this will converge to \\(\\partial{F_y}/\\partial{x}-\\partial{F_x}/\\partial{y}\\).\nHad the bottom of the box been used, a similar result would be found, up to a minus sign.\nUnlike the two dimensional case, there are other directions to consider and here the other sides will yield different answers. Consider now the face connecting \\((x,y,z), (x+\\Delta{x}, y, z), (x+\\Delta{x}, y, z + \\Delta{z})\\), and $ (x,y,z+)$ with outward pointing normal \\(-\\hat{j}\\). Let \\(S_2\\) denote this face and \\(C_2\\) describe its boundary. Orient this curve so that the right hand rule points in the \\(-\\hat{j}\\) direction (the outward pointing normal). Then, as before, we can approximate:\n\\[\n\\begin{align*}\n\\oint_{C_2} F \\cdot \\hat{T} ds\n&\\approx\nF(x,y,z) \\cdot \\hat{i} \\Delta{x} \\\\\n&+ F(x+\\Delta{x},y,z) \\cdot \\hat{k} \\Delta{z} \\\\\n&+ F(x,y,z+\\Delta{z}) \\cdot (-\\hat{i}) \\Delta{x} \\\\\n&+ F(x, y, z) \\cdot (-\\hat{k}) \\Delta{z}\\\\\n&= (F_z(x+\\Delta{x},y,z) - F_z(x, y, z))\\Delta{z} -\n(F_x(x,y,z+\\Delta{z}) - F(x,y,z)) \\Delta{x}.\n\\end{align*}\n\\]\nDividing by \\(\\Delta{S}=\\Delta{x}\\Delta{z}\\) and taking a limit will give:\n\\[\n\\lim \\frac{1}{\\Delta{S}} \\oint_{C_2} F \\cdot \\hat{T} ds =\n\\frac{\\partial{F_z}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{z}}.\n\\]\nHad, the opposite face with outward normal \\(\\hat{j}\\) been chosen, the answer would differ by a factor of \\(-1\\).\nSimilarly, let \\(S_3\\) be the face with outward normal \\(\\hat{i}\\) and curve \\(C_3\\) bounding it with parameterization chosen so that the right hand rule points in the direction of \\(\\hat{i}\\). This will give\n\\[\n\\lim \\frac{1}{\\Delta{S}} \\oint_{C_3} F \\cdot \\hat{T} ds =\n\\frac{\\partial{F_z}}{\\partial{y}} - \\frac{\\partial{F_y}}{\\partial{z}}.\n\\]\nIn short, depending on the face chosen, a different answer is given, but all have the same type.\n\nDefine the curl of a \\(3\\)-dimensional vector field \\(F=\\langle F_x,F_y,F_z\\rangle\\) by:\n\\[\n\\text{curl}(F) =\n\\langle \\frac{\\partial{F_z}}{\\partial{y}} - \\frac{\\partial{F_y}}{\\partial{z}},\n\\frac{\\partial{F_x}}{\\partial{z}} - \\frac{\\partial{F_z}}{\\partial{x}},\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}} \\rangle.\n\\]\n\nIf \\(S\\) is some surface with closed boundary \\(C\\) oriented so that the unit normal, \\(\\hat{N}\\), of \\(S\\) is given by the right hand rule about \\(C\\), then\n\\[\n\\hat{N} \\cdot \\text{curl}(F) = \\lim \\frac{1}{\\Delta{S}} \\oint_C F \\cdot \\hat{T} ds.\n\\]\nThe curl has a formal representation in terms of a \\(3\\times 3\\) determinant, similar to that used to compute the cross product, that is useful for computation:\n\\[\n\\text{curl}(F) = \\det\\left[\n\\begin{array}{}\n\\hat{i} & \\hat{j} & \\hat{k}\\\\\n\\frac{\\partial}{\\partial{x}} & \\frac{\\partial}{\\partial{y}} & \\frac{\\partial}{\\partial{z}}\\\\\nF_x & F_y & F_z\n\\end{array}\n\\right]\n\\]\n\nIn Julia, the curl can be implemented different ways depending on how the problem is presented. We will use the Jacobian matrix to compute the required partials. If the Jacobian is known, this function from the CalculusWithJulia package will combine the off-diagonal terms appropriately:\nfunction curl(J::Matrix)\n Mx, Nx, Px, My, Ny, Py, Mz, Nz, Pz = J\n [Py-Nz, Mz-Px, Nx-My] # ∇×VF\nend\nThe computation of the Jacobian differs whether the problem is treated numerically or symbolically. Here are two functions:\ncurl(F::Vector{Sym}, vars=free_symbols(F)) = curl(F.jacobian(vars))\ncurl(F::Function, pt) = curl(ForwardDiff.jacobian(F, pt))\n\n61.2.1 The \\(\\nabla\\) (del) operator\nThe divergence, gradient, and curl all involve partial derivatives. There is a notation employed that can express the operations more succinctly. Let the Del operator be defined in Cartesian coordinates by the formal expression:\n\n\\[\n\\nabla = \\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\n\\rangle.\n\\]\n\nThis is a vector differential operator that acts on functions and vector fields through the typical notation to yield the three operations:\n\\[\n\\begin{align*}\n\\nabla{f} &= \\langle\n\\frac{\\partial{f}}{\\partial{x}},\n\\frac{\\partial{f}}{\\partial{y}},\n\\frac{\\partial{f}}{\\partial{z}}\n\\rangle, \\quad\\text{the gradient;}\\\\\n\\nabla\\cdot{F} &= \\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\n\\rangle \\cdot F \\\\\n&=\n\\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\n\\rangle \\cdot\n\\langle F_x, F_y, F_z \\rangle \\\\\n&=\n\\frac{\\partial{F_x}}{\\partial{x}} +\n\\frac{\\partial{F_y}}{\\partial{y}} +\n\\frac{\\partial{F_z}}{\\partial{z}},\\quad\\text{the divergence;}\\\\\n\\nabla\\times F &= \\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\n\\rangle \\times F =\n\\det\\left[\n\\begin{array}{}\n\\hat{i} & \\hat{j} & \\hat{k} \\\\\n\\frac{\\partial}{\\partial{x}}&\n\\frac{\\partial}{\\partial{y}}&\n\\frac{\\partial}{\\partial{z}}\\\\\nF_x & F_y & F_z\n\\end{array}\n\\right],\\quad\\text{the curl}.\n\\end{align*}\n\\]\n\n\n\n\n\n\nNote\n\n\n\nMathematically operators have not been seen previously, but the concept of an operation on a function that returns another function is a common one when using Julia. We have seen many examples (plot, D, quadgk, etc.). In computer science such functions are called higher order functions, as they accept arguments which are also functions.\n\n\n\nIn the CalculusWithJulia package, the constant \\nabla[\\tab], producing \\(\\nabla\\) implements this operator for functions and symbolic expressions.\n\n@syms x::real y::real z::real\n\n(x, y, z)\n\n\n\nf(x,y,z) = x*y*z\nf(v) = f(v...)\nF(x,y,z) = [x, y, z]\nF(v) = F(v...)\n\n∇(f(x,y,z)) # symbolic operation on the symbolic expression f(x,y,z)\n\n3-element Vector{Sym}:\n y⋅z\n x⋅z\n x⋅y\n\n\nThis usage of ∇ takes partial derivatives according to the order given by:\n\nfree_symbols(f(x,y,z))\n\n3-element Vector{Sym}:\n x\n y\n z\n\n\nwhich may not be as desired. In this case, the variables can be specified using a tuple to pair up the expression with the variables to differentiate against:\n\n∇( (f(x,y,z), [x,y,z]) )\n\n3-element Vector{Sym}:\n y⋅z\n x⋅z\n x⋅y\n\n\nFor numeric expressions, we have:\n\n∇(f)(1,2,3) # a numeric computation. Also can call with a point [1,2,3]\n\n3-element Vector{Int64}:\n 6\n 3\n 2\n\n\n(The extra parentheses are unfortunate. Here ∇ is called like a function.)\nThe divergence can be found symbolically:\n\n∇ ⋅ F(x,y,z)\n\n \n\\[\n3\n\\]\n\n\n\nOr numerically:\n\n(∇ ⋅ F)(1,2,3) # a numeric computation. Also can call (∇ ⋅ F)([1,2,3])\n\n3.0\n\n\nSimilarly, the curl. Symbolically:\n\n∇ × F(x,y,z)\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nand numerically:\n\n(∇ × F)(1,2,3) # numeric. Also can call (∇ × F)([1,2,3])\n\n3-element Vector{Float64}:\n 0.0\n 0.0\n 0.0\n\n\nThere is a subtle difference in usage. Symbolically the evaluation of F(x,y,z) first is desired, numerically the evaluation of ∇ ⋅ F or ∇ × F first is desired. As ⋅ and × have lower precedence than function evaluation, parentheses must be used in the numeric case.\n\n\n\n\n\n\nNote\n\n\n\nAs mentioned, for the symbolic evaluations, a specification of three variables (here x, y, and z) is necessary. This use takes free_symbols to identify three free symbols which may not always be the case. (It wouldnt be for, say, F(x,y,z) = [a*x,b*y,0], a and b constants.) In those cases, the notation accepts a tuple to specify the function or vector field and the variables, e.g. (∇( (f(x,y,z), [x,y,z]) ), as illustrated; ∇ × (F(x,y,z), [x,y,z]); or ∇ ⋅ (F(x,y,z), [x,y,z]) where this is written using function calls to produce the symbolic expression in the first positional argument, though a direct expression could also be used. In these cases, the named versions gradient, curl, and divergence may be preferred."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#interpretation",
"href": "integral_vector_calculus/div_grad_curl.html#interpretation",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.3 Interpretation",
"text": "61.3 Interpretation\nThe divergence and curl measure complementary aspects of a vector field. The divergence is defined in terms of flow out of an infinitesimal box, the curl is about rotational flow around an infinitesimal area patch.\nLet \\(F(x,y,z) = [x, 0, 0]\\), a vector field pointing in just the \\(\\hat{i}\\) direction. The divergence is simply \\(1\\). If \\(V\\) is a box, as in the derivation, then the divergence measures the flow into the side with outward normal \\(-\\hat{i}\\) and through the side with outward normal \\(\\hat{i}\\) which will clearly be positive as the flow passes through the region \\(V\\), increasing as \\(x\\) increases, when \\(x > 0\\).\nThe radial vector field \\(F(x,y,z) = \\langle x, y, z \\rangle\\) is also an example of a divergent field. The divergence is:\n\nF(x,y,z) = [x,y,z]\n∇ ⋅ F(x,y,z)\n\n \n\\[\n3\n\\]\n\n\n\nThere is a constant outward flow, emanating from the origin. Here we picture the field when \\(z=0\\):\n\n\n\n\n\nConsider the limit definition of the divergence:\n\\[\n\\nabla\\cdot{F} = \\lim \\frac{1}{\\Delta{V}} \\oint_S F\\cdot\\hat{N} dA.\n\\]\nIn the vector field above, the shape along the curved edges has constant magnitude field. On the left curved edge, the length is smaller and the field is smaller than on the right. The flux across the left edge will be less than the flux across the right edge, and a net flux will exist. That is, there is divergence.\nNow, were the field on the right edge less, it might be that the two balance out and there is no divergence. This occurs with the inverse square laws, such as for gravity and electric field:\n\nR = [x,y,z]\nRhat = R/norm(R)\nVF = (1/norm(R)^2) * Rhat\n∇ ⋅ VF |> simplify\n\n \n\\[\n0\n\\]\n\n\n\n\nThe vector field \\(F(x,y,z) = \\langle -y, x, 0 \\rangle\\) is an example of a rotational field. Its curl can be computed symbolically through:\n\ncurl([-y,x,0], [x,y,z])\n\n3-element Vector{Sym}:\n 0\n 0\n 2\n\n\nThis vector field rotates as seen in this figure showing slices for different values of \\(z\\):\n\n\n\n\n\nThe field has a clear rotation about the \\(z\\) axis (illustrated with a line), the curl is a vector that points in the direction of the right hand rule as the right hand fingers follow the flow with magnitude given by the amount of rotation.\nThis is a bit misleading though, the curl is defined by a limit, and not in terms of a large box. The key point for this field is that the strength of the field is stronger as the points get farther away, so for a properly oriented small box, the integral along the closer edge will be less than that along the outer edge.\nConsider a related field where the strength gets smaller as the point gets farther away but otherwise has the same circular rotation pattern\n\nR = [-y, x, 0]\nVF = R / norm(R)^2\ncurl(VF, [x,y,z]) .|> simplify\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nFurther, the curl of R/norm(R)^3 now points in the opposite direction of the curl of R. This example isnt typical, as dividing by norm(R) with a power greater than \\(1\\) makes the vector field discontinuous at the origin.\nThe curl of the vector field \\(F(x,y,z) = \\langle 0, 1+y^2, 0\\rangle\\) is \\(0\\), as there is clearly no rotation as seen in this slice where \\(z=0\\):\n\n\n\n\n\nAlgebraically, this is so:\n\ncurl(Sym[0,1+y^2,0], [x,y,z])\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nNow consider a similar field \\(F(x,y,z) = \\langle 0, 1+x^2, 0,\\rangle\\). A slice is somewhat similar, in that the flow lines are all in the \\(\\hat{j}\\) direction:\n\n\n\n\n\nHowever, this vector field has a curl:\n\ncurl([0, 1+x^2,0], [x,y,z])\n\n3-element Vector{Sym}:\n 0\n 0\n 2⋅x\n\n\nThe curl points in the \\(\\hat{k}\\) direction (out of the figure). A useful visualization is to mentally place a small paddlewheel at a point and imagine if it will turn. In the constant field case, there is equal flow on both sides of the axis, so it any forces on the wheel blades will balance out. In the latter example, if \\(x > 0\\), the force on the right side will be greater than the force on the left so the paddlewheel would rotate counter clockwise. The right hand rule for this rotation will point in the upward, or \\(\\hat{k}\\) direction, as seen algebraically in the curl.\nFollowing Strang, in general the curl can point in any direction, so the amount the paddlewheel will spin will be related to how the paddlewheel is oriented. The angular velocity of the wheel will be \\((1/2)(\\nabla\\times{F})\\cdot\\hat{N}\\), \\(\\hat{N}\\) being the normal for the paddlewheel.\nIf \\(\\vec{a}\\) is some vector and \\(\\hat{r} = \\langle x, y, z\\rangle\\) is the radial vector, then \\(\\vec{a} \\times \\vec{r}\\) has a curl, which is given by:\n\n@syms a1 a2 a3\na = [a1, a2, a3]\nr = [x, y, z]\ncurl(a × r, [x,y, z])\n\n3-element Vector{Sym}:\n 2⋅a₁\n 2⋅a₂\n 2⋅a₃\n\n\nThe angular velocity then is \\(\\vec{a} \\cdot \\hat{N}\\). The curl is constant. As the dot product involves the cosine of the angle between the two vectors, we see the turning speed is largest when \\(\\hat{N}\\) is parallel to \\(\\vec{a}\\). This gives a similar statement for the curl like the gradient does for steepest growth rate: the maximum rotation rate of \\(F\\) is \\((1/2)\\|\\nabla\\times{F}\\|\\) in the direction of \\(\\nabla\\times{F}\\).\nThe curl of the radial vector field, \\(F(x,y,z) = \\langle x, y, z\\rangle\\) will be \\(\\vec{0}\\):\n\ncurl([x,y,z], [x,y,z])\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nWe will see that this can be anticipated, as \\(F = (1/2) \\nabla(x^2+y^2+z^2)\\) is a gradient field.\nIn fact, the curl of any radial field will be \\(\\vec{0}\\). Here we represent a radial field as a scalar function of \\(\\vec{r}\\) time \\(\\hat{r}\\):\n\n@syms H()\nR = sqrt(x^2 + y^2 + z^2)\nRhat = [x, y, z]/R\ncurl(H(R) * Rhat, [x, y, z])\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nWere one to represent the curl in spherical coordinates (below), this follows algebraically from the formula easily enough. To anticipate this, due to symmetry, the curl would need to be the same along any ray emanating from the origin and again by symmetry could only possible point along the ray. Mentally place a paddlewheel along the \\(x\\) axis oriented along \\(\\hat{i}\\). There will be no rotational forces that could make the wheel spin around the \\(x\\)-axis, hence the curl must be \\(0\\)."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#the-maxwell-equations",
"href": "integral_vector_calculus/div_grad_curl.html#the-maxwell-equations",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.4 The Maxwell equations",
"text": "61.4 The Maxwell equations\nThe divergence and curl appear in Maxwells equations describing the relationships of electromagnetism. In the formulas below the notation is \\(E\\) is the electric field; \\(B\\) is the magnetic field; \\(\\rho\\) is the charge density (charge per unit volume); \\(J\\) the electric current density (current per unit area); and \\(\\epsilon_0\\), \\(\\mu_0\\), and \\(c\\) are universal constants.\nThe equations in differential form are:\n\nGausss law: \\(\\nabla\\cdot{E} = \\rho/\\epsilon_0\\).\n\nThat is, the divergence of the electric field is proportional to the density. We have already mentioned this in integral form.\n\nGausss law of magnetism: \\(\\nabla\\cdot{B} = 0\\)\n\nThe magnetic field has no divergence. This says that there no magnetic charges (a magnetic monopole) unlike electric charge, according to Maxwells laws.\n\nFaradays law of induction: \\(\\nabla\\times{E} = - \\partial{B}/\\partial{t}\\).\n\nThe curl of the time-varying electric field is in the direction of the partial derivative of the magnetic field. For example, if a magnet is in motion in the in the \\(z\\) axis, then the electric field has rotation in the \\(x-y\\) plane induced by the motion of the magnet.\n\nAmperes circuital law: \\(\\nabla\\times{B} = \\mu_0J + \\mu_0\\epsilon_0 \\partial{E}/\\partial{t}\\)\n\nThe curl of the magnetic field is related to the sum of the electric current density and the change in time of the electric field.\n\nIn a region with no charges (\\(\\rho=0\\)) and no currents (\\(J=\\vec{0}\\)), such as a vacuum, these equations reduce to two divergences being \\(0\\): \\(\\nabla\\cdot{E} = 0\\) and \\(\\nabla\\cdot{B}=0\\); and two curl relationships with time derivatives: \\(\\nabla\\times{E}= -\\partial{B}/\\partial{t}\\) and \\(\\nabla\\times{B} = \\mu_0\\epsilon_0 \\partial{E}/\\partial{t}\\).\nWe will see later how these are differential forms are consequences of related integral forms."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#algebra-of-vector-calculus",
"href": "integral_vector_calculus/div_grad_curl.html#algebra-of-vector-calculus",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.5 Algebra of vector calculus",
"text": "61.5 Algebra of vector calculus\nThe divergence, gradient, and curl satisfy several algebraic properties.\nLet \\(f\\) and \\(g\\) denote scalar functions, \\(R^3 \\rightarrow R\\) and \\(F\\) and \\(G\\) be vector fields, \\(R^3 \\rightarrow R^3\\).\n\n61.5.1 Linearity\nAs with the sum rule of univariate derivatives, these operations satisfy:\n\\[\n\\begin{align}\n\\nabla(f + g) &= \\nabla{f} + \\nabla{g}\\\\\n\\nabla\\cdot(F+G) &= \\nabla\\cdot{F} + \\nabla\\cdot{G}\\\\\n\\nabla\\times(F+G) &= \\nabla\\times{F} + \\nabla\\times{G}.\n\\end{align}\n\\]\n\n\n61.5.2 Product rule\nThe product rule \\((uv)' = u'v + uv'\\) has related formulas:\n\\[\n\\begin{align}\n\\nabla{(fg)} &= (\\nabla{f}) g + f\\nabla{g} = g\\nabla{f} + f\\nabla{g}\\\\\n\\nabla\\cdot{fF} &= (\\nabla{f})\\cdot{F} + f(\\nabla\\cdot{F})\\\\\n\\nabla\\times{fF} &= (\\nabla{f})\\times{F} + f(\\nabla\\times{F}).\n\\end{align}\n\\]\n\n\n61.5.3 Rules over cross products\nThe cross product of two vector fields is a vector field for which the divergence and curl may be taken. There are formulas to relate to the individual terms:\n\\[\n\\begin{align}\n\\nabla\\cdot(F \\times G) &= (\\nabla\\times{F})\\cdot G - F \\cdot (\\nabla\\times{G})\\\\\n\\nabla\\times(F \\times G) &= F(\\nabla\\cdot{G}) - G(\\nabla\\cdot{F} + (G\\cdot\\nabla)F-(F\\cdot\\nabla)G\\\\\n&= \\nabla\\cdot(BA^t - AB^t).\n\\end{align}\n\\]\nThe curl formula is more involved.\n\n\n61.5.4 Vanishing properties\nSurprisingly, the curl and divergence satisfy two vanishing properties. First\n\nThe curl of a gradient field is \\(\\vec{0}\\)\n\\[\n\\nabla \\times \\nabla{f} = \\vec{0},\n\\]\n\nif the scalar function \\(f\\) is has continuous second derivatives (so the mixed partials do not depend on order).\nVector fields where \\(F = \\nabla{f}\\) are conservative. Conservative fields have path independence, so any line integral, \\(\\oint F\\cdot \\hat{T} ds\\), around a closed loop will be \\(0\\). But the curl is defined as a limit of such integrals, so it too will be \\(\\vec{0}\\). In short, conservative fields have no rotation.\nWhat about the converse? If a vector field has zero curl, then integrals around infinitesimally small loops are \\(0\\). Does this also mean that integrals around larger closed loops will also be \\(0\\), and hence the field is conservative? The answer will be yes, under assumptions. But the discussion will wait for later.\nThe combination \\(\\nabla\\cdot\\nabla{f}\\) is defined and is called the Laplacian. This is denoted \\(\\Delta{f}\\). The equation \\(\\Delta{f} = 0\\) is called Laplaces equation. It is not guaranteed for any scalar function \\(f\\), but the \\(f\\) for which it holds are important.\nSecond,\n\nThe divergence of a curl field is \\(0\\):\n\\[\n\\nabla \\cdot(\\nabla\\times{F}) = 0.\n\\]\n\nThis is not as clear, but can be seen algebraically as terms cancel. First:\n\\[\n\\begin{align*}\n\\nabla\\cdot(\\nabla\\times{F}) &=\n\\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\\rangle \\cdot\n\\langle\n\\frac{\\partial{F_z}}{\\partial{y}} - \\frac{\\partial{F_y}}{\\partial{z}},\n\\frac{\\partial{F_x}}{\\partial{z}} - \\frac{\\partial{F_z}}{\\partial{x}},\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}\n\\rangle \\\\\n&=\n\\left(\\frac{\\partial^2{F_z}}{\\partial{y}\\partial{x}} - \\frac{\\partial^2{F_y}}{\\partial{z}\\partial{x}}\\right) +\n\\left(\\frac{\\partial^2{F_x}}{\\partial{z}\\partial{y}} - \\frac{\\partial^2{F_z}}{\\partial{x}\\partial{y}}\\right) +\n\\left(\\frac{\\partial^2{F_y}}{\\partial{x}\\partial{z}} - \\frac{\\partial^2{F_x}}{\\partial{y}\\partial{z}}\\right)\n\\end{align*}\n\\]\nFocusing on one component function, \\(F_z\\) say, we see this contribution:\n\\[\n\\frac{\\partial^2{F_z}}{\\partial{y}\\partial{x}} -\n\\frac{\\partial^2{F_z}}{\\partial{x}\\partial{y}}.\n\\]\nThis is zero under the assumption that the second partial derivatives are continuous.\nFrom the microscopic picture of a box this can also be seen. Again we focus on just the appearance of the \\(F_z\\) component function. Let the faces with normals \\(\\hat{i}, \\hat{j},-\\hat{i}, -\\hat{j}\\) be labeled \\(A, B, C\\), and \\(D\\). This figure shows \\(A\\) (enclosed in blue) and \\(B\\) (enclosed in green):\n\n\n\n\n\nWe will get from the approximate surface integral of the approximate curl the following terms:\n\n@syms x y z Δx Δy Δz\np1, p2, p3, p4=(x, y, z), (x + Δx, y, z), (x + Δx, y + Δy, z), (x, y + Δy, z)\n@syms F_z()\nglobal exₐ = (-F_z(p2...) + F_z(p3...))*Δz + # face A\n(-F_z(p3...) + F_z(p4...))*Δz + # face B\n(F_z(p1...) - F_z(p4...))*Δz + # face C\n(F_z(p2...) - F_z(p1...))*Δz # face D\n\n \n\\[\nΔz \\left(- \\operatorname{F_{z}}{\\left(x,y,z \\right)} + \\operatorname{F_{z}}{\\left(x + Δx,y,z \\right)}\\right) + Δz \\left(\\operatorname{F_{z}}{\\left(x,y,z \\right)} - \\operatorname{F_{z}}{\\left(x,y + Δy,z \\right)}\\right) + Δz \\left(\\operatorname{F_{z}}{\\left(x,y + Δy,z \\right)} - \\operatorname{F_{z}}{\\left(x + Δx,y + Δy,z \\right)}\\right) + Δz \\left(- \\operatorname{F_{z}}{\\left(x + Δx,y,z \\right)} + \\operatorname{F_{z}}{\\left(x + Δx,y + Δy,z \\right)}\\right)\n\\]\n\n\n\nThe term for face \\(A\\), say, should be divided by \\(\\Delta{y}\\Delta{z}\\) for the curl approximation, but this will be multiplied by the same amount for the divergence calculation, so it isnt written.\nThe expression above simplifies to:\n\nsimplify(exₐ)\n\n \n\\[\n0\n\\]\n\n\n\nThis is because of how the line integrals are oriented so that the right-hand rule gives outward pointing normals. For each up stroke for one face, there is a downstroke for a different face, and so the corresponding terms cancel each other out. So providing the limit of these two approximations holds, the vanishing identity can be anticipated from the microscopic picture.\n\nExample\nThe invariance of charge can be derived as a corollary of Maxwells equation. The divergence of the curl of the magnetic field is \\(0\\), leading to:\n\\begin{align*}\n0 &= \\nabla\\cdot(\\nabla\\times{B}) \\\\\n&=\n\\mu_0(\\nabla\\cdot{J} + \\epsilon_0 \\nabla\\cdot{\\frac{\\partial{E}}{\\partial{t}}}) \\\\\n&=\n\\mu_0(\\nabla\\cdot{J} + \\epsilon_0 \\frac{\\partial}{\\partial{t}}(\\nabla\\cdot{E})) \\\\\n&=\n\\mu_0(\\nabla\\cdot{J} + \\frac{\\partial{\\rho}}{\\partial{t}}).\n\\end{align*}\nThat is \\(\\nabla\\cdot{J} = -\\partial{\\rho}/\\partial{t}\\). This says any change in the charge density in time (\\(\\partial{\\rho}/\\partial{t}\\)) is balanced off by a divergence in the electric current density (\\(\\nabla\\cdot{J}\\)). That is, charge cant be created or destroyed in an isolated system."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#fundamental-theorem-of-vector-calculus",
"href": "integral_vector_calculus/div_grad_curl.html#fundamental-theorem-of-vector-calculus",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.6 Fundamental theorem of vector calculus",
"text": "61.6 Fundamental theorem of vector calculus\nThe divergence and curl are complementary ideas. Are there other distinct ideas to sort a vector field by? The Helmholtz decomposition says not really. It states that vector fields that decay rapidly enough can be expressed in terms of two pieces: one with no curl and one with no divergence.\nFrom Wikipedia we have this formulation:\nLet \\(F\\) be a vector field on a bounded domain \\(V\\) which is twice continuously differentiable. Let \\(S\\) be the surface enclosing \\(V\\). Then \\(F\\) can be decomposed into a curl-free component and a divergence-free component:\n\\[\nF = -\\nabla(\\phi) + \\nabla\\times A.\n\\]\nWithout explaining why, these values can be computed using volume and surface integrals:\n\\[\n\\begin{align}\n\\phi(\\vec{r}') &=\n\\frac{1}{4\\pi} \\int_V \\frac{\\nabla \\cdot F(\\vec{r})}{\\|\\vec{r}'-\\vec{r} \\|} dV -\n\\frac{1}{4\\pi} \\oint_S \\frac{F(\\vec{r})}{\\|\\vec{r}'-\\vec{r} \\|} \\cdot \\hat{N} dS\\\\\nA(\\vec{r}') &= \\frac{1}{4\\pi} \\int_V \\frac{\\nabla \\times F(\\vec{r})}{\\|\\vec{r}'-\\vec{r} \\|} dV +\n\\frac{1}{4\\pi} \\oint_S \\frac{F(\\vec{r})}{\\|\\vec{r}'-\\vec{r} \\|} \\times \\hat{N} dS.\n\\end{align}\n\\]\nIf \\(V = R^3\\), an unbounded domain, but \\(F\\) vanishes faster than \\(1/r\\), then the theorem still holds with just the volume integrals:\n\\[\n\\begin{align}\n\\phi(\\vec{r}') &=\\frac{1}{4\\pi} \\int_V \\frac{\\nabla \\cdot F(\\vec{r})}{\\|\\vec{r}'-\\vec{r} \\|} dV\\\\\nA(\\vec{r}') &= \\frac{1}{4\\pi} \\int_V \\frac{\\nabla \\times F(\\vec{r})}{\\|\\vec{r}'-\\vec{r}\\|} dV.\n\\end{align}\n\\]"
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#change-of-variable",
"href": "integral_vector_calculus/div_grad_curl.html#change-of-variable",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.7 Change of variable",
"text": "61.7 Change of variable\nThe divergence and curl are defined in a manner independent of the coordinate system, though the method to compute them depends on the Cartesian coordinate system. If that is inconvenient, then it is possible to develop the ideas in different coordinate systems.\nSome details are here, the following is based on some lecture notes.\nWe restrict to \\(n=3\\) and use \\((x,y,z)\\) for Cartesian coordinates and \\((u,v,w)\\) for an orthogonal curvilinear coordinate system, such as spherical or cylindrical. If \\(\\vec{r} = \\langle x,y,z\\rangle\\), then\n\\[\n\\begin{align}\nd\\vec{r} &= \\langle dx,dy,dz \\rangle = J \\langle du,dv,dw\\rangle\\\\\n&=\n\\left[ \\frac{\\partial{\\vec{r}}}{\\partial{u}} \\vdots\n\\frac{\\partial{\\vec{r}}}{\\partial{v}} \\vdots\n\\frac{\\partial{\\vec{r}}}{\\partial{w}} \\right] \\langle du,dv,dw\\rangle\\\\\n&= \\frac{\\partial{\\vec{r}}}{\\partial{u}} du +\n\\frac{\\partial{\\vec{r}}}{\\partial{v}} dv\n\\frac{\\partial{\\vec{r}}}{\\partial{w}} dw.\n\\end{align}\n\\]\nThe term \\({\\partial{\\vec{r}}}/{\\partial{u}}\\) is tangent to the curve formed by assuming \\(v\\) and \\(w\\) are constant and letting \\(u\\) vary. Similarly for the other partial derivatives. Orthogonality assumes that at every point, these tangent vectors are orthogonal.\nAs \\({\\partial{\\vec{r}}}/{\\partial{u}}\\) is a vector it has a magnitude and direction. Define the scale factors as the magnitudes:\n\\[\nh_u = \\| \\frac{\\partial{\\vec{r}}}{\\partial{u}} \\|,\\quad\nh_v = \\| \\frac{\\partial{\\vec{r}}}{\\partial{v}} \\|,\\quad\nh_w = \\| \\frac{\\partial{\\vec{r}}}{\\partial{w}} \\|.\n\\]\nand let \\(\\hat{e}_u\\), \\(\\hat{e}_v\\), and \\(\\hat{e}_w\\) be the unit, direction vectors.\nThis gives the following notation:\n\\[\nd\\vec{r} = h_u du \\hat{e}_u + h_v dv \\hat{e}_v + h_w dw \\hat{e}_w.\n\\]\nFrom here, we can express different formulas.\nFor line integrals, we have the line element:\n\\[\ndl = \\sqrt{d\\vec{r}\\cdot d\\vec{r}} = \\sqrt{(h_ud_u)^2 + (h_vd_v)^2 + (h_wd_w)^2}.\n\\]\nConsider the surface for constant \\(u\\). The vector \\(\\hat{e}_v\\) and \\(\\hat{e}_w\\) lie in the surfaces tangent plane, and the surface element will be:\n\\[\ndS_u = \\| h_v dv \\hat{e}_v \\times h_w dw \\hat{e}_w \\| = h_v h_w dv dw \\| \\hat{e}_v \\| = h_v h_w dv dw.\n\\]\nThis uses orthogonality, so \\(\\hat{e}_v \\times \\hat{e}_w\\) is parallel to \\(\\hat{e}_u\\) and has unit length. Similarly, \\(dS_v = h_u h_w du dw\\) and \\(dS_w = h_u h_v du dv\\) .\nThe volume element is found by projecting \\(d\\vec{r}\\) onto the \\(\\hat{e}_u\\), \\(\\hat{e}_v\\), \\(\\hat{e}_w\\) coordinate system through \\((d\\vec{r} \\cdot\\hat{e}_u) \\hat{e}_u\\), \\((d\\vec{r} \\cdot\\hat{e}_v) \\hat{e}_v\\), and \\((d\\vec{r} \\cdot\\hat{e}_w) \\hat{e}_w\\). Then forming the triple scalar product to compute the volume of the parallelepiped:\n\\[\n\\begin{align*}\n\\left[(d\\vec{r} \\cdot\\hat{e}_u) \\hat{e}_u\\right] \\cdot\n\\left(\n\\left[(d\\vec{r} \\cdot\\hat{e}_v) \\hat{e}_v\\right] \\times\n\\left[(d\\vec{r} \\cdot\\hat{e}_w) \\hat{e}_w\\right]\n\\right) &=\n(h_u h_v h_w) ( du dv dw ) (\\hat{e}_u \\cdot (\\hat{e}_v \\times \\hat{e}_w) \\\\\n&=\nh_u h_v h_w du dv dw,\n\\end{align*}\n\\]\nas the unit vectors are orthonormal, their triple scalar product is \\(1\\) and \\(d\\vec{r}\\cdot\\hat{e}_u = h_u du\\), etc.\n\n61.7.1 Example\nWe consider spherical coordinates with\n\\[\nF(r, \\theta, \\phi) = \\langle\nr \\sin(\\phi) \\cos(\\theta),\nr \\sin(\\phi) \\sin(\\theta),\nr \\cos(\\phi)\n\\rangle.\n\\]\nThe following figure draws curves starting at \\((r_0, \\theta_0, \\phi_0)\\) formed by holding \\(2\\) of the \\(3\\) variables constant. The tangent vectors are added in blue. The surface \\(S_r\\) formed by a constant value of \\(r\\) is illustrated.\n\n\n\n\n\nThe tangent vectors found from the partial derivatives of \\(\\vec{r}\\):\n\\[\n\\begin{align}\n\\frac{\\partial{\\vec{r}}}{\\partial{r}} &=\n\\langle \\cos(\\theta) \\cdot \\sin(\\phi), \\sin(\\theta) \\cdot \\sin(\\phi), \\cos(\\phi)\\rangle,\\\\\n\\frac{\\partial{\\vec{r}}}{\\partial{\\theta}} &=\n\\langle -r\\cdot\\sin(\\theta)\\cdot\\sin(\\phi), r\\cdot\\cos(\\theta)\\cdot\\sin(\\phi), 0\\rangle,\\\\\n\\frac{\\partial{\\vec{r}}}{\\partial{\\phi}} &=\n\\langle r\\cdot\\cos(\\theta)\\cdot\\cos(\\phi), r\\cdot\\sin(\\theta)\\cdot\\cos(\\phi), -r\\cdot\\sin(\\phi) \\rangle.\n\\end{align}\n\\]\nWith this, we have \\(h_r=1\\), \\(h_\\theta=r\\sin(\\phi)\\), and \\(h_\\phi = r\\). So that\n\\[\n\\begin{align*}\ndl &= \\sqrt{dr^2 + (r\\sin(\\phi)d\\theta^2) + (rd\\phi)^2},\\\\\ndS_r &= r^2\\sin(\\phi)d\\theta d\\phi,\\\\\ndS_\\theta &= rdr d\\phi,\\\\\ndS_\\phi &= r\\sin(\\phi)dr d\\theta, \\quad\\text{and}\\\\\ndV &= r^2\\sin(\\phi) drd\\theta d\\phi.\n\\end{align*}\n\\]\nThe following visualizes the volume and the surface elements.\n\n\n\n\n\n\n\n61.7.2 The gradient in a new coordinate system\nIf \\(f\\) is a scalar function then \\(df = \\nabla{f} \\cdot d\\vec{r}\\) by the chain rule. Using the curvilinear coordinates:\n\\[\n\\begin{align*}\ndf &=\n\\frac{\\partial{f}}{\\partial{u}} du +\n\\frac{\\partial{f}}{\\partial{v}} dv +\n\\frac{\\partial{f}}{\\partial{w}} dw \\\\\n&=\n\\frac{1}{h_u}\\frac{\\partial{f}}{\\partial{u}} h_udu +\n\\frac{1}{h_v}\\frac{\\partial{f}}{\\partial{v}} h_vdv +\n\\frac{1}{h_w}\\frac{\\partial{f}}{\\partial{w}} h_wdw.\n\\end{align*}\n\\]\nBut, as was used above, \\(d\\vec{r} \\cdot \\hat{e}_u = h_u du\\), etc. so \\(df\\) can be re-expressed as:\n\\[\ndf = (\\frac{1}{h_u}\\frac{\\partial{f}}{\\partial{u}}\\hat{e}_u +\n\\frac{1}{h_v}\\frac{\\partial{f}}{\\partial{v}}\\hat{e}_v +\n\\frac{1}{h_w}\\frac{\\partial{f}}{\\partial{w}}\\hat{e}_w) \\cdot d\\vec{r} =\n\\nabla{f} \\cdot d\\vec{r}.\n\\]\nThe gradient is the part within the parentheses.\n\nAs an example, in cylindrical coordinates, we have \\(h_r =1\\), \\(h_\\theta=r\\), and \\(h_z=1\\), giving:\n\\[\n\\nabla{f} = \\frac{\\partial{f}}{\\partial{r}}\\hat{e}_r +\n\\frac{1}{r}\\frac{\\partial{f}}{\\partial{\\theta}}\\hat{e}_\\theta +\n\\frac{\\partial{f}}{\\partial{z}}\\hat{e}_z\n\\]\n\n\n61.7.3 The divergence in a new coordinate system\nThe divergence is a result of the limit of a surface integral,\n\\[\n\\nabla \\cdot F = \\lim \\frac{1}{\\Delta{V}}\\oint_S F \\cdot \\hat{N} dS.\n\\]\nTaking \\(V\\) as a box in the curvilinear coordinates, with side lengths \\(h_udu\\), \\(h_vdv\\), and \\(h_wdw\\) the surface integral is computed by projecting \\(F\\) onto each normal area element and multiplying by the area. The task is similar to how the the divergence was derived above, only now the terms are like \\(\\partial{(F_uh_vh_w)}/\\partial{u}\\) due to the scale factors (\\(F_u\\) is the u component of \\(F\\).) The result is:\n\\[\n\\nabla\\cdot F = \\frac{1}{h_u h_v h_w}\\left[\n\\frac{\\partial{(F_uh_vh_w)}}{\\partial{u}} +\n\\frac{\\partial{(h_uF_vh_w)}}{\\partial{v}} +\n\\frac{\\partial{(h_uh_vF_w)}}{\\partial{w}} \\right].\n\\]\n\nFor example, in cylindrical coordinates, we have\n\\[\n\\nabla \\cdot F = \\frac{1}{r}\n\\left[\n\\frac{\\partial{F_r r}}{\\partial{r}} +\n\\frac{\\partial{F_\\theta}}{\\partial{\\theta}} +\n\\frac{\\partial{F_x}}{\\partial{z}}\n\\right].\n\\]\n\n\n61.7.4 The curl in a new coordinate system\nThe curl, like the divergence, can be expressed as the limit of an integral:\n\\[\n(\\nabla \\times F) \\cdot \\hat{N} = \\lim \\frac{1}{\\Delta{S}} \\oint_C F \\cdot d\\vec{r},\n\\]\nwhere \\(S\\) is a surface perpendicular to \\(\\hat{N}\\) with boundary \\(C\\). For a small rectangular surface, the derivation is similar to above, only the scale factors are included. This gives, say, for the \\(\\hat{e}_u\\) normal, \\(\\frac{\\partial{(h_zF_z)}}{\\partial{y}} - \\frac{\\partial{(h_yF_y)}}{\\partial{z}}\\). The following determinant form combines the terms compactly:\n\\[\n\\nabla\\times{F} = \\det \\left[\n\\begin{array}{}\nh_u\\hat{e}_u & h_v\\hat{e}_v & h_w\\hat{e}_w \\\\\n\\frac{\\partial}{\\partial{u}} & \\frac{\\partial}{\\partial{v}} & \\frac{\\partial}{\\partial{w}} \\\\\nh_uF_u & h_v F_v & h_w F_w\n\\end{array}\n\\right].\n\\]\n\nFor example, in cylindrical coordinates, the curl is:\n\\[\n\\det\\left[\n\\begin{array}{}\n\\hat{r} & r\\hat{\\theta} & \\hat{k} \\\\\n\\frac{\\partial}{\\partial{r}} & \\frac{\\partial}{\\partial{\\theta}} & \\frac{\\partial}{\\partial{z}} \\\\\nF_r & rF_\\theta & F_z\n\\end{array}\n\\right]\n\\]\nApplying this to the function \\(F(r,\\theta, z) = \\hat{\\theta}\\) we get:\n\\[\n\\text{curl}(F) = \\det\\left[\n\\begin{array}{}\n\\hat{r} & r\\hat{\\theta} & \\hat{k} \\\\\n\\frac{\\partial}{\\partial{r}} & \\frac{\\partial}{\\partial{\\theta}} & \\frac{\\partial}{\\partial{z}} \\\\\n0 & r & 0\n\\end{array}\n\\right] =\n\\hat{k} \\det\\left[\n\\begin{array}{}\n\\frac{\\partial}{\\partial{r}} & \\frac{\\partial}{\\partial{\\theta}}\\\\\n0 & r\n\\end{array}\n\\right] =\n\\hat{k}.\n\\]\nAs \\(F\\) represents a vector field that rotates about the \\(z\\) axis at a constant rate, the magnitude of the curl should be a constant and it should point in the \\(\\hat{k}\\) direction, as we found."
},
{
"objectID": "integral_vector_calculus/div_grad_curl.html#questions",
"href": "integral_vector_calculus/div_grad_curl.html#questions",
"title": "61  The Gradient, Divergence, and Curl",
"section": "61.8 Questions",
"text": "61.8 Questions\n\nQuestion\nNumerically find the divergence of \\(F(x,y,z) = \\langle xy, yz, zx\\rangle\\) at the point \\(\\langle 1,2,3\\rangle\\).\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nNumerically find the curl of \\(F(x,y,z) = \\langle xy, yz, zx\\rangle\\) at the point \\(\\langle 1,2,3\\rangle\\). What is the \\(x\\) component?\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y,z) = \\langle \\sin(x), e^{xy}, xyz\\rangle\\). Find the divergence of \\(F\\) symbolically.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(x e^{x y} + \\cos{\\left (x \\right )}\\)\n \n \n\n\n \n \n \n \n \\(x y + x e^{x y} + \\cos{\\left (x \\right )}\\)\n \n \n\n\n \n \n \n \n \\(x y + x e^{x y}\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y,z) = \\langle \\sin(x), e^{xy}, xyz\\rangle\\). Find the curl of \\(F\\) symbolically. What is the \\(x\\) component?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(ye^{xy}\\)\n \n \n\n\n \n \n \n \n \\(xz\\)\n \n \n\n\n \n \n \n \n \\(-yz\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\phi(x,y,z) = x + 2y + 3z\\). We know that \\(\\nabla\\times\\nabla{\\phi}\\) is zero by the vanishing property. Compute \\(\\nabla\\cdot\\nabla{\\phi}\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(6\\)\n \n \n\n\n \n \n \n \n \\(\\vec{0}\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nIn two dimensions the curl of a gradient field simplifies to:\n\\[\n\\nabla\\times\\nabla{f} = \\nabla\\times\n\\langle\\frac{\\partial{f}}{\\partial{x}},\n\\frac{\\partial{f}}{\\partial{y}}\\rangle =\n\\frac{\\partial{\\frac{\\partial{f}}{\\partial{y}}}}{\\partial{x}} -\n\\frac{\\partial{\\frac{\\partial{f}}{\\partial{x}}}}{\\partial{y}}.\n\\]\n\n\n\n \n \n \n \n \n \n \n \n \n This is \\(0\\) for any \\(f\\), as \\(\\nabla\\times\\nabla\\) is \\(0\\) since the cross product of vector with itself is the \\(0\\) vector.\n \n \n\n\n \n \n \n \n This is \\(0\\) if the partial derivatives are continuous by Schwarz's (Clairault's) theorem\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBased on this vector-field plot\n\n\n\n\n\nwhich seems likely\n\n\n\n \n \n \n \n \n \n \n \n \n The field is incompressible (divergence free)\n \n \n\n\n \n \n \n \n The field is irrotational (curl free)\n \n \n\n\n \n \n \n \n The field has a non-trivial curl and divergence\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nBased on this vectorfield plot\n\n\n\n\n\nwhich seems likely\n\n\n\n \n \n \n \n \n \n \n \n \n The field is incompressible (divergence free)\n \n \n\n\n \n \n \n \n The field is irrotational (curl free)\n \n \n\n\n \n \n \n \n The field has a non-trivial curl and divergence\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe electric field \\(E\\) (by Maxwells equations) satisfies:\n\n\n\n \n \n \n \n \n \n \n \n \n The field is incompressible (divergence free)\n \n \n\n\n \n \n \n \n The field is irrotational (curl free)\n \n \n\n\n \n \n \n \n The field has a non-trivial curl and divergence\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nThe magnetic field \\(B\\) (by Maxwells equations) satisfies:\n\n\n\n \n \n \n \n \n \n \n \n \n The field is incompressible (divergence free)\n \n \n\n\n \n \n \n \n The field is irrotational (curl free)\n \n \n\n\n \n \n \n \n The field has a non-trivial curl and divergence\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor spherical coordinates, \\(\\Phi(r, \\theta, \\phi)=r \\langle \\sin\\phi\\cos\\theta,\\sin\\phi\\sin\\theta,\\cos\\phi\\rangle\\), the scale factors are \\(h_r = 1\\), \\(h_\\theta=r\\sin\\phi\\), and \\(h_\\phi=r\\).\nThe curl then will then be\n\\[\n\\nabla\\times{F} = \\det \\left[\n\\begin{array}{}\n\\hat{e}_r & r\\sin\\phi\\hat{e}_\\theta & r\\hat{e}_\\phi \\\\\n\\frac{\\partial}{\\partial{r}} & \\frac{\\partial}{\\partial{\\theta}} & \\frac{\\partial}{\\partial{phi}} \\\\\nF_r & r\\sin\\phi F_\\theta & r F_\\phi\n\\end{array}\n\\right].\n\\]\nFor a radial function \\(F = h(r)e_r\\). (That is \\(F_r = h(r)\\), \\(F_\\theta=0\\), and \\(F_\\phi=0\\). What is the curl of \\(F\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(rh'(r)e_\\phi\\)\n \n \n\n\n \n \n \n \n \\(\\vec{0}\\)\n \n \n\n\n \n \n \n \n \\(re_\\phi\\)"
},
{
"objectID": "integral_vector_calculus/stokes_theorem.html",
"href": "integral_vector_calculus/stokes_theorem.html",
"title": "62  Greens Theorem, Stokes Theorem, and the Divergence Theorem",
"section": "",
"text": "This section uses these add-on packages:\nThe fundamental theorem of calculus is a fan favorite, as it reduces a definite integral, \\(\\int_a^b f(x) dx\\), into the evaluation of a related function at two points: \\(F(b)-F(a)\\), where the relation is \\(F\\) is an antiderivative of \\(f\\). It is a favorite, as it makes life much easier than the alternative of computing a limit of a Riemann sum.\nThis relationship can be generalized. The key is to realize that the interval \\([a,b]\\) has boundary \\(\\{a, b\\}\\) (a set) and then expressing the theorem as: the integral around some region of \\(f\\) is the integral, suitably defined, around the boundary of the region for a function related to \\(f\\).\nIn an abstract setting, Stokes theorem says exactly this with the relationship being the exterior derivative. Here we are not as abstract, we discuss below:\nThe related functions will involve the divergence and the curl, previously discussed.\nMany of the the examples in this section come from either Strang or Schey.\nTo make the abstract concrete, consider the one dimensional case of finding the definite integral \\(\\int_a^b F'(x) dx\\). The Riemann sum picture at the microscopic level considers a figure like:\nThe total area under the blue curve from \\(a\\) to \\(b\\), is found by adding the area of each segment of the figure.\nLets consider now what an integral over the boundary would mean. The region, or interval, \\([x_{i-1}, x_i]\\) has a boundary that clearly consists of the two points \\(x_{i-1}\\) and \\(x_i\\). If we orient the boundary, as we need to for higher dimensional boundaries, using the outward facing direction, then the oriented boundary at the right-hand end point, \\(x_i\\), would point towards \\(+\\infty\\) and the left-hand end point, \\(x_{i-1}\\), would be oriented to point to \\(-\\infty\\). An “integral” on the boundary of \\(F\\) would naturally be \\(F(b) \\times 1\\) plus \\(F(a) \\times -1\\), or \\(F(b)-F(a)\\).\nWith this choice of integral over the boundary, we can see much cancellation arises were we to compute this integral for each piece, as we would have with \\(a=x_0 < x_1 < \\cdots x_{n-1} < x_n=b\\):\n\\[\n(F(x_1) - F(x_0)) + (F(x_2)-F(x_1)) + \\cdots + (F(x_n) - F(x_{n-1})) = F(x_n) - F(x_0) = F(b) - F(a).\n\\]\nThat is, with this definition for a boundary integral, the interior pieces of the microscopic approximation cancel and the total is just the integral over the oriented macroscopic boundary \\(\\{a, b\\}\\).\nBut each microscopic piece can be reimagined, as\n\\[\nF(x_{i}) - F(x_{i-1}) = \\left(\\frac{F(x_{i}) - F(x_{i-1})}{\\Delta{x}}\\right)\\Delta{x}\n\\approx F'(x_i)\\Delta{x}.\n\\]\nThe approximation could be exact were the mean value theorem used to identify a point in the interval, but we dont pursue that, as the key point is the right hand side is a Riemann sum approximation for a different integral, in this case the integral \\(\\int_a^b F'(x) dx\\). Passing from the microscopic view to an infinitesimal view, the picture gives two interpretations, leading to the Fundamental Theorem of Calculus:\n\\[\n\\int_a^b F'(x) dx = F(b) - F(a).\n\\]\nThe three theorems of this section, Greens theorem, Stokes theorem, and the divergence theorem, can all be seen in this manner: the sum of microscopic boundary integrals leads to a macroscopic boundary integral of the entire region; whereas, by reinterpretation, the microscopic boundary integrals are viewed as Riemann sums, which in the limit become integrals of a related function over the region."
},
{
"objectID": "integral_vector_calculus/stokes_theorem.html#greens-theorem",
"href": "integral_vector_calculus/stokes_theorem.html#greens-theorem",
"title": "62  Greens Theorem, Stokes Theorem, and the Divergence Theorem",
"section": "62.1 Greens theorem",
"text": "62.1 Greens theorem\nTo continue the above analysis for a higher dimension, we consider the following figure hinting at a decomposition of a macroscopic square into subsequent microscopic sub-squares. The boundary of each square is oriented so that the right hand rule comes out of the picture.\n\n\n\n\n\nConsider the boundary integral \\(\\oint_c F\\cdot\\vec{T} ds\\) around the smallest (green) squares. We have seen that the curl at a point in a direction is given in terms of the limit. Let the plane be the \\(x-y\\) plane, and the \\(\\hat{k}\\) direction be the one coming out of the figure. In the derivation of the curl, we saw that the line integral for circulation around the square satisfies:\n\\[\n\\lim \\frac{1}{\\Delta{x}\\Delta{y}} \\oint_C F \\cdot\\hat{T}ds =\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}.\n\\]\nIf the green squares are small enough, then the line integrals satisfy:\n\\[\n\\oint_C F \\cdot\\hat{T}ds\n\\approx\n\\left(\n\\frac{\\partial{F_y}}{\\partial{x}}\n-\n\\frac{\\partial{F_x}}{\\partial{y}}\n\\right) \\Delta{x}\\Delta{y} .\n\\]\nWe interpret the right hand side as a Riemann sum approximation for the \\(2\\) dimensional integral of the function \\(f(x,y) = \\frac{\\partial{F_x}}{\\partial{y}} - \\frac{\\partial{F_y}}{\\partial{x}}=\\text{curl}(F)\\), the two-dimensional curl. Were the green squares continued to fill out the large blue square, then the sum of these terms would approximate the integral\n\\[\n\\iint_S f(x,y) dA = \\iint_S\n\\left(\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}\\right) dA\n= \\iint_S \\text{curl}(F) dA.\n\\]\nHowever, the microscopic boundary integrals have cancellations that lead to a macroscopic boundary integral. The sum of \\(\\oint_C F \\cdot\\hat{T}ds\\) over the \\(4\\) green squares will be equal to \\(\\oint_{C_r} F\\cdot\\hat{T}ds\\), where \\(C_r\\) is the red square, as the interior line integral pieces will all cancel off. The sum of \\(\\oint_{C_r} F \\cdot\\hat{T}ds\\) over the \\(4\\) red squares will equal \\(\\oint_{C_b} F \\cdot\\hat{T}ds\\), where \\(C_b\\) is the oriented path around the blue square, as again the interior line pieces will cancel off. Etc.\nThis all suggests that the flow integral around the surface of the larger region (the blue square) is equivalent to the integral of the curl component over the region. This is Greens theorem, as stated by Wikipedia:\n\nGreens theorem: Let \\(C\\) be a positively oriented, piecewise smooth, simple closed curve in the plane, and let \\(D\\) be the region bounded by \\(C\\). If \\(F=\\langle F_x, F_y\\rangle\\), is a vector field on an open region containing \\(D\\) having continuous partial derivatives then:\n\\[\n\\oint_C F\\cdot\\hat{T}ds =\n\\iint_D \\left(\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}\n\\right) dA=\n\\iint_D \\text{curl}(F)dA.\n\\]\n\nThe statement of the theorem applies only to regions whose boundaries are simple closed curves. Not all simple regions have such boundaries. An annulus for example. This is a restriction that will be generalized.\n\n62.1.1 Examples\nSome examples, following Strang, are:\n\nComputing area\nLet \\(F(x,y) = \\langle -y, x\\rangle\\). Then \\(\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}=2\\), so\n\\[\n\\frac{1}{2}\\oint_C F\\cdot\\hat{T}ds = \\frac{1}{2}\\oint_C (xdy - ydx) =\n\\iint_D dA = A(D).\n\\]\nThis gives a means to compute the area of a region by integrating around its boundary.\n\nTo compute the area of an ellipse, we have:\n\nF(x,y) = [-y,x]\nF(v) = F(v...)\n\nr(t) = [a*cos(t),b*sin(t)]\n\n@syms a::positive b::positive t\n(1//2) * integrate( F(r(t)) ⋅ diff.(r(t),t), (t, 0, 2PI))\n\n \n\\[\n\\pi a b\n\\]\n\n\n\nTo compute the area of the triangle with vertices \\((0,0)\\), \\((a,0)\\) and \\((0,b)\\) we can orient the boundary counter clockwise. Let \\(A\\) be the line segment from \\((0,b)\\) to \\((0,0)\\), \\(B\\) be the line segment from \\((0,0)\\) to \\((a,0)\\), and \\(C\\) be the other. Then\n\\[\n\\begin{align}\n\\frac{1}{2} \\int_A F\\cdot\\hat{T} ds &=\\frac{1}{2} \\int_A -ydx = 0\\\\\n\\frac{1}{2} \\int_B F\\cdot\\hat{T} ds &=\\frac{1}{2} \\int_B xdy = 0,\n\\end{align}\n\\]\nas on \\(A\\), \\(y=0\\) and \\(dy=0\\) and on \\(B\\), \\(x=0\\) and \\(dx=0\\).\nOn \\(C\\) we have \\(\\vec{r}(t) = (0, b) + t\\cdot(1,-b/a) =\\langle t, b-(bt)/a\\rangle\\) from \\(t=a\\) to \\(0\\)\n\\[\n\\int_C F\\cdot \\frac{d\\vec{r}}{dt} dt =\n\\int_a^0 \\langle -b + (bt)/a), t\\rangle\\cdot\\langle 1, -b/a\\rangle dt\n= \\int_a^0 -b dt = -bt\\mid_{a}^0 = ba.\n\\]\nDividing by \\(1/2\\) give the familiar answer \\(A=(1/2) a b\\).\n\n\nConservative fields\nA vector field is conservative if path integrals for work are independent of the path. We have seen that a vector field that is the gradient of a scalar field will be conservative and vice versa. This led to the vanishing identify \\(\\nabla\\times\\nabla(f) = 0\\) for a scalar field \\(f\\).\nIs the converse true? Namely, if for some vector field \\(F\\), \\(\\nabla\\times{F}\\) is identically \\(0\\) is the field conservative?\nThe answer is yes if the vector field has continuous partial derivatives and the curl is \\(0\\) in a simply connected domain.\nFor the two dimensional case the curl is a scalar. If \\(F = \\langle F_x, F_y\\rangle = \\nabla{f}\\) is conservative, then \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y} = 0\\).\nNow assume \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y} = 0\\). Let \\(P\\) and \\(Q\\) be two points in the plane. Take any path, \\(C_1\\) from \\(P\\) to \\(Q\\) and any return path, \\(C_2\\), from \\(Q\\) to \\(P\\) that do not cross and such that \\(C\\), the concatenation of the two paths, satisfies Greens theorem. Then, as \\(F\\) is continuous on an open interval containing \\(D\\), we have:\n\\[\n\\begin{align*}\n0 &= \\iint_D 0 dA \\\\\n&=\n\\iint_D \\left(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y}\\right)dA \\\\\n&=\n\\oint_C F \\cdot \\hat{T} ds \\\\\n&=\n\\int_{C_1} F \\cdot \\hat{T} ds + \\int_{C_2}F \\cdot \\hat{T} ds.\n\\end{align*}\n\\]\nReversing \\(C_2\\) to go from \\(P\\) to \\(Q\\), we see the two work integrals are identical, that is the field is conservative.\nSummarizing:\n\nIf \\(F=\\nabla{f}\\) then \\(F\\) is conservative.\nIf \\(F=\\langle F_x, F_y\\rangle\\) has continuous partial derivatives in a simply connected open region with \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y}=0\\), then in that region \\(F\\) is conservative and can be represented as the gradient of a scalar function.\n\nFor example, let \\(F(x,y) = \\langle \\sin(xy), \\cos(xy) \\rangle\\). Is this a conservative vector field?\nWe can check by taking partial derivatives. Those of interest are:\n\\[\n\\begin{align}\n\\frac{\\partial{F_y}}{\\partial{x}} &= \\frac{\\partial{(\\cos(xy))}}{\\partial{x}} =\n-\\sin(xy) y,\\\\\n\\frac{\\partial{F_x}}{\\partial{y}} &= \\frac{\\partial{(\\sin(xy))}}{\\partial{y}} =\n\\cos(xy)x.\n\\end{align}\n\\]\nIt is not the case that \\(\\partial{F_y}/\\partial{x} - \\partial{F_x}/\\partial{y}=0\\), so this vector field is not conservative.\n\nThe conditions of Greens theorem are important, as this next example shows.\nLet \\(D\\) be the unit disc, \\(C\\) the unit circle parameterized counter clockwise.\nLet \\(R(x,y) = \\langle -y, x\\rangle\\) be a rotation field and \\(F(x,y) = R(x,y)/(R(x,y)\\cdot R(x,y))\\). Then:\n\n@syms x::real y::real z::real t::real\n\n(x, y, z, t)\n\n\n\nR(x,y) = [-y,x]\nF(x,y) = R(x,y)/(R(x,y)⋅R(x,y))\n\nFx, Fy = F(x,y)\ndiff(Fy, x) - diff(Fx, y) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nAs the integrand is \\(00\\), \\(\\iint_D \\left( \\partial{F_y}/{\\partial{x}}-\\partial{F_xy}/{\\partial{y}}\\right)dA = 0\\), as well. But,\n\\[\nF\\cdot\\hat{T} = \\frac{R}{R\\cdot{R}} \\cdot \\frac{R}{R\\cdot{R}} = \\frac{R\\cdot{R}}{(R\\cdot{R})^2} = \\frac{1}{R\\cdot{R}},\n\\]\nso \\(\\oint_C F\\cdot\\hat{T}ds = 2\\pi\\), \\(C\\) being the unit circle so \\(R\\cdot{R}=1\\).\nThat is, for this example, Greens theorem does not apply, as the two integrals are not the same. What isnt satisfied in the theorem? \\(F\\) is not continuous at the origin and our curve \\(C\\) defining \\(D\\) encircles the origin. So, \\(F\\) does not have continuous partial derivatives, as is required for the theorem.\n\n\nMore complicated boundary curves\nA simple closed curve is one that does not cross itself. Greens theorem applies to regions bounded by curves which have finitely many crosses provided the orientation used is consistent throughout.\nConsider the curve \\(y = f(x)\\), \\(a \\leq x \\leq b\\), assuming \\(f\\) is continuous, \\(f(a) > 0\\), and \\(f(b) < 0\\). We can use Greens theorem to compute the signed “area” under under \\(f\\) if we consider the curve in \\(R^2\\) from \\((b,0)\\) to \\((a,0)\\) to \\((a, f(a))\\), to \\((b, f(b))\\) and back to \\((b,0)\\) in that orientation. This will cross at each zero of \\(f\\).\n\n\n\n\n\nLet \\(A\\) label the red line, \\(B\\) the green curve, \\(C\\) the blue line, and \\(D\\) the black line. Then the area is given from Greens theorem by considering half of the the line integral of \\(F(x,y) = \\langle -y, x\\rangle\\) or \\(\\oint_C (xdy - ydx)\\). To that matter we have:\n\\[\n\\begin{align}\n\\int_A (xdy - ydx) &= a f(a)\\\\\n\\int_C (xdy - ydx) &= b(-f(b))\\\\\n\\int_D (xdy - ydx) &= 0\\\\\n\\end{align}\n\\]\nFinally the integral over \\(B\\), using integration by parts:\n\\[\n\\begin{align}\n\\int_B F(\\vec{r}(t))\\cdot \\frac{d\\vec{r}(t)}{dt} dt &=\n\\int_b^a \\langle -f(t),t)\\rangle\\cdot\\langle 1, f'(t)\\rangle dt\\\\\n&= \\int_a^b f(t)dt - \\int_a^b tf'(t)dt\\\\\n&= \\int_a^b f(t)dt - \\left(tf(t)\\mid_a^b - \\int_a^b f(t) dt\\right).\n\\end{align}\n\\]\nCombining, we have after cancellation \\(\\oint (xdy - ydx) = 2\\int_a^b f(t) dt\\), or after dividing by \\(2\\) the signed area under the curve.\n\nThe region may not be simply connected. A simple case might be the disc: \\(1 \\leq x^2 + y^2 \\leq 4\\). In this figure we introduce a cut to make a simply connected region.\n\n\n\n\n\nThe cut leads to a counter-clockwise orientation on the outer ring and a clockwise orientation on the inner ring. If this cut becomes so thin as to vanish, then the line integrals along the lines introducing the cut will cancel off and we have a boundary consisting of two curves with opposite orientations. (If we follow either orientation the closed figure is on the left.)\nTo see that the area integral of \\(F(x,y) = (1/2)\\langle -y, x\\rangle\\) produces the area for this orientation we have, using \\(C_1\\) as the outer ring, and \\(C_2\\) as the inner ring:\n\\[\n\\begin{align}\n\\oint_{C_1} F \\cdot \\hat{T} ds &=\n\\int_0^{2\\pi} (1/2)(2)\\langle -\\sin(t), \\cos(t)\\rangle \\cdot (2)\\langle-\\sin(t), \\cos(t)\\rangle dt \\\\\n&= (1/2) (2\\pi) 4 = 4\\pi\\\\\n\\oint_{C_2} F \\cdot \\hat{T} ds &=\n\\int_{0}^{2\\pi} (1/2) \\langle \\sin(t), \\cos(t)\\rangle \\cdot \\langle-\\sin(t), -\\cos(t)\\rangle dt\\\\\n&= -(1/2)(2\\pi) = -\\pi.\n\\end{align}\n\\]\n(Using \\(\\vec{r}(t) = 2\\langle \\cos(t), \\sin(t)\\rangle\\) for the outer ring and \\(\\vec{r}(t) = 1\\langle \\cos(t), -\\sin(t)\\rangle\\) for the inner ring.)\nAdding the two gives \\(4\\pi - \\pi = \\pi \\cdot(b^2 - a^2)\\), with \\(b=2\\) and \\(a=1\\).\n\n\nFlow not flux\nGreens theorem has a complement in terms of flow across \\(C\\). As \\(C\\) is positively oriented (so the bounded interior piece is on the left of \\(\\hat{T}\\) as the curve is traced), a normal comes by rotating \\(90^\\circ\\) counterclockwise. That is if \\(\\hat{T} = \\langle a, b\\rangle\\), then \\(\\hat{N} = \\langle b, -a\\rangle\\).\nLet \\(F = \\langle F_x, F_y \\rangle\\) and \\(G = \\langle F_y, -F_x \\rangle\\), then \\(G\\cdot\\hat{T} = -F\\cdot\\hat{N}\\). The curl formula applied to \\(G\\) becomes\n\\[\n\\frac{\\partial{G_y}}{\\partial{x}} - \\frac{\\partial{G_x}}{\\partial{y}} =\n\\frac{\\partial{-F_x}}{\\partial{x}}-\\frac{\\partial{(F_y)}}{\\partial{y}}\n=\n-\\left(\\frac{\\partial{F_x}}{\\partial{x}} + \\frac{\\partial{F_y}}{\\partial{y}}\\right)=\n-\\nabla\\cdot{F}.\n\\]\nGreens theorem applied to \\(G\\) then gives this formula for \\(F\\):\n\\[\n\\oint_C F\\cdot\\hat{N} ds =\n-\\oint_C G\\cdot\\hat{T} ds =\n-\\iint_D (-\\nabla\\cdot{F})dA =\n\\iint_D \\nabla\\cdot{F}dA.\n\\]\nThe right hand side integral is the \\(2\\)-dimensional divergence, so this has the interpretation that the flux through \\(C\\) (\\(\\oint_C F\\cdot\\hat{N} ds\\)) is the integral of the divergence. (The divergence is defined in terms of a limit of this picture, so this theorem extends the microscopic view to a bigger view.)\nRather than leave this as an algebraic consequence, we sketch out how this could be intuitively argued from a microscopic picture, the reason being similar to that for the curl, where we considered the small green boxes. In the generalization to dimension \\(3\\) both arguments are needed for our discussion:\nConsider now a \\(2\\)-dimensional region split into microscopic boxes; we focus now on two adjacent boxes, \\(A\\) and \\(B\\):\n\n\n\n\n\nThe integrand \\(F\\cdot\\hat{N}\\) for \\(A\\) will differ from that for \\(B\\) by a minus sign, as the field is the same, but the normal carries an opposite sign. Hence the contribution to the line integral around \\(A\\) along this part of the box partition will cancel out with that around \\(B\\). The only part of the line integral that will not cancel out for such a partition will be the boundary pieces of the overall shape.\nThis figure shows in red the parts of the line integrals that will cancel for a more refined grid.\n\n\n\n\n\nAgain, the microscopic boundary integrals when added will give a macroscopic boundary integral due to cancellations.\nBut, as seen in the derivation of the divergence, only modified for \\(2\\) dimensions, we have \\(\\nabla\\cdot{F} = \\lim \\frac{1}{\\Delta S} \\oint_C F\\cdot\\hat{N}\\), so for each cell\n\\[\n\\oint_{C_i} F\\cdot\\hat{N} \\approx \\left(\\nabla\\cdot{F}\\right)\\Delta{x}\\Delta{y},\n\\]\nan approximating Riemann sum for \\(\\iint_D \\nabla\\cdot{F} dA\\). This yields:\n\\[\n\\oint_C (F \\cdot\\hat{N}) dA =\n\\sum_i \\oint_{C_i} (F \\cdot\\hat{N}) dA \\approx\n\\sum \\left(\\nabla\\cdot{F}\\right)\\Delta{x}\\Delta{y} \\approx\n\\iint_S \\nabla\\cdot{F}dA,\n\\]\nthe approximation signs becoming equals signs in the limit.\n\nExample\nLet \\(F(x,y) = \\langle ax , by\\rangle\\), and \\(D\\) be the square with side length \\(2\\) centered at the origin. Verify that the flow form of Greens theorem holds.\nWe have the divergence is simply \\(a + b\\) so \\(\\iint_D (a+b)dA = (a+b)A(D) = 4(a+b)\\).\nThe integral of the flow across \\(C\\) consists of \\(4\\) parts. By symmetry, they all should be similar. We consider the line segment connecting \\((1,-1)\\) to \\((1,1)\\) (which has the proper counterclockwise orientation):\n\\[\n\\int_C F \\cdot \\hat{N} ds=\n\\int_{-1}^1 \\langle F_x, F_y\\rangle\\cdot\\langle 0, 1\\rangle ds =\n\\int_{-1}^1 b dy = 2b.\n\\]\nIntegrating across the top will give \\(2a\\), along the bottom \\(2a\\), and along the left side \\(2b\\) totaling \\(4(a+b)\\).\n\nNext, let \\(F(x,y) = \\langle -y, x\\rangle\\). This field rotates, and we see has no divergence, as \\(\\partial{F_x}/\\partial{x} = \\partial{(-y)}/\\partial{x} = 0\\) and \\(\\partial{F_y}/\\partial{y} = \\partial{x}/\\partial{y} = 0\\). As such, the area integral in Greens theorem is \\(0\\). As well, \\(F\\) is parallel to \\(\\hat{T}\\) so orthogonal to \\(\\hat{N}\\), hence \\(\\oint F\\cdot\\hat{N}ds = \\oint 0ds = 0\\). For any region \\(S\\) there is no net flow across the boundary and no source or sink of flow inside.\n\n\nExample: stream functions\nStrang compiles the following equivalencies (one implies the others) for when the total flux is \\(0\\) for a vector field with continuous partial derivatives:\n\n\\[\n\\oint F\\cdot\\hat{N} ds = 0\n\\]\nfor all curves connecting \\(P\\) to \\(Q\\), \\(\\int_C F\\cdot\\hat{N}\\) has the same value\nThere is a stream function \\(g(x,y)\\) for which \\(F_x = \\partial{g}/\\partial{y}\\) and \\(F_y = -\\partial{g}/\\partial{x}\\). (This says \\(\\nabla{g}\\) is orthogonal to \\(F\\).)\nthe components have zero divergence: \\(\\partial{F_x}/\\partial{x} + \\partial{F_y}/\\partial{y} = 0\\).\n\nStrang calls these fields source free as the divergence is \\(0\\).\nA stream function plays the role of a scalar potential, but note the minus sign and order of partial derivatives. These are accounted for by saying \\(\\langle F_x, F_y, 0\\rangle = \\nabla\\times\\langle 0, 0, g\\rangle\\), in Cartesian coordinates. Streamlines are tangent to the flow of the velocity vector of the flow and in two dimensions are perpendicular to field lines formed by the gradient of a scalar function.\nPotential flow uses a scalar potential function to describe the velocity field through \\(\\vec{v} = \\nabla{f}\\). As such, potential flow is irrotational due to the curl of a conservative field being the zero vector. Restricting to two dimensions, this says the partials satisfy \\(\\partial{v_y}/\\partial{x} - \\partial{v_x}/\\partial{y} = 0\\). For an incompressible flow (like water) the velocity will have \\(0\\) divergence too. That is \\(\\nabla\\cdot\\nabla{f} = 0\\) - \\(f\\) satisfies Laplaces equation.\nBy the equivalencies above, an incompressible potential flow means in addition to a potential function, \\(f\\), there is a stream function \\(g\\) satisfying \\(v_x = \\partial{g}/\\partial{y}\\) and \\(v_y=-\\partial{g}/\\partial{x}\\).\nThe gradient of \\(f=\\langle v_x, v_y\\rangle\\) is orthogonal to the contour lines of \\(f\\). The gradient of \\(g=\\langle -v_y, v_x\\rangle\\) is orthogonal to the gradient of \\(f\\), so are tangents to the contour lines of \\(f\\). Reversing, the gradient of \\(f\\) is tangent to the contour lines of \\(g\\). If the flow follows the velocity field, then the contour lines of \\(g\\) indicate the flow of the fluid.\nAs an example consider the following in polar coordinates:\n\\[\nf(r, \\theta) = A r^n \\cos(n\\theta),\\quad\ng(r, \\theta) = A r^n \\sin(n\\theta).\n\\]\nThe constant \\(A\\) just sets the scale, the parameter \\(n\\) has a qualitative effect on the contour lines. Consider \\(n=2\\) visualized below:\n\ngr() # pyplot doesn't like the color as specified below.\nn = 2\nf(r,theta) = r^n * cos(n*theta)\ng(r, theta) = r^n * sin(n*theta)\n\nf(v) = f(v...); g(v)= g(v...)\n\nΦ(x,y) = [sqrt(x^2 + y^2), atan(y,x)]\nΦ(v) = Φ(v...)\n\nxs = ys = range(-2,2, length=50)\np = contour(xs, ys, f∘Φ, color=:red, legend=false, aspect_ratio=:equal)\ncontour!(p, xs, ys, g∘Φ, color=:blue, linewidth=3)\n#pyplot()\np\n\n\n\n\nThe fluid would flow along the blue (stream) lines. The red lines have equal potential along the line."
},
{
"objectID": "integral_vector_calculus/stokes_theorem.html#stokes-theorem",
"href": "integral_vector_calculus/stokes_theorem.html#stokes-theorem",
"title": "62  Greens Theorem, Stokes Theorem, and the Divergence Theorem",
"section": "62.2 Stokes theorem",
"text": "62.2 Stokes theorem\n\n\n\nThe Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary - where the surface meets the pan - stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes theorem will relate a surface integral over the surface to a line integral about the bounding curve.\n\n\nWere the figure of Jiffy Pop popcorn animated, the surface of foil would slowly expand due to pressure of popping popcorn until the popcorn was ready. However, the boundary would remain the same. Many different surfaces can have the same boundary. Take for instance the upper half unit sphere in \\(R^3\\) it having the curve \\(x^2 + y^2 = 1\\) as a boundary curve. This is the same curve as the surface of the cone \\(z = 1 - (x^2 + y^2)\\) that lies above the \\(x-y\\) plane. This would also be the same curve as the surface formed by a Mickey Mouse glove if the collar were scaled and positioned onto the unit circle.\nImagine if instead of the retro labeling, a rectangular grid were drawn on the surface of the Jiffy Pop popcorn before popping. By Greens theorem, the integral of the curl of a vector field \\(F\\) over this surface reduces to just an accompanying line integral over the boundary, \\(C\\), where the orientation of \\(C\\) is in the \\(\\hat{k}\\) direction. The intuitive derivation being that the curl integral over the grid will have cancellations due to adjacent cells having shared paths being traversed in both directions.\nNow imagine the popcorn expanding, but rather than worry about burning, focusing instead on what happens to the integral of the curl in the direction of the normal, we have\n\\[\n\\nabla\\times{F} \\cdot\\hat{N} = \\lim \\frac{1}{\\Delta{S}} \\oint_C F\\cdot\\hat{T} ds\n\\approx \\frac{1}{\\Delta{S}} F\\cdot\\hat{T} \\Delta{s}.\n\\]\nThis gives the series of approximations:\n\\[\n\\begin{align*}\n\\oint_C F\\cdot\\hat{T} ds &=\n\\sum \\oint_{C_i} F\\cdot\\hat{T} ds \\\\\n&\\approx\n\\sum F\\cdot\\hat{T} \\Delta s \\\\\n&\\approx\n\\sum \\nabla\\times{F}\\cdot\\hat{N} \\Delta{S} \\\\\n&\\approx\n\\iint_S \\nabla\\times{F}\\cdot\\hat{N} dS.\n\\end{align*}\n\\]\nIn terms of our expanding popcorn, the boundary integral - after accounting for cancellations, as in Greens theorem - can be seen as a microscopic sum of boundary integrals each of which is approximated by a term \\(\\nabla\\times{F}\\cdot\\hat{N} \\Delta{S}\\) which is viewed as a Riemann sum approximation for the the integral of the curl over the surface. The cancellation depends on a proper choice of orientation, but with that we have:\n\nStokes theorem: Let \\(S\\) be an orientable smooth surface in \\(R^3\\) with boundary \\(C\\), \\(C\\) oriented so that the chosen normal for \\(S\\) agrees with the right-hand rule for \\(C\\)s orientation. Then if \\(F\\) has continuous partial derivatives\n\\[\n\\oint_C F \\cdot\\hat{T} ds = \\iint_S (\\nabla\\times{F})\\cdot\\hat{N} dA.\n\\]\n\nGreens theorem is an immediate consequence upon viewing the region in \\(R^2\\) as a surface in \\(R^3\\) with normal \\(\\hat{k}\\).\n\n62.2.1 Examples\n\nExample\nOur first example involves just an observation. For any simply connected surface \\(S\\) without boundary (such as a sphere) the integral \\(\\oint_S \\nabla\\times{F}dS=0\\), as the line integral around the boundary must be \\(0\\), as there is no boundary.\n\n\nExample\nLet \\(F(x,y,z) = \\langle x^2, 0, y^2\\rangle\\) and \\(C\\) be the circle \\(x^2 + z^2 = 1\\) with \\(y=0\\). Find \\(\\oint_C F\\cdot\\hat{T}ds\\).\nWe can use Stokes theorem with the surface being just the disc, so that \\(\\hat{N} = \\hat{j}\\). This makes the computation easy:\n\nFₛ(x,y,z) = [x^2, 0, y^2]\nCurlFₛ = curl(Fₛ(x,y,z), [x,y,z])\n\n3-element Vector{Sym}:\n 2⋅y\n 0\n 0\n\n\nWe have \\(\\nabla\\times{F}\\cdot\\hat{N} = 0\\), so the answer is \\(0\\).\nWe could have directly computed this. Let \\(r(t) = \\langle \\cos(t), 0, \\sin(t)\\rangle\\). Then we have:\n\nrₛ(t) = [cos(t), 0, sin(t)]\nrpₛ = diff.(rₛ(t), t)\nintegrandₛ = Fₛ(rₛ(t)...) ⋅ rpₛ\n\n \n\\[\n- \\sin{\\left(t \\right)} \\cos^{2}{\\left(t \\right)}\n\\]\n\n\n\nThe integrand isnt obviously going to yield \\(0\\) for the integral, but through symmetry:\n\nintegrate(integrandₛ, (t, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\n\n\nExample: Amperes circuital law\n(Schey) Suppose a current \\(I\\) flows along a line and \\(C\\) is a path encircling the current with orientation such that the right hand rule points in the direction of the current flow.\nAmperes circuital law relates the line integral of the magnetic field to the induced current through:\n\\[\n\\oint_C B\\cdot\\hat{T} ds = \\mu_0 I.\n\\]\nThe goal here is to re-express this integral law to produce a law at each point of the field. Let \\(S\\) be a surface with boundary \\(C\\), Let \\(J\\) be the current density - \\(J=\\rho v\\), with \\(\\rho\\) the density of the current (not time-varying) and \\(v\\) the velocity. The current can be re-expressed as \\(I = \\iint_S J\\cdot\\hat{n}dA\\). (If the current flows through a wire and \\(S\\) is much bigger than the wire, this is still valid as \\(\\rho=0\\) outside of the wire.)\nWe then have:\n\\[\n\\mu_0 \\iint_S J\\cdot\\hat{N}dA =\n\\mu_0 I =\n\\oint_C B\\cdot\\hat{T} ds =\n\\iint_S (\\nabla\\times{B})\\cdot\\hat{N}dA.\n\\]\nAs \\(S\\) and \\(C\\) are arbitrary, this implies the integrands of the surface integrals are equal, or:\n\\[\n\\nabla\\times{B} = \\mu_0 J.\n\\]\n\n\nExample: Faradays law\n(Strang) Suppose \\(C\\) is a wire and there is a time-varying magnetic field \\(B(t)\\). Then Faradays law says the flux passing within \\(C\\) through a surface \\(S\\) with boundary \\(C\\) of the magnetic field, \\(\\phi = \\iint B\\cdot\\hat{N}dS\\), induces an electric field \\(E\\) that does work:\n\\[\n\\oint_C E\\cdot\\hat{T}ds = -\\frac{\\partial{\\phi}}{\\partial{t}}.\n\\]\nFaradays law is an empirical statement. Stokes theorem can be used to produce one of Maxwells equations. For any surface \\(S\\), as above with its boundary being \\(C\\), we have both:\n\\[\n-\\iint_S \\left(\\frac{\\partial{B}}{\\partial{t}}\\cdot\\hat{N}\\right)dS =\n-\\frac{\\partial{\\phi}}{\\partial{t}} =\n\\oint_C E\\cdot\\hat{T}ds =\n\\iint_S (\\nabla\\times{E}) dS.\n\\]\nThis is true for any capping surface for \\(C\\). Shrinking \\(C\\) to a point means it will hold for each point in \\(R^3\\). That is:\n\\[\n\\nabla\\times{E} = -\\frac{\\partial{B}}{\\partial{t}}.\n\\]\n\n\nExample: Conservative fields\nGreens theorem gave a characterization of \\(2\\)-dimensional conservative fields, Stokes theorem provides a characterization for \\(3\\) dimensional conservative fields (with continuous derivatives):\n\nThe work \\(\\oint_C F\\cdot\\hat{T} ds = 0\\) for every closed path\nThe work \\(\\int_P^Q F\\cdot\\hat{T} ds\\) is independent of the path between \\(P\\) and \\(Q\\)\nfor a scalar potential function \\(\\phi\\), \\(F = \\nabla{\\phi}\\)\nThe curl satisfies: \\(\\nabla\\times{F} = \\vec{0}\\) (and the domain is simply connected).\n\nStokess theorem can be used to show the first and fourth are equivalent.\nFirst, if \\(0 = \\oint_C F\\cdot\\hat{T} ds\\), then by Stokes theorem \\(0 = \\int_S \\nabla\\times{F} dS\\) for any orientable surface \\(S\\) with boundary \\(C\\). For a given point, letting \\(C\\) shrink to that point can be used to see that the cross product must be \\(0\\) at that point.\nConversely, if the cross product is zero in a simply connected region, then take any simple closed curve, \\(C\\) in the region. If the region is simply connected then there exists an orientable surface, \\(S\\) in the region with boundary \\(C\\) for which: \\(\\oint_C F\\cdot{N} ds = \\iint_S (\\nabla\\times{F})\\cdot\\hat{N}dS= \\iint_S \\vec{0}\\cdot\\hat{N}dS = 0\\).\nThe construction of a scalar potential function from the field can be done as illustrated in this next example.\nTake \\(F = \\langle yz^2, xz^2, 2xyz \\rangle\\). Verify \\(F\\) is conservative and find a scalar potential \\(\\phi\\).\nTo verify that \\(F\\) is conservative, we find its curl to see that it is \\(\\vec{0}\\):\n\nF(x,y,z) = [y*z^2, x*z^2, 2*x*y*z]\ncurl(F(x,y,z), [x,y,z])\n\n3-element Vector{Sym}:\n 0\n 0\n 0\n\n\nWe need \\(\\phi\\) with \\(\\partial{\\phi}/\\partial{x} = F_x = yz^2\\). To that end, we integrate in \\(x\\):\n\\[\n\\phi(x,y,z) = \\int yz^2 dx = xyz^2 + g(y,z),\n\\]\nthe function \\(g(y,z)\\) is a “constant” of integration (it doesnt depend on \\(x\\)). That \\(\\partial{\\phi}/\\partial{x} = F_x\\) is true is easy to verify. Now, consider the partial in \\(y\\):\n\\[\n\\frac{\\partial{\\phi}}{\\partial{y}} = xz^2 + \\frac{\\partial{g}}{\\partial{y}} = F_y = xz^2.\n\\]\nSo we have \\(\\frac{\\partial{g}}{\\partial{y}}=0\\) or \\(g(y,z) = h(z)\\), some constant in \\(y\\). Finally, we must have \\(\\partial{\\phi}/\\partial{z} = F_z\\), or\n\\[\n\\frac{\\partial{\\phi}}{\\partial{z}} = 2xyz + h'(z) = F_z = 2xyz,\n\\]\nSo \\(h'(z) = 0\\). This value can be any constant, even \\(0\\) which we take, so that \\(g(y,z) = 0\\) and \\(\\phi(x,y,z) = xyz^2\\) is a scalar potential for \\(F\\).\n\n\nExample\nLet \\(F(x,y,z) = \\nabla(xy^2z^3) = \\langle y^2z^3, 2xyz^3, 3xy^2z^2\\rangle\\). Show that the line integrals around the unit circle in the \\(x-y\\) plane and the \\(y-z\\) planes are \\(0\\), as \\(F\\) is conservative.\n\nFxyz = ∇(x*y^2*z^3)\n\n3-element Vector{Sym}:\n y^2*z^3\n 2*x*y*z^3\n 3*x*y^2*z^2\n\n\n\nr(t) = [cos(t), sin(t), 0]\nrp = diff.(r(t), t)\nFt = subs.(Fxyz, x .=> r(t)[1], y.=> r(t)[2], z .=> r(t)[3])\nintegrate(Ft ⋅ rp, (t, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\n(This is trivial, as Ft is \\(0\\), as each term has a \\(z\\) factor of \\(0\\).)\nIn the \\(y-z\\) plane we have:\n\nr(t) = [0, cos(t), sin(t)]\nrp = diff.(r(t), t)\nFt = subs.(Fxyz, x .=> r(t)[1], y.=> r(t)[2], z .=> r(t)[3])\nintegrate(Ft ⋅ rp, (t, 0, 2PI))\n\n \n\\[\n0\n\\]\n\n\n\nThis is also easy, as Ft has only an x component and rp has only y and z components, so the two are orthogonal.\n\n\nExample\nIn two dimensions the vector field \\(F(x,y) = \\langle -y, x\\rangle/(x^2+y^2) = S(x,y)/\\|R\\|^2\\) is irrotational (\\(0\\) curl) and has \\(0\\) divergence, but is not conservative in \\(R^2\\), as with \\(C\\) being the unit disk we have \\(\\oint_C F\\cdot\\hat{T}ds = \\int_0^{2\\pi} \\langle -\\sin(\\theta),\\cos(\\theta)\\rangle \\cdot \\langle-\\sin(\\theta), \\cos(\\theta)\\rangle/1 d\\theta = 2\\pi\\). This is because \\(F\\) is not continuously differentiable at the origin, so the path \\(C\\) is not in a simply connected domain where \\(F\\) is continuously differentiable. (Were \\(C\\) to avoid the origin, the integral would be \\(0\\).)\nIn three dimensions, removing a single point in a domain does change simple connectedness, but removing an entire line will. So the function \\(F(x,y,z) =\\langle -y,x,0\\rangle/(x^2+y^2)\\rangle\\) will have \\(0\\) curl, \\(0\\) divergence, but wont be conservative in a domain that includes the \\(z\\) axis.\nHowever, the function \\(F(x,y,z) = \\langle x, y,z\\rangle/\\sqrt{x^2+y^2+z^2}\\) has curl \\(0\\), except at the origin. However, \\(R^3\\) less the origin, as a domain, is simply connected, so \\(F\\) will be conservative."
},
{
"objectID": "integral_vector_calculus/stokes_theorem.html#divergence-theorem",
"href": "integral_vector_calculus/stokes_theorem.html#divergence-theorem",
"title": "62  Greens Theorem, Stokes Theorem, and the Divergence Theorem",
"section": "62.3 Divergence theorem",
"text": "62.3 Divergence theorem\nThe divergence theorem is a consequence of a simple observation. Consider two adjacent cubic regions that share a common face. The boundary integral, \\(\\oint_S F\\cdot\\hat{N} dA\\), can be computed for each cube. The surface integral requires a choice of normal, and the convention is to use the outward pointing normal. The common face of the two cubes has different outward pointing normals, the difference being a minus sign. As such, the contribution of the surface integral over this face for one cube is cancelled out by the contribution of the surface integral over this face for the adjacent cube. As with Greens theorem, this means for a cubic partition, that only the contribution over the boundary is needed to compute the boundary integral. In formulas, if \\(V\\) is a \\(3\\) dimensional cubic region with boundary \\(S\\) and it is partitioned into smaller cubic subregions, \\(V_i\\) with surfaces \\(S_i\\), we have:\n\\[\n\\oint_S F\\cdot{N} dA = \\sum \\oint_{S_i} F\\cdot{N} dA.\n\\]\nIf the partition provides a microscopic perspective, then the divergence approximation \\(\\nabla\\cdot{F} \\approx (1/\\Delta{V_i}) \\oint_{S_i} F\\cdot{N} dA\\) can be used to say:\n\\[\n\\oint_S F\\cdot{N} dA =\n\\sum \\oint_{S_i} F\\cdot{N} dA \\approx\n\\sum (\\nabla\\cdot{F})\\Delta{V_i} \\approx\n\\iiint_V \\nabla\\cdot{F} dV,\n\\]\nthe last approximation through a Riemann sum approximation. This heuristic leads to:\n\nThe divergence theorem: Suppose \\(V\\) is a \\(3\\)-dimensional volume which is bounded (compact) and has a boundary, \\(S\\), that is piecewise smooth. If \\(F\\) is a continuously differentiable vector field defined on an open set containing \\(V\\), then:\n\\[\n\\iiint_V (\\nabla\\cdot{F}) dV = \\oint_S (F\\cdot\\hat{N})dS.\n\\]\n\nThat is, the volume integral of the divergence can be computed from the flux integral over the boundary of \\(V\\).\n\n62.3.1 Examples of the divergence theorem\n\nExample\nVerify the divergence theorem for the vector field \\(F(x,y,z) = \\langle xy, yz, zx\\rangle\\) for the cubic box centered at the origin with side lengths \\(2\\).\nWe need to compute two terms and show they are equal. We begin with the volume integral:\n\nF₁(x,y,z) = [x*y, y*z, z*x]\nDivF₁ = divergence(F₁(x,y,z), [x,y,z])\nintegrate(DivF₁, (x, -1,1), (y,-1,1), (z, -1,1))\n\n \n\\[\n0\n\\]\n\n\n\nThe total integral is \\(0\\) by symmetry, not due to the divergence being \\(0\\), as it is \\(x+y+z\\).\nAs for the surface integral, we have \\(6\\) sides to consider. We take the sides with \\(\\hat{N}\\) being \\(\\pm\\hat{i}\\):\n\nNhat = [1,0,0]\nintegrate((F₁(x,y,z) ⋅ Nhat), (y, -1, 1), (z, -1,1)) # at x=1\n\n \n\\[\n0\n\\]\n\n\n\nIn fact, all \\(6\\) sides will be \\(0\\), as in this case \\(F \\cdot \\hat{i} = xy\\) and at \\(x=1\\) the surface integral is just \\(\\int_{-1}^1\\int_{-1}^1 y dy dz = 0\\), as \\(y\\) is an odd function.\nAs such, the two sides of the Divergence theorem are both \\(0\\), so the theorem is verified.\n\nExample\n(From Strang) If the temperature inside the sun is \\(T = \\log(1/\\rho)\\) find the heat flow \\(F=-\\nabla{T}\\); the source, \\(\\nabla\\cdot{F}\\); and the flux, \\(\\iint F\\cdot\\hat{N}dS\\). Model the sun as a ball of radius \\(\\rho_0\\).\nWe have the heat flow is simply:\n\nRₗ(x,y,z) = norm([x,y,z])\nTₗ(x,y,z) = log(1/Rₗ(x,y,z))\nHeatFlow = -diff.(Tₗ(x,y,z), [x,y,z])\n\n3-element Vector{Sym}:\n x/(x^2 + y^2 + z^2)\n y/(x^2 + y^2 + z^2)\n z/(x^2 + y^2 + z^2)\n\n\nWe may recognize this as \\(\\rho/\\|\\rho\\|^2 = \\hat{\\rho}/\\|\\rho\\|\\).\nThe source is\n\nDivₗ = divergence(HeatFlow, [x,y,z]) |> simplify\n\n \n\\[\n\\frac{1}{x^{2} + y^{2} + z^{2}}\n\\]\n\n\n\nWhich would simplify to \\(1/\\rho^2\\).\nFinally, the surface integral over the surface of the sun is an integral over a sphere of radius \\(\\rho_0\\). We could use spherical coordinates to compute this, but note instead that the normal is \\(\\hat{\\rho}\\) so, \\(F \\cdot \\hat{N} = 1/\\rho = 1/\\rho_0\\) over this surface. So the surface integral is simple the surface area times \\(1/\\rho_0\\): \\(4\\pi\\rho_0^2/\\rho_0 = 4\\pi\\rho_0\\).\nFinally, though \\(F\\) is not continuous at the origin, the divergence theorems result holds. Using spherical coordinates we have:\n\n@syms rho::real rho_0::real phi::real theta::real\nJac = rho^2 * sin(phi)\nintegrate(1/rho^2 * Jac, (rho, 0, rho_0), (theta, 0, 2PI), (phi, 0, PI))\n\n \n\\[\n4 \\pi \\rho_{0}\n\\]\n\n\n\n\n\n\nExample: Continuity equation (Schey)\nImagine a venue with a strict cap on the number of persons at one time. Two ways to monitor this are: at given times, a count, or census, of all the people in the venue can be made. Or, when possible, a count of people coming in can be compared to a count of people coming out and the difference should yield the number within. Either works well when access is limited and the venue small, but the latter can also work well on a larger scale. For example, for the subway system of New York it would be impractical to attempt to count all the people at a given time using a census, but from turnstile data an accurate count can be had, as turnstiles can be used to track people coming in and going out. But turnstiles can be restricting and cause long(ish) lines. At some stores, new technology is allowing checkout-free shopping. Imagine if each customer had an app on their phone that can be used to track location. As they enter a store, they can be recorded, as they exit they can be recorded and if RFID tags are on each item in the store, their “purchases” can be tallied up and billed through the app. (As an added bonus to paying fewer cashiers, stores can also track on a step-by-step basis how a customer interacts with the store.) In any of these three scenarios, a simple thing applies: the total number of people in a confined region can be counted by counting how many crossed the boundary (and in which direction) and the change in time of the count can be related to the change in time of the people crossing.\nFor a more real world example, the New York Times ran an article about estimating the size of a large protest in Hong Kong:\n\nCrowd estimates for Hong Kongs large pro-democracy protests have been a point of contention for years. The organizers and the police often release vastly divergent estimates. This years annual pro-democracy protest on Monday, July 1, was no different. Organizers announced 550,000 people attended; the police said 190,000 people were there at the peak.\n\n\nBut for the first time in the marchs history, a group of researchers combined artificial intelligence and manual counting techniques to estimate the size of the crowd, concluding that 265,000 people marched.\n\n\nOn Monday, the A.I. team attached seven iPads to two major footbridges along the march route. Volunteers doing manual counts were also stationed next to the cameras, to help verify the computer count.\n\nThe article describes some issues in counting such a large group:\n\nThe high density of the crowd and the moving nature of these protests make estimating the turnout very challenging. For more than a decade, groups have stationed teams along the route and manually counted the rate of people passing through to derive the total number of participants.\n\nAs there are no turnstiles to do an accurate count and too many points to come and go, this technique can be too approximate. The article describes how artificial intelligence was used to count the participants. The Times tried their own hand:\n\nAnalyzing a short video clip recorded on Monday, The Timess model tried to detect people based on color and shape, and then tracked the figures as they moved across the screen. This method helps avoid double counting because the crowd generally flowed in one direction.\n\nThe divergence theorem provides two means to compute a value, the point here is to illustrate that there are (at least) two possible ways to compute crowd size. Which is better depends on the situation.\n\nFollowing Schey, we now consider a continuous analog to the crowd counting problem through a flow with a non-uniform density that may vary in time. Let \\(\\rho(x,y,z;t)\\) be the time-varying density and \\(v(x,y,z;t)\\) be a vector field indicating the direction of flow. Consider some three-dimensional volume, \\(V\\), with boundary \\(S\\) (though two-dimensional would also be applicable). Then these integrals have interpretations:\n\\[\n\\begin{align}\n\\iiint_V \\rho dV &&\\quad\\text{Amount contained within }V\\\\\n\\frac{\\partial}{\\partial{t}} \\iiint_V \\rho dV &=\n\\iiint_V \\frac{\\partial{\\rho}}{\\partial{t}} dV &\\quad\\text{Change in time of amount contained within }V\n\\end{align}\n\\]\nMoving the derivative inside the integral requires an assumption of continuity. Assume the material is conserved, meaning that if the amount in the volume \\(V\\) changes it must flow in and out through the boundary. The flow out through \\(S\\), the boundary of \\(V\\), is\n\\[\n\\oint_S (\\rho v)\\cdot\\hat{N} dS,\n\\]\nusing the customary outward pointing normal for the orientation of \\(S\\).\nSo we have:\n\\[\n\\iiint_V \\frac{\\partial{\\rho}}{\\partial{t}} dV =\n-\\oint_S (\\rho v)\\cdot\\hat{N} dS = - \\iiint_V \\nabla\\cdot\\left(\\rho v\\right)dV.\n\\]\nThe last equality by the divergence theorem, the minus sign as a positive change in amount within \\(V\\) means flow opposite the outward pointing normal for \\(S\\).\nThe volume \\(V\\) was arbitrary. While it isnt the case that two integrals being equal implies the integrands are equal, it is the case that if the two integrals are equal for all volumes and the two integrands are continuous, then they are equal.\nThat is, under the assumptions that material is conserved and density is continuous a continuity equation can be derived from the divergence theorem:\n\\[\n\\nabla\\cdot(\\rho v) = - \\frac{\\partial{\\rho}}{dt}.\n\\]\n\n\nExample: The divergence theorem can fail to apply\nThe assumption of the divergence theorem that the vector field be continuously differentiable is important, as otherwise it may not hold. With \\(R(x,y,z) = \\langle x,y,z\\rangle\\) take for example \\(F = (R/\\|R\\|) / \\|R\\|^2)\\). This has divergence\n\nR(x,y,z) = [x,y,z]\nF(x,y,z) = R(x,y,z) / norm(R(x,y,z))^3\n\n\ndivergence(F(x,y,z), [x,y,z]) |> simplify\n\n \n\\[\n0\n\\]\n\n\n\nThe simplification done by SymPy masks the presence of \\(R^{-5/2}\\) when taking the partial derivatives, which means the field is not continuously differentiable at the origin.\nWere the divergence theorem applicable, then the integral of \\(F\\) over the unit sphere would mean:\n\\[\n0 = \\iiint_V \\nabla\\cdot{F} dV =\n\\oint_S F\\cdot{N}dS = \\oint_S \\frac{R}{\\|R\\|^3} \\cdot{R} dS =\n\\oint_S 1 dS = 4\\pi.\n\\]\nClearly, as \\(0\\) is not equal to \\(4\\pi\\), the divergence theorem can not apply.\nHowever, it does apply to any volume not enclosing the origin. So without any calculation, if \\(V\\) were shifted over by \\(2\\) units the volume integral over \\(V\\) would be \\(0\\) and the surface integral over \\(S\\) would be also.\nAs already seen, the inverse square law here arises in the electrostatic force formula, and this same observation was made in the context of Gausss law."
},
{
"objectID": "integral_vector_calculus/stokes_theorem.html#questions",
"href": "integral_vector_calculus/stokes_theorem.html#questions",
"title": "62  Greens Theorem, Stokes Theorem, and the Divergence Theorem",
"section": "62.4 Questions",
"text": "62.4 Questions\n\nQuestion\n(Schey) What conditions on \\(F: R^2 \\rightarrow R^2\\) imply \\(\\oint_C F\\cdot d\\vec{r} = A\\)? (\\(A\\) is the area bounded by the simple, closed curve \\(C\\))\n\n\n\n \n \n \n \n \n \n \n \n \n We must have \\(\\text{curl}(F) = x\\)\n \n \n\n\n \n \n \n \n We must have \\(\\text{curl}(F) = 1\\)\n \n \n\n\n \n \n \n \n We must have \\(\\text{curl}(F) = 0\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFor \\(C\\), a simple, closed curve parameterized by \\(\\vec{r}(t) = \\langle x(t), y(t) \\rangle\\), \\(a \\leq t \\leq b\\). The area contained can be computed by \\(\\int_a^b x(t) y'(t) dt\\). Let \\(\\vec{r}(t) = \\sin(t) \\cdot \\langle \\cos(t), \\sin(t)\\rangle\\).\nFind the area inside \\(C\\)\n\n\n\n \n \n \n \n \n\n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\hat{N} = \\langle \\cos(t), \\sin(t) \\rangle\\) and \\(\\hat{T} = \\langle -\\sin(t), \\cos(t)\\rangle\\). Then polar coordinates can be viewed as the parametric curve \\(\\vec{r}(t) = r(t) \\hat{N}\\).\nApplying Greens theorem to the vector field \\(F = \\langle -y, x\\rangle\\) which along the curve is \\(r(t) \\hat{T}\\) we know the area formula \\((1/2) (\\int xdy - \\int y dx)\\). What is this in polar coordinates (using \\(\\theta=t\\)?) (Using \\((r\\hat{N}' = r'\\hat{N} + r \\hat{N}' = r'\\hat{N} +r\\hat{T}\\) is useful.)\n\n\n\n \n \n \n \n \n \n \n \n \n \\((1/2) \\int r^2d\\theta\\)\n \n \n\n\n \n \n \n \n \\((1/2) \\int r d\\theta\\)\n \n \n\n\n \n \n \n \n \\(\\int rd\\theta\\)\n \n \n\n\n \n \n \n \n \\(\\int r^2 d\\theta\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(\\vec{r}(t) = \\langle \\cos^3(t), \\sin^3(t)\\rangle\\), \\(0\\leq t \\leq 2\\pi\\). (This describes a hypocycloid.) Compute the area enclosed by the curve \\(C\\) using Greens theorem.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(3\\pi/8\\)\n \n \n\n\n \n \n \n \n \\(\\pi/4\\)\n \n \n\n\n \n \n \n \n \\(\\pi/2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = \\langle y, x\\rangle\\). We verify Greens theorem holds when \\(S\\) is the unit square, \\([0,1]\\times[0,1]\\).\nThe curl of \\(F\\) is\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAs the curl is a constant, say \\(c\\), we have \\(\\iint_S (\\nabla\\times{F}) dS = c \\cdot 1\\). This is?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nTo integrate around the boundary we have \\(4\\) terms: the path \\(A\\) connecting \\((0,0)\\) to \\((1,0)\\) (on the \\(x\\) axis), the path \\(B\\) connecting \\((1,0)\\) to \\((1,1)\\), the path \\(C\\) connecting \\((1,1)\\) to \\((0,1)\\), and the path \\(D\\) connecting \\((0,1)\\) to \\((0,0)\\) (along the \\(y\\) axis).\nWhich path has tangent \\(\\hat{j}\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(A\\)\n \n \n\n\n \n \n \n \n \\(B\\)\n \n \n\n\n \n \n \n \n \\(C\\)\n \n \n\n\n \n \n \n \n \\(D\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nAlong path \\(C\\), \\(F(x,y) = [1,x]\\) and \\(\\hat{T}=-\\hat{i}\\) so \\(F\\cdot\\hat{T} = -1\\). The path integral \\(\\int_C (F\\cdot\\hat{T})ds = -1\\). What is the value of the path integral over \\(A\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(-1\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nWhat is the integral over the oriented boundary of \\(S\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nSuppose \\(F: R^2 \\rightarrow R^2\\) is a vector field such that \\(\\nabla\\cdot{F}=0\\) except at the origin. Let \\(C_1\\) and \\(C_2\\) be the unit circle and circle with radius \\(2\\) centered at the origin, both parameterized counterclockwise. What is the relationship between \\(\\oint_{C_2} F\\cdot\\hat{N}ds\\) and \\(\\oint_{C_1} F\\cdot\\hat{N}ds\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n They differ by a minus sign, as Green's theorem applies to the area, \\(S\\), between \\(C_1\\) and \\(C_2\\) so \\(\\iint_S \\nabla\\cdot{F}dA = 0\\).\n \n \n\n\n \n \n \n \n They are the same, as Green's theorem applies to the area, \\(S\\), between \\(C_1\\) and \\(C_2\\) so \\(\\iint_S \\nabla\\cdot{F}dA = 0\\).\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(F(x,y) = \\langle x, y\\rangle/(x^2+y^2)\\). Though this has divergence \\(0\\) away from the origin, the flow integral around the unit circle, \\(\\oint_C (F\\cdot\\hat{N})ds\\), is \\(2\\pi\\), as Greens theorem in divergence form does not apply. Consider the integral around the square centered at the origin, with side lengths \\(2\\). What is the flow integral around this closed curve?\n\n\n\n \n \n \n \n \n \n \n \n \n Also \\(2\\pi\\), as Green's theorem applies to the region formed by the square minus the circle and so the overall flow integral around the boundary is \\(0\\), so the two will be the same.\n \n \n\n\n \n \n \n \n It is \\(-2\\pi\\), as Green's theorem applies to the region formed by the square minus the circle and so the overall flow integral around the boundary is \\(0\\), so the two will have opposite signs, but the same magnitude.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing the divergence theorem, compute \\(\\iint F\\cdot\\hat{N} dS\\) where \\(F(x,y,z) = \\langle x, x, y \\rangle\\) and \\(V\\) is the unit sphere.\n\n\n\n \n \n \n \n \n \n \n \n \n \\(4/3 \\pi\\)\n \n \n\n\n \n \n \n \n \\(4\\pi\\)\n \n \n\n\n \n \n \n \n \\(\\pi\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nUsing the divergence theorem, compute \\(\\iint F\\cdot\\hat{N} dS\\) where \\(F(x,y,z) = \\langle y, y,x \\rangle\\) and \\(V\\) is the unit cube \\([0,1]\\times[0,1]\\times[0,1]\\).\n\n\n\n \n \n \n \n \n \n \n \n \n \\(1\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \\(3\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nLet \\(R(x,y,z) = \\langle x, y, z\\rangle\\) and \\(\\rho = \\|R\\|^2\\). If \\(F = 2R/\\rho^2\\) then \\(F\\) is the gradient of a potential. Which one?\n\n\n\n \n \n \n \n \n \n \n \n \n \\(\\rho\\)\n \n \n\n\n \n \n \n \n \\(\\log(\\rho)\\)\n \n \n\n\n \n \n \n \n \\(1/\\rho\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\nBased on this information, for \\(S\\) a surface not including the origin with boundary \\(C\\), a simple closed curve, what is \\(\\oint_C F\\cdot\\hat{T}ds\\)?\n\n\n\n \n \n \n \n \n \n \n \n \n It is \\(0\\), as, by Stoke's theorem, it is equivalent to \\(\\iint_S (\\nabla\\times\\nabla{\\phi})dS = \\iint_S 0 dS = 0\\).\n \n \n\n\n \n \n \n \n It is \\(2\\pi\\), as this is the circumference of the unit circle\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nConsider the circle, \\(C\\) in \\(R^3\\) parameterized by \\(\\langle \\cos(t), \\sin(t), 0\\rangle\\). The upper half sphere and the unit disc in the \\(x-y\\) plane are both surfaces with this boundary. Let \\(F(x,y,z) = \\langle -y, x, z\\rangle\\). Compute \\(\\oint_C F\\cdot\\hat{T}ds\\) using Stokes theorem. The value is:\n\n\n\n \n \n \n \n \n \n \n \n \n \\(2\\pi\\)\n \n \n\n\n \n \n \n \n \\(0\\)\n \n \n\n\n \n \n \n \n \\(2\\)\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nFrom Illinois comes this advice to check if a vector field \\(F:R^3 \\rightarrow R^3\\) is conservative:\n\nIf \\(\\nabla\\times{F}\\) is non -zero the field is not conservative\nIf \\(\\nabla\\times{F}\\) is zero and the domain of \\(F\\) is simply connected (e.g., all of \\(R^3\\), then \\(F\\) is conservative\nIf \\(\\nabla\\times{F}\\) is zero but the domain of \\(F\\) is not simply connected then …\n\nWhat should finish the last sentence?\n\n\n\n \n \n \n \n \n \n \n \n \n the field is conservative\n \n \n\n\n \n \n \n \n the field is not conservative.\n \n \n\n\n \n \n \n \n the field could be conservative or not. One must work harder to answer the question.\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nKnill provides the following chart showing what happens under the three main operations on vector-valued functions:\n 1\n 1 -> grad -> 1\n 1 -> grad -> 2 -> curl -> 1\n1 -> grad -> 3 -> curl -> 3 -> div -> 1\nIn the first row, the gradient is just the regular derivative and takes a function \\(f:R^1 \\rightarrow R^1\\) into another such function, \\(f':R \\rightarrow R^1\\).\nIn the second row, the gradient is an operation that takes a function \\(f:R^2 \\rightarrow R\\) into one \\(\\nabla{f}:R^2 \\rightarrow R^2\\), whereas the curl takes \\(F:R^2\\rightarrow R^2\\) into \\(\\nabla\\times{F}:R^2 \\rightarrow R^1\\).\nIn the third row, the gradient is an operation that takes a function \\(f:R^3 \\rightarrow R\\) into one \\(\\nabla{f}:R^3 \\rightarrow R^3\\), whereas the curl takes \\(F:R^3\\rightarrow R^3\\) into \\(\\nabla\\times{F}:R^3 \\rightarrow R^3\\), and the divergence takes \\(F:R^3 \\rightarrow R^3\\) into \\(\\nabla\\cdot{F}:R^3 \\rightarrow R\\).\nThe diagram emphasizes a few different things:\n\nThe number of integral theorems is implied here. The ones for the gradient are the fundamental theorem of line integrals, namely \\(\\int_C \\nabla{f}\\cdot d\\vec{r}=\\int_{\\partial{C}} f\\), a short hand notation for \\(f\\) evaluated at the end points.\n\nThe one for the curl in \\(n=2\\) is Greens theorem: \\(\\iint_S \\nabla\\times{F}dA = \\oint_{\\partial{S}} F\\cdot d\\vec{r}\\).\nThe one for the curl in \\(n=3\\) is Stokes theorem: \\(\\iint S \\nabla\\times{F}dA = \\oint_{\\partial{S}} F\\cdot d\\vec{r}\\). Finally, the divergence for \\(n=3\\) is the divergence theorem \\(\\iint_V \\nabla\\cdot{F} dV = \\iint_{\\partial{V}} F dS\\).\n\nWorking left to right along a row of the diagram, applying two steps of these operations yields:\n\n\n\n\n \n \n \n \n \n \n \n \n \n Zero, by the vanishing properties of these operations\n \n \n\n\n \n \n \n \n The row number plus 1\n \n \n\n\n \n \n \n \n The maximum number in a row\n \n \n\n\n \n \n \n \n \n \n\n\n\n\n\n\n\nQuestion\nKatz provides details on the history of Green, Gauss (divergence), and Stokes. The first paragraph says that each theorem was not original to the attributed name. Part of the reason being the origins dating back to the 17th century, their usage by Lagrange in Laplace in the 18th century, and their formalization in the 19th century. Other reasons are the applications were different “Gauss was interested in the theory of magnetic attraction, Ostrogradsky in the theory of heat, Green in electricity and magnetism, Poisson in elastic bodies, and Sarrus in floating bodies.” Finally, in nearly all the cases the theorems were thought of as tools toward some physical end.\nIn 1846, Cauchy proved\n\\[\n\\int\\left(p\\frac{dx}{ds} + q \\frac{dy}{ds}\\right)ds =\n\\pm\\iint\\left(\\frac{\\partial{p}}{\\partial{y}} - \\frac{\\partial{q}}{\\partial{x}}\\right)dx dy.\n\\]\nThis is a form of:\n\n\n\n \n \n \n \n \n \n \n \n \n Green's theorem\n \n \n\n\n \n \n \n \n The divergence (Gauss') theorem\n \n \n\n\n \n \n \n \n Stokes' theorem"
},
{
"objectID": "integral_vector_calculus/review.html",
"href": "integral_vector_calculus/review.html",
"title": "63  Quick Review of Vector Calculus",
"section": "",
"text": "This section considers functions from \\(R^n\\) into \\(R^m\\) where one or both of \\(n\\) or \\(m\\) is greater than \\(1\\):\nWhen \\(m>1\\) a function is called vector valued.\nWhen \\(n>1\\) the argument may be given in terms of components, e.g. \\(f(x,y,z)\\); with a point as an argument, \\(F(p)\\); or with a vector as an argument, \\(F(\\vec{a})\\). The identification of a point with a vector is done frequently."
},
{
"objectID": "integral_vector_calculus/review.html#limits",
"href": "integral_vector_calculus/review.html#limits",
"title": "63  Quick Review of Vector Calculus",
"section": "63.1 Limits",
"text": "63.1 Limits\nLimits when \\(m > 1\\) depend on the limits of each component existing.\nLimits when \\(n > 1\\) are more complicated. One characterization is a limit at a point \\(c\\) exists if and only if for every continuous path going to \\(c\\) the limit along the path for every component exists in the univariate sense."
},
{
"objectID": "integral_vector_calculus/review.html#derivatives",
"href": "integral_vector_calculus/review.html#derivatives",
"title": "63  Quick Review of Vector Calculus",
"section": "63.2 Derivatives",
"text": "63.2 Derivatives\nThe derivative of a univariate function, \\(f\\), at a point \\(c\\) is defined by a limit:\n\\[\nf'(c) = \\lim_{h\\rightarrow 0} \\frac{f(c+h)-f(c)}{h},\n\\]\nand as a function by considering the mapping \\(c\\) into \\(f'(c)\\). A characterization is it is the value for which\n\\[\n|f(c+h) - f(h) - f'(c)h| = \\mathcal{o}(|h|),\n\\]\nThat is, after dividing the left-hand side by \\(|h|\\) the expression goes to \\(0\\) as \\(|h|\\rightarrow 0\\). This characterization will generalize with the norm replacing the absolute value, as needed.\n\n63.2.1 Parameterized curves\nThe derivative of a function \\(\\vec{r}: R \\rightarrow R^m\\), \\(\\vec{r}'(t)\\), is found by taking the derivative of each component. (The function consisting of just one component is univariate.)\nThe derivative satisfies\n\\[\n\\| \\vec{r}(t+h) - \\vec{r}(t) - \\vec{r}'(t) h \\| = \\mathcal{o}(|h|).\n\\]\nThe derivative is tangent to the curve and indicates the direction of travel.\nThe tangent vector is the unit vector in the direction of \\(\\vec{r}'(t)\\):\n\\[\n\\hat{T} = \\frac{\\vec{r}'(t)}{\\|\\vec{r}(t)\\|}.\n\\]\nThe path is parameterized by arc length if \\(\\|\\vec{r}'(t)\\| = 1\\) for all \\(t\\). In this case an “\\(s\\)” is used for the parameter, as a notational hint: \\(\\hat{T} = d\\vec{r}/ds\\).\nThe normal vector is the unit vector in the direction of the derivative of the tangent vector:\n\\[\n\\hat{N} = \\frac{\\hat{T}'(t)}{\\|\\hat{T}'(t)\\|}.\n\\]\nIn dimension \\(m=2\\), if \\(\\hat{T} = \\langle a, b\\rangle\\) then \\(\\hat{N} = \\langle -b, a\\rangle\\) or \\(\\langle b, -a\\rangle\\) and \\(\\hat{N}'(t)\\) is parallel to \\(\\hat{T}\\).\nIn dimension \\(m=3\\), the binormal vector, \\(\\hat{B}\\), is the unit vector \\(\\hat{T}\\times\\hat{N}\\).\nThe Frenet-Serret formulas define the curvature, \\(\\kappa\\), and the torsion, \\(\\tau\\), by\n\\[\n\\begin{align}\n\\frac{d\\hat{T}}{ds} &= & \\kappa \\hat{N} &\\\\\n\\frac{d\\hat{N}}{ds} &= -\\kappa\\hat{T} & & + \\tau\\hat{B}\\\\\n\\frac{d\\hat{B}}{ds} &= & -\\tau\\hat{N}&\n\\end{align}\n\\]\nThese formulas apply in dimension \\(m=2\\) with \\(\\hat{B}=\\vec{0}\\).\nThe curvature, \\(\\kappa\\), can be visualized by imagining a circle of radius \\(r=1/\\kappa\\) best approximating the path at a point. (A straight line would have a circle of infinite radius and curvature \\(0\\).)\nThe chain rule says \\((\\vec{r}(g(t))' = \\vec{r}'(g(t)) g'(t)\\).\n\n\n63.2.2 Scalar functions\nA scalar function, \\(f:R^n\\rightarrow R\\), \\(n > 1\\) has a partial derivative defined. For \\(n=2\\), these are:\n\\[\n\\begin{align}\n\\frac{\\partial{f}}{\\partial{x}}(x,y) &=\n\\lim_{h\\rightarrow 0} \\frac{f(x+h,y)-f(x,y)}{h}\\\\\n\\frac{\\partial{f}}{\\partial{y}}(x,y) &=\n\\lim_{h\\rightarrow 0} \\frac{f(x,y+h)-f(x,y)}{h}.\n\\end{align}\n\\]\nThe generalization to \\(n>2\\) is clear - the partial derivative in \\(x_i\\) is the derivative of \\(f\\) when the other \\(x_j\\) are held constant.\nThis may be viewed as the derivative of the univariate function \\((f\\circ\\vec{r})(t)\\) where \\(\\vec{r}(t) = p + t \\hat{e}_i\\), \\(\\hat{e}_i\\) being the unit vector of all \\(0\\)s except a \\(1\\) in the \\(i\\)th component.\nThe gradient of \\(f\\), when the limits exist, is the vector-valued function for \\(R^n\\) to \\(R^n\\):\n\\[\n\\nabla{f} = \\langle\n\\frac{\\partial{f}}{\\partial{x_1}},\n\\frac{\\partial{f}}{\\partial{x_2}},\n\\dots\n\\frac{\\partial{f}}{\\partial{x_n}}\n\\rangle.\n\\]\nThe gradient satisfies:\n\\[\n\\|f(\\vec{x}+\\Delta{\\vec{x}}) - f(\\vec{x}) - \\nabla{f}\\cdot\\Delta{\\vec{x}}\\| = \\mathcal{o}(\\|\\Delta{\\vec{x}\\|}).\n\\]\nThe gradient is viewed as a column vector. If the dot product above is viewed as matrix multiplication, then it would be written \\(\\nabla{f}' \\Delta{\\vec{x}}\\).\nLinearization is the approximation\n\\[\nf(\\vec{x}+\\Delta{\\vec{x}}) \\approx f(\\vec{x}) + \\nabla{f}\\cdot\\Delta{\\vec{x}}.\n\\]\nThe directional derivative of \\(f\\) in the direction \\(\\vec{v}\\) is \\(\\vec{v}\\cdot\\nabla{f}\\), which can be seen as the derivative of the univariate function \\((f\\circ\\vec{r})(t)\\) where \\(\\vec{r}(t) = p + t \\vec{v}\\).\nFor the function \\(z=f(x,y)\\) the gradient points in the direction of steepest ascent. Ascent is seen in the \\(3\\)d surface, the gradient is \\(2\\) dimensional.\nFor a function \\(f(\\vec{x})\\), a level curve is the set of values for which \\(f(\\vec{x})=c\\), \\(c\\) being some constant. Plotted, this may give a curve or surface (in \\(n=2\\) or \\(n=3\\)). The gradient at a point \\(\\vec{x}\\) with \\(f(\\vec{x})=c\\) will be orthogonal to the level curve \\(f=c\\).\nPartial derivatives are scalar functions, so will themselves have partial derivatives when the limits are defined. The notation \\(f_{xy}\\) stands for the partial derivative in \\(y\\) of the partial derivative of \\(f\\) in \\(x\\). Schwarzs theorem says the order of partial derivatives will not matter (e.g., \\(f_{xy} = f_{yx}\\)) provided the higher-order derivatives are continuous.\nThe chain rule applied to \\((f\\circ\\vec{r})(t)\\) says:\n\\[\n\\frac{d(f\\circ\\vec{r})}{dt} = \\nabla{f}(\\vec{r}) \\cdot \\vec{r}'.\n\\]\n\n\n63.2.3 Vector-valued functions\nFor a function \\(F:R^n \\rightarrow R^m\\), the total derivative of \\(F\\) is the linear operator \\(d_F\\) satisfying:\n\\[\n\\|F(\\vec{x} + \\vec{h})-F(\\vec{x}) - d_F \\vec{h}\\| = \\mathcal{o}(\\|\\vec{h}\\|)\n\\]\nFor \\(F=\\langle f_1, f_2, \\dots, f_m\\rangle\\) the total derivative is the Jacobian, a \\(m \\times n\\) matrix of partial derivatives:\n\\[\nJ_f = \\left[\n\\begin{align}{}\n\\frac{\\partial f_1}{\\partial x_1} &\\quad \\frac{\\partial f_1}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_1}{\\partial x_n}\\\\\n\\frac{\\partial f_2}{\\partial x_1} &\\quad \\frac{\\partial f_2}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_2}{\\partial x_n}\\\\\n&&\\vdots&\\\\\n\\frac{\\partial f_m}{\\partial x_1} &\\quad \\frac{\\partial f_m}{\\partial x_2} &\\dots&\\quad\\frac{\\partial f_m}{\\partial x_n}\n\\end{align}\n\\right].\n\\]\nThis can be viewed as being comprised of row vectors, each being the individual gradients; or as column vectors each being the vector of partial derivatives for a given variable.\nThe chain rule for \\(F:R^n \\rightarrow R^m\\) composed with \\(G:R^k \\rightarrow R^n\\) is:\n\\[\nd_{F\\circ G}(a) = d_F(G(a)) d_G(a),\n\\]\nThat is the total derivative of \\(F\\) at the point \\(G(a)\\) times (matrix multiplication) the total derivative of \\(G\\) at \\(a\\). The dimensions work out as \\(d_F\\) is \\(m\\times n\\) and \\(d_G\\) is \\(n\\times k\\), so \\(d_(F\\circ G)\\) will be \\(m\\times k\\) and \\(F\\circ{G}: R^k\\rightarrow R^m\\).\nA scalar function \\(f:R^n \\rightarrow R\\) and a parameterized curve \\(\\vec{r}:R\\rightarrow R^n\\) composes to yield a univariate function. The total derivative of \\(f\\circ\\vec{r}\\) satisfies:\n\\[\nd_f(\\vec{r}) d_\\vec{r} = \\nabla{f}(\\vec{r}(t))' \\vec{r}'(t) =\n\\nabla{f}(\\vec{r}(t)) \\cdot \\vec{r}'(t),\n\\]\nas above. (There is an identification of a \\(1\\times 1\\) matrix with a scalar in re-expressing as a dot product.)\n\n\n63.2.4 The divergence, curl, and their vanishing properties\nDefine the divergence of a vector-valued function \\(F:R^n \\rightarrow R^n\\) by:\n\\[\n\\text{divergence}(F) =\n\\frac{\\partial{F_{x_1}}}{\\partial{x_1}} +\n\\frac{\\partial{F_{x_2}}}{\\partial{x_2}} + \\cdots\n\\frac{\\partial{F_{x_n}}}{\\partial{x_n}}.\n\\]\nThe divergence is a scalar function. For a vector field \\(F\\), it measures the microscopic flow out of a region.\nA vector field whose divergence is identically \\(0\\) is called incompressible.\nDefine the curl of a two-dimensional vector field, \\(F:R^2 \\rightarrow R^2\\), by:\n\\[\n\\text{curl}(F) = \\frac{\\partial{F_y}}{\\partial{x}} -\n\\frac{\\partial{F_x}}{\\partial{y}}.\n\\]\nThe curl for \\(n=2\\) is a scalar function.\nFor \\(n=3\\) define the curl of \\(F:R^3 \\rightarrow R^3\\) to be the vector field:\n\\[\n\\text{curl}(F) =\n\\langle \\\n\\frac{\\partial{F_z}}{\\partial{y}} - \\frac{\\partial{F_y}}{\\partial{z}},\n\\frac{\\partial{F_x}}{\\partial{z}} - \\frac{\\partial{F_z}}{\\partial{x}},\n\\frac{\\partial{F_y}}{\\partial{x}} - \\frac{\\partial{F_x}}{\\partial{y}}\n\\rangle.\n\\]\nThe curl measures the circulation in a vector field. In dimension \\(n=3\\) it points in the direction of the normal of the plane of maximum circulation with direction given by the right-hand rule.\nA vector field whose curl is identically of magnitude \\(0\\) is called irrotational.\nThe \\(\\nabla\\) operator is the formal vector\n\\[\n\\nabla = \\langle\n\\frac{\\partial}{\\partial{x}},\n\\frac{\\partial}{\\partial{y}},\n\\frac{\\partial}{\\partial{z}}\n\\rangle.\n\\]\nThe gradient is then scalar “multiplication” on the left: \\(\\nabla{f}\\).\nThe divergence is the dot product on the left: \\(\\nabla\\cdot{F}\\).\nThe curl is the the cross product on the left: \\(\\nabla\\times{F}\\).\nThese operations satisfy two vanishing properties:\n\nThe curl of a gradient is the zero vector: \\(\\nabla\\times\\nabla{f}=\\vec{0}\\)\nThe divergence of a curl is \\(0\\): \\(\\nabla\\cdot(\\nabla\\times F)=0\\)\n\nHelmholtz decomposition theorem says a vector field (\\(n=3\\)) which vanishes rapidly enough can be expressed in terms of \\(F = -\\nabla\\phi + \\nabla\\times{A}\\). The left term will be irrotational (no curl) and the right term will be incompressible (no divergence)."
},
{
"objectID": "integral_vector_calculus/review.html#integrals",
"href": "integral_vector_calculus/review.html#integrals",
"title": "63  Quick Review of Vector Calculus",
"section": "63.3 Integrals",
"text": "63.3 Integrals\nThe definite integral, \\(\\int_a^b f(x) dx\\), for a bounded univariate function is defined in terms Riemann sums, \\(\\lim \\sum f(c_i)\\Delta{x_i}\\) as the maximum partition size goes to \\(0\\). Similarly the integral of a bounded scalar function \\(f:R^n \\rightarrow R\\) over a box-like region \\([a_1,b_1]\\times[a_2,b_2]\\times\\cdots\\times[a_n,b_n]\\) can be defined in terms of a limit of Riemann sums. A Riemann integrable function is one for which the upper and lower Riemann sums agree in the limit. A characterization of a Riemann integrable function is that the set of discontinuities has measure \\(0\\).\nIf \\(f\\) and the partial functions (\\(x \\rightarrow f(x,y)\\) and \\(y \\rightarrow f(x,y)\\)) are Riemann integrable, then Fubinis theorem allows the definite integral to be performed iteratively:\n\\[\n\\iint_{R\\times S}fdV = \\int_R \\left(\\int_S f(x,y) dy\\right) dx\n= \\int_S \\left(\\int_R f(x,y) dx\\right) dy.\n\\]\nThe integral satisfies linearity and monotonicity properties that follow from the definitions:\n\nFor integrable \\(f\\) and \\(g\\) and constants \\(a\\) and \\(b\\):\n\n\\[\n\\iint_R (af(x) + bg(x))dV = a\\iint_R f(x)dV + b\\iint_R g(x) dV.\n\\]\n\nIf \\(R\\) and \\(R'\\) are disjoint rectangular regions (possibly sharing a boundary), then the integral over the union is defined by linearity:\n\n\\[\n\\iint_{R \\cup R'} f(x) dV = \\iint_R f(x)dV + \\iint_{R'} f(x) dV.\n\\]\n\nAs \\(f\\) is bounded, let \\(m \\leq f(x) \\leq M\\) for all \\(x\\) in \\(R\\). Then\n\n\\[\nm V(R) \\leq \\iint_R f(x) dV \\leq MV(R).\n\\]\n\nIf \\(f\\) and \\(g\\) are integrable and \\(f(x) \\leq g(x)\\), then the integrals have the same property, namely \\(\\iint_R f dV \\leq \\iint_R gdV\\).\nIf \\(S \\subset R\\), both closed rectangles, then if \\(f\\) is integrable over \\(R\\) it will be also over \\(S\\) and, when \\(f\\geq 0\\), \\(\\iint_S f dV \\leq \\iint_R fdV\\).\nIf \\(f\\) is bounded and integrable, then \\(|\\iint_R fdV| \\leq \\iint_R |f| dV\\).\n\nIn two dimensions, we have the following interpretations:\n\\[\n\\begin{align}\n\\iint_R dA &= \\text{area of } R\\\\\n\\iint_R \\rho dA &= \\text{mass with constant density }\\rho\\\\\n\\iint_R \\rho(x,y) dA &= \\text{mass of region with density }\\rho\\\\\n\\frac{1}{\\text{area}}\\iint_R x \\rho(x,y)dA &= \\text{centroid of region in } x \\text{ direction}\\\\\n\\frac{1}{\\text{area}}\\iint_R y \\rho(x,y)dA &= \\text{centroid of region in } y \\text{ direction}\n\\end{align}\n\\]\nIn three dimensions, we have the following interpretations:\n\\[\n\\begin{align}\n\\iint_VdV &= \\text{volume of } V\\\\\n\\iint_V \\rho dV &= \\text{mass with constant density }\\rho\\\\\n\\iint_V \\rho(x,y) dV &= \\text{mass of volume with density }\\rho\\\\\n\\frac{1}{\\text{volume}}\\iint_V x \\rho(x,y)dV &= \\text{centroid of volume in } x \\text{ direction}\\\\\n\\frac{1}{\\text{volume}}\\iint_V y \\rho(x,y)dV &= \\text{centroid of volume in } y \\text{ direction}\\\\\n\\frac{1}{\\text{volume}}\\iint_V z \\rho(x,y)dV &= \\text{centroid of volume in } z \\text{ direction}\n\\end{align}\n\\]\nTo compute integrals over non-box-like regions, Fubinis theorem may be utilized. Alternatively, a transformation of variables\n\n63.3.1 Line integrals\nFor a parameterized curve, \\(\\vec{r}(t)\\), the line integral of a scalar function between \\(a \\leq t \\leq b\\) is defined by: \\(\\int_a^b f(\\vec{r}(t)) \\| \\vec{r}'(t)\\| dt\\). For a path parameterized by arc-length, the integral is expressed by \\(\\int_C f(\\vec{r}(s)) ds\\) or simply \\(\\int_C f ds\\), as the norm is \\(1\\) and \\(C\\) expresses the path.\nA Jordan curve in two dimensions is a non-intersecting continuous loop in the plane. The Jordan curve theorem states that such a curve divides the plane into a bounded and unbounded region. The curve is positively parameterized if the the bounded region is kept on the left. A line integral over a Jordan curve is denoted \\(\\oint_C f ds\\).\nSome interpretations: \\(\\int_a^b \\| \\vec{r}'(t)\\| dt\\) computes the arc-length. If the path represents a wire with density \\(\\rho(\\vec{x})\\) then \\(\\int_a^b \\rho(\\vec{r}(t)) \\|\\vec{r}'(t)\\| dt\\) computes the mass of the wire.\nThe line integral is also defined for a vector field \\(F:R^n \\rightarrow R^n\\) through \\(\\int_a^b F(\\vec{r}(t)) \\cdot \\vec{r}'(t) dt\\). When parameterized by arc length, this becomes \\(\\int_C F(\\vec{r}(s)) \\cdot \\hat{T} ds\\) or more simply \\(\\int_C F\\cdot\\hat{T}ds\\). In dimension \\(n=2\\) if \\(\\hat{N}\\) is the normal, then this line integral (the flow) is also of interest \\(\\int_a^b F(\\vec{r}(t)) \\cdot \\hat{N} dt\\) (this is also expressed by \\(\\int_C F\\cdot\\hat{N} ds\\)).\nWhen \\(F\\) is a force field, then the interpretation of \\(\\int_a^b F(\\vec{r}(t)) \\cdot \\vec{r}'(t) dt\\) is the amount of work to move an object from \\(\\vec{r}(a)\\) to \\(\\vec{r}(b)\\). (Work measures force applied times distance moved.)\nA conservative force is a force field within an open region \\(R\\) with the property that the total work done in moving a particle between two points is independent of the path taken. (Similarly, integrals over Jordan curves are zero.)\nThe gradient theorem or fundamental theorem of line integrals states if \\(\\phi\\) is a scalar function then the vector field \\(\\nabla{\\phi}\\) (if continuous in \\(R\\)) is a conservative field. That is if \\(q\\) and \\(p\\) are points, \\(C\\) any curve in \\(R\\), and \\(\\vec{r}\\) a parameterization of \\(C\\) over \\([a,b]\\) that \\(\\phi(p) - \\phi(q) = \\int_a^b \\nabla{f}(\\vec{r}(t)) \\cdot \\vec{r}'(t) dt\\).\nIf \\(\\phi\\) is a scalar function producing a field \\(\\nabla{\\phi}\\) then in dimensions \\(2\\) and \\(3\\) the curl of \\(\\nabla{\\phi}\\) is zero when the functions involved are continuous. Conversely, if the curl of a force field, \\(F\\), is zero and the derivatives are continuous in a simply connected domain, then there exists a scalar potential function, \\(\\phi,\\) with \\(F = -\\nabla{\\phi}\\).\nIn dimension \\(2\\), if \\(F\\) describes a flow field, the integral \\(\\int_C F \\cdot\\hat{N}ds\\) is interpreted as the flow across the curve \\(C\\); when \\(C\\) is a closed curve \\(\\oint_C F\\cdot\\hat{N}ds\\) is interpreted as the flow out of the region, when \\(C\\) is positively parameterized.\nGreens theorem states if \\(C\\) is a positively oriented Jordan curve in the plane bounding a region \\(D\\) and \\(F\\) is a vector field \\(F:R^2 \\rightarrow R^2\\) then \\(\\oint_C F\\cdot\\hat{T}ds = \\iint_D \\text{curl}(F) dA\\).\nGreens theorem can be re-expressed in flow form: \\(\\oint_C F\\cdot\\hat{N}ds=\\iint_D\\text{divergence}(F)dA\\).\nFor \\(F=\\langle -y,x\\rangle\\), Greens theorem says the area of \\(D\\) is given by \\((1/2)\\oint_C F\\cdot\\vec{r}' dt\\). Similarly, if \\(F=\\langle 0,x\\rangle\\) or \\(F=\\langle -y,0\\rangle\\) then the area is given by \\(\\oint_C F\\cdot\\vec{r}'dt\\). The above follows as \\(\\text{curl}(F)\\) is \\(2\\) or \\(1\\). Similar formulas can be given to compute the centroids, by identifying a vector field with \\(\\text{curl}(F) = x\\) or \\(y\\).\n\n\n63.3.2 Surface integrals\nA surface in \\(3\\) dimensions can be described by a scalar function \\(z=f(x,y)\\), a parameterization \\(F:R^2 \\rightarrow R^3\\) or as a level curve of a scalar function \\(f(x,y,z)\\). The second case, covers the first through the parameterization \\((x,y) \\rightarrow (x,y,f(x,y)\\). For a parameterization of a surface, \\(\\Phi(u,v) = \\langle \\Phi_x, \\Phi_y, \\Phi_z\\rangle\\), let \\(\\partial{\\Phi}/\\partial{u}\\) be the \\(3\\)-d vector \\(\\langle \\partial{\\Phi_x}/\\partial{u}, \\partial{\\Phi_y}/\\partial{u}, \\partial{\\Phi_z}/\\partial{u}\\rangle\\), similarly define \\(\\partial{\\Phi}/\\partial{v}\\). As vectors, these lie in the tangent plane to the surface and this plane has normal vector \\(\\vec{N}=\\partial{\\Phi}/\\partial{u}\\times\\partial{\\Phi}/\\partial{v}\\). For a closed surface, the parametrization is positive if \\(\\vec{N}\\) is an outward pointing normal. Let the surface element be defined by \\(\\|\\vec{N}\\|\\).\nThe surface integral of a scalar function \\(f:R^3 \\rightarrow R\\) for a parameterization \\(\\Phi:R \\rightarrow S\\) is defined by\n\\[\n\\iint_R f(\\Phi(u,v))\n\\|\\frac{\\partial{\\Phi}}{\\partial{u}} \\times \\frac{\\partial{\\Phi}}{\\partial{v}}\\|\ndu dv\n\\]\nIf \\(F\\) is a vector field, the surface integral may be defined as a flow across the boundary through\n\\[\n\\iint_R F(\\Phi(u,v)) \\cdot \\vec{N} du dv =\n\\iint_R (F \\cdot \\hat{N}) \\|\\frac{\\partial{\\Phi}}{\\partial{u}} \\times \\frac{\\partial{\\Phi}}{\\partial{v}}\\| du dv = \\iint_S (F\\cdot\\hat{N})dS\n\\]\n\n\n63.3.3 Stokes theorem, divergence theorem\nStokes theorem states that in dimension \\(3\\) if \\(S\\) is a smooth surface with boundary \\(C\\) oriented so the right-hand rule gives the choice of normal for \\(S\\) and \\(F\\) is a vector field with continuous partial derivatives then:\n\\[\n\\iint_S (\\nabla\\times{F}) \\cdot \\hat{N} dS = \\oint_C F ds.\n\\]\nStokes theorem has the same formulation as Greens theorem in dimension \\(2\\), where the surface integral is just the \\(2\\)-dimensional integral.\nStokes theorem is used to show a vector field \\(F\\) with zero curl is conservative if \\(F\\) is continuous in a simply connected region.\nStokes theorem is used in Physics, for example, to relate the differential and integral forms of \\(2\\) of Maxwells equations.\n\nThe divergence theorem states if \\(V\\) is a compact volume in \\(R^3\\) with piecewise smooth boundary \\(S=\\partial{V}\\) and \\(F\\) is a vector field with continuous partial derivatives then:\n\\[\n\\iint_V (\\nabla\\cdot{F})dV = \\oint_S (F\\cdot\\hat{N})dS.\n\\]\nThe divergence theorem is available for other dimensions. In the \\(n=2\\) case, it is the alternate (flow) form of Greens theorem.\nThe divergence theorem is used in Physics to express physical laws in either integral or differential form."
},
{
"objectID": "alternatives/plotly_plotting.html",
"href": "alternatives/plotly_plotting.html",
"title": "64  JavaScript based plotting libraries",
"section": "",
"text": "Not working with quarto\n\n\n\nCurrently, the plots generated here are not rendering within quarto.\nThis section uses this add-on package:\nTo avoid a dependence on the CalculusWithJulia package, we load two utility packages:\nJulia has different interfaces to a few JavaScript plotting libraries, notably the vega and vega-lite through the VegaLite.jl package, and plotly through several interfaces: Plots.jl, PlotlyJS.jl, and PlotlyLight.jl. These all make web-based graphics, for display through a web browser.\nThe Plots.jl interface is a backend for the familiar Plots package, making the calling syntax familiar, as is used throughout these notes. The plotly() command, from Plots, switches to this backend.\nThe PlotlyJS.jl interface offers direct translation from Julia structures to the underlying JSON structures needed by plotly, and has mechanisms to call back into Julia from JavaScript. This allows complicated interfaces to be produced.\nHere we discuss PlotlyLight which conveniently provides the translation from Julia structures to the JSON structures needed in a light-weight package, which plots quickly, without the delays due to compilation of the more complicated interfaces. Minor modifications would be needed to adjust the examples to work with PlotlyJS or PlotlyBase. The documentation for the JavaScript library provides numerous examples which can easily be translated. The one-page-reference gives specific details, and is quoted from below, at times.\nThis discussion covers the basic of graphing for calculus purposes. It does not cover, for example, the faceting common in statistical usages, or the chart types common in business and statistics uses. The plotly library is much more extensive than what is reviewed below."
},
{
"objectID": "alternatives/plotly_plotting.html#julia-dictionaries-to-json",
"href": "alternatives/plotly_plotting.html#julia-dictionaries-to-json",
"title": "64  JavaScript based plotting libraries",
"section": "64.1 Julia dictionaries to JSON",
"text": "64.1 Julia dictionaries to JSON\nPlotlyLight uses the JavaScript interface for the plotly libraries. Unlike more developed interfaces, like the one for Python, PlotlyLight only manages the translation from Julia structures to JavaScript structures and the display of the results.\nThe key to translation is the mapping for Julias dictionaries to the nested JSON structures needed by the JavaScript library.\nFor example, an introductory example for a scatter plot includes this JSON structure:\nvar trace1 = {\n x: [1, 2, 3, 4],\n y: [10, 15, 13, 17],\n mode: 'markers',\n type: 'scatter'\n};\nThe {} create a list, the [] an Array (or vector, as it does with Julia), the name: are keys. The above is simply translated via:\n\nConfig(x = [1,2,3,4],\n y = [10, 15, 13, 17],\n mode = \"markers\",\n type = \"scatter\"\n )\n\nConfig with 4 entries:\n :x => [1, 2, 3, 4]\n :y => [10, 15, 13, 17]\n :mode => \"markers\"\n :type => \"scatter\"\n\n\nThe Config constructor (from the EasyConfig package loaded with PlotlyLight) is an interface for a dictionary whose keys are symbols, which are produced by the named arguments passed to Config. By nesting Config statements, nested JavaScript structures can be built up. As well, these can be built on the fly using . notation, as in:\n\ncfg = Config()\ncfg.key1.key2.key3 = \"value\"\ncfg\n\nConfig with 1 entry:\n :key1 => Config(:key2=>Config(:key3=>\"value\"))\n\n\nTo produce a figure with PlotlyLight then is fairly straightforward: data and, optionally, a layout are created using Config, then passed along to the Plot command producing a Plot object which has display methods defined for it. This will be illustrated through the examples."
},
{
"objectID": "alternatives/plotly_plotting.html#scatter-plot",
"href": "alternatives/plotly_plotting.html#scatter-plot",
"title": "64  JavaScript based plotting libraries",
"section": "64.2 Scatter plot",
"text": "64.2 Scatter plot\nA basic scatter plot of points \\((x,y)\\) is created as follows:\n\nxs = 1:5\nys = rand(5)\ndata = Config(x = xs,\n y = ys,\n type=\"scatter\",\n mode=\"markers\"\n )\nPlot(data)\n\n\n \n\n\n\n\n\nThe symbols x and y (and later z) specify the data to plotly. Here the mode is specified to show markers.\nThe type key specifies the chart or trace type. The mode specification sets the drawing mode for the trace. Above it is “markers”. It can be any combination of “lines”, “markers”, or “text” joined with a “+” if more than one is desired."
},
{
"objectID": "alternatives/plotly_plotting.html#line-plot",
"href": "alternatives/plotly_plotting.html#line-plot",
"title": "64  JavaScript based plotting libraries",
"section": "64.3 Line plot",
"text": "64.3 Line plot\nA line plot is very similar, save for a different mode specification:\n\nxs = 1:5\nys = rand(5)\ndata = Config(x = xs,\n y = ys,\n type=\"scatter\",\n mode=\"lines\"\n )\nPlot(data)\n\n\n \n\n\n\n\n\nThe difference is solely the specification of the mode value, for a line plot it is “lines,” for a scatter plot it is “markers” The mode “lines+markers” will plot both. The default for the “scatter” types is to use “lines+markers” for small data sets, and “lines” for others, so for this example, mode could be left off.\n\n64.3.1 Nothing\nThe line graph plays connect-the-dots with the points specified by paired x and y values. Typically, when and x value is NaN that “dot” (or point) is skipped. However, NaN doesnt pass through the JSON conversion nothing can be used.\n\ndata = Config(\n x=[0,1,nothing,3,4,5],\n y = [0,1,2,3,4,5],\n type=\"scatter\", mode=\"markers+lines\")\nPlot(data)"
},
{
"objectID": "alternatives/plotly_plotting.html#multiple-plots",
"href": "alternatives/plotly_plotting.html#multiple-plots",
"title": "64  JavaScript based plotting libraries",
"section": "64.4 Multiple plots",
"text": "64.4 Multiple plots\nMore than one graph or layer can appear on a plot. The data argument can be a vector of Config values, each describing a plot. For example, here we make a scatter plot and a line plot:\n\ndata = [Config(x = 1:5,\n y = rand(5),\n type = \"scatter\",\n mode = \"markers\",\n name = \"scatter plot\"),\n Config(x = 1:5,\n y = rand(5),\n type = \"scatter\",\n mode = \"lines\",\n name = \"line plot\")\n ]\nPlot(data)\n\n\n \n\n\n\n\n\nThe name argument adjusts the name in the legend referencing the plot. This is produced by default.\n\n64.4.1 Adding a layer\nIn PlotlyLight, the Plot object has a field data for storing a vector of configurations, as above. After a plot is made, this field can have values pushed onto it and the corresponding layers will be rendered when the plot is redisplayed.\nFor example, here we plot the graphs of both the \\(\\sin(x)\\) and \\(\\cos(x)\\) over \\([0,2\\pi]\\). We used the utility PlotUtils.adapted_grid to select the points to use for the graph.\n\na, b = 0, 2pi\n\nxs, ys = PlotUtils.adapted_grid(sin, (a,b))\np = Plot(Config(x=xs, y=ys, name=\"sin\"))\n\nxs, ys = PlotUtils.adapted_grid(cos, (a,b))\npush!(p.data, Config(x=xs, y=ys, name=\"cos\"))\n\np # to display the plot\n\n\n \n\n\n\n\n\nThe values for a and b are used to generate the \\(x\\)- and \\(y\\)-values. These can also be gathered from the existing plot object. Here is one way, where for each trace with an x key, the extrema are consulted to update a list of left and right ranges.\n\nxs, ys = PlotUtils.adapted_grid(x -> x^5 - x - 1, (0, 2)) # answer is (0,2)\np = Plot([Config(x=xs, y=ys, name=\"Polynomial\"),\n Config(x=xs, y=0 .* ys, name=\"x-axis\", mode=\"lines\", line=Config(width=5))]\n )\nds = filter(d -> !isnothing(get(d, :x, nothing)), p.data)\na=reduce(min, [minimum(d.x) for d ∈ ds]; init=Inf)\nb=reduce(max, [maximum(d.x) for d ∈ ds]; init=-Inf)\n(a, b)\n\n(0.0, 2.0)"
},
{
"objectID": "alternatives/plotly_plotting.html#interactivity",
"href": "alternatives/plotly_plotting.html#interactivity",
"title": "64  JavaScript based plotting libraries",
"section": "64.5 Interactivity",
"text": "64.5 Interactivity\nJavaScript allows interaction with a plot as it is presented within a browser. (Not the Julia process which produced the data or the plot. For that interaction, PlotlyJS may be used.) The basic default features are:\n\nThe data producing a graphic are displayed on hover using flags.\nThe legend may be clicked to toggle whether the corresponding graph is displayed.\nThe viewing region can be narrowed using the mouse for selection.\nThe toolbar has several features for panning and zooming, as well as adjusting the information shown on hover.\n\nLater we will see that \\(3\\)-dimensional surfaces can be rotated interactively."
},
{
"objectID": "alternatives/plotly_plotting.html#plot-attributes",
"href": "alternatives/plotly_plotting.html#plot-attributes",
"title": "64  JavaScript based plotting libraries",
"section": "64.6 Plot attributes",
"text": "64.6 Plot attributes\nAttributes of the markers and lines may be adjusted when the data configuration is specified. A selection is shown below. Consult the reference for the extensive list.\n\n64.6.1 Marker attributes\nA markers attributes can be adjusted by values passed to the marker key. Labels for each marker can be assigned through a text key and adding text to the mode key. For example:\n\ndata = Config(x = 1:5,\n y = rand(5),\n mode=\"markers+text\",\n type=\"scatter\",\n name=\"scatter plot\",\n text = [\"marker $i\" for i in 1:5],\n textposition = \"top center\",\n marker = Config(size=12, color=:blue)\n )\nPlot(data)\n\n\n \n\n\n\n\n\nThe text mode specification is necessary to have text be displayed on the chart, and not just appear on hover. The size and color attributes are recycled; they can be specified using a vector for per-marker styling. Here the symbol :blue is used to specify a color, which could also be a name, such as \"blue\".\n\nRGB Colors\nThe ColorTypes package is the standard Julia package providing an RGB type (among others) for specifying red-green-blue colors. To make this work with Config and JSON3 requires some type-piracy (modifying Base.string for the RGB type) to get, say, RGB(0.5, 0.5, 0.5) to output as \"rgb(0.5, 0.5, 0.5)\". (RGB values in JavaScript are integers between \\(0\\) and \\(255\\) or floating point values between \\(0\\) and \\(1\\).) A string with this content can be specified. Otherwise, something like the following can be used to avoid the type piracy:\nstruct rgb\n r\n g\n b\nend\nPlotlyLight.JSON3.StructTypes.StructType(::Type{rgb}) = PlotlyLight.JSON3.StructTypes.StringType()\nBase.string(x::rgb) = \"rgb($(x.r), $(x.g), $(x.b))\"\nWith these defined, red-green-blue values can be used for colors. For example to give a range of colors, we might have:\n\ncols = [rgb(i,i,i) for i in range(10, 245, length=5)]\nsizes = [12, 16, 20, 24, 28]\ndata = Config(x = 1:5,\n y = rand(5),\n mode=\"markers+text\",\n type=\"scatter\",\n name=\"scatter plot\",\n text = [\"marker $i\" for i in 1:5],\n textposition = \"top center\",\n marker = Config(size=sizes, color=cols)\n )\nPlot(data)\n\n\n \n\n\n\n\n\nThe opacity key can be used to control the transparency, with a value between \\(0\\) and \\(1\\).\n\n\nMarker symbols\nThe marker_symbol key can be used to set a marker shape, with the basic values being: circle, square, diamond, cross, x, triangle, pentagon, hexagram, star, diamond, hourglass, bowtie, asterisk, hash, y, and line. Add -open or -open-dot modifies the basic shape.\n\nmarkers = [\"circle\", \"square\", \"diamond\", \"cross\", \"x\", \"triangle\", \"pentagon\",\n \"hexagram\", \"star\", \"diamond\", \"hourglass\", \"bowtie\", \"asterisk\",\n \"hash\", \"y\", \"line\"]\nn = length(markers)\ndata = [Config(x=1:n, y=1:n, mode=\"markers\",\n marker = Config(symbol=markers, size=10)),\n Config(x=1:n, y=2 .+ (1:n), mode=\"markers\",\n marker = Config(symbol=markers .* \"-open\", size=10)),\n Config(x=1:n, y=4 .+ (1:n), mode=\"markers\",\n marker = Config(symbol=markers .* \"-open-dot\", size=10))\n ]\nPlot(data)\n\n\n \n\n\n\n\n\n\n\n\n64.6.2 Line attributes\nThe line key can be used to specify line attributes, such as width (pixel width), color, or dash.\nThe width key specifies the line width in pixels.\nThe color key specifies the color of the line drawn.\nThe dash key specifies the style for the drawn line. Values can be set by string from “solid”, “dot”, “dash”, “longdash”, “dashdot”, or “longdashdot” or set by specifying a pattern in pixels, e.g. “5px,10px,2px,2px”.\nThe shape attribute determine how the points are connected. The default is linear, but other possibilities are hv, vh, hvh, vhv, spline for various patterns of connectivity. The following example, from the plotly documentation, shows the differences:\n\nshapes = [\"linear\", \"hv\", \"vh\", \"hvh\", \"vhv\", \"spline\"]\ndata = [Config(x = 1:5, y = 5*(i-1) .+ [1,3,2,3,1], mode=\"lines+markers\", type=\"scatter\",\n name=shape,\n line=Config(shape=shape)\n ) for (i, shape) ∈ enumerate(shapes)]\nPlot(data)\n\n\n \n\n\n\n\n\n\n\n64.6.3 Text\nThe text associated with each point can be drawn on the chart, when “text” is included in the mode or shown on hover.\nThe onscreen text is passed to the text attribute. The texttemplate key can be used to format the text with details in the accompanying link.\nSimilarly, the hovertext key specifies the text shown on hover, with hovertemplate used to format the displayed text.\n\n\n64.6.4 Filled regions\nThe fill key for a chart of mode line specifies how the area around a chart should be colored, or filled. The specification are declarative, with values in “none”, “tozeroy”, “tozerox”, “tonexty”, “tonextx”, “toself”, and “tonext”. The value of “none” is the default, unless stacked traces are used.\nIn the following, to highlight the difference between \\(f(x) = \\cos(x)\\) and \\(p(x) = 1 - x^2/2\\) the area from \\(f\\) to the next \\(y\\) is declared; for \\(p\\), the area to \\(0\\) is declared.\n\nxs = range(-1, 1, 100)\ndata = [\n Config(\n x=xs, y=cos.(xs),\n fill = \"tonexty\",\n fillcolor = \"rgba(0,0,255,0.25)\", # to get transparency\n line = Config(color=:blue)\n ),\n Config(\n x=xs, y=[1 - x^2/2 for x ∈ xs ],\n fill = \"tozeroy\",\n fillcolor = \"rgba(255,0,0,0.25)\", # to get transparency\n line = Config(color=:red)\n )\n]\nPlot(data)\n\n\n \n\n\n\n\n\nThe toself declaration is used below to fill in a polygon:\n\ndata = Config(\n x=[-1,1,1,-1,-1], y = [-1,1,-1,1,-1],\n fill=\"toself\",\n type=\"scatter\")\nPlot(data)"
},
{
"objectID": "alternatives/plotly_plotting.html#layout-attributes",
"href": "alternatives/plotly_plotting.html#layout-attributes",
"title": "64  JavaScript based plotting libraries",
"section": "64.7 Layout attributes",
"text": "64.7 Layout attributes\nThe title key sets the main title; the title key in the xaxis configuration sets the \\(x\\)-axis title (similarly for the \\(y\\) axis).\nThe legend is shown when \\(2\\) or more charts or specified, by default. This can be adjusted with the showlegend key, as below. The legend shows the corresponding name for each chart.\n\ndata = Config(x=1:5, y=rand(5), type=\"scatter\", mode=\"markers\", name=\"legend label\")\nlyt = Config(title = \"Main chart title\",\n xaxis = Config(title=\"x-axis label\"),\n yaxis = Config(title=\"y-axis label\"),\n showlegend=true\n )\nPlot(data, lyt)\n\n\n \n\n\n\n\n\nThe xaxis and yaxis keys have many customizations. For example: nticks specifies the maximum number of ticks; range to set the range of the axis; type to specify the axis type from “linear”, “log”, “date”, “category”, or “multicategory;” and visible\nThe aspect ratio of the chart can be set to be equal through the scaleanchor key, which specifies another axis to take a value from. For example, here is a parametric plot of a circle:\n\nts = range(0, 2pi, length=100)\ndata = Config(x = sin.(ts), y = cos.(ts), mode=\"lines\", type=\"scatter\")\nlyt = Config(title = \"A circle\",\n xaxis = Config(title = \"x\"),\n yaxis = Config(title = \"y\",\n scaleanchor = \"x\")\n )\nPlot(data, lyt)\n\n\n \n\n\n\n\n\n\nAnnotations\nText annotations may be specified as part of the layout object. Annotations may or may not show an arrow. Here is a simple example using a vector of annotations.\n\ndata = Config(x = [0, 1], y = [0, 1], mode=\"markers\", type=\"scatter\")\nlayout = Config(title = \"Annotations\",\n xaxis = Config(title=\"x\",\n range = (-0.5, 1.5)),\n yaxis = Config(title=\"y\",\n range = (-0.5, 1.5)),\n annotations = [\n Config(x=0, y=0, text = \"(0,0)\"),\n Config(x=1, y=1.2, text = \"(1,1)\", showarrow=false)\n ]\n )\nPlot(data, layout)\n\n\n \n\n\n\n\n\nThe following example is more complicated use of the elements previously described. It mimics an image from Wikipedia for trigonometric identities. The use of \\(\\LaTeX\\) does not seem to be supported through the JavaScript interface; unicode symbols are used instead. The xanchor and yanchor keys are used to position annotations away from the default. The textangle key is used to rotate text, as desired.\n\nalpha = pi/6\nbeta = pi/5\nxₘ = cos(alpha)*cos(beta)\nyₘ = sin(alpha+beta)\nr₀ = 0.1\n\ndata = [\n Config(\n x = [0,xₘ, xₘ, 0, 0],\n y = [0, 0, yₘ, yₘ, 0],\n type=\"scatter\", mode=\"line\"\n ),\n Config(\n x = [0, xₘ],\n y = [0, sin(alpha)*cos(beta)],\n fill = \"tozeroy\",\n fillcolor = \"rgba(100, 100, 100, 0.5)\"\n ),\n Config(\n x = [0, cos(alpha+beta), xₘ],\n y = [0, yₘ, sin(alpha)*cos(beta)],\n fill = \"tonexty\",\n fillcolor = \"rgba(200, 0, 100, 0.5)\",\n ),\n Config(\n x = [0, cos(alpha+beta)],\n y = [0, yₘ],\n line = Config(width=5, color=:black)\n )\n]\n\nlyt = Config(\n height=450,\n showlegend=false,\n xaxis=Config(visible=false),\n yaxis = Config(visible=false, scaleanchor=\"x\"),\n annotations = [\n\n Config(x = r₀*cos(alpha/2), y = r₀*sin(alpha/2),\n text=\"α\", showarrow=false),\n Config(x = r₀*cos(alpha+beta/2), y = r₀*sin(alpha+beta/2),\n text=\"β\", showarrow=false),\n Config(x = cos(alpha+beta) + r₀*cos(pi+(alpha+beta)/2),\n y = yₘ + r₀*sin(pi+(alpha+beta)/2),\n xanchor=\"center\", yanchor=\"center\",\n text=\"α+β\", showarrow=false),\n Config(x = xₘ + r₀*cos(pi/2+alpha/2),\n y = sin(alpha)*cos(beta) + r₀ * sin(pi/2 + alpha/2),\n text=\"α\", showarrow=false),\n Config(x = 1/2 * cos(alpha+beta),\n y = 1/2 * sin(alpha+beta),\n text = \"1\"),\n Config(x = xₘ/2*cos(alpha), y = xₘ/2*sin(alpha),\n xanchor=\"center\", yanchor=\"bottom\",\n text = \"cos(β)\",\n textangle=-rad2deg(alpha),\n showarrow=false),\n Config(x = xₘ + sin(beta)/2*cos(pi/2 + alpha),\n y = sin(alpha)*cos(beta) + sin(beta)/2*sin(pi/2 + alpha),\n xanchor=\"center\", yanchor=\"top\",\n text = \"sin(β)\",\n textangle = rad2deg(pi/2-alpha),\n showarrow=false),\n\n Config(x = xₘ/2,\n y = 0,\n xanchor=\"center\", yanchor=\"top\",\n text = \"cos(α)⋅cos(β)\", showarrow=false),\n Config(x = 0,\n y = yₘ/2,\n xanchor=\"right\", yanchor=\"center\",\n text = \"sin(α+β)\",\n textangle=-90,\n showarrow=false),\n Config(x = cos(alpha+beta)/2,\n y = yₘ,\n xanchor=\"center\", yanchor=\"bottom\",\n text = \"cos(α+β)\", showarrow=false),\n Config(x = cos(alpha+beta) + (xₘ - cos(alpha+beta))/2,\n y = yₘ,\n xanchor=\"center\", yanchor=\"bottom\",\n text = \"sin(α)⋅sin(β)\", showarrow=false),\n Config(x = xₘ, y=sin(alpha)*cos(beta) + (yₘ - sin(alpha)*cos(beta))/2,\n xanchor=\"left\", yanchor=\"center\",\n text = \"cos(α)⋅sin(β)\",\n textangle=90,\n showarrow=false),\n Config(x = xₘ,\n y = sin(alpha)*cos(beta)/2,\n xanchor=\"left\", yanchor=\"center\",\n text = \"sin(α)⋅cos(β)\",\n textangle=90,\n showarrow=false)\n ]\n)\n\nPlot(data, lyt)"
},
{
"objectID": "alternatives/plotly_plotting.html#parameterized-curves",
"href": "alternatives/plotly_plotting.html#parameterized-curves",
"title": "64  JavaScript based plotting libraries",
"section": "64.8 Parameterized curves",
"text": "64.8 Parameterized curves\nIn \\(2\\)-dimensions, the plotting of a parameterized curve is similar to that of plotting a function. In \\(3\\)-dimensions, an extra \\(z\\)-coordinate is included.\nTo help, we define an unzip function as an interface to SplitApplyCombines invert function:\n\nunzip(v) = SplitApplyCombine.invert(v)\n\nunzip (generic function with 1 method)\n\n\nEarlier, we plotted a two dimensional circle, here we plot the related helix.\n\nhelix(t) = [cos(t), sin(t), t]\n\nts = range(0, 4pi, length=200)\n\nxs, ys, zs = unzip(helix.(ts))\n\ndata = Config(x=xs, y=ys, z=zs,\n type = \"scatter3d\", # <<- note the 3d\n mode = \"lines\",\n line=(width=2,\n color=:red)\n )\n\nPlot(data)\n\n\n \n\n\n\n\n\nThe main difference is the chart type, as this is a \\(3\\)-dimensional plot, “scatter3d” is used.\n\n64.8.1 Quiver plots\nThere is no quiver plot for plotly using JavaScript. In \\(2\\)-dimensions a text-less annotation could be employed. In \\(3\\)-dimensions, the following (from stackoverflow.com) is a possible workaround where a line segment is drawn and capped with a small cone. Somewhat opaquely, we use NamedTuple for an iterator to create the keys for the data below:\n\nhelix(t) = [cos(t), sin(t), t]\nhelix(t) = [-sin(t), cos(t), 1]\nts = range(0, 4pi, length=200)\nxs, ys, zs = unzip(helix.(ts))\nhelix_trace = Config(; NamedTuple(zip((:x,:y,:z), unzip(helix.(ts))))...,\n type = \"scatter3d\", # <<- note the 3d\n mode = \"lines\",\n line=(width=2,\n color=:red)\n )\n\ntss = pi/2:pi/2:7pi/2\nrs, rs = helix.(tss), helix.(tss)\n\narrows = [\n Config(x = [x[1], x[1]+x[1]],\n y = [x[2], x[2]+x[2]],\n z = [x[3], x[3]+x[3]],\n mode=\"lines\", type=\"scatter3d\")\n for (x, x) ∈ zip(rs, rs)\n]\n\ntips = rs .+ rs\nlengths = 0.1 * rs\n\ncaps = Config(;\n NamedTuple(zip([:x,:y,:z], unzip(tips)))...,\n NamedTuple(zip([:u,:v,:w], unzip(lengths)))...,\n type=\"cone\", anchor=\"tail\")\n\ndata = vcat(helix_trace, arrows, caps)\n\nPlot(data)\n\n\n \n\n\n\n\n\nIf several arrows are to be drawn, it might be more efficient to pass multiple values in for the x, y, … values. They expect a vector. In the above, we create \\(1\\)-element vectors."
},
{
"objectID": "alternatives/plotly_plotting.html#contour-plots",
"href": "alternatives/plotly_plotting.html#contour-plots",
"title": "64  JavaScript based plotting libraries",
"section": "64.9 Contour plots",
"text": "64.9 Contour plots\nA contour plot is created by the “contour” trace type. The data is prepared as a vector of vectors, not a matrix. The following has the interior vector corresponding to slices ranging over \\(x\\) for a fixed \\(y\\). With this, the construction is straightforward using a comprehension:\n\nf(x,y) = x^2 - 2y^2\n\nxs = range(0,2,length=25)\nys = range(0,2, length=50)\nzs = [[f(x,y) for x in xs] for y in ys]\n\ndata = Config(\n x=xs, y=ys, z=zs,\n type=\"contour\"\n)\n\nPlot(data)\n\n\n \n\n\n\n\n\nThe same zs data can be achieved by broadcasting and then collecting as follows:\n\nf(x,y) = x^2 - 2y^2\n\nxs = range(0,2,length=25)\nys = range(0,2, length=50)\nzs = collect(eachrow(f.(xs', ys)))\n\ndata = Config(\n x=xs, y=ys, z=zs,\n type=\"contour\"\n)\n\nPlot(data)\n\n\n \n\n\n\n\n\nThe use of just f.(xs', ys) or f.(xs, ys'), as with other plotting packages, is not effective, as JSON3 writes matrices as vectors (with linear indexing)."
},
{
"objectID": "alternatives/plotly_plotting.html#surface-plots",
"href": "alternatives/plotly_plotting.html#surface-plots",
"title": "64  JavaScript based plotting libraries",
"section": "64.10 Surface plots",
"text": "64.10 Surface plots\nThe chart type “surface” allows surfaces in \\(3\\) dimensions to be plotted.\n\n64.10.1 Surfaces defined by \\(z = f(x,y)\\)\nSurfaces defined through a scalar-valued function are drawn quite naturally, save for needing to express the height data (\\(z\\) axis) using a vector of vectors, and not a matrix.\n\npeaks(x,y) = 3 * (1-x)^2 * exp(-(x^2) - (y+1)^2) -\n 10*(x/5 - x^3 - y^5) * exp(-x^2-y^2) - 1/3 * exp(-(x+1)^2 - y^2)\n\nxs = range(-3,3, length=50)\nys = range(-3,3, length=50)\nzs = [[peaks(x,y) for x in xs] for y in ys]\n\ndata = Config(x=xs, y=ys, z=zs,\n type=\"surface\")\n\nPlot(data)\n\n\n \n\n\n\n\n\n\n\n64.10.2 Parametrically defined surfaces\nFor parametrically defined surfaces, the \\(x\\) and \\(y\\) values also correspond to matrices. Her we see a pattern to plot a torus. The aspectmode instructs the scenes axes to be drawn in proportion with the axes ranges.\n\nr, R = 1, 5\nX(theta,phi) = [(r*cos(theta)+R)*cos(phi), (r*cos(theta)+R)*sin(phi), r*sin(theta)]\n\nus = range(0, 2pi, length=25)\nvs = range(0, pi, length=25)\n\nxs = [[X(u,v)[1] for u in us] for v in vs]\nys = [[X(u,v)[2] for u in us] for v in vs]\nzs = [[X(u,v)[3] for u in us] for v in vs]\n\ndata = Config(\n x = xs, y = ys, z = zs,\n type=\"surface\",\n mode=\"scatter3d\"\n)\n\nlyt = Config(scene=Config(aspectmode=\"data\"))\n\nPlot(data, lyt)"
},
{
"objectID": "alternatives/makie_plotting.html",
"href": "alternatives/makie_plotting.html",
"title": "65  Calculus plots with Makie",
"section": "",
"text": "The Makie.jl webpage says\nMakie itself is a metapackage for a rich ecosystem. We show how to use the interface provided by the GLMakie backend to produce the familiar graphics of calculus."
},
{
"objectID": "alternatives/makie_plotting.html#figures",
"href": "alternatives/makie_plotting.html#figures",
"title": "65  Calculus plots with Makie",
"section": "65.1 Figures",
"text": "65.1 Figures\nMakie draws graphics onto a canvas termed a “scene” in the Makie documentation. A scene is an implementation detail, the basic (non-mutating) plotting commands described below return a FigureAxisPlot object, a compound object that combines a figure, an axes, and a plot object. The show method for these objects display the figure.\nFor Makie there are the GLMakie, WGLMakie, and CairoMakie backends for different types of canvases. In the following, we have used GLMakie. WGLMakie is useful for incorporating Makie plots into web-based technologies.\nWe begin by loading the main package and the norm function from the standard LinearAlgebra package:\nusing GLMakie\nimport LinearAlgebra: norm\nThe Makie developers have workarounds for the delayed time to first plot, but without utilizing these the time to load the package is lengthy."
},
{
"objectID": "alternatives/makie_plotting.html#points-scatter",
"href": "alternatives/makie_plotting.html#points-scatter",
"title": "65  Calculus plots with Makie",
"section": "65.2 Points (scatter)",
"text": "65.2 Points (scatter)\nThe task of plotting the points, say \\((1,2)\\), \\((2,3)\\), \\((3,2)\\) can be done different ways. Most plotting packages, and Makie is no exception, allow the following: form vectors of the \\(x\\) and \\(y\\) values then plot those with scatter:\n\nxs = [1,2,3]\nys = [2,3,2]\nscatter(xs, ys)\n\n\n\n\nThe scatter function creates and returns an object, which when displayed shows the plot.\n\n65.2.1 Point2, Point3\nWhen learning about points on the Cartesian plane, a “t”-chart is often produced:\nx | y\n-----\n1 | 2\n2 | 3\n3 | 2\nThe scatter usage above used the columns. The rows are associated with the points, and these too can be used to produce the same graphic. Rather than make vectors of \\(x\\) and \\(y\\) (and optionally \\(z\\)) coordinates, it is more idiomatic to create a vector of “points.” Makie utilizes a Point type to store a 2 or 3 dimensional point. The Point2 and Point3 constructors will be utilized.\nMakie uses a GPU, when present, to accelerate the graphic rendering. GPUs employ 32-bit numbers. Julia uses an f0 to indicate 32-bit floating points. Hence the alternate types Point2f0 to store 2D points as 32-bit numbers and Points3f0 to store 3D points as 32-bit numbers are seen in the documentation for Makie.\nWe can plot a vector of points in as direct manner as vectors of their coordinates:\n\npts = [Point2(1,2), Point2(2,3), Point2(3,2)]\nscatter(pts)\n\n\n\n\nA typical usage is to generate points from some vector-valued function. Say we have a parameterized function r taking \\(R\\) into \\(R^2\\) defined by:\n\nr(t) = [sin(t), cos(t)]\n\nr (generic function with 1 method)\n\n\nThen broadcasting values gives a vector of vectors, each identified with a point:\n\nts = [1,2,3]\nr.(ts)\n\n3-element Vector{Vector{Float64}}:\n [0.8414709848078965, 0.5403023058681398]\n [0.9092974268256817, -0.4161468365471424]\n [0.1411200080598672, -0.9899924966004454]\n\n\nWe can broadcast Point2 over this to create a vector of Point objects:\n\npts = Point2.(r.(ts))\n\n3-element Vector{Point2{Float64}}:\n [0.8414709848078965, 0.5403023058681398]\n [0.9092974268256817, -0.4161468365471424]\n [0.1411200080598672, -0.9899924966004454]\n\n\nThese then can be plotted directly:\n\nscatter(pts)\n\n\n\n\nThe ploting of points in three dimesions is essentially the same, save the use of Point3 instead of Point2.\n\nr(t) = [sin(t), cos(t), t]\nts = range(0, 4pi, length=100)\npts = Point3.(r.(ts))\nscatter(pts; markersize=5)\n\n\n\n\n\nTo plot points generated in terms of vectors of coordinates, the component vectors must be created. The “t”-table shows how, simply loop over each column and add the corresponding \\(x\\) or \\(y\\) (or \\(z\\)) value. This utility function does exactly that, returning the vectors in a tuple.\n\nunzip(vs) = Tuple([vs[j][i] for j in eachindex(vs)] for i in eachindex(vs[1]))\n\nunzip (generic function with 1 method)\n\n\n\n\n\n\n\n\nNote\n\n\n\nIn the CalculusWithJulia package, unzip is implemented using SplitApplyCombine.invert.\n\n\nWe might have then:\n\nscatter(unzip(r.(ts))...; markersize=5)\n\n\n\n\nwhere splatting is used to specify the xs, ys, and zs to scatter.\n(Compare to scatter(Point3.(r.(ts))) or scatter(Point3∘r).(ts)).)\n\n\n65.2.2 Attributes\nA point is drawn with a “marker” with a certain size and color. These attributes can be adjusted, as in the following:\n\nscatter(xs, ys;\n marker=[:x,:cross, :circle], markersize=25,\n color=:blue)\n\n\n\n\nMarker attributes include\n\nmarker a symbol, shape.\nmarker_offset offset coordinates\nmarkersize size (radius pixels) of marker\n\nA single value will be repeated. A vector of values of a matching size will specify the attribute on a per point basis."
},
{
"objectID": "alternatives/makie_plotting.html#curves",
"href": "alternatives/makie_plotting.html#curves",
"title": "65  Calculus plots with Makie",
"section": "65.3 Curves",
"text": "65.3 Curves\nThe curves of calculus are lines. The lines command of Makie will render a curve by connecting a series of points with straight-line segments. By taking a sufficient number of points the connect-the-dot figure can appear curved.\n\n65.3.1 Plots of univariate functions\nThe basic plot of univariate calculus is the graph of a function \\(f\\) over an interval \\([a,b]\\). This is implemented using a familiar strategy: produce a series of representative values between \\(a\\) and \\(b\\); produce the corresponding \\(f(x)\\) values; plot these as points and connect the points with straight lines.\nTo create regular values between a and b typically the range function or the range operator (a:h:b) are employed. The the related LinRange function is also an option.\nFor example:\n\nf(x) = sin(x)\na, b = 0, 2pi\nxs = range(a, b, length=250)\nlines(xs, f.(xs))\n\n\n\n\nMakie also will read the interval notation of IntervalSets and select its own set of intermediate points:\n\nlines(a..b, f)\n\n\n\n\nAs with scatter, lines returns an object that produces a graphic when displayed.\nAs with scatter, lines can can also be drawn using a vector of points:\n\npts = [Point2(x, f(x)) for x ∈ xs]\nlines(pts)\n\n\n\n\n(Though the advantage isnt clear here, this will be useful when the points are generated in different manners.)\nWhen a y value is NaN or infinite, the connecting lines are not drawn:\n\nxs = 1:5\nys = [1,2,NaN, 4, 5]\nlines(xs, ys)\n\n\n\n\nAs with other plotting packages, this is useful to represent discontinuous functions, such as what occurs at a vertical asymptote or a step function.\n\nAdding to a figure (lines!, scatter!, …)\nTo add or modify a scene can be done using a mutating version of a plotting primitive, such as lines! or scatter!. The names follow Julias convention of using an ! to indicate that a function modifies an argument, in this case the underlying figure.\nHere is one way to show two plots at once:\n\nxs = range(0, 2pi, length=100)\nlines(xs, sin.(xs))\nlines!(xs, cos.(xs))\ncurrent_figure()\n\n\n\n\n\n\n\n\n\n\nCurrent figure\n\n\n\nThe current_figure call is needed to have the figure display, as the returned value of lines! is not a figure object. (Figure objects display when shown as the output of a cell.)\n\n\nWe will see soon how to modify the line attributes so that the curves can be distinguished.\nThe following shows the construction details in the graphic:\n\nxs = range(0, 2pi, length=10)\nlines(xs, sin.(xs))\nscatter!(xs, sin.(xs);\n markersize=10)\ncurrent_figure()\n\n\n\n\nAs an example, this shows how to add the tangent line to a graph. The slope of the tangent line being computed by ForwardDiff.derivative.\n\nimport ForwardDiff\nf(x) = x^x\na, b= 0, 2\nc = 0.5\nxs = range(a, b, length=200)\n\ntl(x) = f(c) + ForwardDiff.derivative(f, c) * (x-c)\n\nlines(xs, f.(xs))\nlines!(xs, tl.(xs), color=:blue)\ncurrent_figure()\n\n\n\n\nThis example, modified from a discourse post by user @rafael.guerra, shows how to plot a step function (floor) using NaNs to create line breaks. The marker colors set for scatter! use :white to match the background color.\n\nx = -5:5\nδ = 5eps() # for rounding purposes; our interval is [i,i+1) ≈ [i, i+1-δ]\nxx = Float64[]\nfor i ∈ x[1:end-1]\n append!(xx, (i, i+1 - δ, NaN))\nend\nyy = floor.(xx)\n\nlines(xx, yy)\nscatter!(xx, yy, color=repeat([:black, :white, :white], length(xx)÷3))\n\ncurrent_figure()\n\n\n\n\n\n\n\n65.3.2 Text (annotations)\nText can be placed at a point, as a marker is. To place text, the desired text and a position need to be specified along with any adjustments to the default attributes.\nFor example:\n\nxs = 1:5\npts = Point2.(xs, xs)\nscatter(pts)\nannotations!(\"Point \" .* string.(xs), pts;\n textsize = 50 .- 2*xs,\n rotation = 2pi ./ xs)\n\ncurrent_figure()\n\n\n\n\nThe graphic shows that textsize adjusts the displayed size and rotation adjusts the orientation. (The graphic also shows a need to manually override the limits of the y axis, as the Point 5 is chopped off; the ylims! function to do so will be shown later.)\nAttributes for text, among many others, include:\n\nalign Specify the text alignment through (:pos, :pos), where :pos can be :left, :center, or :right.\nrotation to indicate how the text is to be rotated\ntextsize the font point size for the text\nfont to indicate the desired font\n\n\nLine attributes\nIn a previous example, we added the argument color=:blue to the lines! call. This was to set an attribute for the line being drawn. Lines have other attributes that allow different ones to be distinguished, as above where colors indicate the different graphs.\nOther attributes can be seen from the help page for lines, and include:\n\ncolor set with a symbol, as above, or a string\nlabel a label for the line to display in a legend\nlinestyle available styles are set by a symbol, one of :dash, :dot, :dashdot, or :dashdotdot.\nlinewidth width of line\ntransparency the alpha value, a number between \\(0\\) and \\(1\\), smaller numbers for more transparent.\n\n\n\nSimple legends\nA simple legend displaying labels given to each curve can be produced by axislegend. For example:\n\nxs = 0..pi\nlines(xs, x -> sin(x^2), label=\"sin(x^2)\")\nlines!(xs, x -> sin(x)^2, label = \"sin(x)^2\")\naxislegend()\n\ncurrent_figure()\n\n\n\n\nLater, we will see how to control the placement of a legend within a figure.\n\n\nTitles, axis labels, axis ticks\nThe basic plots we have seen are of type FigureAxisPlot. The “axis” part controls attributes of the plot such as titles, labels, tick positions, etc. These values can be set in different manners. On construction we can pass values to a named argument axis using a named tuple.\nFor example:\n\nxs = 0..2pi\nlines(xs, sin;\n axis=(title=\"Plot of sin(x)\", xlabel=\"x\", ylabel=\"sin(x)\")\n )\n\n\n\n\nTo access the axis element of a plot after the plot is constructed, values can be assigned to the axis property of the FigureAxisPlot object. For example:\n\nxs = 0..2pi\np = lines(xs, sin;\n axis=(title=\"Plot of sin(x)\", xlabel=\"x\", ylabel=\"sin(x)\")\n )\np.axis.xticks = MultiplesTicks(5, pi, \"π\") # label 5 times using `pi`\n\ncurrent_figure()\n\n\n\n\nThe ticks are most easily set as a collection of values. Above, the MultiplesTicks function was used to label with multiples of \\(\\pi\\).\nLater we will discuss how Makie allows for subsequent modification of several parts of the plot (not just the ticks) including the data.\n\n\nFigure resolution, \\(x\\) and \\(y\\) limits\nAs just mentioned, the basic plots we have seen are of type FigureAxisPlot. The “figure” part can be used to adjust the background color or the resolution. As with attributes for the axis, these too can be passed to a simple constructor:\n\nlines(xs, sin;\n axis=(title=\"Plot of sin(x)\", xlabel=\"x\", ylabel=\"sin(x)\"),\n figure=(;resolution=(300, 300))\n )\n\n\n\n\nThe ; in the tuple passed to figure is one way to create a named tuple with a single element. Alternatively, (resolution=(300,300), ) with a trailing comma could have been used.\nTo set the limits of the graph there are shorthand functions xlims!, ylims!, and zlims!. This might prove useful if vertical asymptotes are encountered, as in this example:\n\nf(x) = 1/x\na,b = -1, 1\nxs = range(-1, 1, length=200)\nlines(xs, f.(xs))\nylims!(-10, 10)\n\ncurrent_figure()\n\n\n\n\nThis still leaves the artifact due to the vertical asymptote at \\(0\\) having different values from the left and the right.\n\n\n\n65.3.3 Plots of parametric functions\nA space curve is a plot of a function \\(f:R^2 \\rightarrow R\\) or \\(f:R^3 \\rightarrow R\\).\nTo construct a curve from a set of points, we have a similar pattern in both \\(2\\) and \\(3\\) dimensions:\n\nr(t) = [sin(2t), cos(3t)]\nts = range(0, 2pi, length=200)\npts = Point2.(r.(ts)) # or (Point2∘r).(ts)\nlines(pts)\n\n\n\n\nOr\n\nr(t) = [sin(2t), cos(3t), t]\nts = range(0, 2pi, length=200)\npts = Point3.(r.(ts))\nlines(pts)\n\n\n\n\nAlternatively, vectors of the \\(x\\), \\(y\\), and \\(z\\) components can be produced and then plotted using the pattern lines(xs, ys) or lines(xs, ys, zs). For example, using unzip, as above, we might have done the prior example with:\n\nxs, ys, zs = unzip(r.(ts))\nlines(xs, ys, zs)\n\n\n\n\n\nAspect ratio\nA simple plot of a parametrically defined circle will show an ellipse, as the aspect ratio of the \\(x\\) and \\(y\\) axis is not \\(1\\). To enforce this, we can pass a value of aspect=1 to the underlying “Axis” object. For example:\n\nts = range(0, 2pi, length=100)\nxs, ys = sin.(ts), cos.(ts)\nlines(xs, ys; axis=(; aspect = 1))\n\n\n\n\n\n\nTangent vectors (arrows)\nA tangent vector along a curve can be drawn quite easily using the arrows function. There are different interfaces for arrows, but we show the one which uses a vector of positions and a vector of “vectors”. For the latter, we utilize the derivative function from ForwardDiff:\n\nr(t) = [sin(t), cos(t)] # vector, not tuple\nts = range(0, 4pi, length=200)\nlines(Point2.(r.(ts)))\n\nnts = 0:pi/4:2pi\nus = r.(nts)\ndus = ForwardDiff.derivative.(r, nts)\n\narrows!(Point2.(us), Point2.(dus))\n\ncurrent_figure()\n\n\n\n\nIn 3 dimensions the differences are minor:\n\nr(t) = [sin(t), cos(t), t] # vector, not tuple\nts = range(0, 4pi, length=200)\nlines(Point3.(r.(ts)))\n\nnts = 0:pi/2:(4pi-pi/2)\nus = r.(nts)\ndus = ForwardDiff.derivative.(r, nts)\n\narrows!(Point3.(us), Point3.(dus))\n\ncurrent_figure()\n\n\n\n\n\n\nArrow attributes\nAttributes for arrows include\n\narrowsize to adjust the size\nlengthscale to scale the size\narrowcolor to set the color\narrowhead to adjust the head\narrowtail to adjust the tail"
},
{
"objectID": "alternatives/makie_plotting.html#surfaces",
"href": "alternatives/makie_plotting.html#surfaces",
"title": "65  Calculus plots with Makie",
"section": "65.4 Surfaces",
"text": "65.4 Surfaces\nPlots of surfaces in \\(3\\) dimensions are useful to help understand the behavior of multivariate functions.\n\nSurfaces defined through \\(z=f(x,y)\\)\nThe “peaks” function defined below has a few prominent peaks:\n\nfunction peaks(x, y)\n p = 3*(1-x)^2*exp(-x^2 - (y+1)^2)\n p -= 10(x/5-x^3-y^5)*exp(-x^2-y^2)\n p -= 1/3*exp(-(x+1)^2-y^2)\n p\nend\n\npeaks (generic function with 1 method)\n\n\nHere we see how peaks can be visualized over the region \\([-5,5]\\times[-5,5]\\):\n\nxs = ys = range(-5, 5, length=25)\nsurface(xs, ys, peaks)\n\n\n\n\nThe calling pattern surface(xs, ys, f) implies a rectangular grid over the \\(x\\)-\\(y\\) plane defined by xs and ys with \\(z\\) values given by \\(f(x,y)\\).\nAlternatively a “matrix” of \\(z\\) values can be specified. For a function f, this is conveniently generated by the pattern f.(xs, ys'), the ' being important to get a matrix of all \\(x\\)-\\(y\\) pairs through Julias broadcasting syntax.\n\nzs = peaks.(xs, ys')\nsurface(xs, ys, zs)\n\n\n\n\nTo see how this graph is constructed, the points \\((x,y,f(x,y))\\) are plotted over the grid and displayed.\nHere we downsample to illustrate:\n\nxs = ys = range(-5, 5, length=5)\npts = [Point3(x, y, peaks(x,y)) for x in xs for y in ys]\nscatter(pts, markersize=25)\n\n\n\n\nThese points are then connected. The wireframe function illustrates just the frame:\n\nwireframe(xs, ys, peaks.(xs, ys'); linewidth=5)\n\n\n\n\nThe surface call triangulates the frame and fills in the shading:\n\nsurface!(xs, ys, peaks.(xs, ys'))\ncurrent_figure()\n\n\n\n\n\n\nParametrically defined surfaces\nA surface may be parametrically defined through a function \\(r(u,v) = (x(u,v), y(u,v), z(u,v))\\). For example, the surface generated by \\(z=f(x,y)\\) is of the form with \\(r(u,v) = (u,v,f(u,v))\\).\nThe surface function and the wireframe function can be used to display such surfaces. In previous usages, the x and y values were vectors from which a 2-dimensional grid is formed. For parametric surfaces, a grid for the x and y values must be generated. This function will do so:\n\nfunction parametric_grid(us, vs, r)\n n,m = length(us), length(vs)\n xs, ys, zs = zeros(n,m), zeros(n,m), zeros(n,m)\n for (i, uᵢ) in pairs(us)\n for (j, vⱼ) in pairs(vs)\n x,y,z = r(uᵢ, vⱼ)\n xs[i,j] = x\n ys[i,j] = y\n zs[i,j] = z\n end\n end\n (xs, ys, zs)\nend\n\nparametric_grid (generic function with 1 method)\n\n\nWith the data suitably massaged, we can directly plot either a surface or wireframe plot.\n\nAs an aside, The above can be done more campactly with nested list comprehensions:\nxs, ys, zs = [[pt[i] for pt in r.(us, vs')] for i in 1:3]\nOr using the unzip function directly after broadcasting:\nxs, ys, zs = unzip(r.(us, vs'))\n\nFor example, a sphere can be parameterized by \\(r(u,v) = (\\sin(u)\\cos(v), \\sin(u)\\sin(v), \\cos(u))\\) and visualized through:\n\nr(u,v) = [sin(u)*cos(v), sin(u)*sin(v), cos(u)]\nus = range(0, pi, length=25)\nvs = range(0, pi/2, length=25)\nxs, ys, zs = parametric_grid(us, vs, r)\n\nsurface(xs, ys, zs)\nwireframe!(xs, ys, zs)\ncurrent_figure()\n\n\n\n\nA surface of revolution for \\(g(u)\\) revolved about the \\(z\\) axis can be visualized through:\n\ng(u) = u^2 * exp(-u)\nr(u,v) = (g(u)*sin(v), g(u)*cos(v), u)\nus = range(0, 3, length=10)\nvs = range(0, 2pi, length=10)\nxs, ys, zs = parametric_grid(us, vs, r)\n\nsurface(xs, ys, zs)\nwireframe!(xs, ys, zs)\ncurrent_figure()\n\n\n\n\nA torus with big radius \\(2\\) and inner radius \\(1/2\\) can be visualized as follows\n\nr1, r2 = 2, 1/2\nr(u,v) = ((r1 + r2*cos(v))*cos(u), (r1 + r2*cos(v))*sin(u), r2*sin(v))\nus = vs = range(0, 2pi, length=25)\nxs, ys, zs = parametric_grid(us, vs, r)\n\nsurface(xs, ys, zs)\nwireframe!(xs, ys, zs)\ncurrent_figure()\n\n\n\n\nA Möbius strip can be produced with:\n\nws = range(-1/4, 1/4, length=8)\nthetas = range(0, 2pi, length=30)\nr(w, θ) = ((1+w*cos(θ/2))*cos(θ), (1+w*cos(θ/2))*sin(θ), w*sin(θ/2))\nxs, ys, zs = parametric_grid(ws, thetas, r)\n\nsurface(xs, ys, zs)\nwireframe!(xs, ys, zs)\ncurrent_figure()"
},
{
"objectID": "alternatives/makie_plotting.html#contour-plots-contour-contourf-heatmap",
"href": "alternatives/makie_plotting.html#contour-plots-contour-contourf-heatmap",
"title": "65  Calculus plots with Makie",
"section": "65.5 Contour plots (contour, contourf, heatmap)",
"text": "65.5 Contour plots (contour, contourf, heatmap)\nFor a function \\(z = f(x,y)\\) an alternative to a surface plot, is a contour plot. That is, for different values of \\(c\\) the level curves \\(f(x,y)=c\\) are drawn.\nFor a function \\(f(x,y)\\), the syntax for generating a contour plot follows that for surface.\nFor example, using the peaks function, previously defined, we have a contour plot over the region \\([-5,5]\\times[-5,5]\\) is generated through:\n\nxs = ys = range(-5, 5, length=100)\ncontour(xs, ys, peaks)\n\n\n\n\nThe default of \\(5\\) levels can be adjusted using the levels keyword:\n\ncontour(xs, ys, peaks; levels = 20)\n\n\n\n\nThe levels argument can also specify precisely what levels are to be drawn.\nThe contour graph makes identification of peaks and valleys easy as the limits of patterns of nested contour lines.\nA filled contour plot is produced by contourf:\n\ncontourf(xs, ys, peaks)\n\n\n\n\nA related, but alternative visualization, using color to represent magnitude is a heatmap, produced by the heatmap function. The calling syntax is similar to contour and surface:\n\nheatmap(xs, ys, peaks)\n\n\n\n\nThis graph shows peaks and valleys through “hotspots” on the graph.\nThe MakieGallery package includes an example of a surface plot with both a wireframe and 2D contour graph added. It is replicated here using the peaks function scaled by \\(5\\).\nThe function and domain to plot are described by:\nxs = ys = range(-5, 5, length=51)\nzs = peaks.(xs, ys') / 5;\nThe zs were generated, as wireframe does not provide the interface for passing a function.\nThe surface and wireframe are produced as follows. Here we manually create the figure and axis object so that we can set the viewing angle through the elevation argument to the axis object:\n\nfig = Figure()\nax3 = Axis3(fig[1,1];\n elevation=pi/9, azimuth=pi/16)\nsurface!(ax3, xs, ys, zs)\nwireframe!(ax3, xs, ys, zs;\n overdraw = true, transparency = true,\n color = (:black, 0.1))\ncurrent_figure()\n\n\n\n\nTo add the contour, a simple call via contour!(scene, xs, ys, zs) will place the contour at the \\(z=0\\) level which will make it hard to read. Rather, placing at the “bottom” of the figure is desirable. To identify that the minimum value, is identified (and rounded) and the argument transformation = (:xy, zmin) is passed to contour!:\n\nezs = extrema(zs)\nzmin, zmax = floor(first(ezs)), ceil(last(ezs))\ncontour!(ax3, xs, ys, zs;\n levels = 15, linewidth = 2,\n transformation = (:xy, zmin))\nzlims!(zmin, zmax)\ncurrent_figure()\n\n\n\n\nThe transformation plot attribute sets the “plane” (one of :xy, :yz, or :xz) at a location, in this example zmin.\nThe manual construction of a figure and an axis object will be further discussed later.\n\n65.5.1 Three dimensional contour plots\nThe contour function can also plot \\(3\\)-dimensional contour plots. Concentric spheres, contours of \\(x^2 + y^2 + z^2 = c\\) for \\(c > 0\\) are presented by the following:\n\nf(x,y,z) = x^2 + y^2 + z^2\nxs = ys = zs = range(-3, 3, length=100)\n\ncontour(xs, ys, zs, f)\n\n\n\n\n\n\n65.5.2 Implicitly defined curves and surfaces\nSuppose \\(f\\) is a scalar-valued function. If f takes two variables for its input, then the equation \\(f(x,y) = 0\\) implicitly defines \\(y\\) as a function of \\(x\\); \\(y\\) can be visualized locally with a curve. If \\(f\\) takes three variables for its input, then the equation \\(f(x,y,z)=0\\) implicitly defines \\(z\\) as a function of \\(x\\) and \\(y\\); \\(z\\) can be visualized locally with a surface.\n\nImplicitly defined curves\nThe graph of an equation is the collection of all \\((x,y)\\) values satisfying the equation. This is more general than the graph of a function, which can be viewed as the graph of the equation \\(y=f(x)\\). An equation in \\(x\\)-\\(y\\) can be graphed if the set of solutions to a related equation \\(f(x,y)=0\\) can be identified, as one can move all terms to one side of an equation and define \\(f\\) as the rule of the side with the terms. The implicit function theorem ensures that under some conditions, locally near a point \\((x, y)\\), the value \\(y\\) can be represented as a function of \\(x\\). So, the graph of the equation \\(f(x,y)=0\\) can be produced by stitching together these local function representations.\nThe contour graph can produce these graphs by setting the levels argument to [0].\n\nf(x,y) = x^3 + x^2 + x + 1 - x*y # solve x^3 + x^2 + x + 1 = x*y\nxs = range(-5, 5, length=100)\nys = range(-10, 10, length=100)\n\ncontour(xs, ys, f.(xs, ys'); levels=[0])\n\n\n\n\nThe implicitPlots.jl function uses the Contour package along with a Plots recipe to plot such graphs. Here we see how to use Makie in a similar manner:\nimport Contour\n\nfunction implicit_plot(xs, ys, f; kwargs...)\n fig = Figure()\n ax = Axis(fig[1,1])\n implicit_plot!(ax, xs, ys, f; kwargs...)\n fig\nend\n\nfunction implicit_plot!(ax, xs, ys, f; kwargs...)\n z = [f(x, y) for x in xs, y in ys]\n cs = Contour.contour(collect(xs), collect(ys), z, 0.0)\n ls = Contour.lines(cs)\n\n isempty(ls) && error(\"empty\")\n\n for l ∈ ls\n us, vs = Contour.coordinates(l)\n lines!(ax, us, vs; kwargs...)\n end\n\nend\n\n\nImplicitly defined surfaces, \\(F(x,y,z)=0\\)\nTo plot the equation \\(F(x,y,z)=0\\), for \\(F\\) a scalar-valued function, again the implicit function theorem says that, under conditions, near any solution \\((x,y,z)\\), \\(z\\) can be represented as a function of \\(x\\) and \\(y\\), so the graph will look likes surfaces stitched together. The Implicit3DPlotting package takes an approach like ImplicitPlots to represent these surfaces. It replaces the Contour package computation with a \\(3\\)-dimensional alternative provided through the Meshing and GeometryBasics packages.\nThe Implicit3DPlotting package needs some maintenance, so we borrow the main functionality and wrap it into a function:\n\nimport Meshing\nimport GeometryBasics\n\nfunction make_mesh(xlims, ylims, zlims, f,\n M = Meshing.MarchingCubes(); # or Meshing.MarchingTetrahedra()\n samples=(35, 35, 35),\n )\n\n lims = extrema.((xlims, ylims, zlims))\n Δ = xs -> last(xs) - first(xs)\n xs = Vec(first.(lims))\n Δxs = Vec(Δ.(lims))\n\n GeometryBasics.Mesh(f, Rect(xs, Δxs), M; samples = samples)\nend\n\nmake_mesh (generic function with 2 methods)\n\n\nThe make_mesh function creates a mesh that can be visualized with the wireframe or mesh plotting functions.\nThis example, plotting an implicitly defined sphere, comes from the documentation of Implicit3DPlotting. The f in make_mesh is a scalar-valued function of a vector:\n\nf(x) = sum(x.^2) - 1\nxs = ys = zs = (-5, 5)\nm = make_mesh(xs, ys, zs, f)\nwireframe(m)\n\n\n\n\nHere we visualize an intersection of a sphere with another figure:\n\nr₂(x) = sum(x.^2) - 5/4 # a sphere\nr₄(x) = sum(x.^4) - 1\nxs = ys = zs = -2:2\nm2,m4 = make_mesh(xs, ys, zs, r₂), make_mesh(xs, ys, zs, r₄)\n\nwireframe(m4, color=:yellow)\nwireframe!(m2, color=:red)\ncurrent_figure()\n\n\n\n\nThis example comes from Wikipedia showing an implicit surface of genus \\(2\\):\n\nf(x,y,z) = 2y*(y^2 -3x^2)*(1-z^2) + (x^2 +y^2)^2 - (9z^2-1)*(1-z^2)\nzs = ys = xs = range(-5/2, 5/2, length=100)\nm = make_mesh(xs, ys, zs, x -> f(x...))\nwireframe(m)\n\n\n\n\n(This figure does not render well through contour(xs, ys, zs, f, levels=[0]), as the hole is not shown.)\nFor one last example from Wikipedia, we have the Cassini oval which “can be defined as the point set for which the product of the distances to \\(n\\) given points is constant.” That is:\n\nfunction cassini(λ, ps = ((1,0,0), (-1, 0, 0)))\n n = length(ps)\n x -> prod(norm(x .- p) for p ∈ ps) - λ^n\nend\nxs = ys = zs = range(-2, 2, length=100)\nm = make_mesh(xs, ys, zs, cassini(1.05))\nwireframe(m)"
},
{
"objectID": "alternatives/makie_plotting.html#vector-fields.-visualizations-of-fr2-rightarrow-r2",
"href": "alternatives/makie_plotting.html#vector-fields.-visualizations-of-fr2-rightarrow-r2",
"title": "65  Calculus plots with Makie",
"section": "65.6 Vector fields. Visualizations of \\(f:R^2 \\rightarrow R^2\\)",
"text": "65.6 Vector fields. Visualizations of \\(f:R^2 \\rightarrow R^2\\)\nThe vector field \\(f(x,y) = \\langle y, -x \\rangle\\) can be visualized as a set of vectors, \\(f(x,y)\\), positioned at a grid. These arrows can be visualized with the arrows function. The arrows function is passed a vector of points for the anchors and a vector of points representing the vectors.\nWe can generate these on a regular grid through:\n\nf(x, y) = [y, -x]\nxs = ys = -5:5\npts = vec(Point2.(xs, ys'))\ndus = vec(Point2.(f.(xs, ys')));\nfirst(pts), first(dus) # show an example\n\n([-5, -5], [-5, 5])\n\n\nBroadcasting over (xs, ys') ensures each pair of possible values is encountered. The vec call reshapes an array into a vector.\nCalling arrows on the prepared data produces the graphic:\n\narrows(pts, dus)\n\n\n\n\nThe grid seems rotated at first glance; but is also confusing. This is due to the length of the vectors as the \\((x,y)\\) values get farther from the origin. Plotting the normalized values (each will have length \\(1\\)) can be done easily using norm (which is found in the standard LinearAlgebra library):\n\ndvs = dus ./ norm.(dus)\narrows(pts, dvs)\n\n\n\n\nThe rotational pattern becomes much clearer now.\nThe streamplot function also illustrates this phenomenon. This implements an “algorithm [that] puts an arrow somewhere and extends the streamline in both directions from there. Then, it chooses a new position (from the remaining ones), repeating the the exercise until the streamline gets blocked, from which on a new starting point, the process repeats.”\nThe streamplot function expects a Point not a pair of values, so we adjust f slightly and call the function using the pattern streamplot(g, xs, ys):\n\nf(x, y) = [y, -x]\ng(xs) = Point2(f(xs...))\n\nstreamplot(g, -5..5, -5..5)\n\n\n\n\n(We used interval notation to set the viewing range, a range could also be used.)\n\n\n\n\n\n\nNote\n\n\n\nThe calling pattern of streamplot is different than other functions, such as surface, in that the function comes first."
},
{
"objectID": "alternatives/makie_plotting.html#layoutables-and-observables",
"href": "alternatives/makie_plotting.html#layoutables-and-observables",
"title": "65  Calculus plots with Makie",
"section": "65.7 Layoutables and Observables",
"text": "65.7 Layoutables and Observables\n\n65.7.1 Layoutables\nMakie makes it really easy to piece together figures from individual plots. To illustrate, we create a graphic consisting of a plot of a function, its derivative, and its second derivative. In our graphic, we also leave space for a label.\n\n\n\n\n\n\nNote\n\n\n\nThe Layout Tutorial has much more detail on this subject.\n\n\nThe basic plotting commands, like lines, return a FigureAxisPlot object. For laying out our own graphic, we manage the figure and axes manually. The commands below create a figure, then assign axes to portions of the figure:\n\nF = Figure()\naf = F[2,1:2] = Axis(F)\nafp = F[3,1:end] = Axis(F)\nafpp = F[4,:] = Axis(F)\n\nAxis with 1 plots:\n ┗━ Mesh{Tuple{GeometryBasics.Mesh{3, Float32, GeometryBasics.TriangleP{3, Float32, GeometryBasics.PointMeta{3, Float32, Point{3, Float32}, (:normals,), Tuple{Vec{3, Float32}}}}, GeometryBasics.FaceView{GeometryBasics.TriangleP{3, Float32, GeometryBasics.PointMeta{3, Float32, Point{3, Float32}, (:normals,), Tuple{Vec{3, Float32}}}}, GeometryBasics.PointMeta{3, Float32, Point{3, Float32}, (:normals,), Tuple{Vec{3, Float32}}}, GeometryBasics.NgonFace{3, GeometryBasics.OffsetInteger{-1, UInt32}}, StructArrays.StructVector{GeometryBasics.PointMeta{3, Float32, Point{3, Float32}, (:normals,), Tuple{Vec{3, Float32}}}, NamedTuple{(:position, :normals), Tuple{Vector{Point{3, Float32}}, Vector{Vec{3, Float32}}}}, Int64}, Vector{GeometryBasics.NgonFace{3, GeometryBasics.OffsetInteger{-1, UInt32}}}}}}}\n\n\nThe axes are named af, afp and afpp, as they will hold the respective graphs. The key here is the use of matrix notation to layout the graphic in a grid. The first one is row 2 and columns 1 through 2; the second row 3 and again all columns, the third is row 4 and all columns.\nIn this figure, we want the \\(x\\)-axis for each of the three graphics to be linked. This command ensures that:\nlinkxaxes!(af, afp, afpp);\nBy linking axes, if one is updated, say through xlims!, the others will be as well.\nWe now plot our functions. The key here is the mutating form of lines! takes an axis object to mutate as its first argument:\nf(x) = 8x^4 - 8x^2 + 1\nfp(x) = 32x^3 - 16x\nfpp(x) = 96x^2 - 16\n\nxs = -1..1\nlines!(af, xs, f)\nlines!(afp, xs, fp)\nlines!(afp, xs, zero, color=:blue)\nlines!(afpp, xs, fpp)\nlines!(afpp, xs, zero, color=:blue);\nWe can give title information to each axis:\naf.title = \"f\"\nafp.title = \"fp\"\nafpp.title = \"fpp\";\nFinally, we add a label in the first row, but for illustration purposes, only use the first column.\nLabel(F[1,1], \"\"\"\nPlots of f and its first and second derivatives.\nWhen the first derivative is zero, the function\nf has relative extrema. When the second derivative\nis zero, the function f has an inflection point.\n\"\"\");\nFinally we display the figure:\n\nF\n\n\n\n\n\n\n65.7.2 Observables\nThe basic components of a plot in Makie can be updated interactively. Makie uses the Observables package which allows complicated interactions to be modeled quite naturally. In the following we give a simple example.\nIn Makie, an Observable is a structure that allows its value to be updated, similar to an array. When changed, observables can trigger an event. Observables can rely on other observables, so events can be cascaded.\nThis simple example shows how an observable h can be used to create a collection of points representing a secant line. The figure shows the value for h=3/2.\n\nf(x) = sqrt(x)\nc = 1\nxs = 0..3\nh = Observable(3/2)\n\npoints = lift(h) do h\n xs = [0,c,c+h,3]\n tl = x -> f(c) + (f(c+h)-f(c))/h * (x-c)\n [Point2(x, tl(x)) for x ∈ xs]\nend\n\nlines(xs, f)\nlines!(points)\ncurrent_figure()\n\n\n\n\nWe can update the value of h using setindex! notation (square brackets). For example, to see that the secant line is a good approximation to the tangent line as \\(h \\rightarrow 0\\) we can set h to be 1/4 and replot:\n\nh[] = 1/4\ncurrent_figure()\n\n\n\n\nThe line h[] = 1/4 updated h which then updated points (a points is lifted up from h) which updated the graphic. (In these notes, we replot to see the change, but in an interactive session, the current displayed figure would be updated; no replotting would be necessary.)\nFinally, this example shows how to add a slider to adjust the value of h with a mouse. The slider object is positioned along with a label using the grid reference, as before.\n\nf(x) = sqrt(x)\nc = 1\nxs = 0..3\n\nF = Figure()\nax = Axis(F[1,1:2])\nh = Slider(F[2,2], range = 0.01:0.01:1.5, startvalue = 1.5)\nLabel(F[2,1], \"Adjust slider to change `h`\";\n justification = :left)\n\npoints = lift(h.value) do h\n xs = [0,c,c+h,3]\n tl = x-> f(c) + (f(c+h)-f(c))/h * (x-c)\n [Point2(x, tl(x)) for x ∈ xs]\nend\n\nlines!(ax, xs, f)\nlines!(ax, points)\ncurrent_figure()\n\n\n\n\nThe slider value is “lifted” by its value component, as shown. Otherwise, the above is fairly similar to just using an observable for h."
},
{
"objectID": "misc/getting_started_with_julia.html",
"href": "misc/getting_started_with_julia.html",
"title": "66  Getting started with Julia",
"section": "",
"text": "Julia is a freely available, open-source programming language aimed at technical computing.\nAs it is open source, indeed with a liberal MIT license, it can be installed for free on many types of computers (though not phones or tablets)."
},
{
"objectID": "misc/getting_started_with_julia.html#running-julia-through-the-web",
"href": "misc/getting_started_with_julia.html#running-julia-through-the-web",
"title": "66  Getting started with Julia",
"section": "66.1 Running Julia through the web",
"text": "66.1 Running Julia through the web\nThere are a few services for running Julia through the web. Mentioned here is Binder, which provides a web-based interface to Julia built around Jupyter. Jupyter is a wildly succesful platform for interacting with different open-source software programs.\nlauch binder\nClicking the launch link above will open a web page which provides a blank notebook, save for a package used by these notes. However, Binder is nowhere near as reliable as a local installation."
},
{
"objectID": "misc/getting_started_with_julia.html#installing-julia-locally",
"href": "misc/getting_started_with_julia.html#installing-julia-locally",
"title": "66  Getting started with Julia",
"section": "66.2 Installing Julia locally",
"text": "66.2 Installing Julia locally\nInstalling Julia locally is not more difficult than installing other software.\nBinaries of Julia are provided at julialang.org. Julia has an official released version and a developmental version. Unless there is a compelling reason, the latest released version should be downloaded and installed for use.\nFor Windows users, there is a juliaup program for managing the installation of Julia.\nThe base Julia provides a command-line interface, or REPL (read-evaluate-parse)."
},
{
"objectID": "misc/getting_started_with_julia.html#basic-interactive-usage",
"href": "misc/getting_started_with_julia.html#basic-interactive-usage",
"title": "66  Getting started with Julia",
"section": "66.3 Basic interactive usage",
"text": "66.3 Basic interactive usage\nOnce installed, Julia can be started by clicking on an icon or typing julia at the command line. Either will open a command line interface for a user to interact with a Julia process. The basic workflow is easy: commands are typed then sent to a Julia process when the “return” key is pressed for a complete expression. Then the output is displayed.\nA command is typed following the prompt. An example might be 2 + 2. To send the command to the Julia interpreter the “return” key is pressed. A complete expression or expressions will then be parsed and evaluated (executed). If the expression is not complete, julias prompt will still accept input to complete the expression. Type 2 + to see. (The expression 2 + is not complete, as the infix operator + expects two arguments, one on its left and one on its right.)\n _\n _ _ _(_)_ | Documentation: https://docs.julialang.org\n (_) | (_) (_) |\n _ _ _| |_ __ _ | Type \"?\" for help, \"]?\" for Pkg help.\n | | | | | | |/ _` | |\n | | |_| | | | (_| | | Version 1.7.0 (2021-11-30)\n _/ |\\__'_|_|_|\\__'_| | Official https://julialang.org/ release\n|__/ |\n\njulia> 2 + 2\n4\nAbove, julia> is the prompt. These notes will not include the prompt, so that copying-and-pasting can be more easily used. Input and output cells display similarly, though with differences in coloring. For example:\n\n2 + 2\n\n4\n\n\nWhile many prefer a command line for interacting with Julia, when learning a notebook interfaces is suggested. (An IDE like Julia for Visual Studio Code might be preferred for experienced programmers). In Julia interfaces, we describe two different notebook interfaces that are available through add-on packages."
},
{
"objectID": "misc/getting_started_with_julia.html#add-on-packages",
"href": "misc/getting_started_with_julia.html#add-on-packages",
"title": "66  Getting started with Julia",
"section": "66.4 Add-on packages",
"text": "66.4 Add-on packages\nJulia is well on its way towards 10,000 external add-on packages that enhance the offerings of base Julia. We refer to one, CalculusWithJulia, that is designed to accompany these notes. Installation notes are available.\nIn Julia graphics are provided only by add-on packages there is no built-in graphing. This is the case under Pluto or Jupyter or the command line.\nIn these notes, we use the Plots package and its default backend. The Plots package provides a common interface to several different backends; this choice is easily changed. The gr backend is used in these notes, though for interactive use the Plotly backend has advantages; for more complicated graphics, pyplot has some advantages; for publication PGFPlotsX has advantages.\nThe package, if installed, is loaded as any other package:\nusing Plots\nWith that in hand, to make a graph of a function over a range, we follow this pattern:\n\nplot(sin, 0, 2pi)"
},
{
"objectID": "misc/julia_interfaces.html",
"href": "misc/julia_interfaces.html",
"title": "67  Julia interfaces",
"section": "",
"text": "Julia can be used in many different manners. This page describes a few."
},
{
"objectID": "misc/julia_interfaces.html#the-repl",
"href": "misc/julia_interfaces.html#the-repl",
"title": "67  Julia interfaces",
"section": "67.1 The REPL",
"text": "67.1 The REPL\nBase Julia comes with a REPL package, which provides a means to interact with Julia at the command line.\n _\n _ _ _(_)_ | Documentation: https://docs.julialang.org\n (_) | (_) (_) |\n _ _ _| |_ __ _ | Type \"?\" for help, \"]?\" for Pkg help.\n | | | | | | |/ _` | |\n | | |_| | | | (_| | | Version 1.7.0 (2021-11-30)\n _/ |\\__'_|_|_|\\__'_| | Official https://julialang.org/ release\n|__/ |\n\njulia> 2 + 2\n4\nThe julia> prompt is where commands are typed. The return key will send a command to the interpreter and the results are displayed in the REPL terminal.\nThe REPL has many features for editing, for interacting with the package manager, or interaction with the shell. However it is command-line based, which no support for mouse interaction. For that, other options are available."
},
{
"objectID": "misc/julia_interfaces.html#pluto",
"href": "misc/julia_interfaces.html#pluto",
"title": "67  Julia interfaces",
"section": "67.2 Pluto",
"text": "67.2 Pluto\nThe Pluto package provides a notebook interface for interacting with Julia, which has a few idiosyncrasies, as compared to other interfaces.\nPluto is started from the REPL terminal with these two commands:\nusing Pluto\nPluto.run()\nPrimarily, the variables in the notebook are reactive, meaning if a variables value is modified, all references to that variables are also modified. This reactive nature makes it very easy to see the results of slight modifications and when coupled with HTML controls, allows easy user interfaces to be developed.\nAs a result, a variable name may only be used once in the top-level scope. (Names can be reused inside functions, which create their own scope and in “let” blocks, a trick used within these notes.) In the notes, subscripting and unicode variants are used for symbols which are typically repurposed (e.g., x or f).\nPluto cells may only contain one command, the result of which is displayed above the cell. This one command can be a begin or let block to join multiple statements.\nPluto has a built-in package management system that manages the installation of packages on demand.\nPluto notebooks can be easily run locally using Pluto.\nPluto notebooks are just .jl scripts, so can easily be shared."
},
{
"objectID": "misc/julia_interfaces.html#ijulia",
"href": "misc/julia_interfaces.html#ijulia",
"title": "67  Julia interfaces",
"section": "67.3 IJulia",
"text": "67.3 IJulia\n“Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages.” The IJulia package allows Julia to be one of these programming languages. This package must be installed prior to use.\nThe Jupyter Project provides two web-based interfaces to Julia: the Jupyter notebook and the newer JupyterLab. The the binder project use Juptyer notebooks for their primary interface to Julia. To use a binder notebook, follow this link:\nlauch binder\nTo run locally, these interfaces are available once IJulia is installed. Since version 1.7, the following commands should do this:\nusing IJulia\nnotebook()\nShould that not work, then this should as well:\nusing Pkg\nPkg.add(\"PyCall\")\nPkg.add(\"IJulia\")\n\nThe notebook interface has “cells” where one or more commands can be entered.\nIn IJulia, a block of commands is sent to the kernel (the Julia interpreter) by typing “shift+return” or clicking on a “run” button. The output is printed below a cell, including graphics.\nWhen a cell is evaluating, the leading [] has an asterick ([*]) showing the notebook is awaiting the results of the calculation.\nOnce a cell is evaluated, the leading [] has a number inserted (e.g., [1], as in the figure). This number indicates the order of cell evaluation. Once a notebook is interacted with, the state of the namespace need not reflect the top-to-bottom order of the notebook, but rather reflects the order of cell evaluations.\nTo be specific, a variable like x may be redefined in a cell above where the variable is intially defined and this redefinition will hold the current value known to the interpreter. As well, a notebook, when reloaded, may have unevaluated cells with output showing. These will not influence the state of the kernel until they are evaluated.\nWhen a cells commands are evaluated, the last command executed is displayed. If it is desirable that multiple values be displayed, they can be packed into a tuple. This is done by using commas to separate values. IJulia will also display other means to print output (e.g., @show, display, print, …).\nTo run all cells in a notebook from top to bottom, the “run all” command under the “Cell” menu is available.\nIf a calculation takes much longer than anticipated, the “kernel” can be interrupted through a menu item of “Kernel”.\nIf the kernal appears unresponsive, it can be restarted through a menu item of “Kernel”.\nNotebooks can be saved (as *.ipynb files) for sharing or for reuse. Notebooks can be printed at HTML pages, and if the proper underlying software is available, as formatted pages.\nJupyterLab, a variant, has more features, commonly associated with an integrated development environment (IDE)."
},
{
"objectID": "misc/julia_interfaces.html#vscode",
"href": "misc/julia_interfaces.html#vscode",
"title": "67  Julia interfaces",
"section": "67.4 VSCode",
"text": "67.4 VSCode\nJulia for Visual Studio Code provides support for the julia programming language for VS Code. VS Code is an open-sourced code editor supported by Microsoft. VS Code provides a cross-platform interface to Julia geared towards programming within the language."
},
{
"objectID": "misc/calculus_with_julia.html",
"href": "misc/calculus_with_julia.html",
"title": "68  The CalculusWithJulia package",
"section": "",
"text": "To run the commands in these notes, some external packages must be installed and loaded.\nThe Pluto interface does this in the background, so there is nothing to do but execute the cells that call using or import. For Julia post version 1.7, this installation will be initiated for you when using is called in the REPL terminal.\nFor other interfaces, to use the CalculusWithJulia package requires first that it be installed. From the command line. This can be done with this key sequence:\nOr, using the Pkg package, the commands would be\nInstallation only needs to be done once.\nHowever, for each new Julia session, the package must be loaded, as with the following command:\nThat is all. The rest of this page just provides some details for the interested reader."
},
{
"objectID": "misc/calculus_with_julia.html#the-package-concept",
"href": "misc/calculus_with_julia.html#the-package-concept",
"title": "68  The CalculusWithJulia package",
"section": "68.1 The package concept",
"text": "68.1 The package concept\nThe Julia language provides the building blocks for the wider Julia ecosystem that enhance and extend the languages applicability.\nJulia is extended through “packages.” Some of these, such as packages for certain math constants and some linear algebra operations, are part of all Julia installations and must simple by loaded to be used. Others, such as packages for finding integrals or (automatic) derivatives are provided by users and must first be installed before being used.\n\n68.1.1 Package installation\nPackage installation is straightforward, as Julia has a package, Pkg, that facilitates this.\nSince Julia version 1.7, just attempting to load a package through using PackageName at the command line will either load an installed package or query for an uninstalled package to be installed before lading. So installation just requires confirming a prompt.\nFor more control, the command line and IJulia provide access to the function in Pkg through the escape command ]. For example, to find the status of all currently installed packages, the following command can be executed:\n] status\nExternal packages are typically installed from GitHub and if they are regisered, installation is as easy as calling add:\n] add QuadGK\nThat command will consult Julias general registry for the location of the QuadGK package, use this location to download the necessary files, if necessary dependencies will be built and installed, and then the package available for use.\nFor these notes, when the CalculusWithJulia package is installed it will also install many of the other packages that are needed.\nSee Pkg for more details, such as how to update the set of available packages.\n\n\n68.1.2 Using a package\nThe features of an installed package are not available until the package is brought into the current session. A package need only be installed once, but must be loaded each session.\nTo load a package, the using keyword is provided:\nusing QuadGK\nThe above command will make available all exported function names from the QuadGK package so they can be directly used, as in:\n\nquadgk(sin, 0, pi)\n\n(2.0, 1.7905676941154525e-12)\n\n\n(A command to find an integral of \\(f(x) = \\sin(x)\\) over \\([0, \\pi]\\).)\n\n\n68.1.3 Package details\nWhen a package is first loaded after installation, or some other change, it will go through a pre-compilation process. Depending on the package size, this can take a moment to several seconds. This wont happen the second time a package is loaded.\nHowever, subsequent times a package is loaded some further compilation is done, so it can still take some time for a package to load. Mostly this is not noticeable, though with the plotting package used in these notes, it is.\nWhen a package is loaded, all of its dependent packages are also loaded, but their functions are not immediately available to the user.\nIn typical Julia usage, each needed package is loaded on demand. This is faster and also keeps the namespace (the collection of variable and function names) smaller to avoid collisions. However, for these notes, the package CalculusWithJulia will load a few of the packages needed for the entire set of notes, not just the current section. This is to make it a bit easier for the beginning user.\nOne issue with loading several packages is the possibility that more than one will export a function with the same name, causing a collision. Moreover, at times, there can be dependency conflicts between packages. A suggested workflow is to use projects and in each project use a minimal set of packages. In Pluto, this is done behind the scenes.\nThe Julia language is designed around have several “generic” functions each with many different methods depending on their usage. This design allows many different implementations for operations such as addition or multiplication yet the user only needs to call one function name. Packages can easily extend these generic functions by providing their own methods for their own new types of data. For example, SymPy, which adds symbolic math features to Julia (using a Python package) extends both + and * for use with symbolic objects.\nThis design works great when the “generic” usage matches the needs of the package authors, but there are two common issues that arise:\n\nThe extension of a generic is for a type defined outside the authors package. This is known as “type piracy” and is frowned on, as it can lead to subtle errors. The CalculusWithJulia package practices this for one case: using ' to indicate derivatives for Function objects.\nThe generic function concept is not part of base Julia. An example might be the solve function. This name has a well-defined mathematical usage (e.g., “solve for \\(x\\).”), but the generic concept is not part of base Julia. As it is used by SymPy and DifferentialEquations, among others, the ecosystem has a stub package CommonSolve allowing the sharing of this “verb.”"
},
{
"objectID": "misc/unicode.html",
"href": "misc/unicode.html",
"title": "69  Usages of Unicode symbols",
"section": "",
"text": "In these notes, the following may appear as variable or function names\n\n\n\n\\Name\nSymbol\nUsage notes\n\n\n\n\n\\euler\n\nThe variable e\n\n\n\\pi\nπ\n\n\n\n\\alpha\nα\n\n\n\n\\beta\nβ\n\n\n\n\\delta\nδ\n\n\n\n\\Delta\nΔ\nChange, as in Δx\n\n\n\\gamma\nγ\n\n\n\n\\phi\nϕ\n\n\n\n\\Phi\nΦ\nUsed for parameterized surfaces\n\n\nx\\_1\nx₁\nSubscripts\n\n\nr\\vec\nr⃗\nVector annotation\n\n\nT\\hat\nT̂\nUnit vector annotation\n\n\n\nThe following are associated with derivatives\n\n\n\n\\Name\nSymbol\nUsage notes\n\n\n\n\n\\partial\n∂\n\n\n\n\\nabla\n∇\ndel operator in CwJ package\n\n\n\nThe following are infix operators\n\n\n\n\\Name\nSymbol\nUsage notes\n\n\n\n\n\\circ\n∘\ncomposition\n\n\n\\cdot\n⋅\ndot product\n\n\n\\times\n×\ncross product\n\n\n\nInfix operators may need parentheses due to precedence rules. For example, to call a composition, one needs (f ∘ g)(x) so that composition happens before function evaluation (g(x))."
},
{
"objectID": "misc/quick_notes.html",
"href": "misc/quick_notes.html",
"title": "70  Quick introduction to Calculus with Julia",
"section": "",
"text": "The Julia programming language with a design that makes it well suited as a supplement for the learning of calculus, as this collection of notes is intended to illustrate.\nAs Julia is open source, it can be downloaded and used like many other programming languages.\nJulia can be used through the internet for free using the mybinder.org service. This link: launch binder will take you to website that allows this. Just click on the CalcululsWithJulia.ipynb file after launching Binder by clicking on the badge. Binder provides the Jupyter interface.\nHere are some Julia usages to create calculus objects.\nThe Julia packages loaded below are all loaded when the CalculusWithJulia package is loaded.\nA Julia package is loaded with the using command:\nThe LinearAlgebra package comes with a Julia installation. Other packages can be added. Something like:\nThese notes have an accompanying package, CalculusWithJulia, that when installed, as above, also installs most of the necessary packages to perform the examples.\nPackages need only be installed once, but they must be loaded into each session for which they will be used.\nPackages can also be loaded through import PackageName. Importing does not add the exported objects of a function into the namespace, so is used when there are possible name collisions."
},
{
"objectID": "misc/quick_notes.html#types",
"href": "misc/quick_notes.html#types",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.1 Types",
"text": "70.1 Types\nObjects in Julia are “typed.” Common numeric types are Float64, Int64 for floating point numbers and integers. Less used here are types like Rational{Int64}, specifying rational numbers with a numerator and denominator as Int64; or Complex{Float64}, specifying a comlex number with floating point components. Julia also has BigFloat and BigInt for arbitrary precision types. Typically, operations use “promotion” to ensure the combination of types is appropriate. Other useful types are Function, an abstract type describing functions; Bool for true and false values; Sym for symbolic values (through SymPy); and Vector{Float64} for vectors with floating point components.\nFor the most part the type will not be so important, but it is useful to know that for some function calls the type of the argument will decide what method ultimately gets called. (This allows symbolic types to interact with Julia functions in an idiomatic manner.)"
},
{
"objectID": "misc/quick_notes.html#functions",
"href": "misc/quick_notes.html#functions",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.2 Functions",
"text": "70.2 Functions\n\n70.2.1 Definition\nFunctions can be defined four basic ways:\n\none statement functions follow traditional mathematics notation:\n\n\nf(x) = exp(x) * 2x\n\nf (generic function with 1 method)\n\n\n\nmulti-statement functions are defined with the function keyword. The end statement ends the definition. The last evaluated command is returned. There is no need for explicit return statement, though it can be useful for control flow.\n\n\nfunction g(x)\n a = sin(x)^2\n a + a^2 + a^3\nend\n\ng (generic function with 1 method)\n\n\n\nAnonymous functions, useful for example, as arguments to other functions or as return values, are defined using an arrow, ->, as follows:\n\n\nfn = x -> sin(2x)\nfn(pi/2)\n\n1.2246467991473532e-16\n\n\nIn the following, the defined function, Derivative, returns an anonymously defined function that uses a Julia package, loaded with CalculusWithJulia, to take a derivative:\n\nDerivatve(f::Function) = x -> ForwardDiff.derivative(f, x) # ForwardDiff is loaded in CalculusWithJulia\n\nDerivatve (generic function with 1 method)\n\n\n(The D function of CalculusWithJulia implements something similar.)\n\nAnonymous function may also be created using the function keyword.\n\nFor mathematical functions \\(f: R^n \\rightarrow R^m\\) when \\(n\\) or \\(m\\) is bigger than 1 we have:\n\nWhen \\(n =1\\) and \\(m > 1\\) we use a “vector” for the return value\n\n\nr(t) = [sin(t), cos(t), t]\n\nr (generic function with 1 method)\n\n\n(An alternative would be to create a vector of functions.)\n\nWhen \\(n > 1\\) and \\(m=1\\) we use multiple arguments or pass the arguments in a container. This pattern is common, as it allows both calling styles.\n\n\nf(x, y, z) = x*y + y*z + z*x\nf(v) = f(v...)\n\nf (generic function with 2 methods)\n\n\nSome functions need to pass in a container of values, for this the last definition is useful to expand the values. Splatting takes a container and treats the values like individual arguments.\nAlternatively, indexing can be used directly, as in:\n\nf(x) = x[1]*x[2] + x[2]*x[3] + x[3]*x[1]\n\nf (generic function with 2 methods)\n\n\n\nFor vector fields (\\(n,m > 1\\)) a combination is used:\n\n\nF(x,y,z) = [-y, x, z]\nF(v) = F(v...)\n\nF (generic function with 2 methods)\n\n\n\n\n70.2.2 Calling a function\nFunctions are called using parentheses to group the arguments.\n\nf(t) = sin(t)*sqrt(t)\nsin(1), sqrt(1), f(1)\n\n(0.8414709848078965, 1.0, 0.8414709848078965)\n\n\nWhen a function has multiple arguments, yet the value passed in is a container holding the arguments, splatting is used to expand the arguments, as is done in the definition F(v) = F(v...), above.\n\n\n70.2.3 Multiple dispatch\nJulia can have many methods for a single generic function. (E.g., it can have many different implementations of addiion when the + sign is encountered.) The types of the arguments and the number of arguments are used for dispatch.\nHere the number of arguments is used:\n\nArea(w, h) = w * h # area of rectangle\nArea(w) = Area(w, w) # area of square using area of rectangle defintion\n\nArea (generic function with 2 methods)\n\n\nCalling Area(5) will call Area(5,5) which will return 5*5.\nSimilarly, the definition for a vector field:\n\nF(x,y,z) = [-y, x, z]\nF(v) = F(v...)\n\nF (generic function with 2 methods)\n\n\ntakes advantage of multiple dispatch to allow either a vector argument or individual arguments.\nType parameters can be used to restrict the type of arguments that are permitted. The Derivative(f::Function) definition illustrates how the Derivative function, defined above, is restricted to Function objects.\n\n\n70.2.4 Keyword arguments\nOptional arguments may be specified with keywords, when the function is defined to use them. Keywords are separated from positional arguments using a semicolon, ;:\n\ncircle(x; r=1) = sqrt(r^2 - x^2)\ncircle(0.5), circle(0.5, r=10)\n\n(0.8660254037844386, 9.987492177719089)\n\n\nThe main (but not sole) use of keyword arguments will be with plotting, where various plot attribute are passed as key=value pairs."
},
{
"objectID": "misc/quick_notes.html#symbolic-objects",
"href": "misc/quick_notes.html#symbolic-objects",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.3 Symbolic objects",
"text": "70.3 Symbolic objects\nThe add-on SymPy package allows for symbolic expressions to be used. Symbolic values are defined with @syms, as below.\nusing SymPy\n\n@syms x y z\nx^2 + y^3 + z\n\n \n\\[\nx^{2} + y^{3} + z\n\\]\n\n\n\nAssumptions on the variables can be useful, particularly with simplification, as in\n\n@syms x::real y::integer z::positive\n\n(x, y, z)\n\n\nSymbolic expressions flow through Julia functions symbolically\n\nsin(x)^2 + cos(x)^2\n\n \n\\[\n\\sin^{2}{\\left(x \\right)} + \\cos^{2}{\\left(x \\right)}\n\\]\n\n\n\nNumbers are symbolic once SymPy interacts with them:\n\nx - x + 1 # 1 is now symbolic\n\n \n\\[\n1\n\\]\n\n\n\nThe number PI is a symbolic pi.\n\nsin(PI), sin(pi)\n\n(0, 1.2246467991473532e-16)\n\n\nUse Sym to create symbolic numbers, N to find a Julia number from a symbolic number:\n\n1 / Sym(2)\n\n \n\\[\n\\frac{1}{2}\n\\]\n\n\n\n\nN(PI)\n\nπ = 3.1415926535897...\n\n\nMany generic Julia functions will work with symbolic objects through multiple dispatch (e.g., sin, cos, …). Sympy functions that are not in Julia can be accessed through the sympy object using dot-call notation:\n\nsympy.harmonic(10)\n\n \n\\[\n\\frac{7381}{2520}\n\\]\n\n\n\nSome Sympy methods belong to the object and a called via the pattern object.method(...). This too is the case using SymPy with Julia. For example:\n\nA = [x 1; x 2]\nA.det() # determinant of symbolic matrix A\n\n \n\\[\nx\n\\]"
},
{
"objectID": "misc/quick_notes.html#containers",
"href": "misc/quick_notes.html#containers",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.4 Containers",
"text": "70.4 Containers\nWe use a few different containers:\n\nTuples. These are objects grouped together using parentheses. They need not be of the same type\n\n\nx1 = (1, \"two\", 3.0)\n\n(1, \"two\", 3.0)\n\n\nTuples are useful for programming. For example, they are uesd to return multiple values from a function.\n\nVectors. These are objects of the same type (typically) grouped together using square brackets, values separated by commas:\n\n\nx2 = [1, 2, 3.0] # 3.0 makes theses all floating point\n\n3-element Vector{Float64}:\n 1.0\n 2.0\n 3.0\n\n\nUnlike tuples, the expected arithmatic from Linear Algebra is implemented for vectors.\n\nMatrices. Like vectors, combine values of the same type, only they are 2-dimensional. Use spaces to separate values along a row; semicolons to separate rows:\n\n\nx3 = [1 2 3; 4 5 6; 7 8 9]\n\n3×3 Matrix{Int64}:\n 1 2 3\n 4 5 6\n 7 8 9\n\n\n\nRow vectors. A vector is 1 dimensional, though it may be identified as a column of two dimensional matrix. A row vector is a two-dimensional matrix with a single row:\n\n\nx4 = [1 2 3.0]\n\n1×3 Matrix{Float64}:\n 1.0 2.0 3.0\n\n\nThese have indexing using square brackets:\n\nx1[1], x2[2], x3[3]\n\n(1, 2.0, 7)\n\n\nMatrices are usually indexed by row and column:\n\nx3[1,2] # row one column two\n\n2\n\n\nFor vectors and matrices - but not tuples, as they are immutable - indexing can be used to change a value in the container:\n\nx2[1], x3[1,1] = 2, 2\n\n(2, 2)\n\n\nVectors and matrices are arrays. As hinted above, arrays have mathematical operations, such as addition and subtraction, defined for them. Tuples do not.\nDestructuring is an alternative to indexing to get at the entries in certain containers:\n\na,b,c = x2\n\n3-element Vector{Float64}:\n 2.0\n 2.0\n 3.0\n\n\n\n70.4.1 Structured collections\nAn arithmetic progression, \\(a, a+h, a+2h, ..., b\\) can be produced efficiently using the range operator a:h:b:\n\n5:10:55 # an object that describes 5, 15, 25, 35, 45, 55\n\n5:10:55\n\n\nIf h=1 it can be omitted:\n\n1:10 # an object that describes 1,2,3,4,5,6,7,8,9,10\n\n1:10\n\n\nThe range function can efficiently describe \\(n\\) evenly spaced points between a and b:\n\nrange(0, pi, length=5) # range(a, stop=b, length=n) for version 1.0\n\n0.0:0.7853981633974483:3.141592653589793\n\n\nThis is useful for creating regularly spaced values needed for certain plots."
},
{
"objectID": "misc/quick_notes.html#iteration",
"href": "misc/quick_notes.html#iteration",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.5 Iteration",
"text": "70.5 Iteration\nThe for keyword is useful for iteration, Here is a traditional for loop, as i loops over each entry of the vector [1,2,3]:\n\nfor i in [1,2,3]\n println(i)\nend\n\n1\n2\n3\n\n\n\n\n\n\n\n\nNote\n\n\n\nTechnical aside: For assignment within a for loop at the global level, a global declaration may be needed to ensure proper scoping.\n\n\nList comprehensions are similar, but are useful as they perform the iteration and collect the values:\n\n[i^2 for i in [1,2,3]]\n\n3-element Vector{Int64}:\n 1\n 4\n 9\n\n\nComprehesions can also be used to make matrices\n\n[1/(i+j) for i in 1:3, j in 1:4]\n\n3×4 Matrix{Float64}:\n 0.5 0.333333 0.25 0.2\n 0.333333 0.25 0.2 0.166667\n 0.25 0.2 0.166667 0.142857\n\n\n(The three rows are for i=1, then i=2, and finally for i=3.)\nComprehensions apply an expression to each entry in a container through iteration. Applying a function to each entry of a container can be facilitated by:\n\nBroadcasting. Using . before an operation instructs Julia to match up sizes (possibly extending to do so) and then apply the operation element by element:\n\n\nxs = [1,2,3]\nsin.(xs) # sin(1), sin(2), sin(3)\n\n3-element Vector{Float64}:\n 0.8414709848078965\n 0.9092974268256817\n 0.1411200080598672\n\n\nThis example pairs off the value in bases and xs:\n\nbases = [5,5,10]\nlog.(bases, xs) # log(5, 1), log(5,2), log(10, 3)\n\n3-element Vector{Float64}:\n 0.0\n 0.43067655807339306\n 0.47712125471966244\n\n\nThis example broadcasts the scalar value for the base with xs:\n\nlog.(5, xs)\n\n3-element Vector{Float64}:\n 0.0\n 0.43067655807339306\n 0.6826061944859854\n\n\nRow and column vectors can fill in:\n\nys = [4 5] # a row vector\nh(x,y) = (x,y)\nh.(xs, ys) # broadcasting a column and row vector makes a matrix, then applies f.\n\n3×2 Matrix{Tuple{Int64, Int64}}:\n (1, 4) (1, 5)\n (2, 4) (2, 5)\n (3, 4) (3, 5)\n\n\nThis should be contrasted to the case when both xs and ys are (column) vectors, as then they pair off (and here cause a dimension mismatch as they have different lengths):\n\nh.(xs, [4,5])\n\nLoadError: DimensionMismatch(\"arrays could not be broadcast to a common size; got a dimension with lengths 3 and 2\")\n\n\n\nThe map function is similar, it applies a function to each element:\n\n\nmap(sin, [1,2,3])\n\n3-element Vector{Float64}:\n 0.8414709848078965\n 0.9092974268256817\n 0.1411200080598672\n\n\n\n\n\n\n\n\nNote\n\n\n\nMany different computer languages implement map, broadcasting is less common. Julias use of the dot syntax to indicate broadcasting is reminiscent of MATLAB, but is quite different."
},
{
"objectID": "misc/quick_notes.html#plots",
"href": "misc/quick_notes.html#plots",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.6 Plots",
"text": "70.6 Plots\nThe following commands use the Plots package. The Plots package expects a choice of backend. We will use gr unless, but other can be substituted by calling an appropriate command, suchas pyplot() or plotly().\nusing Plots\n\n\n\n\n\n\nNote\n\n\n\nThe plotly backend and gr backends are available by default. The plotly backend is has some interactivity, gr is for static plots. The pyplot package is used for certain surface plots, when gr can not be used.\n\n\n\n70.6.1 Plotting a univariate function \\(f:R \\rightarrow R\\)\n\nusing plot(f, a, b)\n\n\nplot(sin, 0, 2pi)\n\n\n\n\nOr\n\nf(x) = exp(-x/2pi)*sin(x)\nplot(f, 0, 2pi)\n\n\n\n\nOr with an anonymous function\n\nplot(x -> sin(x) + sin(2x), 0, 2pi)\n\n\n\n\n\n\n\n\n\n\nNote\n\n\n\nThe time to first plot can be lengthy! This can be removed by creating a custom Julia image, but that is not introductory level stuff. As well, standalone plotting packages offer quicker first plots, but the simplicity of Plots is preferred. Subsequent plots are not so time consuming, as the initial time is spent compiling functions so their re-use is speedy.\n\n\nArguments of interest include\n\n\n\n\n\n\n\nAttribute\nValue\n\n\n\n\nlegend\nA boolean, specify false to inhibit drawing a legend\n\n\naspect_ratio\nUse :equal to have x and y axis have same scale\n\n\nlinewidth\nIngters greater than 1 will thicken lines drawn\n\n\ncolor\nA color may be specified by a symbol (leading :).\n\n\n\nE.g., :black, :red, :blue\n\n\n\n\nusing plot(xs, ys)\n\nThe lower level interface to plot involves directly creating x and y values to plot:\n\nxs = range(0, 2pi, length=100)\nys = sin.(xs)\nplot(xs, ys, color=:red)\n\n\n\n\n\nplotting a symbolic expression\n\nA symbolic expression of single variable can be plotted as a function is:\n\n@syms x\nplot(exp(-x/2pi)*sin(x), 0, 2pi)\n\n\n\n\n\nMultiple functions\n\nThe ! Julia convention to modify an object is used by the plot command, so plot! will add to the existing plot:\n\nplot(sin, 0, 2pi, color=:red)\nplot!(cos, 0, 2pi, color=:blue)\nplot!(zero, color=:green) # no a, b then inherited from graph.\n\n\n\n\nThe zero function is just 0 (more generally useful when the type of a number is important, but used here to emphasize the \\(x\\) axis).\n\n\n70.6.2 Plotting a parameterized (space) curve function \\(f:R \\rightarrow R^n\\), \\(n = 2\\) or \\(3\\)\n\nUsing plot(xs, ys)\n\nLet \\(f(t) = e^{t/2\\pi} \\langle \\cos(t), \\sin(t)\\rangle\\) be a parameterized function. Then the \\(t\\) values can be generated as follows:\n\nts = range(0, 2pi, length = 100)\nxs = [exp(t/2pi) * cos(t) for t in ts]\nys = [exp(t/2pi) * sin(t) for t in ts]\nplot(xs, ys)\n\n\n\n\n\nusing plot(f1, f2, a, b). If the two functions describing the components are available, then\n\n\nf1(t) = exp(t/2pi) * cos(t)\nf2(t) = exp(t/2pi) * sin(t)\nplot(f1, f2, 0, 2pi)\n\n\n\n\n\nUsing plot_parametric. If the curve is described as a function of t with a vector output, then the CalculusWithJulia package provides plot_parametric to produce a plot:\n\n\nr(t) = exp(t/2pi) * [cos(t), sin(t)]\nplot_parametric(0..2pi, r)\n\n\n\n\nThe low-level approach doesnt quite work as easily as desired:\n\nts = range(0, 2pi, length = 4)\nvs = r.(ts)\n\n4-element Vector{Vector{Float64}}:\n [1.0, 0.0]\n [-0.6978062125430444, 1.2086358139617603]\n [-0.9738670205273388, -1.6867871593690715]\n [2.718281828459045, -6.657870280805568e-16]\n\n\nAs seen, the values are a vector of vectors. To plot a reshaping needs to be done:\n\nts = range(0, 2pi, length = 100)\nvs = r.(ts)\nxs = [vs[i][1] for i in eachindex(vs)]\nys = [vs[i][2] for i in eachindex(vs)]\nplot(xs, ys)\n\n\n\n\nThis approach is faciliated by the unzip function in CalculusWithJulia (and used internally by plot_parametric):\n\nts = range(0, 2pi, length = 100)\nplot(unzip(r.(ts))...)\n\n\n\n\n\nPlotting an arrow\n\nAn arrow in 2D can be plotted with the quiver command. We show the arrow(p, v) (or arrow!(p,v) function) from the CalculusWithJulia package, which has an easier syntax (arrow!(p, v), where p is a point indicating the placement of the tail, and v the vector to represent):\n\nplot_parametric(0..2pi, r)\nt0 = pi/8\narrow!(r(t0), r'(t0))\n\n\n\n\n\n\n70.6.3 Plotting a scalar function \\(f:R^2 \\rightarrow R\\)\nThe surface and contour functions are available to visualize a scalar function of \\(2\\) variables:\n\nA surface plot\n\n\nf(x, y) = 2 - x^2 + y^2\nxs = ys = range(-2,2, length=25)\nsurface(xs, ys, f)\n\n\n\n\nThe function generates the \\(z\\) values, this can be done by the user and then passed to the surface(xs, ys, zs) format:\n\nf(x, y) = 2 - x^2 + y^2\nxs = ys = range(-2,2, length=25)\nsurface(xs, ys, f.(xs, ys'))\n\n\n\n\n\nA contour plot\n\nThe contour function is like the surface function.\n\nxs = ys = range(-2,2, length=25)\nf(x, y) = 2 - x^2 + y^2\ncontour(xs, ys, f)\n\n\n\n\nThe values can be computed easily enough, being careful where the transpose is needed:\n\nxs = ys = range(-2,2, length=25)\nf(x, y) = 2 - x^2 + y^2\ncontour(xs, ys, f.(xs, ys'))\n\n\n\n\n\nAn implicit equation. The constraint \\(f(x,y)=c\\) generates an implicit equation. While contour can be used for this type of plot - by adjusting the requested contours - the ImplicitPlots package does this to make a plot of the equations \\(f(x,y) = 0\\)”\n\n\nusing ImplicitPlots\nf(x,y) = sin(x*y) - cos(x*y)\nimplicit_plot(f)\n\n\n\n\n\n\n70.6.4 Plotting a parameterized surface \\(f:R^2 \\rightarrow R^3\\)\nThe pyplot (and plotly) backends allow plotting of parameterized surfaces.\nThe low-level surface(xs,ys,zs) is used, and can be specified directly as follows:\n\nX(theta, phi) = sin(phi)*cos(theta)\nY(theta, phi) = sin(phi)*sin(theta)\nZ(theta, phi) = cos(phi)\nthetas = range(0, pi/4, length=20)\nphis = range(0, pi, length=20)\nsurface(X.(thetas, phis'), Y.(thetas, phis'), Z.(thetas, phis'))\n\n\n\n\n\n\n70.6.5 Plotting a vector field \\(F:R^2 \\rightarrow R^2\\).\nThe CalculusWithJulia package provides vectorfieldplot, used as:\n\nF(x,y) = [-y, x]\nvectorfieldplot(F, xlim=(-2, 2), ylim=(-2,2), nx=10, ny=10)\n\n\n\n\nThere is also vectorfieldplot3d."
},
{
"objectID": "misc/quick_notes.html#limits",
"href": "misc/quick_notes.html#limits",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.7 Limits",
"text": "70.7 Limits\nLimits can be investigated numerically by forming tables, eg.:\n\nxs = [1, 1/10, 1/100, 1/1000]\nf(x) = sin(x)/x\n[xs f.(xs)]\n\n4×2 Matrix{Float64}:\n 1.0 0.841471\n 0.1 0.998334\n 0.01 0.999983\n 0.001 1.0\n\n\nSymbolically, SymPy provides a limit function:\n\n@syms x\nlimit(sin(x)/x, x => 0)\n\n \n\\[\n1\n\\]\n\n\n\nOr\n\n@syms h x\nlimit((sin(x+h) - sin(x))/h, h => 0)\n\nLoadError: invalid redefinition of constant h"
},
{
"objectID": "misc/quick_notes.html#derivatives",
"href": "misc/quick_notes.html#derivatives",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.8 Derivatives",
"text": "70.8 Derivatives\nThere are numeric and symbolic approaches to derivatives. For the numeric approach we use the ForwardDiff package, which performs automatic differentiation.\n\n70.8.1 Derivatives of univariate functions\nNumerically, the ForwardDiff.derivative(f, x) function call will find the derivative of the function f at the point x:\n\nForwardDiff.derivative(sin, pi/3) - cos(pi/3)\n\n0.0\n\n\nThe CalculusWithJulia package overides the ' (adjoint) syntax for functions to provide a derivative which takes a function and returns a function, so its usage is familiar\n\nf(x) = sin(x)\nf'(pi/3) - cos(pi/3) # or just sin'(pi/3) - cos(pi/3)\n\n0.0\n\n\nHigher order derivatives are possible as well,\n\nf(x) = sin(x)\nf''''(pi/3) - f(pi/3)\n\n0.0\n\n\n\nSymbolically, the diff function of SymPy finds derivatives.\n\n@syms x\nf(x) = exp(-x)*sin(x)\nex = f(x) # symbolic expression\ndiff(ex, x) # or just diff(f(x), x)\n\n \n\\[\n- e^{- x} \\sin{\\left(x \\right)} + e^{- x} \\cos{\\left(x \\right)}\n\\]\n\n\n\nHigher order derivatives can be specified as well\n\n@syms x\nex = exp(-x)*sin(x)\n\ndiff(ex, x, x)\n\n \n\\[\n- 2 e^{- x} \\cos{\\left(x \\right)}\n\\]\n\n\n\nOr with a number:\n\n@syms x\nex = exp(-x)*sin(x)\n\ndiff(ex, x, 5)\n\n \n\\[\n4 \\left(\\sin{\\left(x \\right)} - \\cos{\\left(x \\right)}\\right) e^{- x}\n\\]\n\n\n\nThe variable is important, as this allows parameters to be symbolic\n\n@syms mu sigma x\ndiff(exp(-((x-mu)/sigma)^2/2), x)\n\n \n\\[\n- \\frac{\\left(- 2 \\mu + 2 x\\right) e^{- \\frac{\\left(- \\mu + x\\right)^{2}}{2 \\sigma^{2}}}}{2 \\sigma^{2}}\n\\]\n\n\n\n\n\n70.8.2 Partial derivatives\nThere is no direct partial derivative function provided by ForwardDiff, rather we use the result of the ForwardDiff.gradient function, which finds the partial derivatives for each variable. To use this, the function must be defined in terms of a point or vector.\n\nf(x,y,z) = x*y + y*z + z*x\nf(v) = f(v...) # this is needed for ForwardDiff.gradient\nForwardDiff.gradient(f, [1,2,3])\n\n3-element Vector{Int64}:\n 5\n 4\n 3\n\n\nWe can see directly that \\(\\partial{f}/\\partial{x} = \\langle y + z\\rangle\\). At the point \\((1,2,3)\\), this is \\(5\\), as returned above.\n\nSymbolically, diff is used for partial derivatives:\n\n@syms x y z\nex = x*y + y*z + z*x\ndiff(ex, x) # ∂f/∂x\n\n \n\\[\ny + z\n\\]\n\n\n\n\nGradient\n\nAs seen, the ForwardDiff.gradient function finds the gradient at a point. In CalculusWithJulia, the gradient is extended to return a function when called with no additional arguments:\n\nf(x,y,z) = x*y + y*z + z*x\nf(v) = f(v...)\ngradient(f)(1,2,3) - gradient(f, [1,2,3])\n\n3-element Vector{Int64}:\n 0\n 0\n 0\n\n\nThe ∇ symbol, formed by entering \\nabla[tab], is mathematical syntax for the gradient, and is defined in CalculusWithJulia.\n\nf(x,y,z) = x*y + y*z + z*x\nf(x) = f(x...)\n∇(f)(1,2,3) # same as gradient(f, [1,2,3])\n\n3-element Vector{Int64}:\n 5\n 4\n 3\n\n\n\nIn SymPy, there is no gradient function, though finding the gradient is easy through broadcasting:\n\n@syms x y z\nex = x*y + y*z + z*x\ndiff.(ex, [x,y,z]) # [diff(ex, x), diff(ex, y), diff(ex, z)]\n\n3-element Vector{Sym}:\n y + z\n x + z\n x + y\n\n\nThe CalculusWithJulia package provides a method for gradient:\n\n@syms x y z\nex = x*y + y*z + z*x\n\ngradient(ex, [x,y,z])\n\n3-element Vector{Sym}:\n y + z\n x + z\n x + y\n\n\nThe ∇ symbol is an alias. It can guess the order of the free symbols, but generally specifying them is needed. This is done with a tuple:\n\n@syms x y z\nex = x*y + y*z + z*x\n\n∇((ex, [x,y,z])) # for this, ∇(ex) also works\n\n3-element Vector{Sym}:\n y + z\n x + z\n x + y\n\n\n\n\n70.8.3 Jacobian\nThe Jacobian of a function \\(f:R^n \\rightarrow R^m\\) is a \\(m\\times n\\) matrix of partial derivatives. Numerically, ForwardDiff.jacobian can find the Jacobian of a function at a point:\n\nF(u,v) = [u*cos(v), u*sin(v), u]\nF(v) = F(v...) # needed for ForwardDiff.jacobian\npt = [1, pi/4]\nForwardDiff.jacobian(F , pt)\n\n3×2 Matrix{Float64}:\n 0.707107 -0.707107\n 0.707107 0.707107\n 1.0 0.0\n\n\n\nSymbolically, the jacobian function is a method of a matrix, so the calling pattern is different. (Of the form object.method(arguments...).)\n\n@syms u v\nF(u,v) = [u*cos(v), u*sin(v), u]\nF(v) = F(v...)\n\nex = F(u,v)\nex.jacobian([u,v])\n\n3×2 Matrix{Sym}:\n cos(v) -u⋅sin(v)\n sin(v) u⋅cos(v)\n 1 0\n\n\nAs the Jacobian can be identified as the matrix with rows given by the transpose of the gradient of the component, it can be computed directly, but it is more difficult:\n\n@syms u::real v::real\nF(u,v) = [u*cos(v), u*sin(v), u]\nF(v) = F(v...)\n\nvcat([diff.(ex, [u,v])' for ex in F(u,v)]...)\n\n3×2 Matrix{Sym}:\n cos(v) -u⋅sin(v)\n sin(v) u⋅cos(v)\n 1 0\n\n\n\n\n70.8.4 Divergence\nNumerically, the divergence can be computed from the Jacobian by adding the diagonal elements. This is a numerically inefficient, as the other partial derivates must be found and discarded, but this is generally not an issue for these notes. The following uses tr (the trace from the LinearAlgebra package) to find the sum of a diagonal.\n\nF(x,y,z) = [-y, x, z]\nF(v) = F(v...)\npt = [1,2,3]\ntr(ForwardDiff.jacobian(F , pt))\n\n1\n\n\nThe CalculusWithJulia package provides divergence to compute the divergence and provides the ∇ ⋅ notation (\\nabla[tab]\\cdot[tab]):\n\nF(x,y,z) = [-y, x, z]\nF(v) = F(v...)\n\ndivergence(F, [1,2,3])\n(∇⋅F)(1,2,3) # not ∇⋅F(1,2,3) as that evaluates F(1,2,3) before the divergence\n\n1.0\n\n\n\nSymbolically, the divergence can be found directly:\n\n@syms x y z\nex = [-y, x, z]\n\nsum(diff.(ex, [x,y,z])) # sum of [diff(ex[1], x), diff(ex[2],y), diff(ex[3], z)]\n\n \n\\[\n1\n\\]\n\n\n\nThe divergence function can be used for symbolic expressions:\n\n@syms x y z\nex = [-y, x, z]\n\ndivergence(ex, [x,y,z])\n∇⋅(ex, [x,y,z]) # For this, ∇ ⋅ F(x,y,z) also works\n\n \n\\[\n1\n\\]\n\n\n\n\n\n70.8.5 Curl\nThe curl can be computed from the off-diagonal elements of the Jacobian. The calculation follows the formula. The CalculusWithJulia package provides curl to compute this:\n\nF(x,y,z) = [-y, x, 1]\nF(v) = F(v...)\n\ncurl(F, [1,2,3])\n\n3-element Vector{Float64}:\n 0.0\n -0.0\n 2.0\n\n\nAs well, if no point is specified, a function is returned for which a point may be specified using 3 coordinates or a vector\n\nF(x,y,z) = [-y, x, 1]\nF(v) = F(v...)\n\ncurl(F)(1,2,3), curl(F)([1,2,3])\n\n([0.0, -0.0, 2.0], [0.0, -0.0, 2.0])\n\n\nFinally, the ∇ × (\\nabla[tab]\\times[tab] notation is available)\n\nF(x,y,z) = [-y, x, 1]\nF(v) = F(v...)\n\n(∇×F)(1,2,3)\n\n3-element Vector{Float64}:\n 0.0\n -0.0\n 2.0\n\n\nFor symbolic expressions, we have the ∇ × times notation is available if the symbolic vector contains all \\(3\\) variables\n\n@syms x y z\nF = [-y, x, z] # but not [-y, x, 1] which errs; use `curl` with variables specified\n\ncurl([-y, x, 1], (x,y,z)), ∇×F\n\nLoadError: invalid redefinition of constant F"
},
{
"objectID": "misc/quick_notes.html#integrals",
"href": "misc/quick_notes.html#integrals",
"title": "70  Quick introduction to Calculus with Julia",
"section": "70.9 Integrals",
"text": "70.9 Integrals\nNumeric integration is provided by the QuadGK package, for univariate integrals, and the HCubature package for higher dimensional integrals.\nusing QuadGK, HCubature\n\n70.9.1 Integrals of univariate functions\nA definite integral may be computed numerically using quadgk\n\nquadgk(sin, 0, pi)\n\n(2.0, 1.7905676941154525e-12)\n\n\nThe answer and an estimate for the worst case error is returned.\nIf singularities are avoided, improper integrals are computed as well:\n\nquadgk(x->1/x^(1/2), 0, 1)\n\n(1.9999999845983916, 2.3762511924588765e-8)\n\n\n\nSymPy provides the integrate function to compute both definite and indefinite integrals.\n\n@syms a::real x::real\nintegrate(exp(a*x)*sin(x), x)\n\n \n\\[\n\\frac{a e^{a x} \\sin{\\left(x \\right)}}{a^{2} + 1} - \\frac{e^{a x} \\cos{\\left(x \\right)}}{a^{2} + 1}\n\\]\n\n\n\nLike diff the variable to integrate is specified.\nDefinite integrals use a tuple, (variable, a, b), to specify the variable and range to integrate over:\n\n@syms a::real x::real\nintegrate(sin(a + x), (x, 0, PI)) # ∫_0^PI sin(a+x) dx\n\n \n\\[\n2 \\cos{\\left(a \\right)}\n\\]\n\n\n\n\n\n70.9.2 2D and 3D iterated integrals\nTwo and three dimensional integrals over box-like regions are computed numerically with the hcubature function from the HCubature package. If the box is \\([x_1, y_1]\\times[x_2,y_2]\\times\\cdots\\times[x_n,y_n]\\) then the limits are specified through tuples of the form \\((x_1,x_2,\\dots,x_n)\\) and \\((y_1,y_2,\\dots,y_n)\\).\n\nf(x,y) = x*y^2\nf(v) = f(v...)\n\nhcubature(f, (0,0), (1, 2)) # computes ∫₀¹∫₀² f(x,y) dy dx\n\n(1.333333333333333, 4.440892098500626e-16)\n\n\nThe calling pattern for more dimensions is identical.\n\nf(x,y,z) = x*y^2*z^3\nf(v) = f(v...)\n\nhcubature(f, (0,0,0), (1, 2,3)) # computes ∫₀¹∫₀²∫₀³ f(x,y,z) dz dy dx\n\n(27.0, 0.0)\n\n\nThe box-like region requirement means a change of variables may be necessary. For example, to integrate over the region \\(x^2 + y^2 \\leq 1; x \\geq 0\\), polar coordinates can be used with \\((r,\\theta)\\) in \\([0,1]\\times[-\\pi/2,\\pi/2]\\). When changing variables, the Jacobian enters into the formula, through\n\\[\n~\n\\iint_{G(S)} f(\\vec{x}) dV = \\iint_S (f \\circ G)(\\vec{u}) |\\det(J_G)(\\vec{u})| dU.\n~\n\\]\nHere we implement this:\n\nf(x,y) = x*y^2\nf(v) = f(v...)\nPhi(r, theta) = r * [cos(theta), sin(theta)]\nPhi(rtheta) = Phi(rtheta...)\nintegrand(rtheta) = f(Phi(rtheta)) * det(ForwardDiff.jacobian(Phi, rtheta))\nhcubature(integrand, (0.0,-pi/2), (1.0, pi/2))\n\n(0.13333333333904918, 1.9853799966359355e-9)\n\n\n\nSymbolically, the integrate function allows additional terms to be specified. For example, the above could be done through:\n\n@syms x::real y::real\nintegrate(x * y^2, (y, -sqrt(1-x^2), sqrt(1-x^2)), (x, 0, 1))\n\n \n\\[\n\\frac{2}{15}\n\\]\n\n\n\n\n\n70.9.3 Line integrals\nA line integral of \\(f\\) parameterized by \\(\\vec{r}(t)\\) is computed by:\n\\[\n~\n\\int_a^b (f\\circ\\vec{r})(t) \\| \\frac{dr}{dt}\\| dt.\n~\n\\]\nFor example, if \\(f(x,y) = 2 - x^2 - y^2\\) and \\(r(t) = 1/t \\langle \\cos(t), \\sin(t) \\rangle\\), then the line integral over \\([1,2]\\) is given by:\n\nf(x,y) = 2 - x^2 - y^2\nf(v) = f(v...)\nr(t) = [cos(t), sin(t)]/t\n\nintegrand(t) = (f∘r)(t) * norm(r'(t))\nquadgk(integrand, 1, 2)\n\n(1.2399213772953277, 4.525271268818187e-9)\n\n\nTo integrate a line integral through a vector field, say \\(\\int_C F \\cdot\\hat{T} ds=\\int_C F\\cdot \\vec{r}'(t) dt\\) we have, for example,\n\nF(x,y) = [-y, x]\nF(v) = F(v...)\nr(t) = [cos(t), sin(t)]/t\nintegrand(t) = (F∘r)(t) ⋅ r'(t)\nquadgk(integrand, 1, 2)\n\n(0.5, 2.1134927141730486e-10)\n\n\n\nSymbolically, there is no real difference from a 1-dimensional integral. Let \\(\\phi = 1/\\|r\\|\\) and integrate the gradient field over one turn of the helix \\(\\vec{r}(t) = \\langle \\cos(t), \\sin(t), t\\rangle\\).\n\n@syms x::real y::real z::real t::real\nphi(x,y,z) = 1/sqrt(x^2 + y^2 + z^2)\nr(t) = [cos(t), sin(t), t]\n∇phi = diff.(phi(x,y,z), [x,y,z])\n∇phi_r = subs.(∇phi, x.=> r(t)[1], y.=>r(t)[2], z.=>r(t)[3])\nrp = diff.(r(t), t)\nglobal helix = simplify(∇phi_r ⋅ rp )\n\n \n\\[\n- \\frac{t}{\\left(t^{2} + 1\\right)^{\\frac{3}{2}}}\n\\]\n\n\n\nThen\n\n@syms t::real\nintegrate(helix, (t, 0, 2PI))\n\n \n\\[\n-1 + \\frac{1}{\\sqrt{1 + 4 \\pi^{2}}}\n\\]\n\n\n\n\n\n70.9.4 Surface integrals\nThe surface integral for a parameterized surface involves a surface element \\(\\|\\partial\\Phi/\\partial{u} \\times \\partial\\Phi/\\partial{v}\\|\\). This can be computed numerically with:\n\nPhi(u,v) = [u*cos(v), u*sin(v), u]\nPhi(v) = Phi(v...)\n\nfunction SE(Phi, pt)\n J = ForwardDiff.jacobian(Phi, pt)\n J[:,1] × J[:,2]\nend\n\nnorm(SE(Phi, [1,2]))\n\n1.4142135623730951\n\n\nTo find the surface integral (\\(f=1\\)) for this surface over \\([0,1] \\times [0,2\\pi]\\), we have:\n\nhcubature(pt -> norm(SE(Phi, pt)), (0.0,0.0), (1.0, 2pi))\n\n(4.442882938158366, 2.6645352591003757e-15)\n\n\nSymbolically, the approach is similar:\n\n@syms u::real v::real\nexₚ = Phi(u,v)\nJₚ = exₚ.jacobian([u,v])\nSurfEl = norm(Jₚ[:,1] × Jₚ[:,2]) |> simplify\n\n \n\\[\n\\sqrt{2} \\left|{u}\\right|\n\\]\n\n\n\nThen\n\nintegrate(SurfEl, (u, 0, 1), (v, 0, 2PI))\n\n \n\\[\n\\sqrt{2} \\pi\n\\]\n\n\n\nIntegrating a vector field over the surface, would be similar:\n\nF(x,y,z) = [x, y, z]\nex = F(Phi(u,v)...) ⋅ (Jₚ[:,1] × Jₚ[:,2])\nintegrate(ex, (u,0,1), (v, 0, 2PI))\n\n \n\\[\n0\n\\]"
},
{
"objectID": "references.html",
"href": "references.html",
"title": "References",
"section": "",
"text": "Hass, Joel R., Christopher E. Heil, and Maurice D. Weir. 2018.\nThomas Calculus. Pearson.\n\n\nKnill, Oliver. n.d. “Some Teaching Notes.” https://people.math.harvard.edu/~knill/teach/index.html.\n\n\nRogawski, Jon, Colin Adams, and Robert Franzosa. 2019.\nCalculus. Macmillan.\n\n\nSchey, H. M. 1997. Div, Grad, Curl, and All That. W.W. Norton.\n\n\nStrang, Gilbert. n.d. “MS Windows NT Kernel\nDescription.” https://ocw.mit.edu/courses/res-18-001-calculus-online-textbook-spring-2005/."
}
]