This commit is contained in:
Steven G. Johnson 2023-02-09 22:08:44 -05:00
parent 3c31b09cf0
commit c920f093e8
2 changed files with 238 additions and 0 deletions

View File

@ -43,6 +43,7 @@ useful study guide.
* Matrix multiplication by blocks and columns-times-rows. Complexity: standard algorithm for (m×p)⋅(p×n) is [Θ(mnp): roughly proportional](https://en.wikipedia.org/wiki/Big_O_notation) to mnp for large m,n,p, regardless of how we rearrange it into blocks. (There also [exist theoretically better](https://en.wikipedia.org/wiki/Computational_complexity_of_matrix_multiplication), but highly impractical, algorithms.)
* Briefly reviewed the "famous four" matrix factorizations: [LU](https://en.wikipedia.org/wiki/LU_decomposition), [diagonalization XΛX⁻¹ or QΛQᵀ](https://en.wikipedia.org/wiki/Eigendecomposition_of_a_matrix), [QR](https://en.wikipedia.org/wiki/QR_decomposition), and the [SVD UΣVᵀ](https://en.wikipedia.org/wiki/Singular_value_decomposition). QR and QΛQᵀ in the columns-times-rows picture, especially QΛQᵀ (diagonalization for real A=Aᵀ) as a sum of symmetric rank-1 projections.
* The [four fundamental subspaces](https://web.mit.edu/18.06/www/Essays/newpaper_ver3.pdf) for an m×n matrix A of rank r, mapping "inputs" x∈ⁿ to "outputs" Ax∈ᵐ: the "input" subspaces C(Aᵀ) (row space, dimension r) and its [orthogonal complement](https://en.wikipedia.org/wiki/Orthogonal_complement) N(A) (nullspace, dimension nr); and the "output" subspaces C(A) (column space, dimension r) and its orthogonal complement N(Aᵀ) (left nullspace, dimension mr).
* [pset 1](psets/pset1.ipynb), due Friday Feb 17
**Further reading**: Textbook 1.31.6. [OCW lecture 2](https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/resources/lecture-2-multiplying-and-factoring-matrices/). If you haven't seen [matrix multiplication by blocks](https://www.math.ucdavis.edu/~linear/old/notes9.pdf) before, [here is a nice video](https://www.youtube.com/watch?v=KCUgWj5nhYc).

237
psets/pset1.ipynb Normal file
View File

@ -0,0 +1,237 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "08cd437b",
"metadata": {},
"source": [
"# 18.065 Pset 1\n",
"\n",
"Due Friday 2/17 at 1pm. Submit in PDF format: a decent-quality scan/image of any handwritten solutions (e.g. get a scanner app on your phone or use a tablet), and a PDF printout of your Jupyter notebook showing your code and (clearly labeled) results."
]
},
{
"cell_type": "markdown",
"id": "0d53b289",
"metadata": {},
"source": [
"## Problem 1 (4+4+4+4+4 points)\n",
"\n",
"Recall from class that multiplying an $m \\times p$ by a $p \\times n$ matrix costs $mnp$ scalar multiplications (and a similar number of additions) by the standard (practical) algorithms.\n",
"\n",
"Matrix multiplication is **not commutative** ($AB \\ne BA$ in general), but it **is associative**: $(AB)C=A(BC)$. It turns out that where you put the parentheses (i.e. in *what order* you do the multiplications) can make a *huge* difference in computational cost.\n",
"\n",
"**(a)** If $x \\in \\mathbb{R}^n$ and $A,B$ are $n \\times n$ matrices, compare the scalar multiplication counts of $(AB)x$ vs. $A(Bx)$, i.e. if we do the multiplications in the order indicated by the parentheses.\n",
"\n",
"**(b)** If $x, b\\in \\mathbb{R}^n$, **how many scalar multiplications** does the computation $$p = (I - (xx^T)/(x^T x)) b$$ take if we *do it in the order indicated by the parentheses*? (Note that dividing by a scalar $\\alpha$ is equivalent to multiplying by $\\alpha^{-1}$ at the negligible cost of one scalar division.)\n",
"\n",
"**(c)** Explain how to compute the *same* $p$ as in part (b) using as *few multiplications as possible*. Outline the sequence of computational steps, and give the count of multiplications.\n",
"\n",
"**(d)** $p^T x = $ what?\n",
"\n",
"**(e)** Implement your algorithm from (c) in Julia, filling in the code below, and time it for $n=1000$ using the `@btime` macro from the [BenchmarkTools package](https://github.com/JuliaCI/BenchmarkTools.jl), along with the algorithm from part (b), following the outline below. How does the ratio of the two times compare to your ratio of multiplication counts?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b149871c",
"metadata": {},
"outputs": [],
"source": [
"using LinearAlgebra, BenchmarkTools\n",
"\n",
"# algorithm from part (b)\n",
"function part_b(x, b)\n",
" return (I - (x*x')*(x'*x)^-1) * b\n",
"end\n",
"\n",
"# algorithm from part (c)\n",
"function part_c(x, b)\n",
" # CHANGE THIS:\n",
" return x + b\n",
"end\n",
"\n",
"# test and benchmark on random vectors:\n",
"n = 1000\n",
"x, b = rand(n), rand(n)\n",
"\n",
"# test it first — should give same answer up to roundoff error\n",
"if part_c(x, b) ≈ part_b(x, b)\n",
" println(\"Hooray, part (c) and part (b) agree!\")\n",
"else\n",
" error(\"You made a mistake: part (c) and part (b) do not agree!\")\n",
"end\n",
"\n",
"# benchmark it:\n",
"\n",
"println(\"\\npart (b): \")\n",
"@btime part_b($x, $b);\n",
"\n",
"println(\"\\npart (c): \")\n",
"@btime part_c($x, $b);"
]
},
{
"cell_type": "markdown",
"id": "bfa41bf1",
"metadata": {},
"source": [
"## Problem 2 (8+4 points)\n",
"\n",
"**(a)** Describe the four fundamental subspaces of the **rank-1 matrix** $A = uv^T$ where $u \\in \\mathbb{R}^m$ and $n \\in \\mathbb{R}^n$.\n",
"\n",
"**(b)** For any column vectors $u,v \\in \\mathbb{R}^3$, the matrix $uv^T$ is rank 1, except when \\_\\_\\_\\_\\_\\_\\_\\_, in which case $uv^T$ has rank \\_\\_\\_\\_."
]
},
{
"cell_type": "markdown",
"id": "82b2c169",
"metadata": {},
"source": [
"## Problem 3 (5+4+4+4 points)\n",
"\n",
"**(a)** Pick the choices that makes this statement correct for arbitrary matrices A and B: $C(AB)$ (*contains / is contained in*) the column space of (*A / B*). Briefly justify your answer.\n",
"\n",
"**(b)** Suppose that $A$ is a $1000\\times 1000$ matrix of rank $< 10$. Suppose we multiply it by 10 random vectors $x_1, x_2, \\ldots, x_{10}$, e.g. generated by `randn(1000)`. How could we use the results to get a $10\\times 10$ matrix $C$ whose rank (almost certainly) matches $A$'s?\n",
"\n",
"**(c)** Suppose we instead make $1000\\times 10$ matrix $X$ whose columns are $x_1, x_2, \\ldots, x_{10}$. Give a formula for the *same* matrix $C$ in terms of matrix products involving $A$ and $X$.\n",
"\n",
"**(d)** Fill in the code for $C$ below, and compare the biggest 10 singular values of $A$ (chosen to be rank ≈ 4 in this case) to the corresponding 10 singular values of $C$. Does it match what you expect?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7b32901",
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"using LinearAlgebra\n",
"\n",
"# random 1000x1000 matrix of rank 4\n",
"A = randn(1000, 4) * randn(4, 1000)\n",
"@show svdvals(A)[1:10]\n",
"\n",
"X = randn(1000, 10)\n",
"C = ???\n",
"@show svdvals(C)"
]
},
{
"cell_type": "markdown",
"id": "ee67e16b",
"metadata": {},
"source": [
"## Problem 4 (4+5+5 points)\n",
"\n",
"The famous Hadamard matrices are filled with ±1 and have orthogonal columns (orthonormal if we divide $H_n$ by $1/\\sqrt{n}$). The first few are:\n",
"\\begin{align}\n",
" H_1 &= \\begin{pmatrix} 1 \\end{pmatrix}, \\\\\n",
" H_2 &= \\begin{pmatrix}\n",
" 1 & 1 \\\\\n",
" 1 & -1\n",
" \\end{pmatrix}, \\\\\n",
" H_4 &= \\begin{pmatrix} H_2 & H_2 \\\\ H_2 & -H_2 \\end{pmatrix} = \\begin{pmatrix}\n",
" 1 & 1 & 1 & 1\\\\\n",
" 1 & -1 & 1 & -1\\\\\n",
" 1 & 1 & -1 & -1\\\\\n",
" 1 & -1 & -1 & 1\n",
" \\end{pmatrix} \\, .\n",
"\\end{align}\n",
"Notice that (for power-of-2 sizes), they are built up \"recursively\" out of smaller Hadamard matrices. Multiplying a vector by a Hadamard matrix requires no multiplications at all, only additions/subtractions.\n",
"\n",
"**(a)** If you multiply $H_4 x$ for some $x \\in \\mathbb{R}^4$ by the normal \"rows-times-columns\" method (*without* exploiting any special patterns), exactly how many scalar additions/subtractions are required?\n",
"\n",
"**(b)** Let's break $x$ into two blocks: $x = \\begin{pmatrix} x_1 \\\\ x_2 \\end{pmatrix}$ for $x_1, x_2 \\in \\mathbb{R}^2$. Write out $H_4 x$ in terms of a sequence of $2\\times 2$ block multiplications with $\\pm H_2$. You'll notice that some of these $2\\times 2$ multiplications are repeated. If we re-use these repeated multiplications rather than doing them twice, we can save a bunch of arithmetic — what is the new count of scalar additions/subtractions if you do this?\n",
"\n",
"**(c)** Similarly, the $8\\times 8$ Hadamard matrix $H_8 = \\begin{pmatrix} H_4 & H_4 \\\\ H_4 & -H_4 \\end{pmatrix}$ is made out of $H_4$ matrices. To multiply it by a vector $y \\in \\mathbb{R}^8$, the naive rows-times columns method would require \\_\\_\\_\\_ scalar additions/subtractions, whereas if you broke them up first into blocks of 4, used your solution from (b), and then re-used any repeated $H_4$ products, it would only require \\_\\_\\_\\_ scalar additions/subtractions."
]
},
{
"cell_type": "markdown",
"id": "c2a2b8eb",
"metadata": {},
"source": [
"## Problem 5 (5+5 points)\n",
"\n",
"The famous \"discrete Fourier transform\" matrix $F$ has columns that are actually eigenvectors of the (unitary) permutation matrix:\n",
"$$\n",
"P = \\begin{pmatrix} & 1 & & \\\\ & &1 & \\\\ & & &1 \\\\ 1 & & & \\end{pmatrix}\n",
"$$\n",
"for the $4\\times 4$ case, and similarly for larger matrices.\n",
"\n",
"**(a)** One way of saying way Fourier transforms are practically important is that they *diagonalize* (are eigenvectors of) matrices that *commute* with $P$. If $A$ is a $4\\times 4$ matrix whose first row is $(a\\, b\\, c\\, d)$\n",
"$$\n",
"A = \\begin{pmatrix} a & b & c & d \\\\ ? & ? &? & ? \\\\ ? &? & ? &? \\\\ ? & ? & ? &? \\end{pmatrix}\n",
"$$\n",
"that commutes with $P$ (i.e. $AP=PA$), what must be true of the other (\"?\") entries of $A$?\n",
"\n",
"**(b)** Fill in the matrix `A` in Julia below and fill in and run the code to check that it commutes with $P$ and is diagonalized by $F$:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7222137",
"metadata": {},
"outputs": [],
"source": [
"a, b, c, d = 1, 7, 3, 2 # 4 arbitrarily chosen values\n",
"\n",
"P = [0 1 0 0\n",
" 0 0 1 0\n",
" 0 0 0 1\n",
" 1 0 0 0]\n",
"\n",
"F = im .^ ((0:3) .* (0:3)') # the 4×4 Fourier matrix\n",
"\n",
"# fill in:\n",
"A = [a b c d\n",
" ? ? ? ?\n",
" ? ? ? ?\n",
" ? ? ? ?]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25dc7673",
"metadata": {},
"outputs": [],
"source": [
"# check:\n",
"P * A == A * P"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ae85697",
"metadata": {},
"outputs": [],
"source": [
"# check that F diagonalizes A. (How?)\n",
"\n",
"????"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.8.3",
"language": "julia",
"name": "julia-1.8"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}