lecture 13 updates

2023-03-06 15:21:26 -05:00 · 2023-03-06 15:21:26 -05:00 · 3ae5807516
commit 3ae5807516
parent 2dedc749de
1 changed files with 10 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -171,7 +171,16 @@ The *nice* case of diagonalization is when you have **orthonormal eigenvectors**
  - Dense linear algebra: Gaussian elimination, QR, eigenvalues, SVD, etcetera, assuming that A is just m×m numbers with no special structure.  Cost is Θ(m³), memory is Θ(m²).  Usually you run out of memory before running out of time! (m ∼ 10⁴ is close to filling up memory, but runs in only few minutes.)
  - Sparse-direct methods: For [sparse matrices](https://en.wikipedia.org/wiki/Sparse_matrix) (mostly 0), only store and compute with nonzero entries.  If a clever ordering is chosen for rows/cols, Gaussian elimination can often produce mostly sparse L and U factors!  (This is what `A \ b` does in Julia if `A` is a [sparse-matrix type](https://github.com/JuliaSparse/SparseArrays.jl).)   But for very big problems even these methods can eventually run out of memory.
  - Iterative methods: start with a "guess" for x (usually x=0 or x=random), and iteratively make it closer to a solution **using only A-times-vector operations** (and linear combinations and dot products).  Requires a fast A-times-vector, e.g. if A is sparse, low rank, a convolution, or some combination thereof.   Modern methods include [GMRES](https://en.wikipedia.org/wiki/Generalized_minimal_residual_method), [BiCGSTAB(ℓ)](https://en.wikipedia.org/wiki/Biconjugate_gradient_stabilized_method), [conjugate gradient (CG)](https://en.wikipedia.org/wiki/Conjugate_gradient_method), and others.
+  - Randomized linear algebra: by multiplying A on the left/right by small random wide/thin matrices, carefully chosen, we can construct an approximate "sketch" of A that can be used to estimate the SVD, solutions to least-squares, etcetera, and can also accelerate iterative solvers.
+  - Tricks for special cases: there are various specialized techniques for convolution/circulant matrices (via FFTs), [banded matrices](https://en.wikipedia.org/wiki/Band_matrix) (linear-time methods), and low-rank updates ([Sherman–Morrison formula](https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula))
 * [pset 2 solutions](psets/pset2sol.ipynb)
 * pset 3 (due 3/17): coming soon

-**Further reading:** For Gram–Schmidt and QR, see further reading for lecture 9.  Lecture II.1, [OCW video lecture 10](https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/resources/lecture-10-survey-of-difficulties-with-ax-b/). Sparse-direct solvers are described in detail by the book *Direct Methods for Sparse Linear Systems* by Davis.  Iterative methods: More advanced treatments include the book *Numerical Linear Algebra* by Trefethen and Bao, and surveys of algorithms can be found in the *Templates* books for [Ax=b](http://www.netlib.org/linalg/html_templates/Templates.html) and [Ax=λx](http://web.cs.ucdavis.edu/~bai/ET/contents.html).  [Some crude rules of thumb](https://github.com/mitmath/18335/blob/spring20/notes/solver-options.pdf) for solving linear systems (from 18.335 spring 2020).
+**Further reading:** For Gram–Schmidt and QR, see further reading for lecture 9.  Texbook section II.1, [OCW video lecture 10](https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/resources/lecture-10-survey-of-difficulties-with-ax-b/). Sparse-direct solvers are described in detail by the book *Direct Methods for Sparse Linear Systems* by Davis.  Iterative methods: More advanced treatments include the book *Numerical Linear Algebra* by Trefethen and Bao, and surveys of algorithms can be found in the *Templates* books for [Ax=b](http://www.netlib.org/linalg/html_templates/Templates.html) and [Ax=λx](http://web.cs.ucdavis.edu/~bai/ET/contents.html).  [Some crude rules of thumb](https://github.com/mitmath/18335/blob/spring20/notes/solver-options.pdf) for solving linear systems (from 18.335 spring 2020).
+
+## Lecture 13 (Mar 6)
+
+* Continued summary of large-scale linear algebra from lecture 12, mentioning randomized algorithms (which we will cover in more detail later), such as "sketched" least-squares and randomized SVD, and also specialized algorithms for particular cases.
+* Krylov methods: defined [Krylov subspaces](https://en.wikipedia.org/wiki/Krylov_subspace) reachable by iterative algorithms, defined a Krylov algorithm (loosely) an iterative algorithm that finds the "best" solution in the whole Krylov space (possibly approximately) on the n-th step.  Gave [power iteration](https://en.wikipedia.org/wiki/Power_iteration) for largest |λ| as an example of something *not* a Krylov method.  Explained why the basis (b Ab A²b ⋯) is a poor (ill-conditioned) choice, and instead explained the [Arnoldi iteration](https://en.wikipedia.org/wiki/Arnoldi_iteration) to find an orthonormal basis Qₙ by (essentially) Gram–Schmidt, leading to the [GMRES algorithm](https://en.wikipedia.org/wiki/Generalized_minimal_residual_method) for Ax=b.
+
+**Further reading:** A review of randomized linear algebra can be found in the Strang textbook sec. II.4, and also in [Halko, Martinsson, and Tropp (2011)](https://epubs.siam.org/doi/10.1137/090771806).  A recent paper on a variety of new randomized algorithms, e.g. for "sketched" least-square problems or to accelerate iterative algorithms like GMRES, is [Nakatsukasa and Tropp (2022)](https://arxiv.org/pdf/2111.00113.pdf).  A nice review of the randomized SVD can be found in a blog post by [Gregory Gundersen (2019)](https://gregorygundersen.com/blog/2019/01/17/randomized-svd/).