smaller updates to figures and captions

2021-04-01 16:53:41 +08:00 · 2021-04-01 16:53:41 +08:00 · 17389f35a3
commit 17389f35a3
parent 1eba53dca5
5 changed files with 5 additions and 4 deletions
--- a/diffphys-discuss.md
+++ b/diffphys-discuss.md
@ -49,7 +49,7 @@ In this case we compute a (potentially very long) sequence of PDE solver steps i

 ```{figure} resources/diffphys-multistep.jpg
 ---
-height: 220px
+height: 180px
 name: diffphys-mulitstep
 ---
 Time stepping with interleaved DP and NN operations for $k$ solver iterations.
@ -59,7 +59,7 @@ Note that this picture (and the ones before) have assumed an _additive_ influenc

 DP setups with many time steps can be difficult to train: the gradients need to backpropagate through the full chain of PDE solver evaluations and NN evaluations. Typically, each of them represents a non-linear and complex function. Hence for larger numbers of steps, the vanishing and exploding gradient problem can make training difficult (see {doc}`diffphys-code-sol` for some practical tipps how to alleviate this).

-## Alternatives - Noise
+## Alternatives: Noise

 It is worth mentioning here that other works have proposed perturbing the inputs and 
 the iterations at training time with noise {cite}`sanchez2020learning` (somewhat similar to
--- a/diffphys.md
+++ b/diffphys.md
@ -147,7 +147,7 @@ to compute the updates (and derivatives) for these operators.
 %in practice break down into larger, monolithic components
 E.g., as this process is very similar to adjoint method optimizations, we can re-use many of the techniques
 that were developed in this field, or leverage established numerical methods. E.g., 
-we could leverage the $O(n)$ complexity of multigrid solvers for matrix inversion.
+we could leverage the $O(n)$ runtime of multigrid solvers for matrix inversion.

 The flipside of this approach is, that it requires some understanding of the problem at hand, 
 and of the numerical methods. Also, a given solver might not provide gradient calculations out of the box.
@ -161,13 +161,14 @@ never produces the parameter $\nu$ in the example above, and it doesn't appear i
 loss formulation, we will never encounter a $\partial/\partial \nu$ derivative
 in our backpropagation step.

+The following figure summarizes the DP-based learning approach, and illustrates the sequence of operations that are typically processed within a single PDE solve. As many of the operations are non-linear in practice, this often leads to a challenging learning task for the NN:

 ```{figure} resources/diffphys-overview.jpg
 ---
 height: 220px
 name: diffphys-full-overview
 ---
-TODO , details...
+DP learning with a PDE solver that consists of $m$ individual operators $\mathcal P_i$. The gradient travels backward through all $m$ operators before influencing the network weights $\theta$.
 ```


--- a/resources/diffphys-multistep.jpg
+++ b/resources/diffphys-multistep.jpg
--- a/resources/diffphys-switched.jpg
+++ b/resources/diffphys-switched.jpg
--- a/resources/pbdl-figures.key
+++ b/resources/pbdl-figures.key