smaller fixes of notation
This commit is contained in:
parent
4d84d9b4ed
commit
333c99ab6b
@ -89,7 +89,7 @@ $$ \begin{aligned}
|
||||
\end{aligned} $$
|
||||
%
|
||||
where, as above, $d$ denotes the number of components in $\mathbf{u}$. As $\mathcal P$ maps one value of
|
||||
$\mathbf{u}$ to another, the Jacobian is square and symmetric here. Of course this isn't necessarily the case
|
||||
$\mathbf{u}$ to another, the Jacobian is square here. Of course this isn't necessarily the case
|
||||
for general model equations, but non-square Jacobian matrices would not cause any problems for differentiable
|
||||
simulations.
|
||||
|
||||
@ -97,7 +97,7 @@ In practice, we rely on the _reverse mode_ differentiation that all modern DL
|
||||
frameworks provide, and focus on computing a matrix vector product of the Jacobian transpose
|
||||
with a vector $\mathbf{a}$, i.e. the expression:
|
||||
$
|
||||
( \frac{\partial \mathcal P_i }{ \partial \mathbf{u} } )^T \mathbf{a}
|
||||
\big( \frac{\partial \mathcal P_i }{ \partial \mathbf{u} } \big)^T \mathbf{a}
|
||||
$.
|
||||
If we'd need to construct and store all full Jacobian matrices that we encounter during training,
|
||||
this would cause huge memory overheads and unnecessarily slow down training.
|
||||
@ -117,7 +117,7 @@ $$
|
||||
$$
|
||||
|
||||
which is just the vector valued version of the "classic" chain rule
|
||||
$f(g(x))' = f'(g(x)) g'(x)$, and directly extends for larger numbers of composited functions, i.e. $i>2$.
|
||||
$f\big(g(x)\big)' = f'\big(g(x)\big) g'(x)$, and directly extends for larger numbers of composited functions, i.e. $i>2$.
|
||||
|
||||
Here, the derivatives for $\mathcal P_1$ and $\mathcal P_2$ are still Jacobian matrices, but knowing that
|
||||
at the "end" of the chain we have our scalar loss (cf. {doc}`overview`), the right-most Jacobian will invariably
|
||||
|
@ -68,5 +68,7 @@ goals of the next sections.
|
||||
- Largely incompatible with _classical_ numerical methods.
|
||||
- Accuracy of derivatives relies on learned representation.
|
||||
|
||||
Next, let's look at how we can leverage numerical methods to improve the DL accuracy and efficiency
|
||||
To address these issues,
|
||||
we'll next look at how we can leverage existing numerical methods to improve the DL process
|
||||
by making use of differentiable solvers.
|
||||
|
||||
|
@ -44,7 +44,7 @@ therefore help to _pin down_ the solution in certain places.
|
||||
Now our training objective becomes
|
||||
|
||||
$$
|
||||
\text{arg min}_{\theta} \ \alpha_0 \sum_i \big( f(x_i ; \theta)-y^*_i \big)^2 + \alpha_1 R(x_i) ,
|
||||
\text{arg min}_{\theta} \ \sum_i \alpha_0 \big( f(x_i ; \theta)-y^*_i \big)^2 + \alpha_1 R(x_i) ,
|
||||
$$ (physloss-training)
|
||||
|
||||
where $\alpha_{0,1}$ denote hyperparameters that scale the contribution of the supervised term and
|
||||
@ -100,7 +100,7 @@ Nicely enough, in this case we don't even need additional supervised samples, an
|
||||
An example implementation can be found in this [code repository](https://github.com/tum-pbs/CG-Solver-in-the-Loop).
|
||||
|
||||
Overall, this variant 1 has a lot in common with _differentiable physics_ training (it's basically a subset). As we'll discuss differentiable physics in a lot more detail
|
||||
in {doc}`diffphys` and after, we'll focus on the direct NN representation (variant 2) from now on.
|
||||
in {doc}`diffphys` and after, we'll focus on direct NN representations (variant 2) from now on.
|
||||
|
||||
---
|
||||
|
||||
@ -147,5 +147,6 @@ For higher order derivatives, such as $\frac{\partial^2 u}{\partial x^2}$, we ca
|
||||
The approach above gives us a method to include physical equations into DL learning as a soft constraint: the residual loss.
|
||||
Typically, this setup is suitable for _inverse problems_, where we have certain measurements or observations
|
||||
for which we want to find a PDE solution. Because of the high cost of the reconstruction (to be
|
||||
demonstrated in the following), the solution manifold shouldn't be overly complex. E.g., it is not possible
|
||||
to capture a wide range of solutions, such as with the previous supervised airfoil example, with such a physical residual loss.
|
||||
demonstrated in the following), the solution manifold shouldn't be overly complex. E.g., it is typically not possible
|
||||
to capture a wide range of solutions, such as with the previous supervised airfoil example, by only using a physical residual loss.
|
||||
|
||||
|
Binary file not shown.
Loading…
Reference in New Issue
Block a user