smaller fixes of notation
This commit is contained in:
parent
4d84d9b4ed
commit
333c99ab6b
@ -89,7 +89,7 @@ $$ \begin{aligned}
|
|||||||
\end{aligned} $$
|
\end{aligned} $$
|
||||||
%
|
%
|
||||||
where, as above, $d$ denotes the number of components in $\mathbf{u}$. As $\mathcal P$ maps one value of
|
where, as above, $d$ denotes the number of components in $\mathbf{u}$. As $\mathcal P$ maps one value of
|
||||||
$\mathbf{u}$ to another, the Jacobian is square and symmetric here. Of course this isn't necessarily the case
|
$\mathbf{u}$ to another, the Jacobian is square here. Of course this isn't necessarily the case
|
||||||
for general model equations, but non-square Jacobian matrices would not cause any problems for differentiable
|
for general model equations, but non-square Jacobian matrices would not cause any problems for differentiable
|
||||||
simulations.
|
simulations.
|
||||||
|
|
||||||
@ -97,7 +97,7 @@ In practice, we rely on the _reverse mode_ differentiation that all modern DL
|
|||||||
frameworks provide, and focus on computing a matrix vector product of the Jacobian transpose
|
frameworks provide, and focus on computing a matrix vector product of the Jacobian transpose
|
||||||
with a vector $\mathbf{a}$, i.e. the expression:
|
with a vector $\mathbf{a}$, i.e. the expression:
|
||||||
$
|
$
|
||||||
( \frac{\partial \mathcal P_i }{ \partial \mathbf{u} } )^T \mathbf{a}
|
\big( \frac{\partial \mathcal P_i }{ \partial \mathbf{u} } \big)^T \mathbf{a}
|
||||||
$.
|
$.
|
||||||
If we'd need to construct and store all full Jacobian matrices that we encounter during training,
|
If we'd need to construct and store all full Jacobian matrices that we encounter during training,
|
||||||
this would cause huge memory overheads and unnecessarily slow down training.
|
this would cause huge memory overheads and unnecessarily slow down training.
|
||||||
@ -117,7 +117,7 @@ $$
|
|||||||
$$
|
$$
|
||||||
|
|
||||||
which is just the vector valued version of the "classic" chain rule
|
which is just the vector valued version of the "classic" chain rule
|
||||||
$f(g(x))' = f'(g(x)) g'(x)$, and directly extends for larger numbers of composited functions, i.e. $i>2$.
|
$f\big(g(x)\big)' = f'\big(g(x)\big) g'(x)$, and directly extends for larger numbers of composited functions, i.e. $i>2$.
|
||||||
|
|
||||||
Here, the derivatives for $\mathcal P_1$ and $\mathcal P_2$ are still Jacobian matrices, but knowing that
|
Here, the derivatives for $\mathcal P_1$ and $\mathcal P_2$ are still Jacobian matrices, but knowing that
|
||||||
at the "end" of the chain we have our scalar loss (cf. {doc}`overview`), the right-most Jacobian will invariably
|
at the "end" of the chain we have our scalar loss (cf. {doc}`overview`), the right-most Jacobian will invariably
|
||||||
|
@ -68,5 +68,7 @@ goals of the next sections.
|
|||||||
- Largely incompatible with _classical_ numerical methods.
|
- Largely incompatible with _classical_ numerical methods.
|
||||||
- Accuracy of derivatives relies on learned representation.
|
- Accuracy of derivatives relies on learned representation.
|
||||||
|
|
||||||
Next, let's look at how we can leverage numerical methods to improve the DL accuracy and efficiency
|
To address these issues,
|
||||||
|
we'll next look at how we can leverage existing numerical methods to improve the DL process
|
||||||
by making use of differentiable solvers.
|
by making use of differentiable solvers.
|
||||||
|
|
||||||
|
@ -44,7 +44,7 @@ therefore help to _pin down_ the solution in certain places.
|
|||||||
Now our training objective becomes
|
Now our training objective becomes
|
||||||
|
|
||||||
$$
|
$$
|
||||||
\text{arg min}_{\theta} \ \alpha_0 \sum_i \big( f(x_i ; \theta)-y^*_i \big)^2 + \alpha_1 R(x_i) ,
|
\text{arg min}_{\theta} \ \sum_i \alpha_0 \big( f(x_i ; \theta)-y^*_i \big)^2 + \alpha_1 R(x_i) ,
|
||||||
$$ (physloss-training)
|
$$ (physloss-training)
|
||||||
|
|
||||||
where $\alpha_{0,1}$ denote hyperparameters that scale the contribution of the supervised term and
|
where $\alpha_{0,1}$ denote hyperparameters that scale the contribution of the supervised term and
|
||||||
@ -100,7 +100,7 @@ Nicely enough, in this case we don't even need additional supervised samples, an
|
|||||||
An example implementation can be found in this [code repository](https://github.com/tum-pbs/CG-Solver-in-the-Loop).
|
An example implementation can be found in this [code repository](https://github.com/tum-pbs/CG-Solver-in-the-Loop).
|
||||||
|
|
||||||
Overall, this variant 1 has a lot in common with _differentiable physics_ training (it's basically a subset). As we'll discuss differentiable physics in a lot more detail
|
Overall, this variant 1 has a lot in common with _differentiable physics_ training (it's basically a subset). As we'll discuss differentiable physics in a lot more detail
|
||||||
in {doc}`diffphys` and after, we'll focus on the direct NN representation (variant 2) from now on.
|
in {doc}`diffphys` and after, we'll focus on direct NN representations (variant 2) from now on.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -147,5 +147,6 @@ For higher order derivatives, such as $\frac{\partial^2 u}{\partial x^2}$, we ca
|
|||||||
The approach above gives us a method to include physical equations into DL learning as a soft constraint: the residual loss.
|
The approach above gives us a method to include physical equations into DL learning as a soft constraint: the residual loss.
|
||||||
Typically, this setup is suitable for _inverse problems_, where we have certain measurements or observations
|
Typically, this setup is suitable for _inverse problems_, where we have certain measurements or observations
|
||||||
for which we want to find a PDE solution. Because of the high cost of the reconstruction (to be
|
for which we want to find a PDE solution. Because of the high cost of the reconstruction (to be
|
||||||
demonstrated in the following), the solution manifold shouldn't be overly complex. E.g., it is not possible
|
demonstrated in the following), the solution manifold shouldn't be overly complex. E.g., it is typically not possible
|
||||||
to capture a wide range of solutions, such as with the previous supervised airfoil example, with such a physical residual loss.
|
to capture a wide range of solutions, such as with the previous supervised airfoil example, by only using a physical residual loss.
|
||||||
|
|
||||||
|
Binary file not shown.
Loading…
Reference in New Issue
Block a user