more text

This commit is contained in:
NT 2021-01-12 11:50:42 +08:00
parent 03a4c7ef29
commit 0063c71c05
12 changed files with 447 additions and 169 deletions

View File

@ -1,9 +1,9 @@
# Table of content # PBDL Table of content (cf https://jupyterbook.org/customize/toc.html)
# Learn more at https://jupyterbook.org/customize/toc.html
# #
- file: intro - file: intro
- file: overview.md - file: overview.md
sections: sections:
- file: overview-equations.md
- file: overview-burgers-forw.ipynb - file: overview-burgers-forw.ipynb
- file: overview-ns-forw.ipynb - file: overview-ns-forw.ipynb
- file: supervised - file: supervised
@ -12,6 +12,7 @@
- file: physicalloss - file: physicalloss
sections: sections:
- file: physicalloss-code.ipynb - file: physicalloss-code.ipynb
- file: physicalloss-discuss.md
- file: diffphys - file: diffphys
sections: sections:
- file: diffphys-code-gradient.ipynb - file: diffphys-code-gradient.ipynb
@ -23,3 +24,4 @@
- file: markdown - file: markdown
- file: notebooks - file: notebooks
- file: references - file: references
- file: notation

View File

@ -29,7 +29,7 @@ For the PINN representation with fully-connected networks on the other hand, we
The following table summarizes these findings: The following table summarizes these findings:
| Method | Pro | Con | | Method | Pro | Con |
|----------|-------------|------------| |----------|-------------|------------|
| **PINN** | - Analytic derivatives via back-propagation | - Expensive evaluation of NN, as well as derivative calculations | | **PINN** | - Analytic derivatives via back-propagation | - Expensive evaluation of NN, as well as derivative calculations |
| | - Simple to implement | - Incompatible with existing numerical methods | | | - Simple to implement | - Incompatible with existing numerical methods |

View File

@ -73,9 +73,15 @@ The contents of the following files would not have been possible without the hel
- Ms. y - Ms. y
- ... - ...
% tests...
% some markdown tests follow ...
---
a b c a b c
```{admonition} My title2 ```{admonition} My title2
@ -86,6 +92,7 @@ See also... Test link: {doc}`supervised`
✅ Do this , ❌ Don't do this ✅ Do this , ❌ Don't do this
% ---------------- % ----------------
--- ---
@ -152,6 +159,6 @@ time series, sequence prediction?] {cite}`wiewel2019lss,bkim2019deep,wiewel2020l
_Misc jupyter book TODOs_ _Misc jupyter book TODOs_
- Fix latex PDF output - Fix latex PDF output
- How to include links in references? - How to include links to papers in the bibtex references?

View File

@ -1,5 +1,8 @@
Jupyter Book Reference Stuff Old Jupyter Book Reference Stuff
======================= =======================
There are many ways to write content in Jupyter Book. This short section There are many ways to write content in Jupyter Book. This short section
covers a few tips for how to do so. covers a few tips for how to do so.
TODO remove sometime...

38
notation.md Normal file
View File

@ -0,0 +1,38 @@
# Notation and Abbreviations
## Math notation:
| Symbol | Meaning |
| --- | --- |
| $A$ | matrix |
| $\eta$ | learning rate or step size |
| $\Gamma$ | boundary of computational domain $\Omega$ |
| $f()$ | approximated version of $f^{*}$ |
| $f^{*}()$ | generic function to be approximated, typically unknown |
| $\Omega$ | computational domain |
| $\mathcal P$ | physical model, PDE |
| $\theta$ | neural network params |
| $t$ | time dimension |
| $\mathbf{u}$ | vector-valued velocity |
| $x$ | neural network input or spatial coordinate |
| $y$ | neural network output |
## Summary of the most important abbreviations:
| ABbreviation | Meaning |
| --- | --- |
| CNN | Convolutional neural network |
| DL | Deep learning |
| NN | Neural network |
| PBDL | Physics-based deep learning |
% test table formatting in markdown
% | | Sentence # | Word | POS | Tag |
% |---:|:-------------|:-----------|:------|:------|
% | 1 | Sentence: 1 | They | PRP | O |
% | 2 | Sentence: 1 | marched | VBD | O |

138
overview-equations.md Normal file
View File

@ -0,0 +1,138 @@
Model Equations
============================
overview of PDE models to be used later on ...
domain $\Omega$, boundary $\Gamma$
continuous functions, but few assumptions about continuity for now...
```{admonition} Notation and abbreviations
:class: seealso
If unsure, please check the summary of our mathematical notation
and the abbreviations used inn: {doc}`notation`, at the bottom of the left panel.
```
% \newcommand{\pde}{\mathcal{P}} % PDE ops
% \newcommand{\pdec}{\pde_{s}}
% \newcommand{\manifsrc}{\mathscr{S}} % coarse / "source"
% \newcommand{\pder}{\pde_{R}}
% \newcommand{\manifref}{\mathscr{R}}
% vc - coarse solutions
% \renewcommand{\vc}[1]{\vs_{#1}} % plain coarse state at time t
% \newcommand{\vcN}{\vs} % plain coarse state without time
% vc - coarse solutions, modified by correction
% \newcommand{\vct}[1]{\tilde{\vs}_{#1}} % modified / over time at time t
% \newcommand{\vctN}{\tilde{\vs}} % modified / over time without time
% vr - fine/reference solutions
% \renewcommand{\vr}[1]{\mathbf{r}_{#1}} % fine / reference state at time t , never modified
% \newcommand{\vrN}{\mathbf{r}} % plain coarse state without time
% \newcommand{\project}{\mathcal{T}} % transfer operator fine <> coarse
% \newcommand{\loss}{\mathcal{L}} % generic loss function
% \newcommand{\nn}{f_{\theta}}
% \newcommand{\dt}{\Delta t} % timestep
% \newcommand{\corrPre}{\mathcal{C}_{\text{pre}}} % analytic correction , "pre computed"
% \newcommand{\corr}{\mathcal{C}} % just C for now...
% \newcommand{\nnfunc}{F} % {\text{NN}}
Some notation from SoL, move with parts from overview into "appendix"?
We typically solve a discretized PDE $\mathcal{P}$ by performing discrete time steps of size $\Delta t$.
Each subsequent step can depend on any number of previous steps,
$\mathbf{u}(\mathbf{x},t+\Delta t) = \mathcal{P}(\mathbf{u}(\mathbf{x},t), \mathbf{u}(\mathbf{x},t-\Delta t),...)$,
where
$\mathbf{x} \in \Omega \subseteq \mathbb{R}^d$ for the domain $\Omega$ in $d$
dimensions, and $t \in \mathbb{R}^{+}$.
Numerical methods yield approximations of a smooth function such as $\mathbf{u}$ in a discrete
setting and invariably introduce errors. These errors can be measured in terms
of the deviation from the exact analytical solution.
For discrete simulations of
PDEs, these errors are typically expressed as a function of the truncation, $O(\Delta t^k)$
for a given step size $\Delta t$ and an exponent $k$ that is discretization dependent.
The following PDEs typically work with a continuous
velocity field $\mathbf{u}$ with $d$ dimensions and components, i.e.,
$\mathbf{u}(\mathbf{x},t): \mathbb{R}^d \rightarrow \mathbb{R}^d $.
For discretized versions below, $d_{i,j}$ will denote the dimensionality
of a field such as the velocity,
with domain size $d_{x},d_{y},d_{z}$ for source and reference in 3D.
% with $i \in \{s,r\}$ denoting source/inference manifold and reference manifold, respectively.
%This yields $\vc{} \in \mathbb{R}^{d \times d_{s,x} \times d_{s,y} \times d_{s,z} }$ and $\vr{} \in \mathbb{R}^{d \times d_{r,x} \times d_{r,y} \times d_{r,z} }$
%Typically, $d_{r,i} > d_{s,i}$ and $d_{z}=1$ for $d=2$.
For all PDEs, we use non-dimensional parametrizations as outlined below,
and the components of the velocity vector are typically denoted by $x,y,z$ subscripts, i.e.,
$\mathbf{u} = (u_x,u_y,u_z)^T$ for $d=3$.
Burgers' equation in 2D. It represents a well-studied advection-diffusion PDE:
$\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x =
\nu \nabla\cdot \nabla u_x + g_x(t),
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y =
\nu \nabla\cdot \nabla u_y + g_y(t)
$,
where $\nu$ and $\mathbf{g}$ denote diffusion constant and external forces, respectively.
Burgers' equation in 1D without forces with $u_x = u$:
%\begin{eqnarray}
$\frac{\partial u}{\partial{t}} + u \nabla u = \nu \nabla \cdot \nabla u $ .
---
Later on, additional equations...
Navier-Stokes, in 2D:
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x =
- \frac{1}{\rho}\nabla{p} + \nu \nabla\cdot \nabla u_x
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y =
- \frac{1}{\rho}\nabla{p} + \nu \nabla\cdot \nabla u_y
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0
$
Navier-Stokes, in 2D with Boussinesq:
%$\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x$
%$ -\frac{1}{\rho} \nabla p $
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x = - \frac{1}{\rho} \nabla p
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y = - \frac{1}{\rho} \nabla p + \eta d
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0,
\\
\frac{\partial d}{\partial{t}} + \mathbf{u} \cdot \nabla d = 0
$
Navier-Stokes, in 3D:
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_x
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_y
\\
\frac{\partial u_z}{\partial{t}} + \mathbf{u} \cdot \nabla u_z = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_z
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0.
$

View File

@ -1,12 +1,14 @@
Overview Overview
============================ ============================
The following "book" of targets _"Physics-Based Deep Learning"_ techniques, The following collection of digital documents, i.e. "book",
i.e., methods that combine physical modeling and numerical simulations with targets _Physics-Based Deep Learning_ techniques.
deep learning (DL). Here, DL will typically refer to methods based By that we mean combining physical modeling and numerical simulations with
on artificial neural networks. The general direction of methods based on artificial neural networks.
Physics-Based Deep Learning represents a very The general direction of Physics-Based Deep Learning represents a very
active, quickly growing and exciting field of research. active, quickly growing and exciting field of research -- we want to provide
a starting point for new researchers as well as a hands-on introduction into
state-of-the-art resarch topics.
## Motivation ## Motivation
@ -50,8 +52,8 @@ whether key phenomena are visible in the solutions or not.
:class: tip :class: tip
Thus, a key aspect that we want to address in the following in the following is: Thus, a key aspect that we want to address in the following in the following is:
- explain how to use DL, - explain how to use DL,
- and how to combine it with existing knowledge of physics and simulations, - how to combine it with existing knowledge of physics and simulations,
- **without throwing away** all existing numerical knowledeg and techniques! - **without throwing away** all existing numerical knowledge and techniques!
``` ```
Rather, we want to build on all the neat techniques that we have Rather, we want to build on all the neat techniques that we have
@ -112,7 +114,7 @@ starting points with code examples, and illustrate pros and cons of the
different approaches. In particular, it's important to know in which scenarios different approaches. In particular, it's important to know in which scenarios
each of the different techniques is particularly useful. each of the different techniques is particularly useful.
```{admonition} Skip ahead if... ```{admonition} You can skip ahead if...
:class: tip :class: tip
- you're very familiar with numerical methods and PDE solvers, and want to get started with DL topics right away. The _Supervised Learning_ chapter is a good starting point then. - you're very familiar with numerical methods and PDE solvers, and want to get started with DL topics right away. The _Supervised Learning_ chapter is a good starting point then.
@ -138,37 +140,13 @@ PINNs ... and more ...
## Deep Learning and Neural Networks ## Deep Learning and Neural Networks
Very brief intro, basic equations... approximate $f(x)=y$ with NN ... Very brief intro, basic equations... approximate $f^*(x)=y$ with NN $f(x;\theta)$ ...
Details in [Deep Learning book](https://www.deeplearningbook.org) learn via GD, $\partial f / \partial \theta$
Read chapters 6 to 9 of the [Deep Learning book](https://www.deeplearningbook.org),
especially about [MLPs]https://www.deeplearningbook.org/contents/mlp.html and
"Conv-Nets", i.e. [CNNs](https://www.deeplearningbook.org/contents/convnets.html).
## Notation and Abbreviations **Note:** Classic distinction between _classification_ and _regression_ problems not so important here,
we only deal with _regression_ problems in the following.
Unify notation... TODO ...
Math notation:
| Symbol | Meaning |
| --- | --- |
| $x$ | NN input |
| $y$ | NN output |
| $\theta$ | NN params |
Quick summary of the most important abbreviations:
| ABbreviation | Meaning |
| --- | --- |
| CNN | Convolutional neural network |
| DL | Deep learning |
| NN | Neural network |
| PBDL | Physics-based deep learning |
test table formatting in markdown
| | Sentence # | Word | POS | Tag |
|---:|:-------------|:-----------|:------|:------|
| 1 | Sentence: 1 | They | PRP | O |
| 2 | Sentence: 1 | marched | VBD | O |

37
physicalloss-discuss.md Normal file
View File

@ -0,0 +1,37 @@
Discussion of Physical Soft-Constraints
=======================
The good news so far is - we have a DL method that can include
physical laws in the form of soft constraints by minimizing residuals.
However, as the very simple previous example illustrates, this is just a conceptual
starting point.
On the positive side, we can leverage DL frameworks with backpropagation to compute
the derivatives of the model. At the same time, this puts us at the mercy of the learned
representation regarding the reliability of these derivatives. Also, each derivative
requires backpropagation through the full network, which can be very slow. Especially so
for higher-order derivatives.
And while the setup is realtively simple, it is generally difficult to control. The NN
has flexibility to refine the solution by itself, but at the same time, tricks are necessary
when it doesn't pick the right regions of the solution.
In general, a fundamental drawback of this approach is that it does combine with traditional
numerical techniques well. E.g., learned representation is not suitable to be refined with
a classical iterative solver such as the conjugate gradient method. This means many
powerful techniques that were developed in the past decades cannot be used in this context.
Bringing these numerical methods back into the picture will be one of the central
goals of the next sections.
✅ Pro:
- uses physical model
- derivatives via backpropagation
❌ Con:
- slow ...
- only soft constraints
- largely incompatible _classical_ numerical methods
- derivatives rely on learned representation
Next, let's look at how we can leverage numerical methods to improve the DL accuracy and efficiency
by making use of differentiable solvers.

View File

@ -1,134 +1,98 @@
Physical Loss Terms Physical Loss Terms
======================= =======================
The supervised setting of the previous sections can quickly
yield approximate solutions with a fairly simple training process, but what's
quite sad to see here is that we only use physical models and numerics
as an "external" tool to produce a big pile of data 😢.
Using the equations now, but no numerical methods! ## Using Physical Models
Still interesting, leverages analytic derivatives of NNs, but lots of problems We can improve this setting by trying to bring the model equations (or parts thereof)
into the training process. E.g., given a PDE for $\mathbf{u}(x,t)$ with a time evolution,
we can typically express it in terms of a function $\mathcal F$ of the derivatives
of $\mathbf{u}$ via
$
\mathbf{u}_t = \mathcal F ( \mathbf{u}_{x}, \mathbf{u}_{xx}, ... \mathbf{u}_{x..x})
$,
where the $_{x}$ subscripts denote spatial derivatives of higher order.
In this context we can employ DL by approxmating the unknown $\mathbf{u}$ itself
with a NN, denoted by $\tilde{\mathbf{u}}$. If the approximation is accurate, the PDE
naturally should be satisfied, i.e., the residual $R$ should be equal to zero:
$
R = \mathbf{u}_t - \mathcal F ( \mathbf{u}_{x}, \mathbf{u}_{xx}, ... \mathbf{u}_{x..x}) = 0
$
This nicely integrates with the objective for training a neural network: similar to before
we can collect sample solutions
$[x_0,y_0], ...[x_n,y_n]$ for $\mathbf{u}$ with $\mathbf{u}(x)=y$.
This is typically important, as most practical PDEs we encounter do not have unique solutions
unless initial and boundary conditions are specified. Hence, if we only consider $R$ we might
get solutions with random offset or other undesirable components. Hence the supervised sample points
help to _pin down_ the solution in certain places.
Now our training objective becomes
$\text{arg min}_{\theta} \ \alpha_0 \sum_i (f(x_i ; \theta)-y_i)^2 + \alpha_1 R(x_i) $,
where $\alpha_{0,1}$ denote hyper parameters that scale the contribution of the supervised term and
the residual term, respectively. We could of course add additional residual terms with suitable scaling factors here.
Note that, similar to the data samples used for supervised training, we have no guarantees that the
residual terms $R$ will actually reach zero during training. The non-linear optimization of the training process
will minimize the supervised and residual terms as much as possible, but worst case, large non-zero residual
contributions can remain. We'll look at this in more detail in the upcoming code example, for now it's important
to remember that physical constraints in this way only represent _soft-constraints_, without guarantees
of minimizing these constraints.
## Neural network derivatives
In order to compute the residuals at training time, it would be possible to store
the unknowns of $\mathbf{u}$ on a computational mesh, e.g., a grid, and discretize the equations of
$R$ there. This has a fairly long "tradition" in DL, and was proposed by Tompson et al. {cite}`tompson2017` early on.
Instead, a more widely used variant of employing physical soft-constraints {cite}`raissi2018hiddenphys`
uses fully connected NNs to represent $\mathbf{u}$. This has some interesting pros and cons that we'll outline in the following.
Due to the popularity of the version, we'll also focus on it in the following code examples and comparisons.
The central idea here is that the aforementioned general function $f$ that we're after in our learning problems
can be seen as a representation of a physical field we're after. Thus, the $\mathbf{u}(x)$ will
be turned into $\mathbf{u}(x, \theta)$ where we choose $\theta$ such that the solution to $\mathbf{u}$ is
represented as precisely as possible.
One nice side effect of this viewpoint is that NN representations inherently support the calculation of derivatives.
The derivative $\partial f / \partial \theta$ was a key building block for learning via gradient descent, as explained
in {doc}`overview`. Here, we can use the same tools to compute spatial derivatives such as $\partial \mathbf{u} / \partial x$,
Note that above for $R$ we've written this derivative in the shortened notation as $\mathbf{u}_{x}$.
For functions over time this of course also works for $\partial \mathbf{u} / \partial t$, i.e. $\mathbf{u}_{t}$ in the notation above.
Thus, for some generic $R$, made up of $\mathbf{u}_t$ and $\mathbf{u}_{x}$ terms, we can rely on the back-propagation algorithm
of DL frameworks to compute these derivatives once we have a NN that represents $\mathbf{u}$. Essentially, this gives us a
function (the NN) that receives space and time coordinates to produce a solution for $\mathbf{u}$. Hence, the input is typically
quite low-dimensional, e.g., 3+1 values for a 3D case over time, and often produces a scalar value or a spatial vector.
Due to the lack of explicit spatial sampling points, an MLP, i.e., fully-connected NN is the architecture of choice here.
To pick a simple example, Burgers equation in 1D,
$\frac{\partial u}{\partial{t}} + u \nabla u = \nu \nabla \cdot \nabla u $ , we can directly
formulate a loss term $R = \frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} - \nu \frac{\partial^2 u}{\partial x^2} u$ that should be minimized as much as possible at training time. For each of the terms, e.g. $\frac{\partial u}{\partial x}$,
we can simply query the DL framework that realizes $u$ to obtain the corresponding derivative.
For higher order derivatives, such as $\frac{\partial^2 u}{\partial x^2}$, we can typically simply query the derivative function of the framework twice. In the following section, we'll give a specific example of how that works in tensorflow.
## Summary so far
This gives us a method to include physical equations into DL learning as a soft-constraint.
Typically, this setup is suitable for _inverse_ problems, where we have certain measurements or observations
that we wish to find a solution of a model PDE for. Because of the high expense of the reconstruction (to be
demonstrated in the following), the solution manifold typically shouldn't be overly complex. E.g., it is difficult
to capture a wide range of solutions, such as the previous supervised airfoil example, in this way.
```{figure} resources/placeholder.png
--- ---
height: 220px
name: pinn-training
% \newcommand{\pde}{\mathcal{P}} % PDE ops
% \newcommand{\pdec}{\pde_{s}}
% \newcommand{\manifsrc}{\mathscr{S}} % coarse / "source"
% \newcommand{\pder}{\pde_{R}}
% \newcommand{\manifref}{\mathscr{R}}
% vc - coarse solutions
% \renewcommand{\vc}[1]{\vs_{#1}} % plain coarse state at time t
% \newcommand{\vcN}{\vs} % plain coarse state without time
% vc - coarse solutions, modified by correction
% \newcommand{\vct}[1]{\tilde{\vs}_{#1}} % modified / over time at time t
% \newcommand{\vctN}{\tilde{\vs}} % modified / over time without time
% vr - fine/reference solutions
% \renewcommand{\vr}[1]{\mathbf{r}_{#1}} % fine / reference state at time t , never modified
% \newcommand{\vrN}{\mathbf{r}} % plain coarse state without time
% \newcommand{\project}{\mathcal{T}} % transfer operator fine <> coarse
% \newcommand{\loss}{\mathcal{L}} % generic loss function
% \newcommand{\nn}{f_{\theta}}
% \newcommand{\dt}{\Delta t} % timestep
% \newcommand{\corrPre}{\mathcal{C}_{\text{pre}}} % analytic correction , "pre computed"
% \newcommand{\corr}{\mathcal{C}} % just C for now...
% \newcommand{\nnfunc}{F} % {\text{NN}}
Some notation from SoL, move with parts from overview into "appendix"?
We typically solve a discretized PDE $\mathcal{P}$ by performing discrete time steps of size $\Delta t$.
Each subsequent step can depend on any number of previous steps,
$\mathbf{u}(\mathbf{x},t+\Delta t) = \mathcal{P}(\mathbf{u}(\mathbf{x},t), \mathbf{u}(\mathbf{x},t-\Delta t),...)$,
where
$\mathbf{x} \in \Omega \subseteq \mathbb{R}^d$ for the domain $\Omega$ in $d$
dimensions, and $t \in \mathbb{R}^{+}$.
Numerical methods yield approximations of a smooth function such as $\mathbf{u}$ in a discrete
setting and invariably introduce errors. These errors can be measured in terms
of the deviation from the exact analytical solution.
For discrete simulations of
PDEs, these errors are typically expressed as a function of the truncation, $O(\Delta t^k)$
for a given step size $\Delta t$ and an exponent $k$ that is discretization dependent.
The following PDEs typically work with a continuous
velocity field $\mathbf{u}$ with $d$ dimensions and components, i.e.,
$\mathbf{u}(\mathbf{x},t): \mathbb{R}^d \rightarrow \mathbb{R}^d $.
For discretized versions below, $d_{i,j}$ will denote the dimensionality
of a field such as the velocity,
with domain size $d_{x},d_{y},d_{z}$ for source and reference in 3D.
% with $i \in \{s,r\}$ denoting source/inference manifold and reference manifold, respectively.
%This yields $\vc{} \in \mathbb{R}^{d \times d_{s,x} \times d_{s,y} \times d_{s,z} }$ and $\vr{} \in \mathbb{R}^{d \times d_{r,x} \times d_{r,y} \times d_{r,z} }$
%Typically, $d_{r,i} > d_{s,i}$ and $d_{z}=1$ for $d=2$.
For all PDEs, we use non-dimensional parametrizations as outlined below,
and the components of the velocity vector are typically denoted by $x,y,z$ subscripts, i.e.,
$\mathbf{u} = (u_x,u_y,u_z)^T$ for $d=3$.
Burgers' equation in 2D. It represents a well-studied advection-diffusion PDE:
$\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x =
\nu \nabla\cdot \nabla u_x + g_x(t),
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y =
\nu \nabla\cdot \nabla u_y + g_y(t)
$,
where $\nu$ and $\mathbf{g}$ denote diffusion constant and external forces, respectively.
Burgers' equation in 1D without forces with $u_x = u$:
%\begin{eqnarray}
$\frac{\partial u}{\partial{t}} + u \nabla u = \nu \nabla \cdot \nabla u $ .
--- ---
TODO, visual overview of PINN training
Later on, additional equations... ```
Navier-Stokes, in 2D:
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x =
- \frac{1}{\rho}\nabla{p} + \nu \nabla\cdot \nabla u_x
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y =
- \frac{1}{\rho}\nabla{p} + \nu \nabla\cdot \nabla u_y
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0
$
Navier-Stokes, in 2D with Boussinesq:
%$\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x$
%$ -\frac{1}{\rho} \nabla p $
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x = - \frac{1}{\rho} \nabla p
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y = - \frac{1}{\rho} \nabla p + \eta d
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0,
\\
\frac{\partial d}{\partial{t}} + \mathbf{u} \cdot \nabla d = 0
$
Navier-Stokes, in 3D:
$
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_x
\\
\frac{\partial u_y}{\partial{t}} + \mathbf{u} \cdot \nabla u_y = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_y
\\
\frac{\partial u_z}{\partial{t}} + \mathbf{u} \cdot \nabla u_z = - \frac{1}{\rho} \nabla p + \nu \nabla\cdot \nabla u_z
\\
\text{subject to} \quad \nabla \cdot \mathbf{u} = 0.
$

View File

@ -762,3 +762,34 @@
PUBLISHER = {Dept. of Computer Science 10, University of Erlangen-Nuremberg} PUBLISHER = {Dept. of Computer Science 10, University of Erlangen-Nuremberg}
} }
% ----------------- external --------------------
@inproceedings{tompson2017,
title = {Accelerating Eulerian Fluid Simulation With Convolutional Networks},
booktitle = {Proceedings of Machine Learning Research},
author = {Tompson, Jonathan and Schlachter, Kristofer and Sprechmann, Pablo and Perlin, Ken},
year = 2017,
pages = {3424--3433}
}
@article{raissi2018hiddenphys,
title={Hidden physics models: Machine learning of nonlinear partial differential equations},
author={Raissi, Maziar and Karniadakis, George Em},
journal={Journal of Computational Physics},
volume={357},
pages={125--141},
year={2018},
publisher={Elsevier}
}

BIN
resources/placeholder.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

View File

@ -1,5 +1,85 @@
Supervised Learning Supervised Learning
======================= =======================
Doing things the old fashioned way... _Supervised_ here essentially means: "doing things the old fashioned way". Old fashioned in the context of
deep learning (DL), of course, so it's still fairly new, and old fashioned of course also doesn't always mean bad.
In a way this viewpoint is a starting point for all projects one would encounter in the context of DL, and
hence is worth studying. And although it typically yields inferior results to approaches that more tightly
couple with physics, it nonetheless can be the only choice in certain application scenarios where no good
model equations exist.
## Problem Setting
For supervised learning, we're faced with an
unknown function $f^*(x)=y$, collect lots of pairs of data $[x_0,y_0], ...[x_n,y_n]$ (the training data set)
and directly train a NN to represent an approximation of $f^*$ denoted as $f$, such
that $f(x)=y$.
The $f$ we can obtain is typically not exact,
but instead we obtain it via a minimization problem:
by adjusting weights $\theta$ of our representation with $f$ such that
$\text{arg min}_{\theta} \sum_i (f(x_i ; \theta)-y_i)^2$.
This will give us $\theta$ such that $f(x;\theta) \approx y$ as accurately as possible given
our choice of $f$ and the hyper parameters for training. Note that above we've assumed
the simplest case of an $L^2$ loss. A more general version would use an error metric $e(x,y)$
to be minimized via $\text{arg min}_{\theta} \sum_i e( f(x_i ; \theta) , y_i) )$. The choice
of a suitable metric is topic we will get back to later on.
Irrespective of our choice of metric, this formulation
gives the actual "learning" process for a supervised approach.
The training data typically needs to be of substantial size, and hence it is attractive
to use numerical simulations to produce a large number of training input-output pairs.
This means that the training process uses a set of model equations, and approximates
them numerically, in order to train the NN representation $\tilde{f}$. This
has a bunch of advantages, e.g., we don't have measurement noise of real-world devices
and we don't need manual labour to annotate a large number of samples to get training data.
On the other hand, this approach inherits the common challenges of replacing experiments
with simulations: first, we need to ensure the chosen model has enough power to predict the
bheavior of real-world phenomena that we're interested in.
In addition, the numerical approximations have numerical errors
which need to be kept small enough for a chosen application. As these topics are studied in depth
for classical simulations, the existing knowledge can likewise be leveraged to
set up DL training tasks.
```{figure} resources/placeholder.png
---
height: 220px
name: supervised-training
---
TODO, visual overview of supervised training
```
## Applications
Let's directly look at an example with a fairly complicated context:
we have a turbulent airflow around wing profiles, and we'd like to know the average motion
and pressure distribution around this airfoil for different Reynolds numbers and angles of attack.
Thus, given an airfoil shape, Reynolds numbers, and angle of attack, we'd like to obtain
a velocity field $\mathbf{u}$ and a pressure field $p$ in a computational domain $\Omega$
around the airfoil in the center of $\Omega$.
This is classically approximated with _Reynolds-Averaged Navier Stokes_ (RANS) models, and this
setting is still one of the most widely used applications of Navier-Stokes solver in industry.
However, instead of relying on traditional numerical methods to solve the RANS equations,
we know aim for training a neural network that completely bypasses the numerical solver,
and produces the solution in terms of $\mathbf{u}$ and $p$.
## Discussion
TODO , add as separate section after code?
TODO , discuss pros / cons of supervised learning
TODO , CNNs powerful, graphs & co likewise possible
Pro:
- very fast output and training
Con:
- lots of data needed
- undesirable averaging / inaccuracies due to direct loss
Outlook: interactions with external "processes" (such as embedding into a solver) very problematic, see DP later on...