pbdl-book/supervised.md

64 lines
3.2 KiB
Markdown
Raw Normal View History

2021-01-21 04:45:37 +01:00
Supervised Training
2021-01-04 09:36:09 +01:00
=======================
2021-01-12 04:50:42 +01:00
_Supervised_ here essentially means: "doing things the old fashioned way". Old fashioned in the context of
2021-04-11 14:17:03 +02:00
deep learning (DL), of course, so it's still fairly new. Also, "old fashioned" doesn't
always mean bad - it's just that later on we'll be able to do better than with a simple supervised training.
2021-01-21 04:45:37 +01:00
In a way, the viewpoint of "supervised training" is a starting point for all projects one would encounter in the context of DL, and
2021-04-11 14:17:03 +02:00
hence is worth studying. While it typically yields inferior results to approaches that more tightly
2021-01-12 04:50:42 +01:00
couple with physics, it nonetheless can be the only choice in certain application scenarios where no good
model equations exist.
2021-04-12 03:19:00 +02:00
## Problem setting
2021-01-12 04:50:42 +01:00
2021-01-21 04:45:37 +01:00
For supervised training, we're faced with an
2021-03-02 14:42:27 +01:00
unknown function $f^*(x)=y^*$, collect lots of pairs of data $[x_0,y^*_0], ...[x_n,y^*_n]$ (the training data set)
2021-01-12 04:50:42 +01:00
and directly train a NN to represent an approximation of $f^*$ denoted as $f$, such
2021-03-02 14:42:27 +01:00
that $f(x)=y \approx y^*$.
2021-01-12 04:50:42 +01:00
The $f$ we can obtain is typically not exact,
but instead we obtain it via a minimization problem:
by adjusting weights $\theta$ of our representation with $f$ such that
2021-03-02 14:42:27 +01:00
$\text{arg min}_{\theta} \sum_i (f(x_i ; \theta)-y^*_i)^2$.
2021-01-12 04:50:42 +01:00
This will give us $\theta$ such that $f(x;\theta) \approx y$ as accurately as possible given
2021-01-15 09:13:41 +01:00
our choice of $f$ and the hyperparameters for training. Note that above we've assumed
2021-01-12 04:50:42 +01:00
the simplest case of an $L^2$ loss. A more general version would use an error metric $e(x,y)$
2021-03-02 14:42:27 +01:00
to be minimized via $\text{arg min}_{\theta} \sum_i e( f(x_i ; \theta) , y^*_i) )$. The choice
2021-01-12 04:50:42 +01:00
of a suitable metric is topic we will get back to later on.
Irrespective of our choice of metric, this formulation
gives the actual "learning" process for a supervised approach.
The training data typically needs to be of substantial size, and hence it is attractive
to use numerical simulations to produce a large number of training input-output pairs.
This means that the training process uses a set of model equations, and approximates
them numerically, in order to train the NN representation $\tilde{f}$. This
has a bunch of advantages, e.g., we don't have measurement noise of real-world devices
and we don't need manual labour to annotate a large number of samples to get training data.
On the other hand, this approach inherits the common challenges of replacing experiments
with simulations: first, we need to ensure the chosen model has enough power to predict the
2021-03-09 09:39:54 +01:00
behavior of real-world phenomena that we're interested in.
2021-01-12 04:50:42 +01:00
In addition, the numerical approximations have numerical errors
which need to be kept small enough for a chosen application. As these topics are studied in depth
for classical simulations, the existing knowledge can likewise be leveraged to
set up DL training tasks.
2021-03-02 14:42:27 +01:00
```{figure} resources/supervised-training.jpg
2021-01-12 04:50:42 +01:00
---
height: 220px
name: supervised-training
---
2021-03-02 14:42:27 +01:00
A visual overview of supervised training. Quite simple overall, but it's good to keep this
in mind in comparison to the more complex variants we'll encounter later on.
2021-01-12 04:50:42 +01:00
```
2021-01-21 04:45:37 +01:00
## Show me some code!
2021-01-12 04:50:42 +01:00
2021-01-21 04:45:37 +01:00
Let's directly look at an implementation within a more complicated context:
_turbulent flows around airfoils_ from {cite}`thuerey2020deepFlowPred`.
2021-01-04 09:36:09 +01:00