more text
This commit is contained in:
@@ -1,5 +1,85 @@
|
||||
Supervised Learning
|
||||
=======================
|
||||
|
||||
Doing things the old fashioned way...
|
||||
_Supervised_ here essentially means: "doing things the old fashioned way". Old fashioned in the context of
|
||||
deep learning (DL), of course, so it's still fairly new, and old fashioned of course also doesn't always mean bad.
|
||||
In a way this viewpoint is a starting point for all projects one would encounter in the context of DL, and
|
||||
hence is worth studying. And although it typically yields inferior results to approaches that more tightly
|
||||
couple with physics, it nonetheless can be the only choice in certain application scenarios where no good
|
||||
model equations exist.
|
||||
|
||||
## Problem Setting
|
||||
|
||||
For supervised learning, we're faced with an
|
||||
unknown function $f^*(x)=y$, collect lots of pairs of data $[x_0,y_0], ...[x_n,y_n]$ (the training data set)
|
||||
and directly train a NN to represent an approximation of $f^*$ denoted as $f$, such
|
||||
that $f(x)=y$.
|
||||
|
||||
The $f$ we can obtain is typically not exact,
|
||||
but instead we obtain it via a minimization problem:
|
||||
by adjusting weights $\theta$ of our representation with $f$ such that
|
||||
|
||||
$\text{arg min}_{\theta} \sum_i (f(x_i ; \theta)-y_i)^2$.
|
||||
|
||||
This will give us $\theta$ such that $f(x;\theta) \approx y$ as accurately as possible given
|
||||
our choice of $f$ and the hyper parameters for training. Note that above we've assumed
|
||||
the simplest case of an $L^2$ loss. A more general version would use an error metric $e(x,y)$
|
||||
to be minimized via $\text{arg min}_{\theta} \sum_i e( f(x_i ; \theta) , y_i) )$. The choice
|
||||
of a suitable metric is topic we will get back to later on.
|
||||
|
||||
Irrespective of our choice of metric, this formulation
|
||||
gives the actual "learning" process for a supervised approach.
|
||||
|
||||
The training data typically needs to be of substantial size, and hence it is attractive
|
||||
to use numerical simulations to produce a large number of training input-output pairs.
|
||||
This means that the training process uses a set of model equations, and approximates
|
||||
them numerically, in order to train the NN representation $\tilde{f}$. This
|
||||
has a bunch of advantages, e.g., we don't have measurement noise of real-world devices
|
||||
and we don't need manual labour to annotate a large number of samples to get training data.
|
||||
|
||||
On the other hand, this approach inherits the common challenges of replacing experiments
|
||||
with simulations: first, we need to ensure the chosen model has enough power to predict the
|
||||
bheavior of real-world phenomena that we're interested in.
|
||||
In addition, the numerical approximations have numerical errors
|
||||
which need to be kept small enough for a chosen application. As these topics are studied in depth
|
||||
for classical simulations, the existing knowledge can likewise be leveraged to
|
||||
set up DL training tasks.
|
||||
|
||||
```{figure} resources/placeholder.png
|
||||
---
|
||||
height: 220px
|
||||
name: supervised-training
|
||||
---
|
||||
TODO, visual overview of supervised training
|
||||
```
|
||||
|
||||
## Applications
|
||||
|
||||
Let's directly look at an example with a fairly complicated context:
|
||||
we have a turbulent airflow around wing profiles, and we'd like to know the average motion
|
||||
and pressure distribution around this airfoil for different Reynolds numbers and angles of attack.
|
||||
Thus, given an airfoil shape, Reynolds numbers, and angle of attack, we'd like to obtain
|
||||
a velocity field $\mathbf{u}$ and a pressure field $p$ in a computational domain $\Omega$
|
||||
around the airfoil in the center of $\Omega$.
|
||||
|
||||
This is classically approximated with _Reynolds-Averaged Navier Stokes_ (RANS) models, and this
|
||||
setting is still one of the most widely used applications of Navier-Stokes solver in industry.
|
||||
However, instead of relying on traditional numerical methods to solve the RANS equations,
|
||||
we know aim for training a neural network that completely bypasses the numerical solver,
|
||||
and produces the solution in terms of $\mathbf{u}$ and $p$.
|
||||
|
||||
## Discussion
|
||||
|
||||
TODO , add as separate section after code?
|
||||
TODO , discuss pros / cons of supervised learning
|
||||
TODO , CNNs powerful, graphs & co likewise possible
|
||||
|
||||
Pro:
|
||||
- very fast output and training
|
||||
|
||||
Con:
|
||||
- lots of data needed
|
||||
- undesirable averaging / inaccuracies due to direct loss
|
||||
|
||||
Outlook: interactions with external "processes" (such as embedding into a solver) very problematic, see DP later on...
|
||||
|
||||
|
||||
Reference in New Issue
Block a user