added know your data section, minor cleanup

This commit is contained in:
NT 2021-08-03 21:55:42 +02:00
parent 7910aa23e9
commit 215b5024f6
7 changed files with 46 additions and 7 deletions

View File

@ -74,11 +74,15 @@ This project would not have been possible without the help of many people who co
- [Nils Thuerey](https://ge.in.tum.de/about/n-thuerey/) - [Nils Thuerey](https://ge.in.tum.de/about/n-thuerey/)
- [Kiwon Um](https://ge.in.tum.de/about/kiwon/) - [Kiwon Um](https://ge.in.tum.de/about/kiwon/)
Additional thanks go to
Georg Kohl for the nice divider images (cf. {cite}`kohl2020lsim`),
Li-Wei Chen for the airfoil data image,
and to
Chloe Paillard for proofreading parts of the document.
% future: % future:
% - [Georg Kohl](https://ge.in.tum.de/about/georg-kohl/) % - [Georg Kohl](https://ge.in.tum.de/about/georg-kohl/)
% proofreading acks:
% - Chloe Pailard
## Citation ## Citation

View File

@ -49,6 +49,8 @@ for fnOut in fileList:
re1 = re.compile(r"WARNING:tensorflow:") re1 = re.compile(r"WARNING:tensorflow:")
re2 = re.compile(r"UserWarning:") re2 = re.compile(r"UserWarning:")
re4 = re.compile(r"DeprecationWarning:") re4 = re.compile(r"DeprecationWarning:")
re5 = re.compile(r"InsecureRequestWarning:") # for https download
# remove all "warnings.warn" from phiflow?
# shorten data line: "0.008612174447657694, 0.02584669669548606, 0.043136357266407785" # shorten data line: "0.008612174447657694, 0.02584669669548606, 0.043136357266407785"
re3 = re.compile(r"\[0.008612174447657694, 0.02584669669548606, 0.043136357266407785.+\]" ) re3 = re.compile(r"\[0.008612174447657694, 0.02584669669548606, 0.043136357266407785.+\]" )
@ -93,6 +95,7 @@ for fnOut in fileList:
nums.append( re1.search( d[t][i]["outputs"][j]["text"][k] ) ) nums.append( re1.search( d[t][i]["outputs"][j]["text"][k] ) )
nums.append( re2.search( d[t][i]["outputs"][j]["text"][k] ) ) nums.append( re2.search( d[t][i]["outputs"][j]["text"][k] ) )
nums.append( re4.search( d[t][i]["outputs"][j]["text"][k] ) ) nums.append( re4.search( d[t][i]["outputs"][j]["text"][k] ) )
nums.append( re5.search( d[t][i]["outputs"][j]["text"][k] ) )
if (nums[0] is None) and (nums[1] is None): if (nums[0] is None) and (nums[1] is None):
okay = okay+1 okay = okay+1
else: # delete line "dell" else: # delete line "dell"

View File

@ -1,4 +1,4 @@
Meshless Methods Unstructured Meshes and Meshless Methods
======================= =======================
For all computer-based methods we need to find a suitable _discrete_ representation. For all computer-based methods we need to find a suitable _discrete_ representation.

View File

@ -138,6 +138,8 @@ learned time evolution with a numerically solved advection step.
The learned prediction is shown at the top, the reference simulation at the bottom. The learned prediction is shown at the top, the reference simulation at the bottom.
``` ```
To summarize, DL allows us to move from linear subspaces to non-linear manifolds, and provides a basis for performing
complex steps (such as time evolutions) in the resulting latent space.
## Source code ## Source code

View File

@ -70,7 +70,7 @@ we'll be using later on in the DL examples.
We typically target continuous PDEs denoted by $\mathcal P^*$ We typically target continuous PDEs denoted by $\mathcal P^*$
whose solution is of interest in a spatial domain $\Omega \subset \mathbb{R}^d$ in $d \in {1,2,3} $ dimensions. whose solution is of interest in a spatial domain $\Omega \subset \mathbb{R}^d$ in $d \in {1,2,3} $ dimensions.
In addition, wo often consider a time evolution for a finite time interval $t \in \mathbb{R}^{+}$. In addition, wo often consider a time evolution for a finite time interval $t \in \mathbb{R}^{+}$.
The corresponding fields are either d-dimensional vector fields, e.g. $\mathbf{u}: \mathbb{R}^d \times \mathbb{R}^{+} \rightarrow \mathbb{R}^d$, The corresponding fields are either d-dimensional vector fields, for instance $\mathbf{u}: \mathbb{R}^d \times \mathbb{R}^{+} \rightarrow \mathbb{R}^d$,
or scalar $\mathbf{p}: \mathbb{R}^d \times \mathbb{R}^{+} \rightarrow \mathbb{R}$. or scalar $\mathbf{p}: \mathbb{R}^d \times \mathbb{R}^{+} \rightarrow \mathbb{R}$.
The components of a vector are typically denoted by $x,y,z$ subscripts, i.e., The components of a vector are typically denoted by $x,y,z$ subscripts, i.e.,
$\mathbf{v} = (v_x, v_y, v_z)^T$ for $d=3$, while $\mathbf{v} = (v_x, v_y, v_z)^T$ for $d=3$, while
@ -203,8 +203,8 @@ in implementations, effectively computing an instantaneous pressure.
An interesting variant is obtained by including the An interesting variant is obtained by including the
[Boussinesq approximation](https://en.wikipedia.org/wiki/Boussinesq_approximation_(buoyancy)) [Boussinesq approximation](https://en.wikipedia.org/wiki/Boussinesq_approximation_(buoyancy))
for varying densities, e.g., for simple temperature changes of the fluid. for varying densities, e.g., for simple temperature changes of the fluid.
With a marker field $v$, e.g., indicating regions of high temperature, With a marker field $v$ that indicates regions of high temperature,
this yields the following set of equations: it yields the following set of equations:
$$\begin{aligned} $$\begin{aligned}
\frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x &= - \frac{\Delta t}{\rho} \nabla p \frac{\partial u_x}{\partial{t}} + \mathbf{u} \cdot \nabla u_x &= - \frac{\Delta t}{\rho} \nabla p

View File

@ -897,7 +897,7 @@
@article{schulman2015high, @article{schulman2015high,
title={High-dimensional continuous control using generalized advantage estimation}, title={High-dimensional continuous control using generalized advantage estimation},
author={Schulman, John and Moritz, Philipp and Levine, Sergey and Jordan, Michael and Abbeel, Pieter}, author={Schulman, John and Moritz, Philipp and Levine, Sergey and Jordan, Michael and Abbeel, Pieter},
journal={arXiv preprint arXiv:1506.02438}, journal={arXiv:1506.02438},
year={2015} year={2015}
} }

View File

@ -50,6 +50,36 @@ as the most central hyperparameter.
You'll probably need to reduce it later on, but you should at least get a You'll probably need to reduce it later on, but you should at least get a
rough estimate of suitable values for $\eta$. rough estimate of suitable values for $\eta$.
### Know your data
All data-driven methods obey the _garbage-in-garbage-out_ principle. Because of this it's important
to work on getting to know the data you are dealing with. While there's no one-size-fits-all
approach for how to best achieve this, we can strongly recommend to track
a broad range of statistics of your data set. A good starting point are
per quantity mean, standard deviation, min and max values.
If some of these contain unusual values, this is a first indicator of bad
samples in the dataset.
These values can
also be easily visualized in terms of histograms, to track down
unwanted outliers. A small number of such outliers
can easily skew a data set in undesirable ways.
Finally, checking the relationships between different quantities
is often a good idea to get some intuition for what's contained in the
data set. The next figure gives an example for this step.
```{figure} resources/supervised-example-plot.jpg
---
height: 300px
name: supervised-example-plot
---
An example from the airfoil case of the previous section: a visualization of a training data
set in terms of mean u and v velocity of 2D flow fields. It nicely shows that there are no extreme outliers,
but there are a few entries with relatively low mean u velocity on the left side.
A second, smaller data set is shown on top in red, showing that its samples cover the range of mean motions quite well.
```
### Where's the magic? 🦄 ### Where's the magic? 🦄
A comment that you'll often hear when talking about DL approaches, and especially A comment that you'll often hear when talking about DL approaches, and especially