updates Maxi DP codes

This commit is contained in:
NT
2021-07-20 21:44:33 +02:00
parent 98df3e43fc
commit 68ec5820df
2 changed files with 15 additions and 16 deletions

View File

@@ -77,7 +77,7 @@
"solution: $\\newcommand{\\loss}{e} \n",
"\\newcommand{\\corr}{\\mathcal{C}} \n",
"\\newcommand{\\vr}[1]{\\mathbf{r}_{#1}} \n",
"\\loss ( \\mathcal{P}_{s}( \\corr (\\mathcal{T} \\vr{t_0}) ) , \\mathcal{T} \\vr{t_1}) < \\loss ( \\mathcal{P}_{s}( \\mathcal{T} \\vr{t_0} ), \\mathcal{T} \\vr{t_1})$. \n",
"\\loss ( \\mathcal{P}_{s}( \\corr (\\mathcal{T} \\vr{t}) ) , \\mathcal{T} \\vr{t+1}) < \\loss ( \\mathcal{P}_{s}( \\mathcal{T} \\vr{t} ), \\mathcal{T} \\vr{t+1})$. \n",
"\n",
"The correction function \n",
"$\\newcommand{\\vcN}{\\mathbf{s}} \\newcommand{\\corr}{\\mathcal{C}} \\corr (\\vcN | \\theta)$ \n",
@@ -89,7 +89,7 @@
"$$\n",
"\\newcommand{\\corr}{\\mathcal{C}} \n",
"\\newcommand{\\vr}[1]{\\mathbf{r}_{#1}} \n",
"\\text{argmin}_\\theta | ( \\mathcal{P}_{s} \\corr )^n ( \\mathcal{T} \\vr{t} ) - \\mathcal{T} \\vr{t}|^2\n",
"\\text{argmin}_\\theta | ( \\mathcal{P}_{s} \\corr )^n ( \\mathcal{T} \\vr{t} ) - \\mathcal{T} \\vr{t+n}|^2\n",
"$$\n",
"\n",
"To simplify the notation, we've dropped the sum over different samples here (the $i$ from previous versions).\n",
@@ -254,11 +254,11 @@
"We'll also define two alternative versions of a neural networks to represent \n",
"$\\newcommand{\\vcN}{\\mathbf{s}} \\newcommand{\\corr}{\\mathcal{C}} \\corr$. In both cases we'll use fully convolutional networks, i.e. networks without any fully-connected layers. We'll use Keras within tensorflow to define the layers of the network (mostly via `Conv2D`), typically activated via ReLU and LeakyReLU functions, respectively.\n",
"The inputs to the network are: \n",
"- 2 fields with x,y velocity\n",
"- plus the Reynolds number as constant channel.\n",
"- 2 fields with x,y velocity,\n",
"- the Reynolds number as constant channel.\n",
"\n",
"The output is: \n",
"- a 2 component field containing the x,y velocity\n",
"- a 2 component field containing the x,y velocity.\n",
"\n",
"First, let's define a minimal network consisting only of three convolutional layers with ReLU activations (we're also using keras here for simplicity). The input channel dimension is defined via the `tensor_in`, then we'll go to 32 and 64 features, before reducing to 2 channels in the output. "
]
@@ -348,7 +348,7 @@
"\n",
"After network evaluation, we transform the output tensor back into a phiflow grid via the `to_staggered` function. \n",
"It converts the 2-component tensor that is returned by the network into a phiflow staggered grid object, so that it is compatible with the velocity field of the fluid simulation.\n",
"(Note: these are two _centered_ grids with different sizes, so we leave the work to the`unstack_staggered_tensor` function in `StaggeredGrid()` constructor)."
"(Note: these are two _centered_ grids with different sizes, so we leave the work to the `unstack_staggered_tensor` function in `StaggeredGrid()` constructor)."
]
},
{
@@ -383,7 +383,7 @@
"\n",
"## Data handling\n",
"\n",
"So far so good - we also need to take care of a few more mundane tasks, e.g. the some data handling and randomization. Below we define a `Dataset` class that stores all \"ground truth\" reference data (already downsampled).\n",
"So far so good - we also need to take care of a few more mundane tasks, e.g., some data handling and randomization. Below we define a `Dataset` class that stores all \"ground truth\" reference data (already downsampled).\n",
"\n",
"We actually have a lot of data dimensions: multiple simulations, with many time steps, each with different fields. This makes the code below a bit more difficult to read.\n",
"\n",
@@ -790,7 +790,7 @@
"source": [
"Finally, we can start training the NN! This is very straight forward now, we simply loop over the desired number of iterations, get a batch each time via `getData`, feed it into the source simulation input `source_in`, and compare it in the loss with the `reference` data for the batch.\n",
"\n",
"The setup above will automatically take care that the differentiable physics solver used here provides the right gradient information, and provides it to the tensorflow network. Be warned: due to the complexity of the setup, this training run can take a while... (If you have a saved `final.h5` model from a previous run, you can potentially skip this block and load the previously trained model instead.)"
"The setup above will automatically take care that the differentiable physics solver used here provides the right gradient information, and provides it to the tensorflow network. Be warned: due to the complexity of the setup, this training run can take a while... (If you have a saved `final.h5` network from a previous run, you can potentially skip this block and load the previously trained network instead.)"
]
},
{
@@ -866,7 +866,7 @@
"\n",
"We can reuse the solver code from above, but in the following, we will consider two simulated versions: for comparison, we'll run one reference simulation in the _source_ space (i.e., without any modifications). This version receives the regular outputs of each evaluation of the simulator, and ignores the learned correction (denoted as `sourcesim` below). The second version, `prediction`, repeatedly computes the source solver plus the learned correction, and advances this state in the solver.\n",
"\n",
"A subtle but important point: we still have to use the normalization from the original training data set here, i.e., the `dataset.dataStats['std']` values. Below we'll create a new test data set with it's own mean and standard deviation, but the model never saw this data before. It was trained with the data in `dataset` above, and hence we have to use the constants from there to make sure the model receives values that it can relate to the data it was trained with."
"A subtle but important point: we still have to use the normalization from the original training data set here, i.e., the `dataset.dataStats['std']` values. Below we'll create a new test data set with it's own mean and standard deviation, and so the trained NN never saw this data before. It was trained with the data in `dataset` above, and hence we have to use the constants from there to make sure the network receives values that it can relate to the data it was trained with."
]
},
{

View File

@@ -121,7 +121,7 @@
"source": [
"## Data generation\n",
"\n",
"Before starting the training, we have to generate a data set to train with, i.e., a set of ground truth time sequences $u^*$. Due to the complexity of the training below, we'll use a staged approach that pre-trains a supervised network as a rough initialization, and then refines it to learn control looking further and further ahead into the future. (This will be realized by training specialized models that deal with longer and longer sequences.) \n",
"Before starting the training, we have to generate a data set to train with, i.e., a set of ground truth time sequences $u^*$. Due to the complexity of the training below, we'll use a staged approach that pre-trains a supervised network as a rough initialization, and then refines it to learn control looking further and further ahead into the future. (This will be realized by training specialized NNs that deal with longer and longer sequences.) \n",
"\n",
"First, let's set up a domain and basic parameters of the data generation step."
]
@@ -195,7 +195,7 @@
"id": "eQdzVAJH30dg"
},
"source": [
"The following cell uses these shapes to create the dataset we want to train our model on.\n",
"The following cell uses these shapes to create the dataset we want to train our network with.\n",
"Each example consists of a start and target (end) frame which are generated by placing a random shape from the `shape_library` somewhere within the domain."
]
},
@@ -408,8 +408,7 @@
},
"source": [
"\n",
"This concludes the pretraining of the OP networks. This makes it possible to at least perform a rough planning of the motions, which will be refined via end-to-end training below. Before this, we'll initialize the $\\mathrm{CFE}$ networks such that we can perform _actions_, i.e., apply forces to the simulation. This is completely decoupled from the $\\mathrm{OP}$ networks.\n",
"\n"
"This concludes the pretraining of the OP networks. These networks make it possible to at least perform a rough planning of the motions, which will be refined via end-to-end training below. However, beforehand we'll initialize the $\\mathrm{CFE}$ networks such that we can perform _actions_, i.e., apply forces to the simulation. This is completely decoupled from the $\\mathrm{OP}$ networks.\n"
]
},
{
@@ -422,7 +421,7 @@
"\n",
"To pretrain the $\\mathrm{CFE}$ networks, we set up a simulation with a single step of the differentiable solver.\n",
"\n",
"The following cell trains the $\\mathrm{CFE}$ network from scratch. If you have a pretrained model at hand, you can skip the training an load the checkpoint by running the cell after."
"The following cell trains the $\\mathrm{CFE}$ network from scratch. If you have a pretrained network at hand, you can skip the training and load the checkpoint by running the cell after."
]
},
{
@@ -578,7 +577,7 @@
"source": [
"The next cell initializes the networks using the supervised checkpoints and then trains all networks jointly. You can increase the number of optimization steps or execute the next cell multiple times to further increase performance.\n",
"\n",
"*Note: The next cell will run for some time. Optionally, you can skip this cell and load a pretrained networks instead with code in the cell below.*"
"*Note: The next cell will run for some time. Optionally, you can skip this cell and load the pretrained networks instead with code in the cell below.*"
]
},
{
@@ -636,7 +635,7 @@
"id": "rDEPL8E9fiFt"
},
"source": [
"Via the index list `batches` below, you can choose to display some of the solutions. Each row show a temporal sequence starting with the initial condition, and evolving the simulation with the NN control forces for 16 time steps. The last step, at $t=16$ should match the target shown in the image on the far right."
"Via the index list `batches` below, you can choose to display some of the solutions. Each row shows a temporal sequence starting with the initial condition, and evolving the simulation with the NN control forces for 16 time steps. The last step, at $t=16$ should match the target shown in the image on the far right."
]
},
{