fixing typos

This commit is contained in:
NT 2021-06-24 20:18:18 +02:00
parent be1f77c836
commit b1e09b8225
4 changed files with 5 additions and 5 deletions

View File

@ -121,7 +121,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Because we're only constraining timestep 16, we could actually omit steps 17 to 31 in this setup. They don't have any degrees of freedom and are not constrained in any way. However, for fairness regarding a comparison with the previous PINN case, we include them.\n",
"Because we're only constraining time step 16, we could actually omit steps 17 to 31 in this setup. They don't have any degrees of freedom and are not constrained in any way. However, for fairness regarding a comparison with the previous PINN case, we include them.\n",
"\n",
"Note that we've done a lot of calculations here: first the 32 steps of our simulation, and then another 16 steps backwards from the loss. They were recorded by the gradient tape, and used to backpropagate the loss to the initial state of the simulation.\n",
"\n",

View File

@ -608,7 +608,7 @@
"\n",
"## Z Space\n",
"\n",
"To understand the behavior and differences of the methods here, it's important to keep in mind that we're not dealing with a black box that maps between $\\mathbf{x}$ and $L$, but rather there are spaces inbetween that matter. In our case, we only have a single $\\mathbf{z}$ space, but for DL settings, we might have a large number of latent spaces, over which we have a certain amount of control. We will return to NNs soon, but for now let's focus on $\\mathbf{z}$. \n",
"To understand the behavior and differences of the methods here, it's important to keep in mind that we're not dealing with a black box that maps between $\\mathbf{x}$ and $L$, but rather there are spaces in between that matter. In our case, we only have a single $\\mathbf{z}$ space, but for DL settings, we might have a large number of latent spaces, over which we have a certain amount of control. We will return to NNs soon, but for now let's focus on $\\mathbf{z}$. \n",
"\n",
"A first thing to note is that for PG, we explicitly map from $L$ to $\\mathbf{z}$, and then continue with a mapping to $\\mathbf{x}$. Thus we already obtained the trajectory in $\\mathbf{z}$ space, and not conincidentally, we already stored it in the `historyPGz` list above.\n",
"\n",

View File

@ -201,7 +201,7 @@
"\n",
"The environment for Burgers' equation contains a `Burgers` physics object provided by `phiflow`. The states are internally stored as `BurgersVelocity` objects. To create the initial states, the environment generates batches of random fields in the same fashion as in the data set generation process shown above. The observation space consists of the velocity fields of the current and target states stacked in the channel dimension with another channel specifying the current time step. Actions are taken in the form of a one dimensional array covering every velocity value. The `step` method calls the physics object to advance the internal state by one time step, also applying the actions as a `FieldEffect`.\n",
"\n",
"The rewards encompass a penalty equal to the square norm of the generated forces at every timestep. Additionally, the $L^2$ distance to the target field, scaled by a predefined factor (`FINAL_REWARD_FACTOR`) is subtracted at the end of each trajectory. The rewards are then normalized with a running estimate for the reward mean and standard deviation.\n",
"The rewards encompass a penalty equal to the square norm of the generated forces at every time step. Additionally, the $L^2$ distance to the target field, scaled by a predefined factor (`FINAL_REWARD_FACTOR`) is subtracted at the end of each trajectory. The rewards are then normalized with a running estimate for the reward mean and standard deviation.\n",
"\n",
"### Neural Network\n",
"\n",
@ -340,7 +340,7 @@
"\n",
"To classify the results of the reinforcement learning method, we now compare them to an approach using differentiable physics training. In contrast to the full approach from {doc}`diffphys-control` which includes a second _OP_ network, we aim for a direct control here. The OP network represents a separate \"physics-predictor\", which is omitted here for fairness when comparing with the RL version.\n",
"\n",
"The DP approach has access to the gradient data provided by the differentiable solver, making it possible to trace the loss over multiple timesteps and enabling the model to comprehend long term effects of generated forces better. The reinforcement learning algorithm, on the other hand, is not limited by training set size like the DP algorithm, as new training samples are generated on policy. However, this also introduces additional simulation overhead during training, which can increase the time needed for convergence. "
"The DP approach has access to the gradient data provided by the differentiable solver, making it possible to trace the loss over multiple time steps and enabling the model to comprehend long term effects of generated forces better. The reinforcement learning algorithm, on the other hand, is not limited by training set size like the DP algorithm, as new training samples are generated on policy. However, this also introduces additional simulation overhead during training, which can increase the time needed for convergence. "
]
},
{

View File

@ -67,7 +67,7 @@ NN on the other hand incurs a constant cost per evaluation, and is typically tri
to evaluate on specialized hardware such as GPUs or NN units.
Despite this, it's important to be careful:
NNs can quickly generate huge numbers of inbetween results. Consider a CNN layer with
NNs can quickly generate huge numbers of in between results. Consider a CNN layer with
$128$ features. If we apply it to an input of $128^2$, i.e. ca. 16k cells, we get $128^3$ intermediate values.
That's more than 2 million.
All these values at least need to be momentarily stored in memory, and processed by the next layer.