SoL updates
This commit is contained in:
@@ -3,7 +3,7 @@
|
|||||||
"nbformat_minor": 0,
|
"nbformat_minor": 0,
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"colab": {
|
"colab": {
|
||||||
"name": "SoL-karman2d.ipynb",
|
"name": "diffphys-code-sol.ipynb",
|
||||||
"provenance": [],
|
"provenance": [],
|
||||||
"collapsed_sections": []
|
"collapsed_sections": []
|
||||||
},
|
},
|
||||||
@@ -25,6 +25,8 @@
|
|||||||
" \n",
|
" \n",
|
||||||
"Pretty much all numerical methods contain some form of iterative process. That can be repeated updates over time for explicit solvers,or within a single update step for implicit solvers. Below we'll target iterations over time, an example for the second case could be found [here](https://github.com/tum-pbs/CG-Solver-in-the-Loop).\n",
|
"Pretty much all numerical methods contain some form of iterative process. That can be repeated updates over time for explicit solvers,or within a single update step for implicit solvers. Below we'll target iterations over time, an example for the second case could be found [here](https://github.com/tum-pbs/CG-Solver-in-the-Loop).\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"## Problem Formulation\n",
|
||||||
|
"\n",
|
||||||
"In the context of reducing errors, it's crucial to have a _differentiable physics solver_, so that the learning process can take the reaction of the solver into account. This interaction is not possible with supervised learning or PINN training. Even small inference errors of a supervised NN can accumulate over time, and lead to a data distribution that differs from the distribution of the pre-computed data. This distribution shift can lead to sub-optimal results, or even cause blow-ups of the solver.\n",
|
"In the context of reducing errors, it's crucial to have a _differentiable physics solver_, so that the learning process can take the reaction of the solver into account. This interaction is not possible with supervised learning or PINN training. Even small inference errors of a supervised NN can accumulate over time, and lead to a data distribution that differs from the distribution of the pre-computed data. This distribution shift can lead to sub-optimal results, or even cause blow-ups of the solver.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"In order to learn the error function, we'll consider two different discretizations of the same PDE $\\mathcal P^*$: \n",
|
"In order to learn the error function, we'll consider two different discretizations of the same PDE $\\mathcal P^*$: \n",
|
||||||
@@ -36,7 +38,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"```{figure} resources/diffphys-sol-manifolds.jpeg\n",
|
"```{figure} resources/diffphys-sol-manifolds.jpeg\n",
|
||||||
"---\n",
|
"---\n",
|
||||||
"height: 280px\n",
|
"height: 150px\n",
|
||||||
"name: diffphys-sol-manifolds\n",
|
"name: diffphys-sol-manifolds\n",
|
||||||
"---\n",
|
"---\n",
|
||||||
"Visual overview of coarse and reference manifolds\n",
|
"Visual overview of coarse and reference manifolds\n",
|
||||||
@@ -88,9 +90,7 @@
|
|||||||
"The overall learning goal now becomes\n",
|
"The overall learning goal now becomes\n",
|
||||||
"\n",
|
"\n",
|
||||||
"$\n",
|
"$\n",
|
||||||
"\\text{argmin}_\\theta | \n",
|
"\\text{argmin}_\\theta | ( \\pdec \\corr )^n ( \\project \\vr{t} ) - \\project \\vr{t}|^2\n",
|
||||||
"( \\pdec \\corr )^n ( \\project \\vr{t} )\n",
|
|
||||||
"- \\project \\vr{t}|^2\n",
|
|
||||||
"$\n",
|
"$\n",
|
||||||
"\n",
|
"\n",
|
||||||
"A crucial bit here that's easy to overlook is that the correction depends on the modified states, i.e.\n",
|
"A crucial bit here that's easy to overlook is that the correction depends on the modified states, i.e.\n",
|
||||||
@@ -102,10 +102,12 @@
|
|||||||
"**TL;DR**:\n",
|
"**TL;DR**:\n",
|
||||||
"We'll train a network $\\mathcal{C}$ to reduce the numerical errors of a simulator with a more accurate reference. Here it's crucial to have the _source_ solver realized as a differential physics operator, such that it can give gradients for an improved training of $\\mathcal{C}$.\n",
|
"We'll train a network $\\mathcal{C}$ to reduce the numerical errors of a simulator with a more accurate reference. Here it's crucial to have the _source_ solver realized as a differential physics operator, such that it can give gradients for an improved training of $\\mathcal{C}$.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\\\\\n",
|
"<br>\n",
|
||||||
"\n",
|
"\n",
|
||||||
"---\n",
|
"---\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"## Getting started with the Implementation\n",
|
||||||
|
"\n",
|
||||||
"First, let's download the prepared data set (for details on generation & loading cf. https://github.com/tum-pbs/Solver-in-the-Loop), and let's get the data handling out of the way, so that we can focus on the _interesting_ parts..."
|
"First, let's download the prepared data set (for details on generation & loading cf. https://github.com/tum-pbs/Solver-in-the-Loop), and let's get the data handling out of the way, so that we can focus on the _interesting_ parts..."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -130,7 +132,7 @@
|
|||||||
"with open('data-karman2d-train.pickle', 'rb') as f: dataPreloaded = pickle.load(f)\n",
|
"with open('data-karman2d-train.pickle', 'rb') as f: dataPreloaded = pickle.load(f)\n",
|
||||||
"print(\"Loaded data, {} training sims\".format(len(dataPreloaded)) )\n"
|
"print(\"Loaded data, {} training sims\".format(len(dataPreloaded)) )\n"
|
||||||
],
|
],
|
||||||
"execution_count": 1,
|
"execution_count": null,
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
@@ -174,7 +176,7 @@
|
|||||||
"np.random.seed(42)\n",
|
"np.random.seed(42)\n",
|
||||||
"tf.compat.v1.set_random_seed(42)\n"
|
"tf.compat.v1.set_random_seed(42)\n"
|
||||||
],
|
],
|
||||||
"execution_count": 2,
|
"execution_count": null,
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
@@ -212,6 +214,8 @@
|
|||||||
"id": "OhnzPdoww11P"
|
"id": "OhnzPdoww11P"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
|
"## Simulation Setup\n",
|
||||||
|
"\n",
|
||||||
"Now we can set up the _source_ simulation $\\newcommand{\\pdec}{\\pde_{s}} \\pdec$. \n",
|
"Now we can set up the _source_ simulation $\\newcommand{\\pdec}{\\pde_{s}} \\pdec$. \n",
|
||||||
"Note that we won't deal with \n",
|
"Note that we won't deal with \n",
|
||||||
"$\\newcommand{\\pder}{\\pde_{r}} \\pder$\n",
|
"$\\newcommand{\\pder}{\\pde_{r}} \\pder$\n",
|
||||||
@@ -259,7 +263,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
" return super().step(fluid=fluid, dt=dt, obstacles=[self.obst], gravity=gravity, density_effects=[self.infl], velocity_effects=())\n"
|
" return super().step(fluid=fluid, dt=dt, obstacles=[self.obst], gravity=gravity, density_effects=[self.infl], velocity_effects=())\n"
|
||||||
],
|
],
|
||||||
"execution_count": 3,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -268,6 +272,8 @@
|
|||||||
"id": "RYFUGICgxk0K"
|
"id": "RYFUGICgxk0K"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
|
"## Network Architecture\n",
|
||||||
|
"\n",
|
||||||
"We'll also define two alternative neural networks to represent \n",
|
"We'll also define two alternative neural networks to represent \n",
|
||||||
"$\\newcommand{\\vcN}{\\mathbf{s}} \\newcommand{\\corr}{\\mathcal{C}} \\corr$: \n",
|
"$\\newcommand{\\vcN}{\\mathbf{s}} \\newcommand{\\corr}{\\mathcal{C}} \\corr$: \n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -296,7 +302,7 @@
|
|||||||
" keras.layers.Conv2D(filters=2, kernel_size=5, padding='same', activation=None), # u, v\n",
|
" keras.layers.Conv2D(filters=2, kernel_size=5, padding='same', activation=None), # u, v\n",
|
||||||
" ])\n"
|
" ])\n"
|
||||||
],
|
],
|
||||||
"execution_count": 4,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -352,7 +358,7 @@
|
|||||||
" l_output = keras.layers.Conv2D(filters=2, kernel_size=5, padding='same')(block_5)\n",
|
" l_output = keras.layers.Conv2D(filters=2, kernel_size=5, padding='same')(block_5)\n",
|
||||||
" return keras.models.Model(inputs=l_input, outputs=l_output)\n"
|
" return keras.models.Model(inputs=l_input, outputs=l_output)\n"
|
||||||
],
|
],
|
||||||
"execution_count": 5,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -387,7 +393,7 @@
|
|||||||
"def to_staggered(tensor_cen, box):\n",
|
"def to_staggered(tensor_cen, box):\n",
|
||||||
" return StaggeredGrid(math.pad(tensor_cen, ((0,0), (0,1), (0,1), (0,0))), box=box)\n"
|
" return StaggeredGrid(math.pad(tensor_cen, ((0,0), (0,1), (0,1), (0,0))), box=box)\n"
|
||||||
],
|
],
|
||||||
"execution_count": 12,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -398,6 +404,8 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"---\n",
|
"---\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"## Data Handling\n",
|
||||||
|
"\n",
|
||||||
"So far so good - we also need to take care of a few more mundane tasks, e.g. the some data handling and randomization. Below we define a `Dataset` class that stores all \"ground truth\" reference data (already downsampled).\n",
|
"So far so good - we also need to take care of a few more mundane tasks, e.g. the some data handling and randomization. Below we define a `Dataset` class that stores all \"ground truth\" reference data (already downsampled).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"We actually have a lot of data dimensions: multiple simulations, with many time steps, each with different fields. This makes the code below a bit more difficult to read.\n",
|
"We actually have a lot of data dimensions: multiple simulations, with many time steps, each with different fields. This makes the code below a bit more difficult to read.\n",
|
||||||
@@ -477,7 +485,7 @@
|
|||||||
" def nextStep(self):\n",
|
" def nextStep(self):\n",
|
||||||
" self.stepIdx += 1\n"
|
" self.stepIdx += 1\n"
|
||||||
],
|
],
|
||||||
"execution_count": 7,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -528,7 +536,7 @@
|
|||||||
" ]\n",
|
" ]\n",
|
||||||
" return [marker_dens, velocity, ext]\n"
|
" return [marker_dens, velocity, ext]\n"
|
||||||
],
|
],
|
||||||
"execution_count": 8,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -560,7 +568,7 @@
|
|||||||
"#print(format(getData(dataset,1)))\n",
|
"#print(format(getData(dataset,1)))\n",
|
||||||
"#print(format(dataset.getData(1)))\n"
|
"#print(format(dataset.getData(1)))\n"
|
||||||
],
|
],
|
||||||
"execution_count": 9,
|
"execution_count": null,
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
@@ -624,7 +632,7 @@
|
|||||||
"network.summary() \n",
|
"network.summary() \n",
|
||||||
"\n"
|
"\n"
|
||||||
],
|
],
|
||||||
"execution_count": 10,
|
"execution_count": null,
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
@@ -665,6 +673,8 @@
|
|||||||
"id": "AbpNPzplQZMF"
|
"id": "AbpNPzplQZMF"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
|
"## Interleaving Simulation and Network\n",
|
||||||
|
"\n",
|
||||||
"Now comes the **most crucial** step in the whole setup: we define the chain of simulation steps and network evaluations to be used at training time. After all the work defining helper functions, it's acutally pretty simple: we loop over `msteps`, call the simulator via `KarmanFlow.step` for an input state, and afterwards evaluate the correction via `network(to_keras())`. The correction is then added to the last simultation state in the `prediction` list (we're actually simply overwriting the last simulated step `prediction[-1]` with `velocity + correction[-1]`.\n",
|
"Now comes the **most crucial** step in the whole setup: we define the chain of simulation steps and network evaluations to be used at training time. After all the work defining helper functions, it's acutally pretty simple: we loop over `msteps`, call the simulator via `KarmanFlow.step` for an input state, and afterwards evaluate the correction via `network(to_keras())`. The correction is then added to the last simultation state in the `prediction` list (we're actually simply overwriting the last simulated step `prediction[-1]` with `velocity + correction[-1]`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"One other important things that's happening here is normalization: the inputs to the network are divided by the standard deviations in `dataset.dataStats`. This is slightly complicated as we have to append the scaling for the Reynolds numbers to the normalization for the velocity. After evaluating the `network`, we only have a velocity left, so we can simply multiply by the standard deviation again (`* dataset.dataStats['std'][1]`)."
|
"One other important things that's happening here is normalization: the inputs to the network are divided by the standard deviations in `dataset.dataStats`. This is slightly complicated as we have to append the scaling for the Reynolds numbers to the normalization for the velocity. After evaluating the `network`, we only have a velocity left, so we can simply multiply by the standard deviation again (`* dataset.dataStats['std'][1]`)."
|
||||||
@@ -702,7 +712,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
" prediction[-1] = prediction[-1].copied_with(velocity=prediction[-1].velocity + correction[-1])\n"
|
" prediction[-1] = prediction[-1].copied_with(velocity=prediction[-1].velocity + correction[-1])\n"
|
||||||
],
|
],
|
||||||
"execution_count": 13,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -729,7 +739,7 @@
|
|||||||
"]\n",
|
"]\n",
|
||||||
"loss = tf.reduce_sum(loss_steps)/msteps\n"
|
"loss = tf.reduce_sum(loss_steps)/msteps\n"
|
||||||
],
|
],
|
||||||
"execution_count": 14,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -738,17 +748,15 @@
|
|||||||
"id": "E6Vly1_0QhZ1"
|
"id": "E6Vly1_0QhZ1"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
|
"## Training\n",
|
||||||
|
"\n",
|
||||||
"For the training, we use a standard Adam optimizer, and only run 4 epochs by default. This could (should) be increased for the larger network or to obtain more accurate results."
|
"For the training, we use a standard Adam optimizer, and only run 4 epochs by default. This could (should) be increased for the larger network or to obtain more accurate results."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"id": "PuljFamYQksW",
|
"id": "PuljFamYQksW"
|
||||||
"colab": {
|
|
||||||
"base_uri": "https://localhost:8080/"
|
|
||||||
},
|
|
||||||
"outputId": "e71bcaae-187c-4c10-cee8-f03bb8964af0"
|
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
"lr = 1e-4\n",
|
"lr = 1e-4\n",
|
||||||
@@ -771,19 +779,8 @@
|
|||||||
" ld_network = keras.models.load_model(output_dir+'/nn_epoch{:04d}.h5'.format(resume))\n",
|
" ld_network = keras.models.load_model(output_dir+'/nn_epoch{:04d}.h5'.format(resume))\n",
|
||||||
" network.set_weights(ld_network.get_weights())\n"
|
" network.set_weights(ld_network.get_weights())\n"
|
||||||
],
|
],
|
||||||
"execution_count": 15,
|
"execution_count": null,
|
||||||
"outputs": [
|
"outputs": []
|
||||||
{
|
|
||||||
"output_type": "stream",
|
|
||||||
"text": [
|
|
||||||
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/phi/tf/session.py:28: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.\n",
|
|
||||||
"\n",
|
|
||||||
"WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/phi/tf/session.py:29: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.\n",
|
|
||||||
"\n"
|
|
||||||
],
|
|
||||||
"name": "stdout"
|
|
||||||
}
|
|
||||||
]
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
@@ -809,7 +806,7 @@
|
|||||||
" elif epoch == 10: lr *= 1e-1\n",
|
" elif epoch == 10: lr *= 1e-1\n",
|
||||||
" return lr\n"
|
" return lr\n"
|
||||||
],
|
],
|
||||||
"execution_count": 16,
|
"execution_count": null,
|
||||||
"outputs": []
|
"outputs": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -830,7 +827,7 @@
|
|||||||
"colab": {
|
"colab": {
|
||||||
"base_uri": "https://localhost:8080/"
|
"base_uri": "https://localhost:8080/"
|
||||||
},
|
},
|
||||||
"outputId": "3bea702a-14d0-43a7-ebc5-25289e27c5a5"
|
"outputId": "148d951b-7070-4a95-c6d7-0fd91d29606e"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
"current_lr = lr\n",
|
"current_lr = lr\n",
|
||||||
@@ -855,7 +852,7 @@
|
|||||||
" _, l2 = sess.run([train_step, loss], my_feed_dict)\n",
|
" _, l2 = sess.run([train_step, loss], my_feed_dict)\n",
|
||||||
" steps += 1\n",
|
" steps += 1\n",
|
||||||
"\n",
|
"\n",
|
||||||
" if (j==0 and i<3) or (ib==0 and i%10==0):\n",
|
" if (j==0 and i<3) or (j==0 and ib==0 and i%31==0) or (ib==0 and i%124==0):\n",
|
||||||
" print('epoch {:03d}/{:03d}, batch {:03d}/{:03d}, step {:04d}/{:04d}: loss={}'.format( j+1, epochs, ib+1, dataset.numBatches, i+1, dataset.numSteps, l2 ))\n",
|
" print('epoch {:03d}/{:03d}, batch {:03d}/{:03d}, step {:04d}/{:04d}: loss={}'.format( j+1, epochs, ib+1, dataset.numBatches, i+1, dataset.numSteps, l2 ))\n",
|
||||||
" dataset.nextStep()\n",
|
" dataset.nextStep()\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -863,7 +860,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
" if j%10==9: network.save(output_dir+'/nn_epoch{:04d}.h5'.format(j+1))\n",
|
" if j%10==9: network.save(output_dir+'/nn_epoch{:04d}.h5'.format(j+1))\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#tf_writer_tr.close()\n",
|
"# all done! save final version\n",
|
||||||
"network.save(output_dir+'/final.h5')\n"
|
"network.save(output_dir+'/final.h5')\n"
|
||||||
],
|
],
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
@@ -871,11 +868,39 @@
|
|||||||
{
|
{
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"epoch 001/004, batch 001/002, step 0001/0496: loss=6816.912109375\n",
|
"epoch 001/004, batch 001/002, step 0001/0496: loss=8114.626953125\n",
|
||||||
"epoch 001/004, batch 001/002, step 0002/0496: loss=4036.171875\n",
|
"epoch 001/004, batch 001/002, step 0002/0496: loss=3371.28125\n",
|
||||||
"epoch 001/004, batch 001/002, step 0003/0496: loss=1627.9716796875\n",
|
"epoch 001/004, batch 001/002, step 0003/0496: loss=1594.294189453125\n",
|
||||||
"epoch 001/004, batch 001/002, step 0011/0496: loss=1403.9822998046875\n",
|
"epoch 001/004, batch 001/002, step 0032/0496: loss=261.2645263671875\n",
|
||||||
"epoch 001/004, batch 001/002, step 0021/0496: loss=841.949951171875\n"
|
"epoch 001/004, batch 001/002, step 0063/0496: loss=124.70037078857422\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0094/0496: loss=86.60037231445312\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0125/0496: loss=93.21685028076172\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0156/0496: loss=64.77877807617188\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0187/0496: loss=58.933082580566406\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0218/0496: loss=51.40797805786133\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0249/0496: loss=42.819091796875\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0280/0496: loss=46.30024719238281\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0311/0496: loss=41.07358932495117\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0342/0496: loss=40.12362289428711\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0373/0496: loss=41.094932556152344\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0404/0496: loss=36.17275619506836\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0435/0496: loss=37.64105987548828\n",
|
||||||
|
"epoch 001/004, batch 001/002, step 0466/0496: loss=33.44026184082031\n",
|
||||||
|
"epoch 001/004, batch 002/002, step 0001/0496: loss=36.6204719543457\n",
|
||||||
|
"epoch 001/004, batch 002/002, step 0002/0496: loss=29.037982940673828\n",
|
||||||
|
"epoch 001/004, batch 002/002, step 0003/0496: loss=27.977163314819336\n",
|
||||||
|
"epoch 002/004, batch 001/002, step 0001/0496: loss=13.540712356567383\n",
|
||||||
|
"epoch 002/004, batch 001/002, step 0125/0496: loss=12.313040733337402\n",
|
||||||
|
"epoch 002/004, batch 001/002, step 0249/0496: loss=11.129035949707031\n",
|
||||||
|
"epoch 002/004, batch 001/002, step 0373/0496: loss=11.969249725341797\n",
|
||||||
|
"epoch 003/004, batch 001/002, step 0001/0496: loss=8.394614219665527\n",
|
||||||
|
"epoch 003/004, batch 001/002, step 0125/0496: loss=7.2177557945251465\n",
|
||||||
|
"epoch 003/004, batch 001/002, step 0249/0496: loss=8.274188041687012\n",
|
||||||
|
"epoch 003/004, batch 001/002, step 0373/0496: loss=9.177286148071289\n",
|
||||||
|
"epoch 004/004, batch 001/002, step 0001/0496: loss=6.306344985961914\n",
|
||||||
|
"epoch 004/004, batch 001/002, step 0125/0496: loss=4.158570289611816\n",
|
||||||
|
"epoch 004/004, batch 001/002, step 0249/0496: loss=4.282064437866211\n",
|
||||||
|
"epoch 004/004, batch 001/002, step 0373/0496: loss=5.2111334800720215\n"
|
||||||
],
|
],
|
||||||
"name": "stdout"
|
"name": "stdout"
|
||||||
}
|
}
|
||||||
@@ -887,7 +912,7 @@
|
|||||||
"id": "swG7GeDpWT_Z"
|
"id": "swG7GeDpWT_Z"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
"The loss should go down from ca. 1000 initially to around 1. This is a good sign, but of course it's even more important to see how the resulting solver fares on new inputs.\n",
|
"The loss should go down from above 1000 initially to below 10. This is a good sign, but of course it's even more important to see how the resulting solver fares on new inputs.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Note that after training we're realized a hybrid solver, consisting of a regular _source_ simulator, and a network that was trained to specificially interact with this simulator for a chosen domain of simulation cases.\n",
|
"Note that after training we're realized a hybrid solver, consisting of a regular _source_ simulator, and a network that was trained to specificially interact with this simulator for a chosen domain of simulation cases.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -897,7 +922,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"## Next steps\n",
|
"## Next steps\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* Modify the training to further reduce the training error\n",
|
"* Modify the training to further reduce the training error. With the medium network you should be able to get the loss down to around 1.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* Export the network to the external github code, and run it on new wake flow cases. You'll see that a reduced training error not always directly correlates with an improved test performance\n",
|
"* Export the network to the external github code, and run it on new wake flow cases. You'll see that a reduced training error not always directly correlates with an improved test performance\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
|||||||
BIN
resources/diffphys-sol-domain.jpeg
Normal file
BIN
resources/diffphys-sol-domain.jpeg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 17 KiB |
@@ -2,8 +2,8 @@ Supervised Training
|
|||||||
=======================
|
=======================
|
||||||
|
|
||||||
_Supervised_ here essentially means: "doing things the old fashioned way". Old fashioned in the context of
|
_Supervised_ here essentially means: "doing things the old fashioned way". Old fashioned in the context of
|
||||||
deep learning (DL), of course, so it's still fairly new. Also, "old fashioned" of course also doesn't always mean bad
|
deep learning (DL), of course, so it's still fairly new. Also, "old fashioned" of course also doesn't
|
||||||
- it's just that we'll be able to do better than simple supervised training later on.
|
always mean bad - it's just that we'll be able to do better than simple supervised training later on.
|
||||||
|
|
||||||
In a way, the viewpoint of "supervised training" is a starting point for all projects one would encounter in the context of DL, and
|
In a way, the viewpoint of "supervised training" is a starting point for all projects one would encounter in the context of DL, and
|
||||||
hence is worth studying. And although it typically yields inferior results to approaches that more tightly
|
hence is worth studying. And although it typically yields inferior results to approaches that more tightly
|
||||||
|
|||||||
Reference in New Issue
Block a user