diff --git a/_toc.yml b/_toc.yml index 11437e8..5077b05 100644 --- a/_toc.yml +++ b/_toc.yml @@ -37,6 +37,16 @@ - file: diffphys-control.ipynb - file: diffphys-outlook.md +- part: Reinforcement Learning + chapters: + - file: reinflearn-intro.md + - file: reinflearn-code.ipynb + +- part: PBDL and Uncertainty + chapters: + - file: bayesian-intro.md + - file: bayesian-code.ipynb + - part: Physical Gradients chapters: - file: physgrad.md @@ -44,11 +54,6 @@ - file: physgrad-nn.md - file: physgrad-discuss.md -- part: PBDL and Uncertainty - chapters: - - file: bayesian-intro.md - - file: bayesian-code.md - - part: Fast Forward Topics chapters: - file: others-intro.md diff --git a/bayesian-intro.md b/bayesian-intro.md index 5b72759..4a42649 100644 --- a/bayesian-intro.md +++ b/bayesian-intro.md @@ -1,7 +1,7 @@ Introduction to Posterior Inference ======================= -We have to keep in mind that for all measurements, models, and discretizations we have uncertainties. In the former, this typically appears in the form of measurements errors, model equations usually encompass only parts of a system we're interested in, and for numerical simulations we inherently introduce discretization errors. So a very important question to ask here is how sure we can be sure that an answer we obtain is the correct one. From a statistics viewpoint, we'd like to know the probability distribution for the posterior, i.e., the outcomes. +We should keep in mind that for all measurements, models, and discretizations we have uncertainties. For the former, this typically appears in the form of measurements errors, while model equations usually encompass only parts of a system we're interested in, and for numerical simulations we inherently introduce discretization errors. So a very important question to ask here is how sure we can be sure that an answer we obtain is the correct one. From a statistics viewpoint, we'd like to know the probability distribution for the posterior, i.e., the different outcomes that are possible. This admittedly becomes even more difficult in the context of machine learning: we're typically facing the task of approximating complex and unknown functions. @@ -10,22 +10,36 @@ yields a _maximum likelihood estimation_ (MLE) for the parameters of the network However, this MLE viewpoint does not take any of the uncertainties mentioned above into account: for DL training, we likewise have a numerical optimization, and hence an inherent approximation error and uncertainty regarding the learned representation. -Ideally, we could change our learning problem such that we could do _posterior inference_, +Ideally, we should reformulate our learning problem such that it enables _posterior inference_, i.e. learn to produce the full output distribution. However, this turns out to be an extremely difficult task. This where so called _Bayesian neural network_ (BNN) approaches come into play. They -make posterior inference possible by making assumptions about the probability -distributions of individual parameters of the network. Nonetheless, the task +make a form of posterior inference possible by making assumptions about the probability +distributions of individual parameters of the network. With a distribution for the +parameters we can evaluate the network multiple times to obtain different versions +of the output, and in this way sample the distribution of the output. + +Nonetheless, the task remains very challenging. Training a BNN is typically significantly more difficult than training a regular NN. However, this should come as no surprise, as we're trying to -learn something fundamentally different in this case: a full probability distribution -instead of a point estimate. +learn something fundamentally different here: a full probability distribution +instead of a point estimate. (All previous chapters "just" dealt with +learning such point estimates.) ![Divider](resources/divider5.jpg) +## Introduction to Bayesian Neural Networks + + +**TODO, integrate Maximilians intro section here** +... + + ## A practical example -first example here with airfoils, extension from {doc}`supervised-airfoils` - +As a first real example for posterior inference with BNNs, let's revisit the +case of turbulent flows around airfoils, from {doc}`supervised-airfoils`. However, +in contrast to the point estimate learned in this section, we'll now aim for +learning the full posterior. diff --git a/diffphys.md b/diffphys.md index ffd8a0a..fa4d796 100644 --- a/diffphys.md +++ b/diffphys.md @@ -62,7 +62,7 @@ given model parameter, with which the NN should not interact. Naturally, it can vary within the solution manifold that we're interested in, but $\nu$ will not be the output of a NN representation. If this is the case, we can omit providing $\partial \mathcal P_i / \partial \nu$ in our solver. However, the following learning process -natuarlly transfers to including $\nu$ as a degree of freedom. +naturally transfers to including $\nu$ as a degree of freedom. ## Jacobians @@ -152,7 +152,7 @@ we could leverage the $O(n)$ runtime of multigrid solvers for matrix inversion. The flipside of this approach is, that it requires some understanding of the problem at hand, and of the numerical methods. Also, a given solver might not provide gradient calculations out of the box. Thus, we want to employ DL for model equations that we don't have a proper grasp of, it might not be a good -idea to direclty go for learning via a DP approach. However, if we don't really understand our model, we probably +idea to directly go for learning via a DP approach. However, if we don't really understand our model, we probably should go back to studying it a bit more anyway... Also, in practice we can be _greedy_ with the derivative operators, and only @@ -191,7 +191,7 @@ Note that to simplify things, we assume that $\mathbf{u}$ is only a function in i.e. constant over time. We'll bring back the time evolution of $\mathbf{u}$ later on. % Let's denote this re-formulation as $\mathcal P$. It maps a state of $d(t)$ into a -new state at an evoled time, i.e.: +new state at an evolved time, i.e.: $$ d(t+\Delta t) = \mathcal P ( ~ d(t), \mathbf{u}, t+\Delta t) @@ -289,7 +289,7 @@ be preferable to actually constructing $A$. As a slightly more complex example let's consider Poisson's equation $\nabla^2 a = b$, where $a$ is the quantity of interest, and $b$ is given. This is a very fundamental elliptic PDE that is important for -a variety of physical problems, from electrostatics to graviational fields. It also arises +a variety of physical problems, from electrostatics to gravitational fields. It also arises in the context of fluids, where $a$ takes the role of a scalar pressure field in the fluid, and the right hand side $b$ is given by the divergence of the fluid velocity $\mathbf{u}$. diff --git a/others-intro.md b/others-intro.md index 6ec1ce0..e37feb9 100644 --- a/others-intro.md +++ b/others-intro.md @@ -12,8 +12,16 @@ More specifically, we will look at: This typically replaces a numerical solver, and we can make use of special techniques from the DL area that target time series. * Generative models are likewise an own topic in DL, and here especially generative adversarial networks were shown to be powerful tools. They also represent a highly interesting training approach involving to separate NNs. +{cite}`xie2018tempoGan` * Meshless methods and unstructured meshes are an important topic for classical simulations. Here, we'll look at a specific Lagrangian method that employs learning in the context of dynamic, particle-based representations. +{cite}`prantl2019tranquil` +{cite}`ummenhofer2019contconv` -* Finally, metrics to reboustly assess the quality of similarity of measurements and results are a central topic for all numerical methods, no matter whether they employ learning or not. In the last section we will look at how DL can be used to learn specialized and improved metrics. +https://github.com/intel-isl/DeepLagrangianFluids +* Finally, metrics to robustly assess the quality of similarity of measurements and results are a central topic for all numerical methods, no matter whether they employ learning or not. In the last section we will look at how DL can be used to learn specialized and improved metrics. + +{cite}`kohl2020lsim` + +{cite}`um2020sol` diff --git a/others-timeseries.md b/others-timeseries.md index ae10366..6881a0e 100644 --- a/others-timeseries.md +++ b/others-timeseries.md @@ -1,7 +1,53 @@ Model Reduction and Time Series ======================= -model reduction? separate +An inherent challenge for many practical PDE solvers is the large dimensionality of the problem. +Our model $\mathcal{P}$ is typically discretized with $\mathcal{O}(n^3)$ samples for a 3 dimensional +problem (with $n$ denoting the number of samples along one axis), +and for time-dependent phenomena we additionally have a discretization along +time. The latter typically scales in accordance to the spatial dimensions, giving an +overall number of samples on the order of $\mathcal{O}(n^4)$. Not surprisingly, +the workload in these situations quickly explodes for larger $n$ (and for practical high-fidelity applications we want $n$ to be as large as possible). + +One popular way to reduce the complexity is to map a spatial state of our system $\mathbf{s_t} \in \mathbb{R}^{n^3}$ +into a much lower dimensional state $\mathbf{c_t} \in \mathbb{R}^{m}$, with $m \ll n^3$. Within this latent space, +we estimate the evolution of our system by inferring a new state $\mathbf{c_{t+1}}$, which we then decode to obtain $\mathbf{s_{t+1}}$. In order for this to work, it's crucial that we can choose $m$ large enough that it captures all important structures in our solution manifold, and that the time prediction of $\mathbf{c_{t+1}}$ can be computed efficiently, such that we obtain a gain in performance despite the additional encoding and decoding steps. In practice, due to the explosion in terms of unknowns for regular simulations (the $\mathcal{O}(n^3)$ above) coupled a super-linear complexity for computing a new state, working with the latent space points $\mathbf{c}$ quickly pays off for small $m$. + +However, it's crucial that encoder and decoder do a good job at reducing the dimensionality of the problem. This is a very good task for DL approaches. Furthermore, we then need a time evolution of the latent space states $\mathbf{c}$, and for most practical model equations, we cannot find closed form solutions to evolve $\mathbf{c}$. Hence, this likewise poses a very good problem for learning methods. To summarize, we're facing to challenges: learning a good spatial encoding and decoding, together with learning an accurate time evolution. +Below, we will describe an approach to solve this problem following Wiewel et al. +{cite}`wiewel2019lss` & {cite}`wiewel2020lsssubdiv`, which in turn employs +the encoder/decoder of Kim et al. {cite}`bkim2019deep`. + + +```{figure} resources/timeseries-lsp-overview.jpg +--- +height: 200px +name: timeseries-lsp-overview +--- +For time series predictions with ROMs, we encode the state of our system with an encoder $f_e$, predict +the time evolution with $f_t$, and then decode the full spatial information with a decoder $f_d$. +``` + + +## Reduced Order Models + +Reducing the order of computational models, often called _reduced order modeling_ (ROM) or _model reduction_, +as a classic topic in the computational field. Traditional techniques often employ techniques such as principal component analysis to arrive at a basis for a chosen space of solution. However, being linear by construction, these approaches have inherent limitations when representing complex, non-linear solution manifolds. And in practice, all "interesting" solutions are highly non-linear. + + +$\text{arg min}_{\theta} | f_d( f_e(x;\theta_e) ;\theta_d) - x |_2^2$ + +$f_e: \mathbb{R}^{n^3} \rightarrow \mathbb{R}^{m}$ + +$f_d: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n^3}$ + + +separable model + + + +## Time Series + ... diff --git a/physgrad.md b/physgrad.md index 4b8c62f..a7983d6 100644 --- a/physgrad.md +++ b/physgrad.md @@ -1,6 +1,8 @@ Physical Gradients ======================= +**Note, this chapter is very preliminary - probably not for the first version of the book** + The next chapter will dive deeper into state-of-the-art-research, and aim for an even tighter integration of physics and learning. The approaches explained previously all integrate physical models into deep learning algorithms, diff --git a/reinflearn-code.md b/reinflearn-code.md deleted file mode 100644 index aa14885..0000000 --- a/reinflearn-code.md +++ /dev/null @@ -1,797 +0,0 @@ -{ - "nbformat": 4, - "nbformat_minor": 0, - "metadata": { - "colab": { - "name": "PDE-Control-RL.ipynb", - "provenance": [], - "collapsed_sections": [] - }, - "kernelspec": { - "name": "python3", - "display_name": "Python 3" - }, - "language_info": { - "name": "python" - }, - "accelerator": "GPU" - }, - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "Aml7ksJPtCmf" - }, - "source": [ - "# Inverse Problems with Reinforcement Learning" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "AHETNWlVtyWr" - }, - "source": [ - "This notebook trains reinforcement learning agents controlling Burgers' equation, a nonlinear PDE. The approach uses the reinforcement learning framework [stable_baselines3](https://github.com/DLR-RM/stable-baselines3) and the differentiable PDE solver [ΦFlow](https://github.com/tum-pbs/PhiFlow). [PPO](https://arxiv.org/abs/1707.06347v2) was chosen as reinforcement learning algorithm.\n", - "\n", - "Additionally, a supervised control force estimator is trained as a performance baseline. This method was introduced by Holl et al. [\\(2020\\)](https://ge.in.tum.de/publications/2020-iclr-holl/)." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "2EDqLS_xz9B8" - }, - "source": [ - "!pip install stable-baselines3 phiflow==1.5.1\n", - "!git clone https://github.com/Sh0cktr4p/PDE-Control-RL.git\n", - "!git clone https://github.com/holl-/PDE-Control.git" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "A9t4odMH6pl1" - }, - "source": [ - "# Training" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "UI1_mMnNQXrN" - }, - "source": [ - "import sys; sys.path.append('PDE-Control/src'); sys.path.append('PDE-Control-RL/src')\n", - "import time\n", - "import csv\n", - "import os\n", - "import shutil\n", - "from tensorboard.backend.event_processing.event_accumulator import EventAccumulator\n", - "from phi.flow import *\n", - "import burgers_plots as bplt\n", - "import matplotlib.pyplot as plt\n", - "from envs.burgers_util import GaussianClash, GaussianForce" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mCUbc-sovPME" - }, - "source": [ - "## Data Generation" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "wSELidjsvRyd" - }, - "source": [ - "At first we generate a dataset to train the CFE model on and evaluate the performance of both approaches during and after training." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "lUENHywEUVsu" - }, - "source": [ - "domain = Domain([32], box=box[0:1]) # Defines the size of the fields\n", - "viscosity = 0.003\n", - "step_count = 32 # Trajectory length\n", - "dt = 0.03\n", - "diffusion_substeps = 1\n", - "\n", - "data_path = 'forced-burgers-clash'\n", - "scene_count = 1000\n", - "batch_size = 100\n", - "\n", - "train_range = range(200, 1000)\n", - "val_range = range(100, 200)\n", - "test_range = range(0, 100)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "vEMdDJjAUeUv" - }, - "source": [ - "for batch_index in range(scene_count // batch_size):\n", - " scene = Scene.create(data_path, count=batch_size)\n", - " print(scene)\n", - " world = World()\n", - " u0 = BurgersVelocity(\n", - " domain, \n", - " velocity=GaussianClash(batch_size), \n", - " viscosity=viscosity, \n", - " batch_size=batch_size, \n", - " name='burgers'\n", - " )\n", - " u = world.add(u0, physics=Burgers(diffusion_substeps=diffusion_substeps))\n", - " force = world.add(FieldEffect(GaussianForce(batch_size), ['velocity']))\n", - " scene.write(world.state, frame=0)\n", - " for frame in range(1, step_count + 1):\n", - " world.step(dt=dt)\n", - " scene.write(world.state, frame=frame)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "plZUZD_av3YH" - }, - "source": [ - "## Reinforcement Learning Training" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "ZrczbfDzUgim" - }, - "source": [ - "from experiment import BurgersTraining" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Agc9EVUeUoY9" - }, - "source": [ - "n_envs = 10 # On how many environments to train in parallel, load balancing\n", - "final_reward_factor = step_count # How hard to punish the agent for not reaching the goal if that were the case\n", - "steps_per_rollout = step_count * 10 # How many steps to collect per environment between agent updates\n", - "n_epochs = 10 # How many epochs to perform during agent update\n", - "rl_learning_rate = 1e-4 # Learning rate for agent updates\n", - "rl_batch_size = 128 # Batch size for agent updates" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "U4FKqSjwv9jR" - }, - "source": [ - "To start training, we create a trainer object which manages the environment and the agent internally. Additionally, a directory for storing models, logs, and hyperparameters is created. This way, training can be continued at any later point using the same configuration. If the model folder specified in exp_name already exists, the agent within is loaded. Otherwise, a new agent is created.\n", - "\n", - "As default, a already trained agent stored at PDE-Control-RL/networks/rl-models/bench is loaded. To generate a new model, replace the specified path with another." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "FjB_0vNKVCxe", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "df262ee8-c3a3-4479-c750-4d9fdba40784" - }, - "source": [ - "rl_trainer = BurgersTraining(\n", - " path='PDE-Control-RL/networks/rl-models/bench', # Replace this to train a new model\n", - " domain=domain,\n", - " viscosity=viscosity,\n", - " step_count=step_count,\n", - " dt=dt,\n", - " diffusion_substeps=diffusion_substeps,\n", - " n_envs=n_envs,\n", - " final_reward_factor=final_reward_factor,\n", - " steps_per_rollout=steps_per_rollout,\n", - " n_epochs=n_epochs,\n", - " learning_rate=rl_learning_rate,\n", - " batch_size=rl_batch_size,\n", - " data_path=data_path,\n", - " val_range=val_range,\n", - " test_range=test_range,\n", - ")" - ], - "execution_count": null, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Loading existing agent from PDE-Control-RL/networks/rl-models/bench/agent.zip\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "skE_zAdGwkM2" - }, - "source": [ - "The following cell opens tensorboard inside the notebook to display the progress of the training. If a new model was created at a different location, please change the path to the location at which you stored your model." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "DM8bVThNVF9Y" - }, - "source": [ - "%load_ext tensorboard\n", - "%tensorboard --logdir PDE-Control-RL/networks/rl-models/bench/tensorboard-log" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nY5uq750wzsK" - }, - "source": [ - "Now we are set up to start training the agent. The next cell might take multiple hours to execute, depending on the number of rollouts (around 6h for 1000 iterations)" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "laqpvrc7VxcW" - }, - "source": [ - "rl_trainer.train(n_rollouts=1000, save_freq=50)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7WlqEvsOL7Rt" - }, - "source": [ - "Now let us take a quick glance about what the results look like.\n", - "Here also an example for a trajectory after around 3600 training iterations:\n", - "![burgers_notebook.png]()" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "Y05aa5BjMVFZ" - }, - "source": [ - "rl_frames, _, _ = rl_trainer.infer_test_set_frames()\n", - "\n", - "index_in_set = 0 # Change this to display a reconstruction of another scene\n", - "\n", - "bplt.burgers_figure('Reinforcement Learning')\n", - "for frame in range(0, step_count + 1):\n", - " plt.plot(rl_frames[frame][index_in_set,:], color=bplt.gradient_color(frame, step_count+1), linewidth=0.8)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "u47Yboxx2CGV" - }, - "source": [ - "## Control Force Estimator Training" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2E_sFqgo2SiU" - }, - "source": [ - "To classify the results of the reinforcement learning method, they are compared to a supervised control force estimator approach using differentiable physics loss. This comparison seems reasonable, as both approaches work by optimizing through trial and error.\n", - "\n", - "The CFE approach has access to the gradient data provided by the differentiable solver, making it possible to trace the loss over multiple timesteps and enabling the model to comprehend long term effects of generated forces better.\n", - "\n", - "The reinforcement learning approach on the other hand uses a dedicated value estimator network (critic) to predict the sum of rewards generated from a certain state. These are then used to update a policy network (actor) which, analogously to the control force estimator network, predicts the forces to control the simulation. The reinforcement learning algorithm is not limited by training set size like the CFE approach, as new training samples are generated on policy. However, this also introduces additional simulation overhead during training, which can increase the time needed for convergence. " - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "U2hqdeJb2GLJ" - }, - "source": [ - "from control.pde.burgers import BurgersPDE\n", - "from control.control_training import ControlTraining\n", - "from control.sequences import StaggeredSequence" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "PLcFuRBz0-Yf" - }, - "source": [ - "The cell below sets up a model for training or to load an existing model checkpoint." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "5s6-hSZp5CGb" - }, - "source": [ - "cfe_app = ControlTraining(\n", - " step_count,\n", - " BurgersPDE(domain, viscosity, dt),\n", - " datapath=data_path,\n", - " val_range=val_range,\n", - " train_range=train_range,\n", - " trace_to_channel=lambda trace: 'burgers_velocity',\n", - " obs_loss_frames=[],\n", - " trainable_networks=['CFE'],\n", - " sequence_class=StaggeredSequence,\n", - " batch_size=100,\n", - " view_size=20,\n", - " learning_rate=1e-3,\n", - " learning_rate_half_life=1000,\n", - " dt=dt\n", - ").prepare()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "3ReXUkzI1L3t" - }, - "source": [ - "The cell below executes the model training. Please specify a number of iterations for the algorithm to run." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "blHHLaVS5jHA" - }, - "source": [ - "cfe_training_eval_data = []\n", - "\n", - "start_time = time.time()\n", - "\n", - "for epoch in range(200):\n", - " cfe_app.progress()\n", - " # Evaluate validation set at regular intervals to track learning progress\n", - " # Size of intervals determined by RL epoch count per iteration for accurate comparison\n", - " if epoch % n_epochs == 0:\n", - " f = cfe_app.infer_scalars(val_range)['Total Force'] / dt\n", - " cfe_training_eval_data.append((time.time() - start_time, epoch, f))" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "31B72FBR1pXr" - }, - "source": [ - "The following cells stores the trained model and the validation performance with respect to iterations and wall time." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "kzfckDMc8O__" - }, - "source": [ - "cfe_store_path = 'networks/cfe-models/bench'\n", - "if not os.path.exists(cfe_store_path):\n", - " os.makedirs(cfe_store_path)\n", - "\n", - "# store training progress information\n", - "with open(os.path.join(cfe_store_path, 'val_forces.csv'), 'at') as log_file:\n", - " logger = csv.DictWriter(log_file, ('time', 'epoch', 'forces'))\n", - " logger.writeheader()\n", - " for (t, e, f) in cfe_training_eval_data:\n", - " logger.writerow({'time': t, 'epoch': e, 'forces': f})\n", - "\n", - "cfe_checkpoint = cfe_app.save_model()\n", - "shutil.move(cfe_checkpoint, cfe_store_path)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "r4r6sOh87B-1" - }, - "source": [ - "Alternatively, run the cell below can be run to load an existing network model." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "gHEAxxjv-vDL" - }, - "source": [ - "cfe_path = 'PDE-Control-RL/networks/cfe-models/bench/checkpoint_00020000/'\n", - "networks_to_load = ['OP2', 'OP4', 'OP8', 'OP16', 'OP32']\n", - "\n", - "cfe_app.load_checkpoints({net: cfe_path for net in networks_to_load})" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "V8inOSSE0OMf" - }, - "source": [ - "Run the cell below to see how the algorithm performs. Here is an example of a model trained for 2000 iterations:\n", - "![burgers_notebook2.png]()" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "SjYY0PXp1VDT" - }, - "source": [ - "cfe_frames = cfe_app.infer_all_frames(test_range)\n", - "\n", - "index_in_set = 1 # Change this to display a reconstruction of another scene\n", - "\n", - "bplt.burgers_figure('Supervised Control Force Estimator')\n", - "for frame in range(0, step_count + 1):\n", - " plt.plot(cfe_frames[frame].burgers.velocity.data[index_in_set,:,0], color=bplt.gradient_color(frame, step_count+1), linewidth=0.8)" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "fr-_uYpQ_nHn" - }, - "source": [ - "# Comparison" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "gA63vV4d2yko" - }, - "source": [ - "Next, the results of both methods are compared in terms of visual quality of the resulting trajectories as well as the generated forces. The latter provides insight about the performance of either approaches as both aspire to minimize this metric during training" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "QuoxjQf8UVuF" - }, - "source": [ - "import utils\n", - "import pandas as pd" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JsUf1411_-xd" - }, - "source": [ - "## Trajectory Comparison" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Yh-xD2cG7d9A" - }, - "source": [ - "To compare the resulting trajectories, we generate trajectories from the test set with each method. Also, we collect the ground truth simulations and the natural evolution of the test set fields." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "EP8S7UC5_vQD" - }, - "source": [ - "rl_frames, gt_frames, unc_frames = rl_trainer.infer_test_set_frames()\n", - "\n", - "cfe_frames = cfe_app.infer_all_frames(test_range)\n", - "cfe_frames = [s.burgers.velocity.data for s in cfe_frames]\n", - "\n", - "frames = {\n", - " (0, 0): ('Ground Truth', gt_frames),\n", - " (0, 1): ('Uncontrolled', unc_frames),\n", - " (1, 0): ('Reinforcement Learning', rl_frames),\n", - " (1, 1): ('Supervised Control Force Estimator', cfe_frames),\n", - "}" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "p6DJ3J6jAkQ5" - }, - "source": [ - "index_in_set = 0 # Specifies which sample of the test set should be displayed\n", - "\n", - "def plot(axs, xy, title, field):\n", - " axs[xy].set_ylim(-2, 2)\n", - " axs[xy].set_xlabel('x')\n", - " axs[xy].set_ylabel('u(x)')\n", - " axs[xy].set_title(title)\n", - "\n", - " label = 'Initial state in dark red, final state in dark blue'\n", - "\n", - " for step_idx in range(0, step_count + 1):\n", - " color = bplt.gradient_color(step_idx, step_count+1)\n", - " axs[xy].plot(\n", - " field[step_idx][index_in_set].squeeze(), \n", - " color=color, \n", - " linewidth=0.8, \n", - " label=label\n", - " )\n", - " label = None\n", - "\n", - " axs[xy].legend()\n", - "\n", - "fig, axs = plt.subplots(2, 2, figsize=(12.8, 9.6))\n", - "\n", - "for xy in frames:\n", - " plot(axs, xy, *frames[xy])" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "rPjprKl-DWAX" - }, - "source": [ - "## Forces Comparison" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ZsksKs4e4QJA" - }, - "source": [ - "Next, we compute the forces the approaches have generated for the test set trajectories." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "me4rb-_qCXgC" - }, - "source": [ - "gt_forces = utils.infer_forces_sum_from_frames(\n", - " gt_frames, domain, diffusion_substeps, viscosity, dt\n", - ")\n", - "cfe_forces = utils.infer_forces_sum_from_frames(\n", - " cfe_frames, domain, diffusion_substeps, viscosity, dt\n", - ")\n", - "rl_forces = rl_trainer.infer_test_set_forces()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "B75xKFuw4414" - }, - "source": [ - "In the following, the forces generated by both methods are compared to the ground truth of the respective sample. Samples placed above the blue line denote stronger forces in the used deep learning approach than in the ground truth and vice versa." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "zMK29jgWDUB_" - }, - "source": [ - "plt.figure(figsize=(12.8, 9.6))\n", - "plt.scatter(gt_forces, cfe_forces, label='CFE')\n", - "plt.scatter(gt_forces, rl_forces, label='RL')\n", - "plt.plot([x * 100 for x in range(15)], [x * 100 for x in range(15)], label='Same forces as original')\n", - "plt.xlabel('ground truth')\n", - "plt.xlim(0, 1500)\n", - "plt.ylim(0, 1500)\n", - "plt.ylabel('reconstruction')\n", - "plt.grid()\n", - "plt.legend()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ema5wsX25gS6" - }, - "source": [ - "Next, the two deep learning methods are compared directly. Samples above the line denote higher forces by the control force estimator, samples below higher forces for the reinforcement learning agent." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "gEgGuyhcDkPK" - }, - "source": [ - "plt.figure(figsize=(12.8, 9.6))\n", - "plt.scatter(rl_forces, cfe_forces)\n", - "plt.xlabel('Reinforcement Learning')\n", - "plt.ylabel('Control Force Estimator')\n", - "plt.plot([x * 100 for x in range(15)], [x * 100 for x in range(15)], label='Same forces cfe rl')\n", - "plt.xlim(0, 1500)\n", - "plt.ylim(0, 1500)\n", - "plt.grid()\n", - "plt.legend()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7JqW7Cca6HUJ" - }, - "source": [ - "The following plot displays the performance of all reinforcement learning, control force estimator and ground truth with respect to individual samples." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "0vqJmQ3FDnKA" - }, - "source": [ - "w=0.25\n", - "plot_count=20\n", - "plt.figure(figsize=(12.8, 9.6))\n", - "plt.bar(\n", - " [i - w for i in range(plot_count)], \n", - " rl_forces[:plot_count], \n", - " width=w, \n", - " align='center', \n", - " label='RL'\n", - ")\n", - "plt.bar(\n", - " [i + w for i in range(plot_count)], \n", - " cfe_forces[:plot_count], \n", - " width=w, \n", - " align='center', \n", - " label='CFE'\n", - ")\n", - "plt.bar(\n", - " [i for i in range(plot_count)], \n", - " gt_forces[:plot_count], \n", - " width=w, \n", - " align='center', \n", - " label='GT'\n", - ")\n", - "plt.xlabel('Scenes')\n", - "plt.xticks(range(plot_count))\n", - "plt.ylabel('Forces')\n", - "plt.legend()\n", - "plt.show()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9Ee3Us_hD9nR" - }, - "source": [ - "## Training Progress Comparison" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "DGBlUpQ271Ww" - }, - "source": [ - "This cell explores the training progress of both methods with respect to iterations and wall time." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "i1ecSuakEAYp" - }, - "source": [ - "def get_cfe_val_set_forces(experiment_path):\n", - " path = os.path.join(experiment_path, 'val_forces.csv')\n", - " table = pd.read_csv(path)\n", - " return list(table['time']), list(table['epoch']), list(table['forces'])\n", - "\n", - "rl_w_times, rl_step_nums, rl_val_forces = rl_trainer.get_val_set_forces_data()\n", - "cfe_w_times, cfe_epochs, cfe_val_forces = get_cfe_val_set_forces('PDE-Control-RL/networks/cfe-models/bench')\n", - "\n", - "fig, axs = plt.subplots(2, 1, figsize=(12.8, 9.6))\n", - "\n", - "axs[0].plot(np.array(rl_step_nums), rl_val_forces, label='RL')\n", - "axs[0].plot(np.array(cfe_epochs), cfe_val_forces, label='CFE')\n", - "axs[0].set_xlabel('Epochs')\n", - "axs[0].set_ylabel('Forces')\n", - "axs[0].set_ylim(0, 1500)\n", - "axs[0].grid()\n", - "axs[0].legend()\n", - "\n", - "axs[1].plot(np.array(rl_w_times) / 3600, rl_val_forces, label='RL')\n", - "axs[1].plot(np.array(cfe_w_times) / 3600, cfe_val_forces, label='CFE')\n", - "axs[1].set_xlabel('Wall time (hours)')\n", - "axs[1].set_ylabel('Forces')\n", - "axs[1].set_ylim(0, 1500)\n", - "axs[1].grid()\n", - "axs[1].legend()" - ], - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "ih2a2rackAPs" - }, - "source": [ - "" - ], - "execution_count": null, - "outputs": [] - } - ] -}