cleanup of code examples

2025-02-19 09:53:02 +08:00 · 2025-02-19 09:53:02 +08:00 · 16f0f351ac
commit 16f0f351ac
parent c59992f349
9 changed files with 42 additions and 46 deletions
--- a/_toc.yml
+++ b/_toc.yml
@ -47,9 +47,6 @@ parts:
  chapters:
  - file: reinflearn-intro.md
  - file: reinflearn-code.ipynb
- caption: Graph Networks
-  chapters:
-  - file: graphs.md
 - caption: Improved Gradients
  chapters:
  - file: physgrad.md
@ -63,8 +60,8 @@ parts:
  chapters:
  - file: others-intro.md
  - file: others-timeseries.md
-  - file: others-GANs.md
  - file: others-lagrangian.md
+  - file: others-GANs.md
 - caption: End Matter
  chapters:
  - file: outlook.md
--- a/diffphys-code-sol.ipynb
+++ b/diffphys-code-sol.ipynb
@ -673,11 +673,11 @@
      "  input = torch.tensor(input_cpu, dtype=torch.float32).to(device);\n",
      "/tmp/ipykernel_774262/4154370188.py:8: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n",
      "  targets = torch.tensor(targets_cpu, dtype=torch.float32).to(device)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
+      "anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
      "  warnings.warn(\"Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\")\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:803: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:53.)\n",
+      "anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:803: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at ../aten/src/ATen/SparseCsrTensorImpl.cpp:53.)\n",
      "  return torch.sparse_csr_tensor(row_pointers, column_indices, values, shape, device=values.device)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
+      "anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
      "  warnings.warn(\"Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\")\n",
      "loss 15.62617:   0%|                                        | 1/4960 [00:12<16:36:58, 12.06s/it]/tmp/ipykernel_774262/4154370188.py:7: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).\n",
      "  input = torch.tensor(input_cpu, dtype=torch.float32).to(device);\n",
@ -922,9 +922,9 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "Sim. only:   0%|                        | 0/100 [00:00<?, ?it/s]/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
+      "Sim. only:   0%|                        | 0/100 [00:00<?, ?it/s]anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
      "  warnings.warn(\"Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\")\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
+      "anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
      "  warnings.warn(\"Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\")\n",
      "Sim. only: 100%|██████████████| 100/100 [00:31<00:00,  3.14it/s]\n"
     ]
@ -1101,9 +1101,9 @@
    "\n",
    "## Next steps\n",
    "\n",
-    "* Turn off the differentiable physics training (by setting `msteps=1`), and compare it with the unrolled version. This yields a _supervised_ training, as no gradients need to flow through the solver anymore. Compare how much larger the relative errors are in this case.\n",
+    "* Turn off the differentiable physics training (by setting `msteps=1`), and compare it with the unrolled version. This yields a _supervised_ training, as no gradients need to flow through the solver anymore. The relative errors will be substantially larger.\n",
    "\n",
-    "* Likewise, train a network with a larger `msteps` setting, e.g., 8 or 16. Note that due to the recurrent nature of the training, you'll probably have to load a pre-trained state to stabilize the first iterations. How much does accuracy improve?\n",
+    "* Likewise, try training a network with a larger `msteps` settings, e.g., 8 or 16. Note that due to the recurrent nature of the training, you'll probably have to load a pre-trained state to stabilize the first iterations (this effectively adds \"curriculum learning\").\n",
    "\n",
    "* Use the external github code to generate tougher test data, and run your trained NN on these cases. You'll see that a reduced training error not always directly correlates with an improved test performance.\n",
    "\n"
--- a/others-GANs.md
+++ b/others-GANs.md
@ -1,11 +1,18 @@
 Generative Adversarial Networks
 =======================

-A fundamental problem in machine learning is to fully represent
+We've dealt with generative AI techniques and diffusion modeling 
+in detail in {doc}`probmodels-intro`.
+As outlined there, the fundamental problem to fully represent
 all possible states of a variable $\mathbf{x}$ under consideration,
-i.e. to capture its full distribution.
-For this task, _generative adversarial networks_ (GANs) were
-shown to be powerful tools in DL. They are important when the data has ambiguous solutions,
+i.e. to capture its full distribution, is a very old topic. Hence, 
+even before DDPMs&Co. there were techniques to make this possible, 
+and _generative adversarial networks_ (GANs) were
+shown to be powerful tools in this context. While they've been largely replaced
+by diffsion approaches in research, GANs use a highly interesting approach,
+and the following sections will give an introduction and show what's possible with GANs.
+
+Traditionally, GANs were employed when the data has ambiguous solutions,
 and no differentiable physics model is available to disambiguate the data. In such a case
 a supervised learning would yield an undesirable averaging that can be prevented with
 a GAN approach.
@ -21,12 +28,12 @@ results can be highly ambiguous.

 ## Maximum likelihood estimation

-To train a GAN we have to briefly turn to classification problems.
-For these, the learning objective takes a slightly different form than the
+To train a GAN we have to briefly turn to _classification problems_, which we've managed to ignore up to now.
+For classification, the learning objective takes a slightly different form than the
 regression objective in equation {eq}`learn-l2` of {doc}`overview-equations`:
 We now want to maximize the likelihood of a learned representation
-$f$ that assigns a probability to an input $\mathbf{x}_i$ given a set of weights $\theta$. 
-This yields a maximization problem of the form 
+$f$ that assigns a probability to an input $\mathbf{x}_i$ given a set of weights $\theta$ for 
+a chosen set of $i$ distinct classes.  This yields a maximization problem of the form 

 $$
 \text{arg max}_{\theta} \Pi_i f(\mathbf{x}_i;\theta) ,
--- a/physicalloss-div.ipynb
+++ b/physicalloss-div.ipynb
@ -466,23 +466,8 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "  0%|          | 0/15 [00:00<?, ?it/s]/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(eval_nn) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(loss_div) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(eval_nn) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(loss_div) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "100%|██████████| 15/15 [00:16<00:00,  1.12s/it]\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(eval_nn) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(loss_div) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(eval_nn) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n",
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/backend/torch/_torch_backend.py:1091: RuntimeWarning: PyTorch does not support nested tracing. The inner JIT of native(loss_div) will be ignored.\n",
-      "  warnings.warn(f\"PyTorch does not support nested tracing. The inner JIT of {self.f.__name__} will be ignored.\", RuntimeWarning)\n"
+      "  0%|          | 0/15 [00:00<?, ?it/s]\n",
+      "100%|██████████| 15/15 [00:16<00:00,  1.12s/it]\n"
     ]
    },
    {
@ -567,7 +552,7 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
+      "anaconda3/envs/torch24/lib/python3.12/site-packages/phiml/math/_optimize.py:631: UserWarning: Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\n",
      "  warnings.warn(\"Possible rank deficiency detected. Matrix might be singular which can lead to convergence problems. Please specify using Solve(rank_deficiency=...).\")\n"
     ]
    },
--- a/probmodels-ddpm-fm.ipynb
+++ b/probmodels-ddpm-fm.ipynb
@ -8,16 +8,16 @@
   "source": [
    "# From DDPM to Flow Matching\n",
    "\n",
-    "We'll be using a learning task where we can reliably generate arbitrary amounts of ground truth data, to make sure we can quantify how well the target distribution was learned. Specifically, we'll focus on Reynolds-averaged Navier-Stokes simulations around airfoils, which have the interesting characteristic that typical solvers (such as OpenFoam) transition from steady solutions to oscillating ones for larger Reynolds numbers. This transition is exactly what we'll give as a task to diffusion models below. (Details can be found in our [diffuion-based flow prediction repository](https://github.com/tum-pbs/Diffusion-based-Flow-Prediction/).) Also, to make the notebook self-contained, we'll revisit the most important concepts from the previous section.\n",
+    "To show the capabilities of **denoising diffusion** and **flow matching**, we'll be use a learning task where we can reliably generate arbitrary amounts of ground truth data. This ensures we can quantify how well the target distribution was learned. Specifically, we'll focus on Reynolds-averaged Navier-Stokes simulations around airfoils, which have the interesting characteristic that typical solvers (such as OpenFoam) transition from steady solutions to oscillating ones for larger Reynolds numbers. This transition is exactly what we'll give as a task to diffusion models below. (Details can be found in our [diffuion-based flow prediction repository](https://github.com/tum-pbs/Diffusion-based-Flow-Prediction/).) Also, to make the notebook self-contained, we'll revisit the most important concepts from the previous section.\n",
     "[[run in colab]](https://colab.research.google.com/github/tum-pbs/pbdl-book/blob/main/probmodels-ddpm-fm.ipynb)\n",
    "\n",
    "```{note} \n",
-    "If you're directly continuing reading from the previous chapter, note that there's an important difference: for simplicity, we'll apply denoising and flow-matching to a **forward** problem here. We won't be aiming to recover $x$ for an observation $y$, but rather assume we have initial conditions $x$ from which we want to compute a solution $y$. So don't be surprised by the switched $x$ and $y$ below.\n",
+    "If you're directly continuing reading from the previous chapter, note that there's an important difference: we'll deviate from the _simulation-based inference viewpoint, and for simplicity we'll apply denoising and flow-matching to a **forward** problem. We won't be aiming to recover $x$ for an observation $y$, but rather assume we have initial conditions $x$ from which we want to compute a solution $y$. So don't be surprised by the switched $x$ and $y$ below.\n",
    "```\n",
    "\n",
    "## Intro\n",
    "\n",
-    "Diffusion models have been rising stars in the deep learning field in the past years, and have made it possible to train powerful generative models with surprisingly simple and robust training setups. Within this sub-field of deep learning, a very promising new development are flow-based approaches, typically going under names such as _flow matching_ {cite}`lipman2022flow` and _rectified flows_ {cite}`liu2022rect` . We'll stick to the former here for simplicity, and denote this class of models with _FM_.\n",
+    "Diffusion models have been rising stars ⭐️ in the deep learning field in the past years, and have made it possible to train powerful generative models with surprisingly simple and robust training setups. Within this sub-field of deep learning, a very promising new development are flow-based approaches, typically going under names such as _flow matching_ {cite}`lipman2022flow` and _rectified flows_ {cite}`liu2022rect` . We'll stick to the former here for simplicity, and denote this class of models with _FM_.\n",
    "\n",
    "For the original diffusion models, especially the _denoising_ tasks were extremely successful: a neural network learns to restore a signal from pure noise. Score functions provided an alternate viewpoint, but ultimately also resulted in denoising tasks. Instead, flow-based approaches aim for transforming distributions. The goal is to transform a known one, such as gaussian noise, into one that represents the distribution of the signal or target function we're interested in. Despite these seemingly different viewpoints, all viewpoints above effectively do the same: starting with noise, they step by step turn it into samples for our target signal. Interestingly, the FM-perspective is not only more stable at training time, it also speeds up inference by orders of magnitude thanks to yielding straighter paths. And even better: if you have a working DM setup, it's surprisingly simple to turn it into an FM one.\n",
    "\n",
@ -66,14 +66,14 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "/home/thuerey/jupyter/Diffusion-based-Flow-Prediction\n"
+      " \n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "/home/thuerey/anaconda3/envs/torch24/lib/python3.12/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library.\n",
+      "site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library.\n",
      "  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]\n"
     ]
    }
--- a/probmodels-discuss.md
+++ b/probmodels-discuss.md
@ -5,7 +5,11 @@ As the previous sections have demonstrated, probabilistic learning offers a wide

 At the same time, they enable a fundamentally different way to work with simulations: they provide a simple way to work with complex distributions of solutions. This is of huge importance for inverse problems, e.g. in the context of obtaining likelihood-based estimates for _simulation-based inference_. 

-That being said, diffusion based approaches will not show relatively few advantages for determinstic settings: they are not more accurate, and typically induce slightly larger computational costs. An interesting exception is the long-term stability, as discussed in {doc}`probmodels-uncond`. 
+That being said, diffusion based approaches will not show relatively few advantages for deterministic settings: they are not more accurate, and typically induce slightly larger computational costs. An interesting exception is the long-term stability, as discussed in {doc}`probmodels-uncond`. 
+
+![Divider](resources/divider1.jpg)
+
+To summarize the key aspects of probabilistic deep learning approaches:

 ✅ Pro: 
 - Enable training and inference for distributions
@ -16,6 +20,8 @@ That being said, diffusion based approaches will not show relatively few advanta
 - (Slightly) increased inference cost
 - No real advantage for deterministic settings

+![Divider](resources/divider7.jpg)
+
 To summarize: if your problems contains ambiguities, diffusion modeling in the form of _flow matching_ is the method of choice. If your data contains reliable input-output pairs, go with simpler _deterministic training_ instead.

 Next, we can turn to a new viewpoint on learning problems, the field of _reinforcement learning_. As the next sections will point out, it is actually not so different from the topics of the previous chapters despite the new viewpoint.
--- a/probmodels-sbisim.ipynb
+++ b/probmodels-sbisim.ipynb
@ -7,7 +7,7 @@
        "id": "4139ba7c-6234-47b8-8a37-35cd01efd51d"
      },
      "source": [
-        "Inverse Problem Example with Differentiable Simulations\n",
+        "Probabilistic Inverse Problem with Differentiable Simulations\n",
        "=======================\n",
        "\n",
        "This notebook will illustrate some of the concepts introduced in {doc}`probmodels-intro`, such as the training of score functions via log likelihoods, and what they look like in a clear and reduced problem. At the same time, the setup provides integration of a simple _differentiable simulator_ to illustrate the concept of physics-based diffusion modeling with the SMDP method from {doc}`probmodels-phys` [(full paper)](https://arxiv.org/abs/2301.10250). This approach combines physics and score matching along a merged time dimension to solve inverse problems. \n",
--- a/reinflearn-intro.md
+++ b/reinflearn-intro.md
@ -1,7 +1,8 @@
 Introduction to Reinforcement Learning
 =======================

-Deep reinforcement learning, which we'll just call _reinforcement learning_ (RL) from now on, is a class of methods in the larger field of deep learning that lets an artificial intelligence agent explore the interactions with a surrounding environment. While doing this, the agent receives reward signals for its actions and tries to discern which actions contribute to higher rewards, to adapt its behavior accordingly. RL has been very successful at playing games such as Go {cite}`silver2017mastering`, and it bears promise for engineering applications such as robotics.
+Deep reinforcement learning, which we'll just call _reinforcement learning_ (RL) from now on, is a class of methods in the larger field of deep learning that takes a different viewpoint from classic "train with data" one:
+RL effectively lets an AI agent learn from interactions with an environment. While performing actions, the agent receives reward signals and tries to discern which actions contribute to higher rewards, to adapt its behavior accordingly. RL has been very successful at playing games such as Go {cite}`silver2017mastering`, and it bears promise for engineering applications such as robotics.

 The setup for RL generally consists of two parts: the environment and the agent. The environment receives actions $a$ from the agent while supplying it with observations in the form of states $s$, and rewards $r$. The observations represent the fraction of the information from the respective environment state that the agent is able to perceive. The rewards are given by a predefined function, usually tailored to the environment and might contain, e.g., a game score, a penalty for wrong actions or a bounty for successfully finished tasks.

--- a/supervised-arch.md
+++ b/supervised-arch.md
@ -75,7 +75,7 @@ A 3x3 convolution (orange) shown for differently deformed regular multi-block gr

 For unstructured data, graph-based neural networks (GNNs) are a good choice. While they're often discussed in terms of _message-passing_ operations,
 they share a lot of similarities with structured grids: the basic operation of a message-passing step on a GNN is equivalent to a convolution on a grid {cite}`sanchez2020learning`. 
-Hierarchies can likewise be constructed by graph coarsening {cite}`lino2024dgn`. Hence, while we'll primarily discuss grids below, keep in mind that the approaches carry over to GNNs. As dealing with graph structures makes the implementation more complicated, we won't go into details until later on in {doc}`graphs`.
+Hierarchies can likewise be constructed by graph coarsening {cite}`lino2024dgn`. Hence, while we'll primarily discuss grids below, keep in mind that the approaches carry over to GNNs. As dealing with graph structures makes the implementation more complicated, we won't go into details until later.

 ```{figure} resources/arch02.jpg
 ---