pbdl-book/physicalloss-discuss.md

Discussion of Physical Losses
=======================

The good news so far is - we have a DL method that can include 
physical laws in the form of soft constraints by minimizing residuals.
However, as the very simple previous example illustrates, this is just a conceptual
starting point.

On the positive side, we can leverage DL frameworks with backpropagation to compute
the derivatives of the model. At the same time, this puts us at the mercy of the learned
representation regarding the reliability of these derivatives. Also, each derivative
requires backpropagation through the full network. This can be very expensive, especially 
for higher-order derivatives.

And while the setup is relatively simple, it is generally difficult to control. The NN
has flexibility to refine the solution by itself, but at the same time, tricks are necessary
when it doesn't focus on the right regions of the solution.

## Is it "Machine Learning"?

One question that might also come to mind at this point is: _can we really call it machine learning_?
Of course, such denomination questions are superficial - if an algorithm is useful, it doesn't matter
what name it has. However, here the question helps to highlight some important properties
that are typically associated with algorithms from fields like machine learning or optimization.

One main reason _not_ to call the optimization of the previous notebook machine learning (ML), is that the
positions where we test and constrain the solution are the final positions we are interested in.
As such, there is no real distinction between training, validation and test sets.
Computing the solution for a known and given set of samples is much more akin to classical optimization,
where inverse problems like the previous Burgers example stem from.

For machine learning, we typically work under the assumption that the final performance of our 
model will be evaluated on a different, potentially unknown set of inputs. The _test data_
should usually capture such _out of distribution_ (OOD) behavior, so that we can make estimates
about how well our model will generalize to "real-world" cases that we will encounter when 
we deploy it in an application.

In contrast, for the PINN training as described here, we reconstruct a single solution in a known 
and given space-time region. As such, any samples from this domain follow the same distribution
and hence don't really represent test or OOD samples. As the NN directly encodes the solution,
there is also little hope that it will yield different solutions, or perform well outside
of the training range. If we're interested in a different solution, we 
have to start training the NN from scratch.

![Divider](resources/divider5.jpg)

## Summary

Thus, the physical soft constraints allow us to encode solutions to 
PDEs with the tools of NNs.
An inherent drawback of this variant 2 is that it yields single solutions,
and that it does not combine with traditional numerical techniques well. 
E.g., the learned representation is not suitable to be refined with 
a classical iterative solver such as the conjugate gradient method. 

This means many
powerful techniques that were developed in the past decades cannot be used in this context.
Bringing these numerical methods back into the picture will be one of the central
goals of the next sections.

✅ Pro: 
- Uses physical model.
- Derivatives can be conveniently computed via backpropagation.

❌ Con: 
- Quite slow ...
- Physical constraints are enforced only as soft constraints.
- Largely incompatible with _classical_ numerical methods.
- Accuracy of derivatives relies on learned representation.

Next, let's look at how we can leverage numerical methods to improve the DL accuracy and efficiency
by making use of differentiable solvers.
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`Discussion of Physical Losses`
more text 2021-01-12 04:50:42 +01:00			`=======================`

			`The good news so far is - we have a DL method that can include`
			`physical laws in the form of soft constraints by minimizing residuals.`
			`However, as the very simple previous example illustrates, this is just a conceptual`
			`starting point.`

			`On the positive side, we can leverage DL frameworks with backpropagation to compute`
			`the derivatives of the model. At the same time, this puts us at the mercy of the learned`
			`representation regarding the reliability of these derivatives. Also, each derivative`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`requires backpropagation through the full network. This can be very expensive, especially`
more text 2021-01-12 04:50:42 +01:00			`for higher-order derivatives.`

cleanup, unified notation NN instead of ANN 2021-03-10 05:15:50 +01:00			`And while the setup is relatively simple, it is generally difficult to control. The NN`
more text 2021-01-12 04:50:42 +01:00			`has flexibility to refine the solution by itself, but at the same time, tricks are necessary`
update physical loss chapter 2021-05-17 14:15:38 +02:00			`when it doesn't focus on the right regions of the solution.`
more text 2021-01-12 04:50:42 +01:00
first version DP 2021-01-16 06:30:26 +01:00			`## Is it "Machine Learning"?`
Starting diffphys chapter 2021-01-15 09:13:41 +01:00
update diffphys code 2021-01-18 07:42:36 +01:00			`One question that might also come to mind at this point is: _can we really call it machine learning_?`
cleanup, added other topics chapter 2021-04-14 13:08:51 +02:00			`Of course, such denomination questions are superficial - if an algorithm is useful, it doesn't matter`
update diffphys code 2021-01-18 07:42:36 +01:00			`what name it has. However, here the question helps to highlight some important properties`
			`that are typically associated with algorithms from fields like machine learning or optimization.`
Starting diffphys chapter 2021-01-15 09:13:41 +01:00
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`One main reason _not_ to call the optimization of the previous notebook machine learning (ML), is that the`
update diffphys code 2021-01-18 07:42:36 +01:00			`positions where we test and constrain the solution are the final positions we are interested in.`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`As such, there is no real distinction between training, validation and test sets.`
update diffphys code 2021-01-18 07:42:36 +01:00			`Computing the solution for a known and given set of samples is much more akin to classical optimization,`
			`where inverse problems like the previous Burgers example stem from.`

			`For machine learning, we typically work under the assumption that the final performance of our`
			`model will be evaluated on a different, potentially unknown set of inputs. The _test data_`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`should usually capture such _out of distribution_ (OOD) behavior, so that we can make estimates`
update diffphys code 2021-01-18 07:42:36 +01:00			`about how well our model will generalize to "real-world" cases that we will encounter when`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`we deploy it in an application.`
update diffphys code 2021-01-18 07:42:36 +01:00
			`In contrast, for the PINN training as described here, we reconstruct a single solution in a known`
update physical loss chapter 2021-05-17 14:15:38 +02:00			`and given space-time region. As such, any samples from this domain follow the same distribution`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`and hence don't really represent test or OOD samples. As the NN directly encodes the solution,`
update diffphys code 2021-01-18 07:42:36 +01:00			`there is also little hope that it will yield different solutions, or perform well outside`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`of the training range. If we're interested in a different solution, we`
cleanup, unified notation NN instead of ANN 2021-03-10 05:15:50 +01:00			`have to start training the NN from scratch.`
Starting diffphys chapter 2021-01-15 09:13:41 +01:00
updated teaser, added dividers 2021-04-11 14:17:03 +02:00			`![Divider](resources/divider5.jpg)`

Starting diffphys chapter 2021-01-15 09:13:41 +01:00			`## Summary`

update diffphys code 2021-01-18 07:42:36 +01:00			`Thus, the physical soft constraints allow us to encode solutions to`
update physical loss chapter 2021-05-17 14:15:38 +02:00			`PDEs with the tools of NNs.`
updated physical loss chapter 2022-02-23 10:56:53 +01:00			`An inherent drawback of this variant 2 is that it yields single solutions,`
update diffphys code 2021-01-18 07:42:36 +01:00			`and that it does not combine with traditional numerical techniques well.`
updates maximilian PINN chapter 2021-07-12 17:19:02 +02:00			`E.g., the learned representation is not suitable to be refined with`
update diffphys code 2021-01-18 07:42:36 +01:00			`a classical iterative solver such as the conjugate gradient method.`

			`This means many`
more text 2021-01-12 04:50:42 +01:00			`powerful techniques that were developed in the past decades cannot be used in this context.`
			`Bringing these numerical methods back into the picture will be one of the central`
			`goals of the next sections.`

			`✅ Pro:`
PG conclusions, list formatting 2021-03-26 03:28:05 +01:00			`- Uses physical model.`
update physical loss chapter 2021-05-17 14:15:38 +02:00			`- Derivatives can be conveniently computed via backpropagation.`
more text 2021-01-12 04:50:42 +01:00
			`❌ Con:`
PG conclusions, list formatting 2021-03-26 03:28:05 +01:00			`- Quite slow ...`
			`- Physical constraints are enforced only as soft constraints.`
update physical loss chapter 2021-05-17 14:15:38 +02:00			`- Largely incompatible with _classical_ numerical methods.`
PG conclusions, list formatting 2021-03-26 03:28:05 +01:00			`- Accuracy of derivatives relies on learned representation.`
more text 2021-01-12 04:50:42 +01:00
			`Next, let's look at how we can leverage numerical methods to improve the DL accuracy and efficiency`
			`by making use of differentiable solvers.`