PINN vs FEM I - A short intro to Physics-Informed Neural Networks

(Co-authored by Sean De Marco and Lloyd Fung)

To kickstart our blog, we’ll start with a series of blog posts discussing Physics-Informed Neural Networks (PINNs), a controversial technique that has galvinised so many academic communities since its inception. Some see it as a brilliant mix of known physics and machine, with recent success such as Deepmind’s effort to find singular solutions in Navier-Stokes. Others are quick to point out that it has over-stated its performance, and that it is not much more performant than traditional Finite Element Methods (FEM). Regardless, with over 800,000 citations (and counting), one cannot simply ignore it. So is it over-hyped or under-appreciated?

While many has tried compared its performance with conventional numerical methods like FEM, in this series of blogs, we have a slightly different take on it. We want to draw out their similarities in the way they represent the solution of a PDE, and from that, we can better highlights their pros and cons. As you shall see, at its core, these two methods actually share more similarities than you think!

What are PINNs

PINNs were first introduced by Raissi, Perdikaris and Karniadakis in 2017 and formalised into a published paper a few years later. At the peak of the ML hype, PINNs quickly took the academic world by storm. The idea is simple - use a Neural Network (NN) to represent the solution to a PDE, which is typically a function of space-time. Then, by exploiting automatic differentiation (AD), we can evaluate the differential terms in the PDE and create a loss function from its residue. Minimising this loss shall therefore be equivalent to “solving” the PDE (whether minimal residue loss is actually the same as solving a PDE is debatable). Since the PDE is the “model” of a system in this case, we call this the “model loss”.

Now, the original inception of PINN is not only to “solve” PDEs, but also to do data assimilation problem - given measurements of a system that supposedly fulfill the PDE, data assimilation tries to match the PDE solution to the data. Because the PINN loss is already casted as an optimisation problem, unlike traditional numerical methods, incorperating data loss becomes as easy as adding the data loss and the model loss together to form a new overall loss function for the optimiser to solve. However, empirically adding these two losses together empirically also violates a fundamental principle of data assimilation, namely, the fulfilment of the PDE (i.e. model) should be a hard constraint, instead of a soft constraint like PINN. Since the PINN optimisation is now a balance between fulfilling the equation and matching the data, the equation is no longer satisfied exactly the same way traditional data assimilation methods do.

The discussion of the controversy of the PINN loss can be quite involved, so we will defer discussion of the “PINN loss” and data assimilation to the next series of blog posts. For this series of blog posts, we will focus on the “Neural Network” part of Physics-Informed Neural Networks. Specifically, how does Neural Network represent PDE solutions, and how does it compare with traditional Finite Element representation of the solutions? To do that, we will discard the data loss, and focus specifically on the fulfillment of PDEs by optimising just the model loss.

Literature’s Comparison of FEM and PINNs

As is expected with exploring new methods, the first question always revolves around how this new method compares to more traditional methods. To explore this avenue, the FEM was selected, often because FEM represents the most efficient class of numerical methods for discretising PDEs. However, as we shall show in the upcoming posts, there is a deeper connection between FEM and NNs.

There have been many discussions and work around PINNs, but here we’d like to highlight two major studies that have directly compared PINNs and the FEM. One study (Grossmann et al. 2024, IMA J. Appl. Maths) is from the Department of Applied Mathematics and Theoretical Physics in Cambridge, UK. They used the Poisson equation in 1D, 2D and 3D, the Schrödinger equation in 1D and 2D and the Allen-Cahn equation in 1D to compare the methods. The scope of using such varied equations is to expose both methods to varying conditions and solution spaces. Grossmann et al. ultimately showed that although PINNs are capable of overcoming the curse of dimensionality, the solution cost and time were not more favourable than FEM in all cases. The study showed inconclusive results where no method was the clear favourite in solving PDEs.

The other literature which directly compared the FEM and PINNs was a study by Sikora et al. (2024, J. Comp. Sci.), who compared PINNs and Variational-PINNs with highorder and continuity FEMs in solving the Eriksson-Johnson problem. The paper showed that in order for the FEM to solve the Eriksson-Johnson problem, two stabilisation methods were required, namely the streamline upwind Petrov-Galerkin method and the residual minimisation method. The drawback of both is that adaptive meshing is required, further increasing the curse of dimensionality of the FEM problem. On the other hand, the PINN implementation was simpler, the only adjustments needed were to adapt the loss function to include the governing PDE and some terms for the boundary and initial conditions, however, due to the nature of the solution, a non-uniform discretisation was required. The need of a non-uniform discretisation came from pre-existing understanding of the boundary layer requiring greater refinement to successfully capture the characteristics in the boundary, mimicking what is done in FEM in the presence of large parameter gradients. This problem highlights a significant challenge with PINNs; where convergence to a correct solution with PINNs could require a priori knowledge.

The aim of our upcoming posts is to explore this comparison further through a mathematical analysis of FEM and PINN, with the goal of highlighting the mathematical similarities and methods between the two methods and how these numeric differences manifest themselves computationally.

Next post

While all these comparison are nicely done, they remain empirical and anedoctal. This leads us to question, is there a more theoretical way to compare the two methods? Instead of comparing their performance empirically, in our series of blog posts, we will be comparing them at a theoretical level. In the next post, we will draw out an explicit connection between a simplified version of PINN - a single layer perceptron, and traditional finite element methods.

See you there!