Numerical validation of probabilistic laws to evaluate finite element error estimates

We propose a numerical validation of a probabilistic approach applied to estimate the relative accuracy between two Lagrange finite elements $P_k$ and $P_m, (k<m)$. In particular, we show practical cases where finite element $P_{k}$ gives more accurate results than finite element $P_{m}$. This illustrates the theoretical probabilistic framework we recently derived in order to evaluate the actual accuracy. This also highlights the importance of the extra caution required when comparing two numerical methods, since the classical results of error estimates concerns only the asymptotic convergence rate.


Introduction
Finite element methods and among them, error estimates play a significant role in the development of numerical methods.Very often, the success of a numerical method depends on its performance in terms of efficiency and accuracy.For this reason, it is still an active subject of research, as observed, for instance, with the considerable interest received by the discontinuous Galerkin methods in the past decades; see e.g. an introduction for elliptic problems in [1], the book [2] or the pioneering work [3].
Since the seminal papers of Strang and Fix [7], Ciarlet and Raviart [8], Babuska [6] and Bramble and Hilbert [9], along with co-workers, a large amount of work has been published, the purpose of which was to derive and expand error estimates in different configurations.Here, we are concerned with a priori error estimates, that aim to find upper bounds for the error between the exact solution u and its finite element approximation u h .More precisely, these estimates describe how the finite element error u − u h , for a given norm, goes to 0 with mesh size h (i.e. the largest diameter of the elements in a given mesh).In addition, these estimates involve a constant, generally unknown, which leads to only get an upper bound for the approximation error.
In addition, quantitative uncertainties do exist in finite element methods; these are based on the way the mesh grid generator creates the mesh which is used to compute the finite element approximation u h , or since the equations are not exactly solved due to round-off errors.In previous papers [12], [13], we investigated the error resulting from a partial non-control of the mesh size.For this purpose, we have considered the approximation error as a random variable, and we have evaluated the relative accuracy between two Lagrange finite elements with the help of a probabilistic approach.In the same way, one can find in [4], [5] a probabilistic approach to evaluate error bounds in numerical analysis.
In this work, we numerically study the a priori error estimate due to the discretization of a linear variational problem by a finite element method, using standard polynomials.Our aim is to compare the probabilistic laws we derived with statistical results, when two different degrees of the polynomials are used, for a fixed value of the mesh size.Since the effective dependence of the accuracy on the mesh size is a central question, it could help one to understand the saturation assumption that is often used in a posteriori error analysis [24].Indeed, we use here a probabilistic approach which differs from the methods involved in a posteriori error analysis.Nonetheless, we will show examples where P k finite element is more likely accurate than P m , k < m, which can be related to the invalidity of the saturation assumption [25].
The paper is structured as follows: Section 2 summarizes the results of [12], [13], which are necessary for one to understand the numerical experiments and their analysis.The main results are the geometrical interpretation of error estimates and the two probabilistic laws we deduced for finite element accuracy.Section 3 is devoted to the numerical results, which illustrate the new probabilistic way we propose to evaluate the accuracy between two finite elements.We basically consider two numerical problems, a stiff one and a smooth one, and we compare, for each of them, the behavior of the theoretical probabilistic models with the statistical results.Concluding remarks follow.
2 Probabilistic models and finite elements accuracy

Error estimates revisited
Consider Ω an open bounded and non-empty subset of R n , and let Γ denote its boundary, assumed to be C 1 −piecewise.We also introduce an Hilbert space V endowed with a norm, .V , and a bilinear, continuous and V −elliptic form a(•, •) defined on V × V .Finally, l(•) denotes a linear continuous form defined on V .
Let u ∈ V be the unique solution to the second order elliptic variational formulation In this paper, we will focus on the simple case where V is the usual Sobolev space of distributions H 1 (Ω).More general cases can be found in [14].
Let us now introduce the finite-dimensional subspace V h of V , and consider u h ∈ V h an approximation of u, solution to the approximate variational formulation In what follows, we are interested in evaluating error bounds for finite element methods.Hence, we first assume that domain Ω is exactly covered by a mesh T h composed by N s n-simplexes K j , (1 ≤ j ≤ N s ), which respects classical rules of regular discretization, (see for example [19] for the bidimensional case, or [21] in R n ).We also denote by P k (K j ) the space of polynomial functions defined on a given n-simplex K j of degree less than or equal to k, (k ≥ 1).
Our study relies on the results of [21].Let . 1 be the classical norm in H 1 (Ω) and |.| k+1 the semi-norm in H k+1 (Ω), and let h be the mesh size, namely the largest diameter of the elements of the mesh T h .We thus have: Lemma 2.1 Suppose that there exists an integer k ≥ 1 such that the approximation u h of V h is a continuous piecewise function composed by polynomials which belong to P k (K j ), (1 ≤ j ≤ N s ).
Then, if the exact solution u belongs to H k+1 (Ω), we have the following error estimate: where C k is a positive constant independent of h.Now, let us consider two families of Lagrange finite elements P k and P m for two values (k, m) ∈ N * 2 , (k < m).Assuming that the solution u to (1) belongs to H m+1 (Ω), inequality (2) can be written as h and u (m) h respectively denote the P k and P m Lagrange finite element approximations of u.
In this article, following a series of previous papers [12]- [14] where a theoretical analysis was performed, we are interested in numerical applications.To this end, for a given mesh size h, two independent meshes for P k and P m are built by a mesh generator.Usually, one considers inequalities ( 3) and ( 4) so as to conclude that, when h goes to zero, P m is more accurate that P k , since h m goes faster to zero than h k .However, in practical numerical applications, the size of the mesh is chosen according to the desired accuracy, so that h has a fixed value.Consequently, this way of comparison is no more relevant.For this reason, we mean to identify the relative accuracy between P k and P m , (k < m), for a given value of h.

Two probabilistic laws
In [12]- [13], we introduced a probabilistic approach that provides a coherent framework for modeling uncertainties in finite element approximations: such uncertainties may come from the way the meshes are created by computer algorithms, leading to a partial non-control of the mesh, even for a given maximum mesh size.
In this framework, values u h − u 1 are viewed as two random variables, respectively denoted as 3) and (4).Our goal is thus to derive a probabilistic law for the event which corresponds to the relative accuracy between finite elements P k and P m .For this purpose, we first introduce the random events A and B defined by: Moreover, we proved in [12] the following result: of event A is given by: where h * k,m is defined by: The shape of the probabilistic distribution, called the two-steps model, is depicted in Fig. 1.Basically, it expresses the fact that, for h < h * k,m , finite element P m is almost surely more accurate than P k , whereas for h > h * k,m , P k becomes almost surely more accurate than P m .
To relax the independence assumption of events A and B, we also derived a second probabilistic law based on the uniform distribution of the random variable X (k) (h) over 0, C k |u| k+1 h k .In this context, we proved in [12] the following theorem: Theorem 2.3 Let us assume that X (i) (h), (i = k, m), are independent and uniformly distributed on Then, the probability P (A) of event A is given by: Figure 1 -Case m − k = 1: shape of the sigmoid distribution (7) (full line) and the two steps corresponding one (5) (dashed line), (P k,m (h The shape of this law, called the sigmoid model, is also plotted in Fig. 1.As one can see, for h > h * k,m , P (A) ≤ 0.5: in that case, finite element P m is probably overqualified.
The purpose of the next section is to propose numerical examples that illustrate and validate this probabilistic approach by comparing statistical frequencies and the corresponding probabilities determined by ( 5) or (7).

Numerical results
In this section, we will illustrate our probabilistic approach on numerical examples, by evaluating the relative accuracy of two Lagrange finite elements.We have intentionally chosen a simple, standard example, in order to help us numerically check the relevance of the proposed probabilistic distributions.
Hence, we consider the following classical elliptic problem, with obvious notations where, for simplicity, domain Ω is the open unit square in R 2 : Ω =]0, 1[×]0, 1[.The associated variational formulation, which is analogous to (1), can be readily derived.According to the choice of q and h, we will consider as examples a stiff problem, where the solution exhibits rapid variations or, alternatively, a very smooth problem.
One of the main ingredients of the method is the computation of h * k,m , as defined by (6).As one will see, it will be evaluated using a maximum likelihood estimator; see for instance [22].
In our case, this principle is applied as follows: for a given finite element P k , we consider a number N of different meshes with the same (maximum) mesh size h.Then, we compute: which constitutes, using estimate (3), the maximum likelihood estimator for C k |u| k+1 .Indeed, due to Then, one can show [20] that for a given uniform random variable Y whose support is [0, θ], θ being an unknown real parameter, the maximum likelihood estimator θ is given by: where (Y 1 , . . ., Y N ) is a sample built with independent and identically distributed random variables (Y i ) i=1,N , with the same distribution as Y .
In our case, this implies that ( 9) is the maximum likelihood estimator for C k|u| k+1 , since N and h each take a finite number of values.
Doing the same for another finite element P m , we obtain that the estimator for h * k,m , denoted h * k,m , is defined by: Then, one can easily compute the two probability laws introduced in subsection 2.2.Indeed, as soon as h * k,m is computed, functions ( 5) and ( 7) are operational by replacing h * k,m by h * k,m .All the numerical results below are computed in this way.
In order to numerically check the validity of each model, we now compare the two probabilistic laws defined by ( 5) and ( 7) with the corresponding statistical frequencies computed on the N meshes, for each fixed value h of the mesh size.
To that end, we consider for two finite elements P k and P m (k < m), the same number N of different meshes with the same (maximum) mesh size h.From there, we compute the approximate solution Then, we repeat the same process for different values of h, either lower or greater than h * k,m .This gives, as a function of h, the percentage of cases where the approximation error of P m is lower than the approximation error of P k .In all cases, we use package FreeFem++ [23] to compute the P k and P m finite element approximations.
In the next subsection, we consider a stiff case, whereas in the following one, we deal with a very smooth example.

A first stiff case
To introduce such a stiff case, we consider the well-known Runge function ϕ(t) = 1 1 + αt 2 which takes α as a parameter, the classical Runge function corresponding to α = 25 (see [10], [11]).
Since we first aim at building an exact solution u(x, y) for (8), we consider solutions of the form u(x, y) = f (x)g(y), where both f (x) and g(y) are Runge functions of parameter α.
To compute the derivatives of u(x, y), we basically need the derivatives of the Runge function f .After some elementary algebra, we obtain the derivatives of (1 + αt 2 ) 3 ), from which the Laplacian of u(x, y) can easily be derived.Indeed, by computing the second order partial derivatives u xx , u yy , we find that We now set the right-hand side q(x, y) of ( 8) equal to expression (11) above, so that is the exact solution of ( 8), provided that the Dirichlet boundary condition h(x, y) is taken as the trace of u(x, y) on the boundary ∂Ω, that is In what follows, we analyze the relative accuracy between two Lagrange finite elements, u(x, y), as defined in (12), being the reference solution for comparison.

P 2 -P 3 comparison and α-independence
The first numerical test we present is devoted to a comparison between finite elements P 2 and P 3 .We first choose α = 500.In that case, as explained above, we computed value h * 2,3 as defined in (10) and obtained h * 2,3 0.12.
For this example, we have used values of h varying from 0.05 to 0.18, and for each h we have constructed N = 500 different meshes with the same value of h.In Fig. 2 we plot, on the same picture, the results obtained for the statistical frequencies (full line) and for the two-steps probability law (5) (dotted line), as a function of h.We then checked that the results do not depend on the value of α.For this purpose, we repeated the same numerical experiments for α = 25 and α = 2000.The results, depicted in Fig. 3, show the same behavior as previously.Of course, the value of h * k,m does depend on α and we have h *

Comparison with P 4 finite element
Next, we numerically assessed the validity of the present approach when finite element P 4 is involved.
We first compared P 3 with P 4 , then P 2 with P 4 .To this end, we used the Runge function with  like previously, that the statistical frequencies behave very similarly to the two-steps probabilistic law (5).
The last illustration of this subsection is devoted to the comparison between the statistical frequencies and the sigmoid probabilistic law defined in (7).We considered comparisons between finite elements P 2 and P 4 , then between P 1 and P 4 .
We followed the same procedure as above, again with the same parameters (N = 500 and α = 2000).
The results are depicted in Fig. 5.As one can see, there is a weaker fitting between the two curves than with the two-steps law (5), even if the trend is still correct.Remark also that the fit is better in the P 1 -P 4 case than in the P 2 -P 4 case.More generally, the greater the m − k difference, the better the match.Hence, the sigmoid model also gave a correct trend, but was less precise and satisfying than the two-steps law, in particular when m − k = 1, for instance when one compares P 2 with P 3 , see Fig. 6.Indeed, in that case, the first part of ( 7) is a linear decreasing function of h (for any given fixed h * k,m ), and the second one decreases like 1/h.However, if difference m − k = 2, for instance when comparing P 2 to P 4 , the fit is better.Indeed, the first part of ( 7) is a decreasing function −h 2 (for any given fixed h * k,m ), and the second one decreases like 1/h 2 .So, depending on the difference m − k, the sigmoid law remains to some extent relevant, where high order finite element (with m around 20-25) are sometimes used [26].
This shows that the two-steps model works well, but is a bit "rough" (essentially binary), whereas the sigmoid law is probably too "rigid" and has to be make more "flexible" to obtain a better fit with the statistical results.For this reason, we are currently working on a more general approach which corresponds to a new generation of probabilistic laws that better fit the statistical frequencies.

A smooth example
In this subsection, we illustrate the probabilistic laws for a very smooth solution to the variational problem (1).To build such a case, we chose q = 2π 2 sin(πx) cos(πy) in (8), so that u(x, y) = sin(πx) cos(πy) is the exact solution of the problem, provided that the Dirichlet boundary condition h is taken as the trace of u(x, y) on the boundary ∂Ω.This can be written as: As previously, we first compute h * k,m defined by (10), then we compute the probabilistic models introduced above.After that, we compare these results to the statistical frequencies.
For example, we consider the finite elements P 2 and P 3 : we depicted in Fig. 7 (left) the statistical frequencies and the probabilistic law (5).As in all the other numerical experiments, 500 meshes have been used for each value of h, where we found a value of h * 2,3 approximately equal to 0.18.As one can see, even in this case, there is a good agreement between the statistical frequencies and the probabilistic law (5).However, the comparison with the sigmoid law (7) (right part of Fig. 7) gave only a global trend and was not really accurate.Here again, as for the Runge example, it will be improved by the above-mentioned new generation of probabilistic laws.

Conclusion
In this paper, we proposed to apply the probabilistic approach we developed in [12] to numerical examples.It enabled us to evaluate the relative accuracy between two Lagrange finite elements P k and P m , (k < m), for a fixed value of the mesh size h.Our approach, which is based on a geometrical interpretation of the error estimate, considers the approximation errors as random variables.Two probabilistic laws were derived, a so-called "two steps" law and a "sigmoid" one, depending on the probabilistic assumptions which were made on the corresponding random variables.
For the finite elements we considered, we illustrated, using several examples, the property that, depending on the position of h with respect to the critical value h * k,m , we can actually estimate which of finite elements P k and P m is more likely accurate.This overturns the common misconception that finite elements P m are always more precise than P k if m > k, regardless of the mesh size h.In particular, this shows cases where a P m finite element surely is overqualified.As a consequence, a significant reduction of implementation time and execution cost can be obtained without loss of accuracy.Such a phenomenon was already observed by using data-mining techniques (see [15], [16], [17] and [18]).However if, in the proposed examples, the first investigated law (the two-steps law) fit the numerical results satisfactorily, the second proposed law (the sigmoid one) produces only a trend and is not accurate enough.Indeed, the results show that the statistical frequencies behave similarly to the two-steps probabilistic law, in both the smooth and stiff examples.However, there is a weaker fitting between the statistical error and the sigmoid law, particularly when difference m − k is small.To address this issue, we are currently working on a new probabilistic framework which corrects the gap between the statistics and a "generalized" probability law.other approximation methods: given several different numerical methods and their error estimates, it would be possible to order them by evaluating which is the most probably accurate.
Homages: The author wants to warmly dedicate this research to pay homage to the memory of Professors André Avez and Gérard Tronel, who broadly promoted the passion of research and teaching in Mathematics.

h
, and we test if u for α = 2000.

αFigure 2 -
Figure 2 -P 2 versus P 3 for the Runge function with α = 500.Comparison between the statistical frequencies (full line) and the probabilistic law (5) (dotted line).500 meshes are used for each value of h.

Figure 5 -
Figure 5 -P 2 versus P 4 (left) and P 1 versus P 4 (right) for the Runge function with α = 2000.Comparison between the statistical frequencies (full) and the probabilistic law (7) (dotted).500 meshes are used for each value of h.

Figure 6 -
Figure 6 -P 2 versus P 3 for the Runge function with α = 2000.Comparison between the statistical frequencies (full) and the probabilistic law (7) (dotted).500 meshes are used for each value of h.

Figure 7 -
Figure 7 -P 2 versus P 3 for the smooth case.Comparison between the statistical frequencies (full) and the two probabilistic laws (dotted).500 meshes are used for each value of h.Left: comparison with the two-steps probabilistic law (5) -Right: comparison with the sigmoid probabilistic law (7).