# How to Choose a Solver for FEM Problems: Direct or Iterative?

In this article, we explore the black box of numerics and the task of choosing an FEM solver. In general, there are a typical set of options that work for a problem and we never change it. These are like those magic parameters that we never change and we do not know why they are assigned those values either. This time, we open the black box to see what they mean and how they affect the solution.

**Fig 01. Screenshot of solver options available in SimScale**

As shown in Fig. 01, we can see that there are several options available in the numerics section, for structure mechanics simulations, in SimScale. Of these, Multfront, MUMPS, and LDLT are direct solvers while PETSc and GCPC are iterative. In addition, there are other options called “Nonlinear resolution type” and “Line search” features. What do all these mean?

## Direct or iterative solver: What is the difference?

When we use the finite element method, we are solving a set of matrix equations of the form [K]{u} = {f}. Here [K] is referred to as the stiffness matrix; {f} as the force vector and {u} as the set of unknowns. There are several methods available that can directly solve this system of equations. A simple and direct way, though not recommended, would be to invert the matrix [K] and multiply with force vector {f}. This results in the solutions to the unknowns. However, an inverse is almost never calculated. More commonly adopted approaches to directly calculate the solution include Gaussian elimination, LU, Cholesky and QR decompositions etc.

However, such direct methods fail when the matrices are:

- Very large, i.e. of the order of few millions: If the matrix is of order
*n*, then*n*x*n*x*n*operations are needed to solve the system using Gaussian elimination. Imagine, the system has 10 Million unknowns, then it would need operations of the order of ExaFlops. - Have a lot of zeros (often referred to as sparse matrices): Sparse matrices have a lot of zeros in the matrix. Operating upon them (like multiply zero by zero) is both waste of computing resources and meaningless.
- Do not have direct access to the entire matrix at once

In contrast to the direct solver, the iterative solver starts by assuming an approximate solution for the unknowns {u}. The solution is iterated upon to reach an “exact” solution.

## What is a sparse matrix?

Sparse matrices are commonly encountered in Finite Element Analysis. As seen above, FEM involves a lot of matrix operations. In order to save both time and storage space, sparse matrices are preferred.

**Fig 02: A matrix being converted to a sparse matrix**

As shown in Fig. 02, the matrix on the LHS, where all non-zero numbers are on either side of the diagonal and zeros are far away, are commonly known as banded matrices. Such a matrix has a lot of zeros. This matrix is compressed and stored in a format where the zeros are dropped. Such a matrix is called a sparse matrix.

**Fig 03: Symmetric matrix being converted to a sparse matrix**

As shown in Fig. 03, the advantage is even higher when a symmetric matrix is being stored in “sparse” format. In this case, only half of the matrix is stored. As it can be seen, the number of non-zero quantities are much higher in the *A*-matrix in Fig. 03. Yet the total space taken is the same as *B*-matrix in Fig. 02. Instead of 36 places only 18 places are being used. Imagine the savings if this matrix was million times million in size!

## How does an iterative solver actually work?

Just as earlier, let us say we are solving the equation [K]{u} = {f}. Then an approximate initial guess {u0} is made. Using this {u1} is calculated and then again {u2} and so on. The procedure is continued until we reach a solution {un} such that [K]{un}-{f} is almost zero. Remember that computer cannot reach an exact zero when using real numbers!

**To give an elaborate answer:** A simpler matrix [A] is chosen to start with. A general choice is [A] = diag[K]. Only the diagonal elements of [K] are taken in [A] to start with. Using this matrix [A], {u1} is calculated from {u0} by directly solving the equation

Iterating for n-steps, the equation becomes

This continues until {u} which satisfies the original equation is found.

In addition, an iterative solver uses a procedure called “Pre-conditioning”. In the simplest terms, the condition number of a matrix can be expressed as the ratio of the absolute value of biggest to smallest number in the matrix. As the condition number increases, the equation system becomes less stable.

When an iterative solver is used, a pre-conditioning procedure is done before starting the solution process. This improves the condition number of the matrix. Consider the above system [K]{u} = {f}. Now multiply by a matrix [M], then we have [M][K]{u} = [M]{f}. This does not change the solution {u} but depending on the properties of matrix [M], can change the overall behavior of the system.

## Direct solver: What are the options? How to choose among them?

SimScale supports three direct solvers for structural mechanics problems: **LDLT, **Multfront**, and MUMPS**. LDLT and Multfront offer nearly the same options and shall be discussed first. Following this, MUMPS and its options shall be discussed. A general rule of thumb to choosing direct solvers is based on the number of degrees-of-freedom in the system.

### Calculating the number of degrees-of-freedom of a system

Let us say that we are solving a problem in 3-D and the mesh has n-nodes. For a purely mechanical problem, we are solving for three unknowns at each node. These are the displacements along x-, y- and z-directions. So the total degrees of freedom is given by

For a more general case, like thermo-mechanical or electro-hydro-thermo-mechanical problems, where:

- number of dimensions = d
- number of unknowns at each node = p
- number of nodes in the mesh = n

the total degrees of freedom of the system is given as

### Choosing options for LDLT and Multfront

In general, the default options should work well. Yet, it could be useful to know what the numbers mean.

**Fig 04: Options of LDLT and** **Multfront solvers**

The options for these solvers are as shown in Fig. 04. The first option is “renumbering method”. Going back to the start of the article, we talked about sparse matrix and bandwidth. This option optimizes the node numbers such that the bandwidth of the matrix is as small as possible. SimScale uses different algorithms for optimization based on the solver used. It is strongly advised to use this option. For the LDLT solver, RCMK is the only option. Alternatively, for the Multfront solver, use MD for smaller problems with up to 50000 degrees of freedom and MDA for larger problems.

The second option is regarding “force symmetric”. Again, if we remember Fig. 03, it was seen that only half the matrix was stored if the matrix was symmetric. “True” could be chosen if one is certain that the matrices are symmetric in nature. This can significantly save both memory and time. For example, problems using linear elastic or hyperelastic material and no inelastic effects result in symmetric matrices. If one is unsure, using “False” could be a safe option.

The last option concerns the determinant of the stiffness matrix. This is generally used often in normalizing during the solution process and also related to the stability and uniqueness of the solution. If the determinant is used to normalize, it will be used in the denominator of a fraction. Thus, if it is very small, the fraction can increase rapidly leading to instability. The option sets to stop the simulation if the determinant is very small of the order of 10^-(Precision singularity detection). Always remember that the computer never reaches an exact zero! It is recommended to keep this option as “True”.

### Additional options for MUMPS

MUltifrontal Massively Parallel sparse direct Solver (or MUMPS) is a sparse solver that is optimized for solving the system of equations in parallel. It offers addition options for customization of parallel computing. For a general usage, the default options are recommended.

Alongside the above-discussed options, some option of interest to advanced users would be

**Renumbering scheme:**SCOTCH and PARD methods are quite superior. The automatic option could be recommended if one is unaware of these schemes.**Single precision:**It is strongly advised to use “False”. Using single precision could lead to faster computations but could also lead to convergence problems for nonlinear problems.**Distributed matrix storage:**“True” is recommended if one uses multiple processors for the computation. This will take advantage of parallel computing to save the results**Matrix type:**This sets the type of matrix: if symmetric etc. The “automatic” option is the best to use unless one is completely sure.

### Which of the three direct solvers to use?

If the size of the problem is small, say about 50000 degrees-of-freedom, LDLT could be a simple and easy to use solver. For larger problems, Multfront is recommended. For advanced users and those who intend to use parallel computing to the best, MUMPS is the best option available.

## Iterative solver: What are the options? How to choose among them?

SimScale supports two iterative solvers for structural problems: PETSc, and GCPC.

**Fig 05: Options in PETSc and GCPC iterative solvers**

Both **PETSc** and** GCPC** solvers use pre-conditioning and allow different options. As evident from Fig. 05, the options in GCPC are much more limited and also offered by PETSc. A general rule of thumb to choose between the two: GCPC is good for beginner user running smaller problems. PETSc is more suited for an advanced user running larger problems and interested in customizing.

There are two main options that need to be considered here: pre-conditioning technique and solution algorithm. As discussed earlier, Pre-conditioning technique improves the condition number of the matrix. The solution algorithm is what does the actual work of solving the system of equations.

There are several preconditioners available and it is recommended to explore them. There is no “fits all” solution here. In general, MUMPS LDLT is a recommended option as it takes advantage of the MUMPS package. Personally, SOR is my favorite since it offers stability for the problems involving nonlinear elasticity.

With regard to the solvers, GCPC is limited to CG solver. PETSc offers several other options. CG (Conjugate Gradient) and CR (Conjugate Residual) work only for symmetric matrices, GCR (Generalized Conjugate Residual) treats all matrices. GMRES is my favorite! GMRES provides a great balance between computational speed – robustness – reliability. If you are unsure, GMRES is always a safe bet!

## What is the “Nonlinear resolution” option?

The “nonlinear resolution” option provides more customization options for the user. At each global iteration (of direct solver) or local + global iteration (of iterative solver), a simple algorithm is used to solve the equations. Here simple algorithms like Newton-Raphson, quasi-newton etc are used. For the direct solvers, only “Newton” option is available. “Newton-Krylov” is optimized for iterative solvers and strongly recommended.

Firstly, it is a good practice to use the same option for both the “prediction matrix” and “jacobian matrix” options. Using the “tangent matrix” calls the newton method, while “elastic matrix” calls a quasi-newton method. The quasi-newton method is recommended for coupled problems where the matrices might not have all the nice properties. For example: in a thermo-mechanical problem, quasi-newton method is well suited. Otherwise, the Jacobian option (or full newton method) is strongly recommended for accurate solutions to structural problems.

If the problem is well-posed, the newton method demonstrates “quadratic convergence.” This means that convergence is reached in 3-7 iterations. In several scenarios, this might not always happen. One of the most common reason is that the chosen time step is too large for that part of the problem. In such a case, the problem will not reach convergence. If an automatic time stepping is chosen, then the time step size is reduced till convergence is reached. The maximum number of iterations when the system declares a non-convergence and reduces the time step can be selected here. The default value of 15 is generally sufficient.

## Is line search necessary? When should it be used?

Line search reduces the calculated step size into much smaller increments. Since the step sizes are smaller, this automatically helps to improve convergence. Line search is not always necessary even in the presence of nonlinearities. It is recommended in elastoplastic problems that involve yielding of materials. Here the derivative of the stiffness matrix is rapidly changing at the yield point. Line search can significantly improve the convergence here. However, the line search is discouraged in the presence of contact constraints.

For a purely elastic problem, a line search is generally not necessary!

## Summarizing the article

**Direct solver or iterative solver?**- Direct solver if number of degree’s of freedom is less than one million

**Which direct solver to use?**- If number of degree’s of freedom is less than 50000, LDLT works well
- For larger problems, Multfront is advised
- To use the full potential of parallel computing, use MUMPS

**Which iterative solver to use?**- PETSc and GMRES are my favorite
- For smaller problems, GCPC is also a worth a try

**Setting Nonlinear resolution options**- For direct solver use the “Newton” option and “Newton-Krylov” for iterative solvers
- For prediction and jacobian matrix: Use “Tangent matrix” for structural problems and “Elastic matrix” for coupled problems

**When to use line search?**- Useful for elastoplastic problems involving yielding of materials

**Download the company’s booklet to learn more about simulation with SimScale.**