P-Fea Course. Lesson 2- Basis Function Continuity

This course will continuously refer to [1]. That book is highly recommended as a reference. Detailed handwritten notes of important derivations for stressRefine are also available here.

For conventional finite elements the basis functions used to represent displacement are continuous at interfaces between elements as long as the adjacent elements have the same nodes at the interface. These functions are often called “shape functions” because they’re the same functions used to represent the element mapping from natural coordinates to physical coordinates, which determines the element’s physical shape. Here will make a distinction. “Basis functions” is used to describe functions representing displacement, while “mapping functions” (interchangeable with “shape functions”) are used to describe functions representing the element mapping.

Before I get to how continuity of basis functions, let me describe a slight tweak I made in stressRefine to the classic heirarchical basis functions developed by Szabo and Babusca [1].

This image has an empty alt attribute; its file name is image-11.png

Variation used in stressRefine:

In the classic functions the only nodes are at the element corners and the edge functions start at polynomial order 2. In the stressRefine functions, the nodal functions are the same as the conventional finite element quadratic functions, which requires the introduction of mid-edge nodes. The edge functions start at polynomial order 3. The advantage of this change is that if the elements is a polynomial order 2, its is identical to a conventional quadratic isoparametric element. A possible disadvantage is if this would degrade the element stiffness conditioning. (The element stiffness matrices are assembled into a linear system Ku = f. If K is not well-conditioned, slight errors in f can cause large errors in u). But I was able to prove numerically that the element stiffness matrices are as well conditioned with the modified functions as with the classic ones, at least as high as polynomial order 8.

So for the stressRefine functions, continuity across element interfaces is assured if the adjacent elements have the same nodes, up until polynomial order 2. Additional work is needed to assure continuity at shared edges and faces for polynomial orders higher than 2.

To show why this is so for the case of edges, consider the odd-numbered polynomial functions for polynomial orders 3,5,7, … In the figure shown (for a 2D mesh), the upper element has the shared edge defined between corner 1 and 2 based on the local element numbering, so it, while the lower elements has the shared edge defined between corners 2 and 3, so the edge runs backwards from the other element. This means that the local definition of the p3 (short for polynomial order 3) function is different, and for the lower element that function has to be multiplied by -1 to assure continuity.

The situation is more complicated for faces. Consider an exploded view of two adjacent elements that share a face with global nodes 10, 11, 12:

But for the left element these correspond to local nodes 1,2,3 while on the right element they correspond to local nodes 3,1,2. So for the two elements the natural coordinates on the face rf, sf are defined differently, and the basis functions are misoriented on the right face with respect to the left.

This brings up an interesting historical aside about Mechanica. My colleague and cofounder Christos Katsis handled everything related to coding up the basis functions. He introduced the concept that there is a global face, and local faces can be rotated (or even reflected) compared to it. There is then a coordinate transformation to make sure the local face is compatible with the global face. It is further complicated by the fact that we need to relate elements natural coordinates r,s,t to face natural coordinates rf, sf. So there was a 3×2 transformation relating r,s,t to the local face, and a second 2×2 transformation relating the local face to the global face. None of this is computationally intensive, but it was tricky to code. Christos took care of all of this while we were in full-blown early start-up mode, with tons of distractions. A major feat of concentration!

When I started to develop stressRefine I was not looking forward to the equivalent of all of that, but fortunately found there’s an easier way to handle it. There was a hint in Szabo and Babusca’s textbook [1], when describing implementation of the hierarchical basis functions. When discussing the 2D basis functions for a triangle, instead of referring to local coordinates r and s they used l2 – l1 and  2l3-1 where l1,l2, and l3 are the area coordinates of the face. From the definition of the area coordinates, r = l2 – l1 and s = 2l3-1. I realized that they had done this because it guarantees continuity of basis functions for elements sharing the same face as long as the nodes of the face are numbered consistently. So we just need to introduce the concept of how the local node numbering of the face relates to the node numbering of the global face.

The global face is defined by the first element that owns that faces. So in the example above, the left hand element is encountered first, and the global node numbering of the face is local node 1 = global node 10, local node 2 = global node 11, and local node 3 = global node 12. Now when we encounter the face again in the right hand element, the local nodes of the face are 3,1,2. When we work with the basis functions of the face, compatibility is assure if we use rf =l1– l3 and sf = 2l3-1 when referring to the face for the second element. This is for 2D, It turns out in generalizes to 3D if we use the same idea with volume coordinates instead of area coordinate. This also takes care of relating element r,s,t to face rf, sf.

I’ll show how this is coded up below. In short, we introduce a variable gno for “global node order”, so for the example face on the right hand element gno(1) = 3, gno(2) = 1, and gno(3) = 2.

All of this works for quadrilaterals in 2D, and bricks and wedges in 3D as well, But for bricks, for example, we define rf, sf using the linear mapping functions of the brick N1 through N8 instead of volume coordinates.

Details of Basis functions for a tet

Nodal functions: These have the value 1 at one node (corner or midedge) and 0 at all others, so they are the same as the quadratic shape functions for a 10 noded tet [2]. for example N1 = (2L1 -1)L1 is a typical function for a corner node, N5 = 4L1L2 is a typical function for a midedge node.

The edge, face, and volume functions are the same as described in [1] except the edge functions start at p3.

Edge Functions: These are nonzero on one element edge and zero on all others. They start at p3 because the quadratic edge functions are the same as nodal midedge functions. They are the integrals of Legendre polynomials shown here blended with volume coordinates so they are zeroes on all other edges:

bn = 4LILJφp(re)/(1-re2) where φp is the integral of Legendre polynomial over the edge for polynomial number p, re = LJ – LI is the natural coordinate on the edge, and n is the basis function number of the element. The term LILJ with the volume functions assures this reduces to the nonzero 1D function φp on the edge with corners I, J, and is 0 on all other edges, and zero on all faces except those with corner I,J.

Face Functions: These are nonzero interior to one face, zero on all edges, and blend to 0 on all other faces.

bn = 4LILJLkPpr(rf)Pps(sf) where Ppr is the Legendre polynomial in the rf direction and pr is the polynomial number in that direction, Pps is the Legendre polynomial in the sf direction and ps is the polynomial number in that direction, rf = LJ – LI and Sf = 2Lk-1 are the natural coordinates on the face, and n is the basis function number of the element The term LILJLk with the volume functions assures this is nonzero interior to the face with corners I, J,K, and is 0 on all edges,and all other faces,

Volume functions: These are 0 on all edges and faces, and nonzero in the interior to the element.

bn = LILJLkLlPpr(r)Pps(s)Ppt(t) where Ppr is the Legendre polynomial in the r direction and pr is the polynomial number in that direction, and similarly for s,t. The volume functions assure that these are 0 on all edges and faces.

Wedges and Bricks

Similar techniques are used to blend the 1D functions wedges, and bricks, For calculating element stiffnesses, we need the derivatives of the basis functions with respect to the element natural coordinates r,s,t. This is a little messy but not difficult to code once you have the formulae derived. The basis functions and their derivatives for tets, bricks, and wedges are described in my notes here. Note that we do not have to actually evaluate the integrals of Legendre polynomials, there are regression formulae for computing them described in [1].

Use of “global node order”: I described assuring continuity of the basis functions above. For edges we just need to correct if the local edge runs backwards from the global edge. A direction variable, which is +/- 1 is assigned to each edge. The basis function is multiplied by “direction” to assure continuity (the same thing could be achieved by switching I and J in the edge basis function if the edge runs backwards, but the direction flag is easier. For the face functions of a triangular face we need to use the “global node order” described. So in the face functions shown above, I,J,K would normally be the element local corner number associated with local corner numbers 1,2,3 of the face. Instead we use I =gno(1), J = gno(2), and K = gno(3). In the example above, for the face on the right hand element shared by the two elements in the exploded view, gno(1) = 3, gno(2) = 1, and gno(3) = 2. Note that the discussion here is “1-based” but the code is in c++ so in the code is is “0-based”, e.g. gno(1) is really gno[0] in the code.

This may all seem messy but it is only necessary if you dive deep into the bowels of the basis function routines, such as TetBasisFuncs is file basis.cpp in SRLibSimple. To actually use the library, two setup actions need to be performed:

Create Global Edges For all elements in your code, create corresponding SRelements. This is done by first allocating space for them:


where nel is the number of elements. Then loop over your elements, and, for each, call

model.CreateElem(id, userid, nnodes, nodes[], mat)

where nnodes is the number of corner and midnodes in the element, e,g, 10 for a quadratic tet, nodes is the vector of node numbers for the element (corners first followed by midedges), and “mat” contains the element material properties. This are all be described in a programmers guide I uploaded here. The global edges automatically get created on the fly, with the “direction” flag assigned.

Create Global Faces:

After nodes and elements have been defined, just call


This creates global faces for all element faces in the model, and assigns the global node order variable. It also determines which elements own each global face, and which faces are boundary faces (only have one owner).

Subsequently the basis function routines can be called as needs and continuity will be assured.

Homework problem: The volume coordinates for a tetrahedron have the property that they are 1 at a single node, and 0 at the opposite face. For example L1 is 1 at node 1, and 0 on face 2,3,4. Given that property, explain why the blending works properly for the edge, face, and volume functions: For edge functions, the functions are nonzero on the edge, and zero on all other edges and all faces that don’t touch the edge. For face functions, the functions are nonzero on the face, 0 on all other faces. For volume functions, the functions are only nonzero in the interior of the volume.


  1. Szabo, B, and Babuska, I, Finite Element Analysis, Wiley, NY, 1991

stressRefine Source on github

The stressRefine Source is now on github at stressrefine/sros. This is still the windows version that requires visual studio to build (there is a free download for VS here, click the download button under the “community version”). I put a readme.txt on github that explains how to build stressrefine using VS. There is also the document “Using The StressRefine source.pdf” That explains which of the four projects you need depending on your purpose. To just link the stressrefine library to your own executable you only need “SRlibSimple”.

My next step is a Linux version, for which I’ll provide a new readme to explain how to build, as well as makefiles.

OpenSource StressRefine Uploaded

The source code for stressRefine has been uploaded here. It is all free per the terms of the GNU General Public License. There is a zip file stressRefineSource.zip containing four folders bdftranslate (the Nastran translator), SRui (the user interface), SRlibSimple (the stressRefine library), and srwithmkl (an executable analysis engine using the library). All are free per the Gnu Gpl. However, the analysis engine srwithmkl needs the Intel pardiso direct solver from the Intel Mkl library, and use of that library is subject to the Intel license. This is all explained in the document “Using the stressRefine source” which is in the same link above. Use of Gnu gpl for code that needs a separate proprietary library linked to it is allowed, though a bit awkward, as explained here. I’ll be working on replacing Intel/pardiso to remove this limitation. Only srwithmkl needs pardiso, so the other three, bdftranslate, srui, and srlibsimple, are not affected.

Please see the document “Using the stressRefine source” for instructions on how to make use of it. Options are to use the library only, with your own Fea code executable, to use the library linked to srwithmkl to create an analysis engine, or to build all four of the folders mentioned above to get a fully functioning UI, translator, and analysis engine.

The first lesson in the P-Fea course was posted today. This is introduction and theoretical background. Starting with next monday’s lesson I’ll dive into topics that will explain the code in the stressRefine library in depth.

P-Fea Course: Introduction

Take this course to learn what’s “under the hood”

This course is intended to teach students and developers who want to learn how to write a p-adaptive code using the stressRefine library. It will start with the theory behind p-adaptivity, then take you through the practical aspects of creating p-adaptive elements using the library, and writing analysis routines that do p-adaptive analysis. It will continue on to show how to do analysis of local regions. This will all be presented first with simplified elements and solution assembly routines, then I’ll cover some code optimization and parallelization.

Having taken at least one finite element course will be assumed as a prerequisite. Familiarity with basic numerical methods concepts will be assumed, but concepts like efficient equation solution will be covered.

Suggested Reading:

Szabo, B, and Babuska, I, Finite Element Analysis, Wiley, NY, 1991. Superb text for understanding the implementation of heirarchical finite elements

Zienkiewicz, O, and Taylor, R, Finite Element Method. Volume 1- The Basis, Butterworth, 2000. Excellent introduction to Fea in general, plus error estimation and adaptivity concepts.

Lesson 1- FEA Displacement Approximations And Convergence

Approximating Functions

Finite Element Analysis uses an integral expression over the entire model being solved, rather than solving the governing equations (such as the equilibrium equations in elasticity) point by point throughout a model.  A series of functions is used to approximate the unknowns, such as the displacements u,v,w in the x,y,z directions in structural analysis:

u = ∑i = 1 to N aifi(x,y,z)   v = ∑i = 1 to N bigi(x,y,z) w = ∑i = 1 to N cihi(x,y,z)

These functions have to satisfy some conditions such as continuity, and displacement boundary conditions (for example if you have an immovable support, the displacements must be zero there).

The most famous use of this approach is the Ritz method, also known as Rayleigh-Ritz. A series of unknown functions like those in the equation above is inserted in an integral expression for the potential energy (strain energy plus work done on the boundary). If the energy is E, then the minimum potential energy is found by setting the derivatives of E with respect to the unknown coefficients to zero, for example

∂E/∂ai = 0 for I = 1 to N,

for the displacements in the x direction (u), and similar expressions for v and w. This results in a linear system of equations that is solved for the unknown coefficients.

Consider a one-D example, solving for the deflection of a beam of length 100. Now u represents the lateral deflection of the beam, and we approximate it with some functions:

u = ∑i = 1 to N aifi(x)

A popular choice is fi(x) = sin(i*π*x/L) where x is the length of the beam. Since L = 100, here’s what the first few functions look like:

These functions are not able to produce a nonzero value for displacement on the boundary. So if there’s an enforced displacement, we could introduce a linear function that’s nonzero on the boundary or a local function that’s nonzero on the boundary and decays rapidly into the interior, like the blue or orange functions shown:

These boundary functions can be used in addition to the sin functions.

This technique works well with functions that are smooth and linearly independent, and often gives accurate results with a few functions. It also helps if the functions are orthogonal to each other or close to it. We want ∫fi(x)fj(x)dx over the body, in this case the length of the beam, to be close to 0 if i≠j and close to 1 if i = j, and that works out well for the sin functions.

To illustrate the concept of completeness, suppose we omitted f1 in the figure above. The series would be incomplete, and even if we used an infinite number of functions, it might not converge to an accurate answer. The typical functions used in conventional finite elements codes are complete, and will converge to an accurate answer as long as enough elements are used (fine enough mesh). In a p-type finite element code, the mesh can be kept fixed, and an accurate answer will results as long as the polynomial order is high enough.

Now suppose we wanted to solve a 1D problem where the exact solution for displacement looks like this:

We could use a lot of the sin functions shown above, plus add in the boundary functions. But that’s a lot easier to do in 1D, and in 2D for simpler shapes like rectangles. It gets a lot harder for complex 3D shapes to come up with functions that can satisfy the boundary conditions and be continuous. One way to get around this is to use functions that are piecewise continuous. If we subdivide the line in the figure above into multiple segments, we can represent the displacement in each segment with more simple functions like polynomials. For example we can use linear functions or quadratic lagrange polynomials:

These functions have the property that they nonzero at a node (such as at the ends of the segment) and zero at all other nodes. Continuity is assured if the displacement is the same at nodes shared by adjacent segments. The segments are the “elements” in this 1D example. Now to follow the very wiggly displacement above well, we’d need a lot of the linear elements, Fewer of the quadratic elements would be needed. We can take this further by introducing higher order elements:

These are the 1D versions of the basis functions proposed by professors Szabo and Babusca [1]. In this example we could get by with only one of these segments if we went to high enough polynomial order. But there is a limit to how high the polynomial order can go in practice, because using very high polynomial order is computationally intensive. This means in more complicated problems we’d still need multiple elements. So there are two basis approaches to adaptivity: If we refer to the length of a segment as “h”, the basic idea with simpler elements, like the linear and quadratic ones above, is to achieve accuracy by keeping the type of element fixed (for example it remains quadratic), and to make “h” smaller by using more elements, This is h-adaptivity, which is achieved by adaptive mesh refinement. An alternative is to use higher order elements where the polynomial order can be raised. Keeping the number of elements fixed, we successively increase the p-order to achieve accuracy. This is p-adaptivity. It has the advantage that it can be achieved entirely inside the finite element analysis code, while h-adaptivity requires collaboration between the analysis code and the automesher. The disadvantage of the p-adaptive approach is that the elements are more complicated. This is especially true if you have a library of existing elements that would have to be converted to p-elements. The purpose of the stressRefine library is to provide tools to make such conversion easier, and in this course we’ll go into detail about showing how to use the library for that purpose.

The piecewise continuity approach is extended to 3D by using 3D elements. These are typically simple shapes like bricks (hexahedra), wedges (pentahedra), or “tets” (tetrahedra). The conventional h version of quadratic elements looks like this:

These a called “serendipity elements” which are based on a variation of the 1d Lagrange polynomials shown above. This is simplest to explain for the brick (c). Lagrange polynomials can be used in x,y, and z, and multiplied together. But this would require there to be additional nodes on the middle of the faces of the element, as well as the interior of the volume. It turns out the functions can be modified so that such nodes are not needed, which simplify the work of meshing, and good accuracy is still achieve. I’ll omit the details of that because my focus is on p-adaptivity, but it’s explained well in [2], and there’s a discussion online here. Until now I haven’t discussed mapping of elements. The basic shape shown above must be capable of being distorted into physical space, shown here in 2D:

The “natural coordinates” of the original elements are r,s,t. These need to be mapped to the physical coordinates x,y,z. To do this, we assume the position anywhere in the element is described by approximating functions called “shape functions”:

x = ∑i = 1 to N aifi(r,s,t)   y = ∑i = 1 to N bigi(r,s,t) z = ∑i = 1 to N cihi(r,s,t)

You can use more or less functions to represent the shape than the displacements. This means that more or less coefficients, or “parameter”, are used. If less are used for mapping than displacement, the element is “subparametric”. If the same number are used, the element is “isoparametric”, and if more are used for mapping the element is “superparametric”. Superparametric elements are not the best idea because they can have trouble representing rigid body motion. Isoparametric elements are the most commonly used conventional finite elements. The elements in stressRefine start out isoparametric at polynomial order 2, then become subparametric. I’ll go into more detail about that in a future lesson.

An obvious choice for the shape functions is to use the same functions used for the displacements, so not only as the same number of functions used, but the same functions. This is typically what is meant by a conventional isoparametric element.

Higher order polynomial elements are created by blending the 1D polynomials shown above in 2 and 3 dimensions. Here is what that looks like for the version created by Szabo and Babusca [1]:

Figure 1

I’ve discussed these functions in detail in a previous post. I discussed there how the number of each type of function must be chosen for completeness. Continuity requirements for these elements are trickier than for conventional elements. For conventional elements, continuity is assured as long as the element have the same nodes on shared faces and elements. An obvious violation of that would occur if a linear element were placed next to a quadratic one. If the displacement cases the quadratic element’s edge to become curved, the edge of the linear element cannot follow, so a crack opens up that is not present in the physical model:

The same issue occurs for an incomptibale mesh, which can occur, for example, at a part boundary in an assembly. There are methods to resolve this such as “glued contact” constraints that will be covered in a future lesson.

For p-elements, it is not enough to assure the nodes, which are only at the corners of the element, are compatible. If the adjacent elements are at different polynomial orders, discontinuity occurs at a shared edge. But they can be discontinuous at a shared edge or face even at the same p-order if you’re not careful in the definition of edges and faces, as discussed previously. Assuring continuity when using higher order polynomial functions will be covered in detail in the next lesson, where for the first time we’ll take a peak at some code.

Comments on the Ritz Method

Minimizing potential energy, as is done in the Ritz method, is only valid for linear problems and a subset of nonlinear problems for which a potential energy function exists. But there are more general alternatives, such as the Galerkin method that work in general nonlinear problems. The idea is the same, approximating functions are introduced into a global equation integrated over the body, and you come up with a set of equations to determine the unknown coefficients. However, if the governing equations are nonlinear and/or time dependent, then so will the equations for the coefficients. This requires “time marching” numerical techniques and solution methods like modfied Newton-Raphson. The Galerkin method is also readily generalized to other physical systems.

Up into the 1950s, before the advent of finite elements, a great deal of effort was put into coming up with functions that would work for more complex shapes. In the US, the National Advisory Committee for Aeronautics (NACA), predecessor to NASA, published a lot of solutions obtained this way for shapes useful in aeronautics. One of my professors at Stanford, Jean Mayers, was one of the experts who had contributed in that era. This seems to be all obsolete now with the advent of finite elements. But there is the sticky point that automeshers do not always succeed in creating the desired mesh, so it’s intriguing to revisit a meshless approach more similar to the Ritz method. That is exactly what was done with the external method in SimSolid. Implementing this requires sophisticated techniques to assure continuity at part boundaries, which is similar to the issue for incompatible meshes, which we’ll touch upon in a future lesson.


We’ve covered approaches to come up with approximation functions for displacements to be able to represent arbitrary motion in complex domains. We’re not done with that because the major issue of assuring continuity for higher order elements remains. That will be the subject of lesson 2. Once we have approximating functions, how to we choose how many are needed to achieve desired accuracy? We need a way to estimate errors in the current solution. Then, from the error-estimate, a way to figure out how many functions are needed. For h-adaptivity, we use that to estimate required element size, while in the p-method we use is to estimate require polynomial order. That is the subject of lesson 3.

Homework problem: The volume functions for hierarchical elements are 0 at all element nodes, edges, and faces. Do they play any role in continuity? If not, why are they needed?


  1. Szabo, B, and Babuska, I, Finite Element Analysis, Wiley, NY, 1991.
  2. Zienkiewicz, O, and Taylor, R, Finite Element Method. Volume 1- The Basis, Butterworth, 2000.

StressRefine Source; Sparselizard; P-Fea course 1st Lesson

The source for the stressRefine library will been uploaded on my stressRefine google drive Monday. l’ll make it available on github also. This allows you to work p-adaptivity into existing finite element codes and will be the basis for the P-Fea course! This is open-source per the Gnu public library license (gpl). I’ll also upload source for an executable that uses the library. This is also open-source and free, and following the spirit of the Gnu-public license, may be freely used with the only restriction being it cannot to redistributed for a fee without my permission. But Jeremy Thieler of CAEplex, who has a lot more experience with open-source software than I, pointed out that it’s not appropriate to use the Gnu public library license for a code that makes use of proprietary software. And the stressRefine executable uses the Intel Mkl library. This is available free from Intel if you use their community license but is still proprietary to Intel. I’m looking into using an open-source sparse solver instead of Intel pardiso, which will alleviate this issue.

Since the “split” of stressRefine into library and executable went smoothly, I’ve decided to use that architecture going forward. Last post I said there were going to be two versions of the source, one complete as an executable, and one split into library and executable. But the library/executable version is cleaner and easier to support and has the same functionality, so I’ll just stick to that. I’ll be offering the “intermediate” and “full” versions of the executable shortly when I have them cleaned up. The simple version has some limitations: it is slower because it doesn’t use any multithreading, uses a more readable but less optimized element formulation, and it cannot do assemblies or local region analysis. It is best for learning the architecture. The other versions are more optimized but less readable, although I’ll provide good documentation for them.

The Sparselizard library

Jeremy also made me aware of the Sparselizard library. With an intriguing name like that I had to check it out. This is a p-adaptive, nonlinear, multi-physics finite element library developed by some smart people at the University of Liege in Belgium. It has other nice features like the mortar method that allow use of incompatible meshes. That method is similar to, but more general than, glued contact which is used for parts with incompatible meshes in assemblies. The mortar method is especially useful in multi-physics at domain interfaces (such as between fluids and solids). Sparselizard appears to be extremely fast, they mention solving 3D models with millions of degrees of freedom in minutes. All of this is very promising, so I’m going to investigate it in more detail over the next few weeks. I’d like to look into wrapping some of stressRefine’s specialized technologies for stress analysis, such as singularity detection and local region extraction, around the sparseLizard library. I’ll keep you posted how that goes. I’d also like to work examples from the sparseLizard library into my adaptive finite element course.

I’ll also post the first lesson for that course this week on Monday.

Sorry this is a little later than I’d originally promised, looking into the licensing issue slowed me down a but.

Coming Soon: Open-Source for stressRefine and Free P-Fea Course

This image has an empty alt attribute; its file name is image-19.png
Take the new free P-Fea course to learn what’s “under the hood”

Open-Source stressRefine

I am busy making the source code for stressRefine available opensource. It is free for academic, research, and engineering use per the GNU open-source General Public License (https://www.gnu.org/licenses/licenses.en.html#GPL). The source will be available in several versions:

  • Simple Version: does analysis of full FEA meshes only. This version does not do “breakout” (local region analysis). The elements and solver bookkeeping are simplified to be more easily readable, and are not implemented in parallel. This version is less efficient, but is recommended for starting out to understand the concepts
  • Intermediate Versions: Same as simple, but the elements and solver bookkeeping are optimized and implemented in parallel (multi-threaded using openMP).
  • Full Version: Same as intermediate version but also does “breakout” (local region analysis).

The above 3 versions are in the form of a single project that builds an executable. They are presented in Microsoft Visual Studio C++ projects, using the intel MKL library and the sparse solver pardiso included with it. I am also working on making versions available with Linux.

These versions are for users who want to use and customize a pre-existing code, or get stressRefine to run on a different platform. But I’d also like to support users who want to work p-adaptivity into your own codes. To this end, I’ll also be making the p-adaptive support routines available in a library. Be calling this library for support, you can easily convert existing conventional isoparametric elements to p elements. It is my hope that the library will alleviate the need to “reinvent the wheel” to create p-elements. A lot of the grunt work, for example of calculating 3D polynomial basis functions and assuring continuity across element and face boundaries is automatically done by the library. Students, developers and researchers can concentrate on more fun aspects like new error estimation techniques or novel applications.

Free Course

I’m also introducing a new free course. If you look at the menu at the top you’ll see it now has “course: P-Fea” . The course is intended to teach students and developers who want to learn how to write a p-adaptive code using the stressRefine library. It will start with the theory behind p-adaptivity, then take you through the practical aspects of creating p-adaptive elements using the library, and writing analysis routines that do p-adaptive analysis. It will continue on to show how to do analysis of local regions. This will all be presented first with simplified elements and solution assembly routines, then I’ll cover some code optimization and parallelization.

All three full executable versions will be available for upload by Friday, along with the first course lesson. The simple version in library form will be available by the end of next week.

Enhanced StressRefine Is Now Available

An new version of stressRefine is now available free for download. This has several enhancements over the previous version.

Nastran results can now be accessed in three different forms. Only Ascii (.f06) results are available in the previous version. Nastran binary (.op2) results can be accessed in the new version. In addition, data tables output by a postprocessor for the local region of interest can also be read, as described in my previous post. Using binary results is convenient because it is the default Nastran output, while outputting a data table is useful for large models because it allows data to be specified only for the region of interest, making translation faster.

StressRefine also now outputs stresses in Nastran ascii format, which can be displayed in your own postprocessor, in addition to the fringe plots available in the opensource postprocessor cgx which is bundled with stressRefine.

My favorite enhancement is automatic “breakout” of parts from an assembly for use in local region analysis which was described here. This is straightforward if you are working with a CAD program and have access to the model tree, but was tricky to implement when only the model data in a Nastran input file is available.

StressRefine Innovations

StressRefine has introduced multiple innovations which include:

  • Robust Error Estimation: StressRefine uses a blend of the Zienkiewicz-Zhu error norm [1] and traction discontinuities across element faces for error estimation. This is accurate for a wide range of polynomial order in elements. For models with localized stress concentrations, the polynomial order stays low in most of the model and is only raised locally where needed.
  • polynomial basis functions that are compatible with conventional isoparametric finite elements (such as 10 noded tets). StressRefine p elements can be mixed with conventional elements in the same mesh, and results and boundary can readily be exchanged between conventional meshes and stressRefine. These basis functions are packaged into a convenient library which allows any developer to use p-adaptive elements without “reinventing the wheel”.
  • Automatic detection and isolation of a variety of common singularities:
  • Automatic smoothing of nodal loads on surfaces: for example, if you apply a pressure load in a preprocessor, it may not show up as a pressure load in the Nastran input file, but instead as an “equivalent nodal load”. But it is only equivalent for quadratic elements. StressRefine has to notice that the all the nodes of a surface are loaded so this is really a distributed load, not multiple concentrated loads, to get a smooth result. An “energy-smoothing” procedure is then applied to recover the user’s original intent.
  • Automatic breakout modeling, including automatic extraction of parts from large assemblies.

StressRefine’s architecture is designed to easily allow it’s polynomial basis functions and elements to be incorporated into other FEA codes. After I do a bit of cleanup and documentation, the source code for stressRefine will shortly be made available.


  1. Zienkiewicz, O, and Zhu, J, “Superconvergent Patch Recovery and a posteriori error estimation in the finite element method”, Int. J. Num. Meth. Engr, 33, 1331, 1992

Using Data Tables From an FEA Postprocessor for Breakout Modeling

A data table is a easy way for a user to specify a region of a model from which to output desired quantities from a postprocessor to a file. I’m going to demonstrate how this is done for FEMAP, doing it with other postprocessors should be a similar process. For the purposes of breakout modeling the quantities needed to be output are node ids and displacements.

The motivation for doing this is that for large models, from which you want to analyze a relatively small local region, even the translation step can become time consuming. You may have a million element model and want more accurate results in a region of 10000 elements. It is inefficient to have an analysis tool like stressRefine wade through the results of a million elements to extract those needed in the local region.

A data table allows you to simply “rubber-band” select a region of interest. I’m going to show doing this with box-selecting.

So here’s how it goes in Femap:

Now we need to tell the Femap selector to box select nodes:

Note there are lots of other choices besides box. Now select a box a bit larger than the local region of interest. It doesn’t have to be precise, but it helps to include the whole part for “breakout by part”:

The data for all the nodes in the box will go into the data table. Now we specify what quantities to output (node ids and displacements).

Table is now ready to be saved:

This will all be in the stressRefine user’s guide that will be updated shortly. This procedure is a lot simpler to do than it is to explain, it takes just a few seconds once you have tried it a couple of times.


The example of breakout by part in an assembly from my previous post used a data table. Extraction of the breakout model took 2.4 secs using the data table, while it took over 30 additional seconds to translate the results and extract the model directly from the Nastran binary output results. This is for a model with less than 300,000 elements. The time savings will be much more for larger models.

Automatically Determining The Point of Maximum Stress In The Region Selected

In stressRefine, the user can specify the center of the local region using a node id. But this is not necessary by default:

The point of maximum stress in the region enclosed by the data table is used by default. Determining this point quickly without having to translate the results for the full model requires stressRefine to recreate the stresses in the elements that own the nodes in the data table. This is straightforward because displacements are known, and it is known that Nastran used quadratic elements (e.g. 10 node tets) in the solution. Calculating the stresses in the local region in the example above took much less than a second (for 55000+ nodes in the data table, owned by 33000+ nodes).

Breakout Analysis of Parts in An Assembly

I’ve recently discussed how local regions can be broken out of large models to more quickly calculate accurate stresses. These regions can have ragged boundaries as shown in the example below:

This does not affect the accuracy of the results as long as elements on the ragged boundary are treated as sacrificial. But sometimes the region of interest for more accurate stresses is a part in an assembly. For this case it is nice to have the entire part be the breakout region and figure out how to automatically extract it. Working with a CAD model, with access to the model tree makes this straightforward to implement. StressRefine’s interface with Solidworks Simulation does that, for example, by determining which surfaces form the boundary of a part using information from the Application Programming Interface (API). Through the API it is also easy to determine which surfaces are free boundaries and which are interfaces with other parts, from which the breakout boundary conditions can be determined. Here is an example using the Solidworks interface:

Working with Nastran models it appears the equivalent information is not available. However, recently I figured out how to use interface constraints (“bsurfs”) to cleanly breakout out parts from assemblies for use as local regions. Working from the center of the region of interest (which is either specified by the user by selecting a node, or is the point of max stress in the model), the mesh is traversed topologically, but does not go past faces with bsurfs.

The traversal to identify the part is shown progressing in the figure below. I also made a slide show that is a time-lapse sequence of the traversal which is shown here (you need to download it and run it with powerpoint or the powerpoint reader for it to show automatically, or in google slides hit view/present (ctrl-F5) to show the show but you’ll have to use page down or right arrow to advance it, it powerpoint it is automatic).

The overall model has 297,000 elements but extraction of the local region only took 2.4 secs, and stressRefine achieved a accurate p-adaptive solution of the local region in 18 secs. It helps that the local region is only 9963 elements, that’s the beauty of breakout analysis.

I think this “breakout by part” procedure is a useful enhancement to this technology. Just specify a point in the region of interest (or specify using the point of max stress in the model), and the appropriate part in an assembly will be automatically extracted as a local region. If that part has a very large mesh, stressRefine can also use a subset of the mesh in the part as the local region, again extracted automatically.

StressRefine and StressCheck

There are currently two well-known p-adaptive commercial FEA codes on the market, Mechanica and StressCheck.

Mechanica was an ambitious undertaking that those of us at Rasna and those who joined us later working for PTC worked hard on for 25 years. Towards the end we had added some pretty advanced features like large strain plasticity and large displacement sliding contact. We took pride in trying to make everything we did adaptive, by default. So all the underlying solutions in nonlinear analysis or optimization runs were p-adaptive. In addition, step size control in time-dependent solutions and nonlinear analysis were always adaptive. This did not always work, but it’s what we aimed for. Unfortunately PTC seems to be going in other directions and is not emphasizing Mechanica much these days. Recently they made a corporate deal with Ansys and our now selling Discovery Live under the name “Simulation Live”.

Another major commercial code on the market is StressCheck from ESRD. ESRD was founded by Professor Barna Szabo, who together with Professor Ivo Babuska invented the P-method, proved it converges, and worked out the important details of its implementation. ESRD recently had its 30th anniversary, and StressCheck is an excellent product which its customers use to achieve high quality results from FEA analyses. It has it’s own user interface and mesher. The user imports the CAD geometry and creates a high quality model in StressCheck, which then uses p-refinement to get accurate results. Users can also do breakout modeling. A local region, part of a large global model solved in another FEA code, is recreated in StressCheck. StressCheck provides tools to apply the boundary conditions to the local region from the solution to the global FEA analysis. StressCheck can do linear and various types of nonlinear analyses. I really like their CAE handbook, and they have a superb solution for composite joints.

In contrast, StressRefine is a niche product created for the sole purpose of being an accessory to other FEA codes, to extract more accurate stress results. It does not require the recreation of the model, instead working directly on an existing FEA model and using p-adaptivity to extract more accurate stresses from it. For large models it also does breakout analysis, but automates the process of creating the breakout region, applying interface conditions to it from the global model, and solving it adaptively for more accurate stress. So in it’s niche it requires less user intervention.

I would heartily recommend to anyone to use StressCheck. And perhaps consider my little niche product as an additional useful add-on.