Untitled

~ntegrating landscape ecology into natural resource management / edited by Jianguo Lui and ..... Prior to this, the scheduling of forest harvest was based on more ..... the most frequently used software for calculating landscape indices.

lmesha rdto read,contactyourHondaGenerator .... OcB~cl. GROUND TERMINAL.
Ground Terminal. The generator ground terminal is connected to the frame of the
... and does not exceed the rated load capacity of the circuit before switching.

From anti-lock brakes to stability control and an optional ..... fans from all across
the country attended the Annual Owners Event in Durango, Colorado, making
friends, swapping ... Single 270 mm disc with Brembo single-piston floating
caliper.

Oct 3, 1989 ... The program gets its name from the series of NFL games seen on Mon day nights
for .... playoff structures, and recruit and trade players. .... information. Right: The
flowchart sitting alone on the title page could send a confus.

Weighted residual methods and Galerkin approximations, A model problem for one-dimensional linear elastostatics, Weak formulations in one dimension, Minimum principles in one dimension, Error estimation in one dimension, Construction of Finite Element basis functions in one dimension, Gaussian Quadrature, Iterative solvers and element by element data structures, A model problem for three-dimensional linear elastostatics, Weak formulations in three dimensions, Basic rules for element construction in three-dimensions, Assembly of the system and solution schemes, An introduction to time-dependent problems and An introduction to rapid computation based on domain decomposition and basic parallel processing.

The approach is to introduce the basic concepts first in one-dimension, then move on to three-dimensions. A relatively informal style is adopted. These notes are intended as a “starting point”, which can be later augmented by the large array of rigorous, detailed, books in the area of Finite Element analysis. Through teaching finite element classes for a number of years at UC Berkeley, my experience has been that the fundamental mathematics, such as vector calculus, linear algebra, and basic mechanics, exemplified by linearized elasticity, cause conceptual problems that impede the understanding of the Finite Element Method. Thus, appendices on these topics have been included. Finally, I am certain that, despite painstaking efforts, there remain errors of one sort or another. Therefore, I would be grateful if readers who find such flaws would contact me at [email protected] IMPORTANT: This document is under copyright. No part can be copied, electronically stored, transmitted, reproduced or translated into another language without written permission from T. I. Zohdi.

where the φi ’s are approximation functions, and the ai ’s are unknown constants that we will determine. Substituting the approximation leads to a “left over” amount called the residual: rN (x) = A(uN ) − f.

(1.3)

If we assume that the φ’s are given, we would like to choose the ai ’s to minimize the residual in an appropriate norm, denoted ||r||. A primary question is which norm should be chosen to measure the solution and to determine its quality. Obviously, if the true solution is smooth enough to have a pointwise solution, and if we could take enough φ-functions, we could probably match the solution pointwise. However, as we shall see, this would be prohibitively computationally expensive to solve. Thus, we usually settle for a less stringent measure, for example a spatially-averaged measure of solution quality. This is not a trivial point, and we will formalize the exact choice of the appropriate metric (a norm) momentarily. Let us pick an obvious choice def

def

Π(rN ) = ||rN ||2 =

Z

L

(rN (x))2 dx. 0

(1.4)

2

1 Weighted residuals and Galerkin’s method for a generic 1-D problem

Taking the derivative with respect to each ai , and setting it to zero, we obtain for i = 1, 2, ...N Z L ∂rN ∂Π 2rN = dx = 0. (1.5) ∂ai ∂ai 0 This leads to N equations and N unknowns. This method is called the “Method of Least Squares”. Another approach is to force the residual to be zero at a discrete number of locations, i = 1, 2, ...N rN (xi ) = 0,

(1.6)

which can also be written as Z

L

rN (x)δ(x − xi ) dx = 0,

0

(1.7)

where δ(x) is the Dirac Functional.1 This approach is sometimes referred to as the “Collocation Method”. Notice that each method has the form Z

L

rN (x)w(x) dx = 0,

(1.9)

0

where w(x) is some “weight”. A general name for these methods is the “Method of Weighted Residuals”.

1.2 Galerkin’s method Of all of the weighted residual methods used in the scientific community, one particular method, the Galerkin method, is by far the most widely used, and has been shown to deliver the most accurate solutions on a wide variety of problems. We now explain the basic construction. Consider the true solution, approximate solution and the error, related through u − u N = eN ⇒ u = u N + eN .

(1.10)

As a helpful mnemonic, now consider them as vectors (Figure 1.1). Clearly, the error (eN ) is the smallest when eN is orthogonal to uN . The problem is that the error eN = u − uN is unknown. However, the “next best thing”, the residual, is known.2 This motivates Galerkin’s idea, namely to force uN and rN to be orthogonal. Mathematically, this can be expressed as 1

Recall, the Dirac Functional is defined via

Z 2

L 0

δ(x − xi )f (x) dx = f (xi ).

(1.8)

Although the error and residual are not the same, we note that when the residual is zero, the error is zero.

1.3 An overall framework

Z

L N

N

r (x)u (x) dx = 0

Z

L

rN (x) 0

N X

ai φi dx = 0.

3

(1.11)

i=1

However, this only gives one equation. Thus, we enforce this for each of the individual approximation functions, which collectively form the space of approximations comprising uN , Z

L

rN (x)ai φi (x) dx = ai 0

Z

L 0

rN (x)φi (x) dx = 0 ⇒

Z

L

rN (x)φi (x) dx = 0. 0

(1.12) This leads to N equations and N unknowns, in order to solve for the ai ’s. It is the usual practice in Galerkin’s method to use approximation functions that are so-called “kinematically admissible”, which we define as functions that satisfy the (primal solution variable, u) so-called “Dirichlet” boundary conditions a priori.3 Kinematically admissible functions do not have to satisfy boundary conditions that involve derivatives of the solution beforehand.

The use of the phrase “kinematically admissible” comes from the fact that in early applications, the solution variable of interest was the displacement (u).

4

1 Weighted residuals and Galerkin’s method for a generic 1-D problem

• Step 2: Force the residual to be orthogonal to each of the approximation RL functions: 0 rN (x)φi (x) dx = 0, i = 1, 2, 3...N . • Step 3: Solve the set of coupled equations. The equations will be linear if the differential equation is linear, and nonlinear if the differential equation is nonlinear. The primarily problem with such a general framework is that it provides no systematic way of choosing the approximation functions, which is strongly dependent on issues of possible nonsmoothness of the true solution. The basic Finite Element Method has been designed to embellish and extend the fundamental Galerkin method by constructing φi ’s in order to deal with such issues. In particular: • It is based upon Galerkin’s method, • It is computationally systematic and efficient and • It is based on reformulations of the differential equations that remove the problems of restrictive differentiability requirements. The approach that we will follow in this monograph is to introduce the basic concepts first in one-dimension. We then present three-dimensional formulations, which extend naturally from one-dimensional formulations. Remark: Two Appendices containing some essential background information on vector calculus, linear algebra, and basic mechanics, exemplified by linearized elasticity are provided. Linearized elasticity will serve as our model problem in the chapters that follow.

2 A model problem: 1-D elastostatics

2.1 Introduction: a model problem In most problems of mathematical physics the true solutions are nonsmooth, i.e. they are not continuously differentiable. Thus, we cannot immediately apply a Galerkin approach. For example in the equation of static mechanical equilibrium1 ∇ · σ + f = 0,

(2.1)

there is an implicit requirement that the stress, σ, is differentiable in the classical sense. Virtually the same mathematical structure form holds for other partial differential equations of mathematical physics describing diffusion, heat conduction, etc. In many applications, differentiability is too strong a requirement, and in many locations it does not hold (the solution “jumps”). Therefore, when solving such problems we have two options: (1) enforcement of solution jump conditions at all of these locations (if they are even known a priori) or (2) weak formulations (weakening the regularity requirements). Weak forms, which are designed to accommodate irregular data and solutions, are usually preferred. Numerical techniques employing weak forms, such as the Finite Element Method, have been developed with the essential property that whenever a smooth classical solution exists, it is also a solution to the weak form problem. Therefore, we lose nothing by reformulating a problem in a more general way, by weaking the a priori smoothness requirements of the solution. In the following few chapters, we shall initially consider a one-dimensional structure which occupies an open bounded domain in Ω ∈ IR, with boundary ∂Ω. The boundary consists of Γu on which the displacements (u), or any other primal variable (temperature in heat conduction applications, concentration in diffusion applications, etc (see Appendix 2)) are prescribed and a part Γt on 1

where r is called the residual. We call ν a “test” function. If we were to add def a condition that we do this for all ( = ∀) possible “test” functions then Z Z dσ ( + f )ν dx = rν dx = 0, (2.4) Ω dx Ω

2.3 An example

7

∀ν, implies r(x) = 0. Therefore, if every possible test function were considered, then r = dσ dx + f = 0 on any finite region in (Ω). Consequently, the weak and strong statements would be equivalent provided, the true solution is smooth enough to have a strong solution. Clearly, r can never be nonzero over any finite region in the body, because the test function will “find” them (Figure 2.2). Using the product rule of differentiation on σν yields dσ dν d (σν) = ( )ν + σ dx dx dx

(2.5)

leads to, ∀ν Z dν d (σν) − σ ) dx + f ν dx = 0, (2.6) dx Ω Ω dx where we choose the ν from an admissible set, to be discussed momentarily. This leads to, ∀ν Z Z dν σ dx = f ν dx + σν|∂Ω , (2.7) Ω dx Ω On Γt , the stress σ on this bounary is known, σ = t∗ (Figure 2.1), and is unknown on Γu , thus, we decide to restrict our choices of ν’s to those that attain ν|Γu = 0. We note the use of the symbol t∗ stems from the indentification of stresses on the boundary as “tractions”. Also, choosing a priori for the solution from those functions such that u|Γu = u∗ , where u∗ is the applied boundary displacement, on a displacement part of the boundary, Γu , we have Z

(

Find u, u|Γu = u∗ , such that ∀ν, ν|Γu = 0

Z

|

Ω

dν du E dx = dx dx def

{z

= B(u,ν)

}

Z |

f ν dx + t∗ ν|Γt . Ω

{z

def

= F (ν)

(2.8)

}

This is called a weak form because it does not require the differentiability of σ. In other words, the differentiability requirements have been weakened. It is clear that we are able to consider problems with quite irregular solutions. We observe that if we test the solution with all possible test functions of sufficient smoothness, then the weak solution is equivalent to the strong solution. We emphasize that provided the true solution is smooth enough, the weak and strong forms are equivalent, which can be seen by the above constructive derivation. To explain the point more clearly, we consider a simple example.

subdomain (subinterval), ω ∈ Ω, defined through δ, ω = ζ ± δ such that r has the same sign as at point ζ. Since ν is arbitrary, we may choose ν to be zero outside of this interval, and with the same sign as r inside (Figure 2.3). This would imply that Z Z rν dx = 0, (2.10) rν dx = 0< ω

Ω

which is a contradiction. Now select

r=

dσ d + f ∈ C 0 (Ω) ⇒ dx dx

du E + f ∈ C 0 (Ω) ⇒ u ∈ C 2 (Ω). dx

(2.11)

Therefore, for this model problem, the equivalence of weak and strong forms occurs if u ∈ C 2 (Ω).

2.4 Some restrictions

9

r v

0

ζ−δ

ζ

ω

ζ+δ

L

Fig. 2.3. A residual function and a test function.

2.4 Some restrictions A key question is the selection of the sets of functions in the weak form. Somewhat naively, the answer is simple, theRintegrals must Rremain finite. dν Therefore the following restrictions hold (∀ν), Ω dx σ dx < ∞, Ω f ν dx < ∞ R and ∂Ω tν dx < ∞ and govern the selection of the approximation spaces. In order to make precise statements one must have a method of “book keeping”. Such a system is to employ so-called Hilbertian Sobolev spaces. We recall that a norm has three main characteristics for any vectors u and v such that ||u|| < ∞ and ||ν|| < ∞ are (1) ||u|| > 0, ||u|| = 0 if and only if u = 0 (“positivity”), (2) ||u + ν|| ≤ ||u|| + ||ν|| (Triangle Inequality) and (3) ||αu|| = |α|||u||, where α is a scalar constant (“scalability”). Certain types of norms, so-called Hilbert space norms, are frequently used in mathematical physics. Following standard notation, we denote H 1 (Ω) as the usual space of scalar functions with generalized partial derivatives of order ≤ 1 in L2 (Ω), i.e. it is square integrable. In other words, u ∈ H 1 (Ω) if Z Z ∂u ∂u def 2 dx + uu dx < ∞. (2.12) ||u||H 1 (Ω) = Ω Ω ∂x ∂x Using these definitions, a complete boundary value problem can be written as follows. The input data loading are assumed to be such that for body forces f ∈ L2 (Ω) and boundary traction sigma = t∗ ∈ L2 (Γt ), but less smooth data can be considered without complications. In summary we assume that our solutions obey these restrictions, leading to the following weak form

We note that if the data in (2.13) are smooth and if (2.13) possesses a solution u that is sufficiently regular, then u is the solution of the classical problem in strong form du d dx (E dx )

+ f = 0,

u = u∗ ,

∀x ∈ Ω,

∀x ∈ Γu ,

∗ σ = E du dx = t ,

(2.14)

∀x ∈ Γt .

2.5 Remarks on nonlinear problems The treatment of nonlinear problems is outside the scope of this introductory monograph. However, a few comments are in order. The literature of solving nonlinear problems with the FEM is vast. This is a complex topic, that is best illustrated for students with an extremely simple one-dimensional example with material nonlinearities. Starting with   du p  d    )  +f = 0. E( dx  |{z} dx 

(2.15)

def

=ǫ

|

The weak form reads Z

L 0

{z

}

def

=σ

dν σ dx = dx

Z

L

f ν dx + t∗ v|Γt .

(2.16)

0

Using a Taylor series expansion of σ(ǫ(u)) about a trial solution u(k) yields (k will be used as an iteration counter) σ(u(k+1) ) = E(ǫ(u(k+1) ))p ≈ E (ǫ(u

(k)

p

)) + p(ǫ(u

(2.17) (k)

))

p−1

× ǫ(u

(k+1)

and substituting this into the weak form yields

) − ǫ(u

(k)

) + O(||u

(k+1)

−u

(k) 2

|| )

(2.18)

2.5 Remarks on nonlinear problems

Z

L 0

Z L dν f ν dx + t∗ ν|Γt Ep(ǫ(u(k) ))p−1 ǫ(u(k+1) ) dx = dx | 0 {z }

11

(2.19)

E tan

−

Z

L

0

dν E (ǫ(u(k) ))p − p((ǫ(u(k) ))p ) dx. dx

One then iterates k = 1, 2, ..., until ||u(k+1) − u(k) || ≤ T OL. Convergence of such a Newton-type formulation is of concern. We refer the reader to the seminal book of Oden [19], which developed and pioneered nonlinear formulations and convergence analysis. For example, consider a general abstract nonlinear equation of the form Π(u) = 0,

where Π T AN (u) = ∇u Π(u) is the so-called “tangent operator”. One immediately sees a potential difficulty, due to the possibility of a zero, or near zero, tangent when employing a Newton’s method to a system that may have a nonmonotonic response, for example those involving material laws with softening. Specialized techniques can be developed for such problems, and we refer the reader to the state-of-the-art found in Wriggers [25].

3 A finite element implementation in one dimension

3.1 Introduction Classical techniques construct approximations from globally kinematically admissible functions, which we define as functions that satisfy the displacement boundary condition beforehand. Two main obstacles arise (1) it may be very difficult to find a kinematically admissible function over the entire domain and (2) if such functions are found they lead to large, strongly coupled, and complicated systems of equations. These problems have been overcome by the fact that local approximations (posed over very small partitions of the entire domain) can deliver high quality solutions, and simultaneously lead to systems of equations which have an advantageous mathematical structure amenable to large scale computation by high speed computers. These piecewise or “element-wise” approximations have been recognized at least 60 years ago by Courant [10] as being quite advantageous. There have been a variety of such approximation methods to solve equations of mathematical physics. The most popular method of this class is the Finite Element Method (FEM). The central feature of the method is to partition the domain in a systematic manner into an assembly of discrete subdomains or “elements”, and then to approximate the solution of each of these pieces in a manner that couples them to form a global solution valid over the whole domain. The process is designed to keep the resulting algebraic systems as computationally manageable, and memory efficient, as possible.

3.2 Weak Formulation Consider the following general form

14

3 A finite element implementation in one dimension

Find u ∈ H 1 (Ω) u|Γu = d such that ∀ν ∈ H 1 (Ω), ν|Γu = 0 Z

Ω

dν du E dx = dx dx

Z

(3.1)

f ν dx + t∗ ν|Γt . Ω

3.3 FEM approximation We approximate u by uh (x) =

N X

aj φj (x).

N X

bi φi (x),

(3.2)

j=1

If we choose ν with the same approximation functions, but a different linear combination ν h (x) =

(3.3)

i=1

then we may write Z

|

Ω

d dx

N X

bi φi (x)

i=1

!

d E dx

{z

def

N X

aj φj (x)

j=1

= stiffness contribution

!

dx =

}

Z

Ω

|

N X

bi φi (x)

i=1

def

{z

!

f dx

}

= body load contribution

+

N X

bi φi (x)

i=1

|

def

! !

{z

t∗

|Γt .(3.4)

}

= traction load contribution

Since the ν’s are arbitrary, the bi are arbitrary, i.e. ∀ν ⇒ ∀bi , therefore PN

i=1 bi def

Kij = def

Ri =

P

R

R

N j=1

Kij aj − Ri = 0 ⇒ [K]{a} = {R},

dφ dφi E dxj Ω dx

Ω

dx and

(3.5)

φi f dx + φi t∗ |Γt ,

were [K] is an N × N (“stiffness”) matrix with components Kij and {R} is an N ×1 (“load”) vector with components Ri . This is the system of equations that is to be solved. Thus, large N implies large systems of equations, and more computational effort. However, with increasing N , we obtain more accurate approximate solutions. We remark that large N does not seem like much of a concern for one-dimensional problems, but is of immense concern for threedimensional problems.

3.4 Construction of FEM basis functions

15

3.4 Construction of FEM basis functions

φ φi

x i−1 x i

(UNIFORM MESH)

x

i+1

h i+1

hi

φ φ

x i−1

hi

(NONUNIFORM MESH) i

xi

x i+1

h i+1

Fig. 3.1. A one-dimensional finite element basis. At the top, is a uniform mesh example and at the bottom, nonuniform.

As mentioned, a primary problem with Galerkin’s method is that it provides no systematic way of constructing approximation functions. The difficulties that arise include (1) ill-conditioned systems du to poor choices of approximation functions and (2) domains with irregular geometries. To circumvent these problems, the FEM defines basis (approximation) functions in a piecewise manner over a subdomain, “the finite elements”, of the entire domain. The basis functions are usually simple polynomials of low degree. The following three criteria are important: • The basis functions are smooth enough to be members of H 1 (Ω).

and φ(x) = 0 otherwise. The derivative of the function is dφ 1 = dx hi

for xi−1 ≤ x ≤ xi ,

(3.8)

and dφ 1 =− dx hi+1

for xi ≤ x ≤ xi+1 .

(3.9)

The functions are arranged so that the “apex” of the ith function coincides with the ith node (Figure 3.1). This framework provides many advantages, for example, simple numerical integration.

3.5 Integration and Gaussian quadrature Gauss made the crucial observation that one can integrate a polynomial of order 2G-1 exactly with G “sampling” points and appropriate weights. Thus, in order to automate the integration process, one redefines the function F (x) over a normalized unit domain −1 ≤ ζ ≤ +1 Z

0

L

F (x) dx =

Z

1

F (x(ζ)) J(ζ)dζ = −1

G X i=1

wi F (ζi )J(ζi ) =

G X

wi Fˆ (ζi ), (3.10)

i=1

where J is the Jacobian of the transformation. Unlike most integration schemes, Gaussian Quadrature relaxes the usual restriction that the function sampling locations be evenly spaced. According to the above, we should be able to integrate a cubic (and lower order) term exactly with G = 2 points, since (2G − 1) = 3. Therefore

3.5 Integration and Gaussian quadrature

17

F(x)

F(x)

x

x unevenly spaced

evenly spaced F( ζ )

ζ=−1

ζ=+1 PARAMETRIC DOMAIN

ζ

Fig. 3.2. Integration using Gaussian Quadrature.

• For a cubic (ζ 3 ): Z

1

ζ 3 dζ = 0 = w1 ζ13 + w2 ζ23

(3.11)

ζ 2 dζ = 2/3 = w1 ζ12 + w2 ζ22

(3.12)

−1

• For a quadratic (ζ 2 ): Z

1 −1

• For a linear (ζ): Z

1

ζ dζ = 0 = w1 ζ1 + w2 ζ2 −1

(3.13)

18

3 A finite element implementation in one dimension

• For a constant (1): Z

1

1 dζ = 2 = w1 1 + w2 1 = w1 + w2

(3.14)

−1

There are four variables, ζ1 , ζ2 ,p w1 , w2 , to solve for. The solution that satisfies all of the requirements is ζ1 = 1/3 = −ζ2 and w1 = w2 = 1. For the general case of G points, we have Z

with Lo (ζ) = 1, L1 (ζ) = ζ. The roots of the Legendre polynomial are wellknown and tabulated. Once the roots are determined the remaining equations for the wi ’s are linear, and easy to solve. Fortunately, the roots are precomputed over a normalized unit domain, and one does not need to compute them. The only task is to convert the domain of each element to a standard unit domain (in the next section). A table of Gauss weights can be found in Table 3.1. Gauss Rule ζi 2 0.577350269189626 -0.577350269189626 0.000000000000000 3 0.774596669224148 -0.774596669224148 0.339981043584856 4 0.861136311594053 -0.339981043584856 -0.861136311594053 0.000000000000000 5 0.538469310105683 0.906179845938664 -0.538469310105683 -0.906179845938664

3.6 Global/local transformations One strength of the finite element method is that most of the computations can be done in an element by element manner. Accordingly, we define the entries of the stiffness matrix [K] as Z dφi dφj E dx, (3.22) Kij = dx Ω dx

In order to make the calculations systematic we wish to use the generic or master element defined in a local coordinate system (ζ). Accordingly, we need the following mapping functions, from the master coordinates to the real spatial coordinates, Mx (ζ) 7→ x (Figure 3.3)

3.7 Differential properties of shape functions

x=

2 X i=1

def Xi φˆi = Mx (ζ),

21

(3.26)

ˆ where the Xi are the true spatial coordinates of the ith node, and where φ(ζ)

def

= φ(x(ζ)). These types of mappings are usually termed “parametric” maps. If the polynomial order of the shape functions is as high as the Galerkin approximation over the element, it is called an “isoparametric” map, lower, then “subparametric” map, higher, then “superparametric”.

They have the following properties: • For linear elements we have a nodal basis consisting of two nodes, and thus two degrees of freedom. • The nodal shape functions can be derived quite easily, by realizing that it is a nodal basis, i.e. they are unity at the corresponding node, and zero at all other nodes. We note that the φi ’s are never really computed, we actually start with the φˆi ’s, and then map them into the actual problem domain. Therefore in the stiffness matrix and righthand side element calculations, all terms must be defined in terms of the local coordinates. With this in mind, we introduce some fundamental quantities, such as the finite element mapping deformation gradient def

F =

dx . dζ

(3.28) def

The corresponding one-dimensional determinant is |F | = dx dζ = J, which is known as the Jacobian. We will use |F | and J interchangeably throughout this monograph. The differential relations ζ → x, are d() dx d() d() = =J . dζ dζ dx dx The inverse differential relations x → ζ, are

(3.29)

22

3 A finite element implementation in one dimension

dζ d() 1 d() d() = = . dx dx dζ J dζ We can now express

d dx

(3.30)

in terms ζ, via

d dζ d dζ d ˆ dφ = φ(M (ζ)) = φ(M (ζ)) = φ(ζ). dx dx dx dζ dx dζ

(3.31)

Finally with quadrature for each element

e Kij =

and

g X

wq q=1 |

Rie =

g X

d (φi (M (ζ)) dζ

(3.32)

evaluated at ζ=ζq

wq φi (M (ζ))f |F | + {z } | q=1 evaluated at ζ=ζq

dζ d dζ E (φj (M (ζ)) |F | dx dζ dx {z }

φi (M (ζ))t∗ | {z }

,

(3.33)

evaluated on traction endpoints

where the wq are Gauss weights. Remarks: It is permitted to have material discontinuities within the finite elements. On the implementation level, the system of equations to be solved are [K]{a} = {R}, where the stiffness matrix is represented by K(I, J), where (I, J) are the global entries. However, one can easily take advantage of the element by element structure, and store the entries via k e (e, i, j), where (e, i, j) are the local (element) entries. For the local storage approach, a global/local index relation must be made to connect the local entry to the global entry when the linear algebraic solution process begins. This is a relatively simple and efficient storage system to encode. The element by element strategy has other advantages with regard to element by element system solvers. This is trivial in one dimension, however, it can be complicated in three dimensions. This is discussed later.

3.8 Post processing Post processing for the stress, strain and energy from the existing displacement solution, i.e., the values of the nodal displacements, the shape functions, is straightforward. Essentially the process is the same as the formation of the weak form in the system. Therefore, for each element ! 2 2 d X d X ˆ dζ du = (3.34) ai φ i = ai φ i dx dx i=1 dζ i=1 dx

has a traction contribution from the right endpoint. In summary, the basic process is to (1) compute element by element and (2) to sweep over all basis function contributions over each element. Remark: We note that all integrals are computed using Gaussian Quadrature. 3.9.3 Applying boundary conditions Applying the primal (displacement) boundary conditions, requires us to recall that the bi ’s in the representation of the test functions are not arbitrary at the endpoints, thus the equations associated with those test functions have to be eliminated, and the value of the approximate solution enforced at the displacement boundary condition via1 uh (x = 0) =

The traction boundary conditions are automatically accounted for in the weak formulation.

26

3 A finite element implementation in one dimension

3.9.4 Massive data storage reduction The direct storage of K(I, J) requires N × N entries. The element by element storage, k e (e, i, j), requires 4e. The memory requirements for element by element are much smaller than a direct scheme, which stores needless zeros. For example, for N = 104 nodes, the direct storage is (104 )2 = 108 , while the element by element storage is 9999 × 4, which is essentially 2500 times less than direct storage. Additionally, there is a massive reduction of mathematical operations during the algebraic solution phase, because of the element by element structure of FEM system.

3.10 Quadratic elements In many cases, if the character of the exact solution is known to be smooth, it is advantageous to use higher order approximation elements. Generally, if the exact solution to a problem is smooth, for sufficiently fine meshes, if one compares, for the same number of nodes, the solution produced with linear basis functions to the solution produced with quadratic basis functions, the quadratically-produced solution is more accurate. Similarly, if the exact solution is rough (nonsmooth), for sufficiently fine meshes, if one compares, for the same number of nodes, the solution produced with linear basis functions to the solution produced with quadratic basis functions, the linearly-produced solution is more accurate. To illustrate how to construct a quadratic finite element approximation, we follow a similar template for linear elements, however, with three nodes instead of two. Consistent with the basic nodal-basis construction, the basis function must equal unity on the node it belongs, and be zero at the others. Thus, for a generic quadratic element: ˆ = 0, • For node # 1: φˆ1 (ζ) = − 12 (1 − ζ)ζ, which yields φˆ1 (−1) = 1, φ(0) ˆ φ1 (1) = 0, • For node # 2: φˆ2 (ζ) = (1 + ζ)(1 − ζ), which yields φˆ2 (−1) = 0, φˆ2 (0) = 1, φˆ2 (1) = 0 and • For node # 3: φˆ3 (ζ) = 21 (ζ + 1)ζ which yields φˆ3 (−1) = 0, φˆ3 (0) = 0, φˆ3 (1) = 1. Following the approach for linear elements, the connection between x and ζ, is x(ζ) = Xi φˆ1 (ζ) + Xi+1 φˆ2 (ζ) + Xi+2 φˆ3 (ζ).

(3.48)

Clearly, the weak form does not change for linear or quadratic approximations. Furthermore, the quadratically-generated system has a similar form to the linearly-generated system

3.10 Quadratic elements

Φ1

Φ2

27

Φ3

x ζ element # 1

element # 3

element # 2

Fig. 3.5. Three quadratic elements with seven nodes. N X

Kij aj = Ri

i = 1, 2, ...N,

(3.49)

j=1

where N is the number of nodes in 1-D. Let us consider an example with three elements, resulting in 7 nodes. Breaking up the integral into the elements Z

One then applies boundary conditions in the same manner as for linear elements. Remark: A logical question to ask is what is the accuracy of the finite element method? This is addressed in the next chapter.

4 Accuracy of the finite element method in one-dimension

4.1 Introduction As we have seen, the essential idea in the finite element method is to select a finite dimensional subspatial approximation of the true solution and form the following weak boundary problem Find uh ∈ Huh (Ω) ⊂ H 1 (Ω), with uh |Γu = d, such that

Z

|

Ω

dν h duh E dx = dx dx

{z

}

B(uh ,ν h )

Z |

Ω

f ν h dx + ν h t∗ |Γt ,

{z

F (ν h )

}

(4.1)

∀ν h ∈ Hνh (Ω) ⊂ H 1 (Ω), with ν h |Γu = 0,

where we refer to Huh (Ω) and Hνh (Ω) as the space of approximations (for example linear functions). The critical point is that Huh (Ω), Hνh (Ω) ⊂ H 1 (Ω). This “inner” approximation allows the development of straightforward subspatial error estimates. We will choose Huh (Ω) and Hνh (Ω) to coincide. We have, for any H 1 (Ω) admissible function w, a definition of the so-called energy semi-norm def

||u − w||2E(Ω) =

Z

( Ω

dw du dw du − )E( − ) dx = B(u − w, u − w). dx dx dx dx

(4.2)

Note that in the event that nonuniform displacements are specified on the boundary (no rigid motion produced), then u − w = constant is unobtainable unless u − w = 0, and the semi-norm in Equation (4.2) is a norm in the strict mathematical sense. Under relatively mild assumptions, a fundamental a-priori error estimate for the finite element method is

30

4 Accuracy of the finite element method in one-dimension def

||u − uh ||E(Ω) ≤ C(u, p)hmin(r−1,p) = γ ,

(4.3)

where p is the (complete) polynomial order of the finite element method used, r is the regularity of the exact solution, C is a constant dependent on the exact solution and the polynomial approximation. C is independent of h, the maximum element size in the mesh. For details see, for example, Ainsworth and Oden [1], Becker, Carey and Oden[7], Carey and Oden [8], Oden and Carey [20], Hughes [15], Szabo and Babuska[24] and Bathe [6]. Remark 1: We note that set of functions specified by Huh (Ω) ⊂ H 1 (Ω) with uh |Γu = d is technically not a space of functions and should be characterized as “a linear variety”. This does not pose a problem for the ensuing analysis. For precise mathematical p details, see Oden and Demkowicz [21]. Remark 2: We note that B(u, u) is a norm since: • Positivity:

where Equation 4.15 is known as the Principle of Minimum Potential Energy (PMPE). In other words, the true solution possesses the minimum potential. The minimum property of the exact solution can be proven by an alternative technique. Let us construct a potential function, for a deviation away from the exact solution u, denoted u + λν, where λ is a scalar and ν is any admissible variation (test function)

J (u+λν) =

Z

Ω

d 1 d (u+λν)E (u+λν) dx− 2 dx dx

Z

f (u+λν) dx−t∗ (u+λν)|Γt . Ω

(4.16)

If we differentiate with respect to λ, ∂J (u + λν) = ∂λ

Z

Ω

dν d E (u + λν) dx − dx dx

Z

Ω

f ν dx − t∗ ν|Γt ,

(4.17)

and set λ = 0 (because we know that the exact solution is for λ = 0), we have ∂J (u + λν) |λ=0 = ∂λ

Z

Ω

dν du E dx − dx dx

Z

Ω

f ν dx − t∗ ν|Γt = 0.

(4.18)

4.4 Simple estimates for adequate FEM meshes

33

Clearly, the minimizer of the potential is the solution to the field equations, since it produces the weak formulation as a result. This is a minimum since Z dν dν ∂ 2 J (u + λν) | = E dx ≥ 0. (4.19) λ=0 2 ∂λ Ω dx dx

It is important to note that the weak form, derived earlier, requires no such potential, and thus is a more general approach than a minimum principle. Thus, in cases where an energy exists, the weak formulation can be considered as a minimization of a potential energy function. Numerical approaches based on this idea are usually referred to as Rayleigh-Ritz methods. This concept allows one to construct simple error estimates for global mesh refinement.

The previous results to generate estimates for the mesh fineness for a desired accuracy. As stated earlier, under standard assumptions the classical a-priori error estimate for the finite element method is (Equation 4.3), def

Remarks: While this scheme provides a simple estimate for the global mesh fineness needed, the meshes need to be locally refined to ensure tolerable accuracy throughout the domain.

4.5 Local mesh refinement Probably the simplest approach to local mesh refinement is to use the residual as a guide. Residual methods require no a-posteriori system of equations to be solved. Such methods bound the error by making use of • • • •

developed in Bab´ uska and Rheinboldt [4] for one-dimensional problems, in Bab` uska and Miller [5] and Kelly et. al. [16] for two-dimensional problems. For reviews see Ainsworth and Oden [1]. This will be discussed further at the end of this monograph.

5 Iterative solutions schemes

5.1 Introduction: minimum principles and Krylov methods There are two main approaches to solving systems of equations resulting from numerical discretization of solid mechanics problems, direct and iterative. There are a large number of variants of each. Standard direct solvers are usually employed when the number of unknowns is not very large, and there are multiple load vectors.1 Basically, one can operate on the multiple righthand sides simultaneously via Gaussian elimination. For a back substitution the cost is 2(0 + 1 + 2 + 3... + N − 1) = 2

N X

k=1

(k − 1) = N (N − 1).

(5.1)

Therefore the total cost of solving such a system is the cost to reduce the system to upper triangular form plus the cost of back substitution, i.e. 2 3 3 N + N (N − 1). However since the operation counts to factor and solve an N × N system are O(N 3 ), iterative solvers are preferred when the systems are very large.2 In general, most modern solvers, for large symmetric systems, like the ones of interest here, employ Conjugate Gradient (CG) type 1

2

However, specialized direct sparse solvers can be used if the matrices have a special structure. For example, Gaussian elimination is approximately

2(N +(N −1)2 +(N −2)2 +(N −3)2 +...12 ) = 2

N X k=1

k2 ≈ 2

Z

N

k2 dk = 0

2 3 N . (5.2) 3

An operation such as the addition of two numbers, or multiplication of two numbers, is one operation count.

38

5 Iterative solutions schemes

iterative techniques which can deliver solutions in O(N )2 operations3 . It is inescapable, for almost all variants of Gaussian elimination, unless they involve complicated sparsity tracking to eliminate unneeded operations on zero entries, that the operation costs are O(N 3 ). However band solvers, which exploit the bandedness of the stiffness matrix, can reduce the number of operation counts to N b2 , where b is the bandwidth. Many schemes exist for the optimal ordering of nodes in order to make the bandwidth as small as possible. Skyline solvers locate the uppermost nonzero elements starting from the diagonal and concentrate only on elements below the skyline. Frontal methods, which are analogous to a moving front in the finite element mesh, perform Gaussian elimination element by element, before the element is incorporated into the global stiffness matrix. In this procedure memory is reduced, and the elimination process can be done for all elements simultaneously, at least in theory. Upon assembly of the stiffness matrix and righthand side, back substitution can be started immediately. If the operations are performed in an optimal order, it can be shown that the number of operations behaves proportionally to N 2 . Such a process, is, of course, nontrivial. We note that with direct methods, zeros within the band, below the skyline, and in the front are generally filled and must be carried in the operations. In very large problems the storage requirements and the number of operation counts can become so large that solution by direct methods is not feasible. The data structures, and I/O are also non-trivial concerns. However, we notice that a matrix/vector multiplication involves 2N 2 operation counts, and that a method based on repeated vector multiplication, if convergent in less than N iterations, could be very attractive. This is usually the premise in using iterative methods, such as the Conjugate Gradient Method (CG). A very important feature of iterative methods, is that the memory requirements remain constant during the solution process. It is important to note that modern computer architectures are based on (1) registers, which have virtually no memory capabilities, but which can perform very fast operations on computer “words”, (2) cache, which have slightly larger memory capabilities, with a slight reduction in speed, but are thermally very “hot”, and are thus limited for physical as well as manufacturing reasons, (3) main memory, which is slower since I/O (input/output) is required, but still within a workstation and (4) disk and tape or magnetic drums, which are out of the core system, and thus require a huge I/O component and are very slow. Therefore, one point that we emphasize is that one can take advantage of the element by element structure inherent in the finite element method for data storage and matrix vector multiplication in the CG method. The element by element data structure is also critical for the ability to fit matrix/vector multiplications into the computer cache, which essentially is a low memory/high floating point operation per second portion of the computer hardware. 3

Similar iterative solvers can be developed for unsymmetric systems, and we refer the reader to Axelsson [3] for details.

5.1 Introduction: minimum principles and Krylov methods

39

Remark: One singularly distinguishing feature of iterative solvers is the fact that since they are based on successive updates of a starting guess solution vector, they can be given a tremendous head start by a good solution guess, for example, provided by an analytical or semi-analytical solution. Minimum principles play a key role in the construction of a certain class of iterative solvers, which we exploit in the chapter. 5.1.1 Krylov searches and minimum principles By itself, the PMPE (introduced in the previous section) is a powerful theoretical result. However, it can be used to develop methods to solve systems of equations arising from a finite element discretization of a infinitesimal strain linearly elastic structure. This result is the essence of the so-called Krylov family of searches. Suppose we wish to solve the discrete system [K]{a} = {R}.

Therefore the minimizer of the potential Π is also the solution to the discrete system. A family of iterative solving techniques for symmetric systems based upon minimizing Π by successively updating a starting vector are the Krylov class. The minimization takes place over vector spaces called the Krylov spaces. These methods are based on the assumption that a solution to a tolerable accuracy can be achieved in much less than O(N 3 ) operations, as required with most Gaussian-type techniques. The simplest of this family is the method of steepest descent, which is a precursor to the widely used Conjugate Gradient Method. The Method of Steepest Descent The Method of Steepest Descent is based upon the following simple idea: if the gradient of the potential is not zero at a possible solution vector, then the greatest increase of the scalar function is in the direction of the gradient, therefore we move in the opposite direction −∇Π. The ingredients in the methods are the residual, def

The rate of convergence of the method is related to the condition number of the stiffness matrix ||{a} − {a}i ||K = (1 −

1 )i/2 ||{a} − {a}1 ||K , C([K])

(5.11)

where, in this case, the Condition Number is defined by def

C([K]) =

max [K] eigenvalue min [K] eigenvalue

(5.12)

and {a} is the exact solution to the algebraic system [K]{a} = {R}. The rate of convergence of the method is typically quite slow, however, a variant, the Conjugate Gradient Method, is guaranteed to converge in N iterations at most, provided the algebra is performed exactly.

5.1 Introduction: minimum principles and Krylov methods

41

The Conjugate Gradient Method In the Conjugate Gradient Method, at each iteration the computational cost is O(N ), due to the FEM matrix structure. We refer the reader to Axelsson [3] for details. We define the residual, def

{r}i = −∇Π = {R} − [K]{a}i ,

(5.13)

and the successive iterates, for i = 1, 2, 3..., def

{a}i+1 = {a}i + λi {z}i ,

(5.14)

with def

{z}i = {r}i + θi {z}i−1 .

(5.15)

The coefficient θi is chosen so that {z}i is [K] − conjugate to {z}i−1 , i.e. {z}T,i [K]{z}i−1 = 0 ⇒ θi = − The value of λi which minimizes

Accelerating computations The rate of convergence of the CG method is related to the condition number

i

p

||{a} − {a} ||K ≤ ( p

C([K]) − 1

C([K]) + 1

)i ||{a} − {a}1 ||K .

(5.20)

Proofs of the various characteristics of the method can be found in Axelsson[3]. As is standard, in an attempt to reduce the condition number and hence increase the rate of convergence, typically preconditioning of [K] is done by forming the following transformation of variables, {a} = [T ]{ˆ a}, which produces a preconditioned system, with stiffness matrix [K] = [T ]T [K][T ]. Ideally we would like [T ] = [L]−T where [L][L]T = [K], and where [L] is a lower triangular matrix, thus forcing [T ]T [K][T ] = [L]−1 [L][L]T [L]−T = I.

(5.21)

However, the reduction of the stiffness matrix into a lower triangular matrix and its transpose is comparable in the number of operations to solving the system by Gaussian elimination. Therefore only an approximation to [L]−1 is

where Tij = 0 for i 6= j and where the Kii (no implied sum on the repeated indices) are the diagonal entries of [K]. In this case the resulting terms in the preconditioned stiffness matrix are unity on the diagonal. The off diagonal terms, Kij are divided by √ 1√ . There are a variety of other precondiKii

Kjj

tioning techniques, of widely ranging expense to compute. For more details see Axelsson [3]. It is strongly suggested to precondition the system. For example with the simple diagonal preconditioner we obtain the following stiffness matrix 

Remark: In the one-dimensional problem considered earlier, the actual computation cost of the matrix-vector multiplication in an element-by-element CG method is a [2 × 2] matrix times a {2 × 1} vector times the number of elements. This is an O(N ) calculation. If we consider M iterations necessary for convergence below an error tolerance, then the entire operation costs O(M N ).

6 Weak formulations in three dimensions

Γt

t=t * specified

Ω

Γu

u=u* specified

Fig. 6.1. A three-dimensional body.

6.1 Introduction Albeit a bit repetitive, we follow similar constructions as followed in onedimensional analysis of the preceding chapters. This allows readers a chance to contrast and compare the differences between one-dimensional and threedimensional formulations. To derive a direct weak form for a body, we take the balance of linear momentum ∇ · σ + f = 0 (denoting the strong form) and form a scalar product with an arbitrary smooth vector valued function ν, and integrate over the body (Figure 6.1),

46

6 Weak formulations in three dimensions

Z

Ω

(∇ · σ + f ) · ν dΩ =

Z

Ω

r · ν dΩ = 0,

(6.1)

where r is the residual and ν is a test function. If we were to add a condition that we do this for all possible test functions (∀ν), Equation 6.1 implies r = 0. Therefore, if every possible test function was considered, then r =∇·σ+f =0

(6.2)

on any finite region in Ω. Consequently, the weak and strong statements would be equivalent provided the true solution is smooth enough to have a strong solution. Clearly, r can never be nonzero over any finite region in the body, because the test function will locate them. Using the product rule of differentiation, ∇ · (σ · ν) = (∇ · σ) · ν + ∇ν : σ

If we decide to restrict our choices of ν’s to those such that ν|Γu = 0, we have, where u∗ is the applied boundary displacement on Γu , for infinitesimal strain linear elasticity Find u, u|Γu = u∗ , such that ∀ν, ν|Γu = 0

Z

|

Ω

∇ν : IE : ∇u dΩ = def

{z

= B(u,ν )

}

Z

|

Ω

f · ν dΩ + def

Z

{z

Γt

= F (ν )

t∗ · ν dA,

}

(6.7)

where t = t∗ on Γt . As in one-dimensional formulations, this is called a “weak” form because it does not require the differentiability of the stress σ. In other words, the differentiability requirements have been weakened. It is clear that we are able to consider problems with quite irregular solutions. We observe that if we test the solution with all possible test functions of

6.2 Hilbertian Sobolev Spaces

47

sufficient smoothness, then the weak solution is equivalent to the strong solution. We emphasize that provided the true solution is smooth enough, the weak and strong forms are equivalent, which can be seen by the above constructive derivation.

6.2 Hilbertian Sobolev Spaces As in one-dimension, a key question is the selection of the sets of functions in the weak form. Somewhat naively, the answer is simple, the R integrals must f · ν dΩ < ∞, remain finite. Therefore the following restrictions hold (∀ν), Ω R R ∗ ∇ν : σ dΩ < ∞, and govern the selection of the t · ν dA < ∞ and Ω ∂Ω approximation spaces. These relations simply mean that the functions must be square integrable. In order to make precise statements one must have a method of “book keeping”. Such a system is to employ so-called Hilbertian Sobolev spaces. We recall that a norm has three main characteristics for any functions u and ν such that ||u|| < ∞ and ||ν|| < ∞ are • (1) ||u|| > 0, ||u|| = 0 if and only if u = 0, • (2) ||u + ν|| ≤ ||u|| + ||ν|| and • (3) ||αu|| = |α|||u||, where α is a scalar constant. Certain types of norms, so-called Hilbert space norms, are frequently used in solid mechanics. Following standard notation, we denote H 1 (Ω) as the usual space of scalar functions with generalized partial derivatives of order ≤ 1 in L2 (Ω), i.e. square integrable, in other words u ∈ H 1 (Ω) if def ||u||2H 1 (Ω) =

and we denote L2 (Ω) = [L2 (Ω)]3 . Using these definitions, a complete boundary value problem can be written as follows. The data (loads) are assumed to be such that f ∈ L2 (Ω) and t∗ ∈ L2 (Γt ), but less smooth data can be considered without complications. Implicitly we require that u ∈ H 1 (Ω) and σ ∈ L2 (Ω) without continually making such references. Therefore in summary we assume that our solutions obey these restrictions, leading to the following infinitesimal strain linear elasticity weak form:

48

6 Weak formulations in three dimensions

Find u ∈ H 1 (Ω), u|Γu = u∗ , such that ∀ν ∈ H 1 (Ω), ν|Γu = 0

Z

Ω

∇ν : IE : ∇u dΩ =

Z

Ω

f · ν dΩ +

Z

Γt

(6.10)

t∗ · ν dA.

We note that if the data in (6.10) are smooth and if (6.10) possesses a solution u that is sufficiently regular, then u is the solution of the classical linear elastostatics problem in strong form: ∇ · (IE : ∇u) + f = 0, u = u∗ ,

where Equation 6.14 is known as the Principle of Minimum Potential Energy (PMPE). In other words, the true solution possesses the minimum potential.

6.4 Complementary principles

49

The minimum property of the exact solution can be proven by an alternative technique. Let us construct a potential function, for a deviation away from the exact solution u, denoted u + λν, where λ is a scalar and ν is any admissible variation (test function)

J (u+λν) =

1 2

Z

Ω

∇(u+λν) : IE : ∇(u+λν) dΩ−

Z

Ω

f ·(u+λν) dΩ−

Z

Γt

t∗ ·(u+λν) dA.

(6.15)

If we differentiate with respect to λ, ∂J (u + λν) = ∂λ

Z

Ω

∇ν : IE : ∇(u+λν) dΩ−

Z

Ω

f ·ν dΩ−

Z

Γt

t∗ ·ν dA, (6.16)

and set λ = 0 (because we know that the exact solution is for λ = 0), we have ∂J (u + λν) |λ=0 = ∂λ

Z

Ω

∇ν : IE : ∇u dΩ −

Z

Ω

f · ν dΩ −

Z

Γt

t∗ · ν dA = 0.

(6.17) Clearly, the minimizer of the potential is the solution to the field equations, since it produces the weak form as a result. This is a minimum since Z ∂ 2 J (u + λν) ∇ν : IE : ∇ν dΩ ≥ 0. (6.18) | = λ=0 ∂λ2 Ω

It is important to note that the weak form, derived earlier, requires no such potential, and thus is a more general approach than a minimum principle. Thus, in the hyperelastic case, the weak formulation can be considered as a minimization of a potential energy function. This is sometimes referred to as the Rayleigh-Ritz method.

This is called the complementary form of Equation 6.7. Similar restrictions are placed on the trial and test fields to force the integrals to make sense, i.e. to be finite. Similar boundedness restrictions control the choice of admissible complementary functions. In other words we assume that the solutions produce finite energy. Despite the apparent simplicity of such principles they are rarely used in practical computations, directly in this form, because of the fact that it is somewhat difficult to find approximate functions, σ, that satisfy ∇ · σ + f = 0. However, in closing, we provide some theoretical results. As in the primal case, a similar process is repeated using the complementary weak form. We define a complementary norm def

which is the Principle of Minimum Complementary Potential Energy (PMCPE). By directly adding together the potential energy and the complementary energy we obtain an equation of balance: 1 J (u) + K(σ) = 2

7.1 Introduction Generally, the ability to change the boundary data quickly is very important in finite element computations. One approach to do this rapidly is via the variational penalty method. This is done by relaxing kinematic assumptions on the members of the space of admissible functions and adding a term to “account for the violation” on the boundary. This is a widely used practice, and therefore to keep the formulation as general as possible we include penalty terms, although this implementation is not mandatory. Obviously, one could simply extract the known (imposed) values of boundary displacements (by appropriately eliminating rows and columns and modifying the righthandside load vector), however, it is tedious. Nevertheless we consider the penalty method formulation for generality, although one does not necessarily need to use it. Accordingly, consider the following statement: Find u ∈ H 1 (Ω) such that ∀ν ∈ H 1 (Ω) Z

Ω

∇ν : IE : ∇u dΩ =

Z

Ω

f · ν dΩ +

Z

Γt

t∗ · ν dA + P ⋆

Z

Γu

(u∗ − u) · ν dA,(7.1)

where the last term is to be thought of as a penalty term to enforce the applied displacement (kinematic) boundary condition, u|Γu = u∗ , and we relax the condition that the test function vanish on the displacement part of the boundary, ν|Γu = 0). The (penalty) parameter P ⋆ is a large positive number. A penalty formulation has a variety of interpretations. It is probably simplest to interpret it as a traction that attempts to restore the correct prescribed displacement: Z

Γu

t · ν dA ≈ P ⋆

Z

Γu

(u∗ − u) · ν dA,

(7.2)

54

7 A finite element implementation in three dimensions

where the term P ⋆ (u∗ − u) takes on the physical interpretation as a very stiff “traction spring” which is proportional to the amount of violation from the true boundary displacement. Remark: In the case where a potential exists, as is the case here, we can def motivate this approach by considering an augmented potential J (u, P ⋆ ) = R 1 J (u) + P ⋆ Γu (u∗ − u) · (u∗ − u) dA, u ∈ H (Ω), whose variation is Find u ∈ H 1 (Ω) such that ∀ν ∈ H 1 (Ω)

Z

Ω

∇ν : IE : ∇u dΩ =

Z

Ω

f · ν dΩ +

Z

Γt

t∗ · ν dA + P ⋆

Z

Γu

(u∗ − u) · ν dA.

(7.3)

Therefore, the penalty term can be thought of as a quadratic augmentation of the potential energy. When no potential exists, the penalty method can only be considered as an enforcement of a constraint.

7.2 FEM approximation It is convenient to write the bilinear form in the following (matrix) manner Z

It is clear that in an implementation of the finite element method, the sparsity of D should be taken into account. It is also convenient to write the approximations as uh1 (x1 , x2 , x3 ) = uh2 (x1 , x2 , x3 ) = uh3 (x1 , x2 , x3 ) =

Explicitly, [K]{a} = {R}, is the system of equations that is to be solved.

7.3 Global/local transformations One strength of the finite element method is that most of the computations can be done in an element by element manner. We define the entries of [K], 1

Representing the numerical approximation this way is simply to ease the understanding of the process. On the implementation level, one would not store the matrices in this form due the large number of zeroes.

where (X1i , X2i , X3i ) are true spatial coordinates of the ith node, and ˆ 1 , ζ2 , ζ3 ) def where φ(ζ = φ(x1 (ζ1 , ζ2 , ζ3 ), x2 (ζ1 , ζ2 , ζ3 ), x3 (ζ1 , ζ2 , ζ3 )). These types of mappings are usually termed parametric maps. If the polynomial order of the shape functions is as high as the element, it is an isoparametric map, lower, then subparametric map, higher, then superparametric.

7.4 Mesh generation and connectivity functions During the calculations, one needs to be able to connect the local numbering of the nodes to the global numbering of the nodes. For simple geometries, this is a straightforward scheme to automate. For complicated geometries, a lookup array connecting the local node number within an element to the global number is needed. Global/local connection is important since the proper (global) locations of the entries within the stiffness element are needed when solving the system of equations, either by Gaussian elimination or in element by element multiplication in a CG-type solver. local node # e#1 node # e#2 node # e#3 node # e#4 node # 1 1 2 3 4 2 3 4 5 2 3 7 8 9 10 6 7 8 9 4 Table 7.1. The local/global numbers for elements for an arch.

58

7 A finite element implementation in three dimensions

8 9

7

2

6

3

2

1

3

ζ

1

4 1

4

4 5

2

3

10

ζ

1

2

Fig. 7.2. An example of a mapped mesh for a semi-circular strip.

7.5 Warning: restrictions on elements Recall that in one-dimension we have the following type of calculations Z

(7.15) Clearly, a zero Jacobian will lead to problems, and potentially singular integrals. In one-dimension, this was easy to avoid since the nodes are numbered sequentially and as long as the nodes do not coincide, this will not happen, since J = h/2. However, clearly, J < 0 is not physical, because this implies that neighboring nodes get mapped inside out (through one another). Negative Jacobians can also lead to indefinite stiffness matrices. As in one-dimensional formulations, for two-dimensional and three-dimensional formulations, one has to insure that J = detF > 0, throughout the domain. There are two ways that nonpositive Jacobians can occur: (1) The elements are nonconvex and (2) the node numbering is incorrect forcing the elements to be pulled inside out. We must insure that J > 0, since J has a physical meaning: it is the ratio of the differential volume of the master element to the differential volume of the finite element. If the nodes are numbered correctly

7.5 Warning: restrictions on elements

59

to insure that nodes are not pulled “inside out” (for example see Figure 7.2) and that the elements are convex, then J > 0. 7.5.1 Good and bad elements: examples

For the four cases (Figure 7.3), we have: • Case 1: This element is acceptable, since 0 < J(ζ1 , ζ2 ) < ∞ throughout the element. The Jacobian is constant in this case. • Case 2: This element is unacceptable, since 0 > J(ζ1 , ζ2 ) throughout the element. The essential problem is that the nodes are numbered incorrectly, turning the element “inside-out”. • Case 3: This element is acceptable, since 0 < J(ζ1 , ζ2 ) < ∞ throughout the element. While the Jacobian is not constant throughout the element domain, it is positive and bounded. • Case 4: This element is unacceptable, since J(ζ1 , ζ2 ) < 0 in regions of the element. Even though the element is positve in some portions of the domain, a negative Jacobian in other parts can cause problems, such as potential singularities in the stiffness matrix. • For linear elements, the key indicator for a problematic element is the nonconvexity of the element (even if numbered correctly.

7.6 Three-dimensional shape functions For the remainder of the monograph, we will illustrate the Finite Element Method’s construction with so-called trilinear “brick” elements. The master element shape functions form a nodal bases of trilinear approximation given by:

For trilinear elements we have a nodal basis consisting of 8 nodes, and since it is vector valued, 24 total degrees of freedom (three degrees of freedom for each of the eight nodes). Remark: For standard quadratic elements, we have a nodal basis consisting of 27 nodes (Figure 7.4), and since it is vector valued, 81 total degrees of freedom (three degrees of freedom for each of the 27 nodes). The nodal shape functions can be derived quite easily, by realizing that it is a nodal basis, i.e. they are unity at the corresponding node, and zero at all other nodes., etc. For more details on the construction of higher-order elements, see Becker, Carey

7.7 Differential properties of shape functions We note that the φi ’s in the domain are never really computed. We actually start with the φˆi ’s in the master domain. Therefore in the stiffness matrix and righthand side element calculations, all terms must be defined in terms of the local coordinates. With this in mind, we lay down some fundamental relations, which are directly related to the concepts of deformation presented in our discussion in continuum mechanics. It is not surprising that a deformation gradient reappears in the following form:  ∂x1 ∂x1 ∂x1  def

Finally, with quadrature for each element, we can form each of the element contributions for [K]{a} = {R}: • For the stiffness matrix: [K e ] =

g g g X X X q=1 r=1 s=1

+

|

ˆ T [IE]([ ˆ ˆ φ]) ˆ D][ ˆ φ])|F wq wr ws ([D][ |

g g X X q=1 r=1

|

{R } =

{z

+

q=1 r=1 s=1

q=1 r=1

|

ˆ T {f }|F | wq wr ws [φ]

{z

}

standard

g g X X

(7.38)

}

penalty for Γu ∩∂Ωe 6=0

g g g X X X

|

}

ˆ T {φ}|F ˆ wq wr P ⋆ [φ] s |,

• For the load vector: e

{z

standard

ˆ T {t∗ }|F s | + wq wr [φ]

{z

for Γt ∩∂Ωe 6=0

}

g g X X q=1 r=1

|

ˆ T {u∗ }|F s |, (7.39) wq wr P ⋆ [φ]

{z

penalty for Γu ∩∂Ωe 6=0

}

where wq , etc, are Gauss weights and where |F s | represents the (surface) Jacobians of element faces on the exterior surface of the body, where, depending on the surface on which it is to be evaluated upon, one of the ζ components will be +1 or -1. These surface Jacobians can be evaluated in a variety of ways, for example using the Nanson’s formula, which is derived in Appendix 2, and which is discussed further shortly.

66

7 A finite element implementation in three dimensions

7.8.1 Implementation issues Following similar procedures as for one-dimensional problems, the global stiffness matrix K(I, J) can be efficiently stored in an element by element manner via k(e, i, j), i and j are the local entries in element number e. The amount of memory required with this relatively simple storage system is, for trilinear hexahedra, k(e, 24, 24) = 576 times the number of finite elements, where the k are the individual element stiffness matrices. If matrix symmetry is taken into account, the memory requirements are 300 times the number of finite elements. This simple approach is so-called element by element storage. The element by element storage is critical in this regard to reduce the memory requirements. 3 For an element by element storage scheme, a global/local index relation must be made to connect the local entry to the global entry for the subsequent linear algebraic solution processes. This is a relatively simple and efficient storage system to encode. The element by element strategy has other advantages with regard to element by element system CG-solvers, as introduced earlier. The actual computation cost of the matrix-vector multiplication in an element-by-element CG method is a [24 × 24] matrix times a {24 × 1} vector times the number of elements. This is an O(N ) calculation. If we consider I iterations necessary for convergence below an error tolerance, then the entire operation costs are O(IN ). 7.8.2 An example of the storage scaling Element by element storage has reduces the storage requirements dramatically. For example, consider a cube meshed uniformly with M elements in each direction (Figure 7.5), thus (M + 1)3 nodes and 3(M + 1)3 degrees of freedom for elasticity problems. A comparison of storage yields: • Direct storage: 3(M + 1)3 × 3(M + 1)3 = 9O(M 6 ), • Element by element storage: M 3 × 24 × 24 = 576M 3 and • Element by element storage with symmetry reduction: 300M 3 . Clearly, a ratio of direct storage to element by element storage scales as cubically O(M 3 ). Thus, 9O(M 6 ) 3 3 300M 3 = 100 O(M ) and 3 O(N ) 1 1 2 IO(N ) = I O(N ) = I O((3(M +

If a direct storage of the finite element storage of the stiffness matrix were attempted, the memory requirements would be K(DOF, DOF ) = DOF × DOF , where DOF indicates the total degrees of freedom, which for large problems, whould be extremely demanding.

Of course there are other compact storage schemes, and we refer the reader to the references for details.

7.9 Surface Jacobians and Nanson’s formula In order to compute surface integrals, for a general element that intersects the exterior surface, one must (Figure 7.6): 1. Identify which element face of the master element corresponds to that surface. One of the ζ-coordinates must be set to ±1; i.e. ζ1 , ζ2 and ζ3 must be set equal to ±1 for the faces that correspond to the exposed surfaces on the body where boundary conditions are imposed. Generally, we seek to integrate a quantity, Q, over the surface of the actual, deformed, element by computing over the master element, for which we can use standard Gaussian quadrature: Z Z ˆ dAˆe , Q (7.40) Q dAe = ∂Ωe

where the a1i , a2i and a3i are the values at the node i for the x1 , x2 and x3 components, and where the global coordinates must be transformed to the

7.10 Post processing

69

master system, in both the deformation tensor, and the displacement representation. Typically, within each element, at each Gauss point, we add up all eight contributions (from the basis functions) for each of the six components, then multiply by the corresponding nodal displacements that have previously been calculated. Gauss point locations are the preferred location to post-process the solution since they typically exhibit so-called superconvergent properties (more accurate than the theoretical estimates). In other words, they are usually the most accurate locations of the finite element approximation (see Ainsworth and Oden [1], Zienkiewicz and Taylor [26] and Zienkiewicz and Zhu [27]). The following expressions must be evaluated at the Gauss points, multiplied by the appropriate weights and added together: ∂uh 1 ∂x1

8.1 Introduction As we have seen in the one-dimensional analysis, the essential idea in the finite element method is to select a finite dimensional subspatial approximation of the true solution and form the following weak boundary problem: Find uh ∈ H hu (Ω) ⊂ H 1 (Ω), with uh |Γu = u∗ , such that

Z |

h

Ω

h

∇ν : IE : ∇u dΩ =

{z

B(uh ,ν h )

}

Z |

h

Ω

f · ν dΩ +

Z

{z

F (ν

Γt

t∗ · ν h dA,

}

h)

∀ν h ∈ H hv (Ω) ⊂ H 1 (Ω), with ν h |Γu = 0.

(8.1)

The critical point is that H hu (Ω), H hv (Ω) ⊂ H 1 (Ω). This “inner” approximation allows the development of straightforward subspatial error estimates. We will choose H hu (Ω) and H hv (Ω) to coincide. We have for any kinematically admissible function, w, a definition of the so-called energy norm ||u −

def w||2E(Ω) =

Z

Ω

(∇u − ∇w) : IE : (∇u − ∇w) dΩ = B(u − w, u − w). (8.2)

Note that in the event that nonconstant displacements are specified on the boundary, then u − w = constant is unobtainable unless u − w = 0, and the semi-norm in Equation (8.2) is a norm in the strict mathematical sense. Under standard assumptions the fundamental a-priori error estimate for the finite element method is def

||u − uh ||E(Ω) ≤ C(u, p)hmin(r−1,p) = γ ,

(8.3)

72

8 Accuracy of the finite element method in three dimensions

where p is the (complete) polynomial order of the finite element method used, r is the regularity of the exact solution, C is a global constant dependent on the exact solution and the polynomial approximation. C is independent of h, the maximum element diameter. For details see, Ainsworth and Oden [1], Becker, Carey and Oden[7], Carey and Oden [8], Oden and Carey [20], Hughes [15], Szabo and Babuska[24] and Bathe [6] for more mathematically precise treatments. Remark: As we have mentioned previously, we note that the set of functions specified by H hu (Ω) ⊂ H 1 (Ω) with uh |Γu = u∗ is technically not a space of functions and should be characterized as “a linear variety”. This does not pose a problem for the ensuing analysis, however, for precise mathematical details see Oden and Demkowicz [21].

8.2 The “Best Approximation” theorem As in the one-dimensional case we have

Remarks: As for one-dimensional problems, while this scheme provides a simple estimate for the global mesh fineness needed, the meshes need to be locally refined to ensure tolerable accuracy throughout the domain.

8.4 Local error estimation and adaptive mesh refinement To drive local mesh refinement schemes there are a variey of error estimation procedures. We mention the two main ones: Recovery Methods and Residual Methods.

8.4 Local error estimation and adaptive mesh refinement

75

Fig. 8.3. The Zienkiewicz-Zhu error estimator takes the solution at neighboring Gauss points to estimate the error at a node.

8.4.1 A-Posteriori Recovery Methods So called recovery methods are based on the assumption that there is a function G(uh ) that is closer to ∇u than ∇uh , which can be used to estimate the error. The most popular of these is the Zienkiewicz-Zhu [27] estimator. Zienkiewicz and Zhu developed an error estimation technique that is effective for a wide class of problems. It is based on the notion that gradients of the solution obtained on a given mesh can be smoothed and compared with the original solution to assess the error. The sampling points at which the gradient’s error are to be evaluated are so-called superconvergent points where the convergence is above optimal. However, these points must be searched for, and may not even exist, i.e. superconvergence occurs only in very special situations. By superconvergence, we mean that the exponent is higher that the standard theoretical estimate (θ): def

||u − uh ||H s (Ω) ≤ C(u, p)hmin(p+1−s,r−s) = θ

(8.15)

76

8 Accuracy of the finite element method in three dimensions

The function G is obtained by calculating a least squares fit to the gradient of sample superconvergent points (potentially several hundred in three dimensions) of elements surrounding a finite element node (Figure 8.3). The new gradient then serves to estimate the error locally over a “patch” of elements, i.e. a group of element sharing a common node, ||G(uh ) − ∇uh ||patch ≈ error

(8.16)

This is by far the most popular method in the engineering community to estimate the error, and also has the benefit of postprocessing stresses as a by-product. 8.4.2 A-Posteriori Residual Methods Residual methods require no a-posteriori system of equations to be solved. Such methods bound the error by making use of • • • •

ζe2 = C1 h2e ||r 1 ||2L2 (Ωe ) +C2 heI ||[|r 2 |]||2L2 (∂ΩI ) +C3 heJ ||r 3 ||2L2 (∂ΩJB ) . (8.18) The local quantities ζe , are used to decide whether an element is to be refined (Figure 4.3). If ζe > T OL, then the element is refined. Such estimates, used to guide local adaptive finite element mesh refinement techniques, were first developed in Bab´ uska and Rheinboldt [4] for one-dimensional problems, in Bab` uska and Miller [5] and Kelly et. al. [16] for two-dimensional problems. For reviews see Ainsworth and Oden [1].

9 Time-dependent problems

9.1 Introduction We now give a brief introduction to time-dependent problems through the equations of elastodynamics for infinitesimal deformations ∇ · σ + f = ρo where ∇ = ∇X and

d dt

=

∂ ∂t

d2 u dv = ρo , dt2 dt

(9.1)

(see Appendix 2).

9.2 Generic time-stepping In order to motivate the time-stepping process, we first start with the dynamics of single point mass under the action of a force Ψ . The equation of motion is given by (Newton’s Law) mv˙ = Ψ ,

(9.2)

where Ψ is the total force applied to the particle. Expanding the velocity in a Taylor series about t + θ∆t, where 0 ≤ θ ≤ 1, for v(t + ∆t), we obtain v(t+∆t) = v(t+θ∆t)+

We note that • When θ = 1, then this is the (implicit) Backward Euler scheme, which is 2 ˆ very stable (very dissipative) and O(∆t) = O(∆t)2 locally in time,

9.3 Application to the continuum formulation

79

• When θ = 0, then this is the (explicit) Forward Euler scheme, which is 2 ˆ conditionally stable and O(∆t) = O(∆t)2 locally in time, • When θ = 0.5, then this is the (implicit) “Midpoint” scheme, which is 2 ˆ stable and O(∆t) = O(∆t)3 locally in time. In summary, we have for the velocity1

which requires one to solve a system of algebraic equations, while for an explicit (Forward Euler) method θ = 0 with usually [M ] is approximated by an easy to invert matrix, such as a diagonal matrix, [M ] ≈ M [1], to make the matrix inversion easy, yielding: L L L L+1 L −1 ˙ {a}

˙ = {a} + ∆t[M ]

−[K]{a} + {Rf } + {Rt }

,

(9.27)

9.3 Application to the continuum formulation

81

augmented with ˙ L. {a}L+1 = {a}L + ∆t{a}

(9.28)

There is an enormous number of time-stepping schemes. For general timestepping, we refer the reader to the seminal texts of Hairer et al. [13, 14]. In the finite element context, we refer the reader to Bathe [6], Becker, Carey and Oden[7], Hughes[15] and Zienkiewicz and Taylor [26].

10 Summary and advanced topics

The finite element method is a huge field of study. This set of notes was designed to give students only a brief introduction to the fundamentals of the method. The implementation, theory and application of FEM is a subject of immense literature. For general references on the subject, see the wellknown books of Ainsworth and Oden [1], Becker, Carey and Oden[7], Carey and Oden [8], Oden and Carey [20], Hughes [15], Szabo and Babuska[24], Bathe [6] and Zienkiewicz and Taylor [26]. For a review of the state of the art in Finite Element Methods, see the relatively recent book of Wriggers [25]. Much of the modern research activity in computational mechanics reflects the growing industrial demands for rapid simulation of large-scale, nonlinear, time-dependent problems. Accordingly, the next concepts the reader should focus on are: 1. 2. 3. 4.

The last item is particularly important. Thus, we close with a few comments on domain decomposition and parallel processing. In many cases, in partucular in three-dimensions, for a desired accuracy, the meshes need to be so fine that the number of unknowns outstrips the available computing power on a single serial processing machine. One approach to deal with this problem is domain decomposition. Decomposition of a domain into parts (subdomains)that can be solved independently by estimating the boundary conditions, solving the decoupled subdomains, correcting the boundary conditions by updating them using information from the computed solutions, and repeating the procedure, has become popular over the last 20 years as a means of harnessing computational power afforded by parallel processing machines.

where 1 ≤ γ ≤ 3 is an exponent that reflects the extremes of solving efficiency. The ratio of the amount of work done by solving the total domain to that of solving the subdomain problems (taking into account the number of iterations (I) needed to update the interface boundary conditions) is approximately ((3(N M + 1))3 )γ M 3γ Cd ∝ ≈ , ICsd I((3(N + 1))3 )γ I

(10.3)

where we have ignored the costs of computing the updated interface conditions (considered small). If we assume that the amount of time to solve is also proportional to the operation counts, and assume that each domain is processed in the same amount of time, using P processors yields: P M 3γ Cd = . ICsd /P I

This idealized simple example illustrates the possible benefits in reduction of solution time, independent of the gains in data storage. For a historical overview, as well as a thorough analysis of the wide range of approaches, see Le Tallec [17]. In many cases, interprocessor communication and synchronization can be a bottleneck to obtain a high-performance parallel algorithm. The parallel speedup (relative to a sequential implementation), S, can be approx1 imated by Amdahl’s law (Amdahl [2]), S = 1−P , where P is the fraction of the algorithm that is parallelizable. For example, if 40 % of the code is inherently sequential, then P = 0.6 and S = 2.5. This provides an upper bound on the utility of adding more processors. A related expression is “Gustafson’s law” (Gustafson [12]), S(P ) = P − k(P − 1), where k represents the parts of the algorithm that are not parallelizable. Amdahl’s law assumes that the

86

10 Summary and advanced topics

problem is of fixed size and that the sequential part is independent of the number of processors, however, Gustafson’s law does not make either of these assumptions. We refer the reader to the works of Papadrakakis et. al [22-23] for parallel strategies that are directly applicable to the class of problems of interest. Remarks: Some comments of the convergence of such iterative schemes are provided in Appendix 3.

This represents the volume of a parallelpiped formed by the three vectors.

90

12 Appendix 1: elementary mathematical concepts

12.2 Vector calculus We have the following elementary operations: • The divergence of a vector (a contraction to a scalar) is defined by ∇ · u = ui,i

(12.4)

∇ · A has components of Aij,j .

(12.5)

whereas for a second order tensor (a contraction to a vector):

• The gradient of a vector (a dilation to a second order tensor) is: ∇u has components of ui,j ,

(12.6)

∇A has components of Aij,k .

(12.7)

whereas for a second order tensor (a dilation to a third order tensor):

• The gradient of a scalar (a dilation to a vector): ∇φ has components of φ,i .

(12.8)

The scalar product of two second order tensors, for example the gradients of first order vectors, is defined as ∂vi ∂ui ∂xj ∂xj | {z }

∇v : ∇u =

def

= vi,j ui,j

i, j = 1, 2, 3,

(12.9)

in Cartesian bases

where ∂ui /∂xj , ∂vi /∂xj are partial derivatives of ui and vi , and where ui , vi are the Cartesian components of u and v and ∇u · n has components of

ui,j nj | {z }

,

i, j = 1, 2, 3.

(12.10)

in Cartesian bases

• The divergence theorem for vectors is Z

Ω

∇ · u dΩ =

Z

∂Ω

Z

u · n dA

and analogously for tensors Z

Ω

∇·B dΩ =

Z

∂Ω

B ·n dA

Z

ui,i dΩ = Ω

Bij,j dΩ = Ω

Z Z

ui ni dA

(12.11)

∂Ω

Bij nj dA, (12.12) ∂Ω

where n is the outward normal to the bounding surface.

12.4 Matrix manipulations

91

These standard operations arise throughout the analysis.

12.3 Interpretation of the gradient of functionals The elementary concepts to follow are important for understanding iterative solvers. Consider a surface in space defined by Π(x1 , x2 , ...xN ) = C.

(12.13)

Consider a unit vector b, and the inner product, forming the directional derivative (the rate of change of Π in the direction of b): ∇Π · b = ||b||||∇Π||cosγ.

(12.14)

When γ = 0, the directional derivative is maximized, in other words when b and ∇Π are colinear. Since we can represent curves on the surface defined by Π = C by a position vector (t is a parameter) r = x1 (t)e1 + x2 (t)e2 ... + xN (t)eN ,

we immediately see that ∇Π is normal to the surface, and represents the direction of maximum change in the normal direction.

12.4 Matrix manipulations Throughout the next few definitions, we consider the matrix [A]. The matrix [A] is said to be symmetric if [A] = [A]T and skew-symmetric if [A] = −[A]T . A first order contraction (inner product) of two matrices is defined by A · B = [A][B] has components of Aij Bjk = Cik

(12.18)

where it is clear that the range of the inner index j must be the same for [A] and [B]. The second order inner product of two matrices is A : B = Aij Bij = tr([A]T [B])

(12.19)

92

12 Appendix 1: elementary mathematical concepts

The rule of transposes for the product of two matrices is ([A][B])T = [B]T [A]T .

(12.20)

The rule of inverses for two invertible n × n matrices is ([A][B])−1 = [B]−1 [A]−1

12.4.2 Eigenvalues The mathematical definition of an eigenvalue, a scalar denoted Λ and eigenvector, a vector denoted E, of a matrix [A] are [A]{E} = Λ{E}

(12.25)

Some main properties to remember about eigenvalues and eigenvectors are: 1. If [A] (n × n) has n linearly independent eigenvectors then it is diagonalizable by a matrix formed by columns of the eigenvectors. In the case of a 3 × 3 matrix, "

Λ1 0 0 0 Λ2 0 0 0 Λ3

#



(1)

(2)

(3)

−1 "

E1 E1 E1 =  E2(1) E2(2) E2(3)  (1) (2) (3) E3 E3 E3

A11 A12 A13 A21 A22 A23 A31 A32 A33

#  (1) (2) (3)  E1 E1 E1  E2(1) E2(2) E2(3)  (12.26) (1)

E3

(2)

E3

(3)

E3

2. If [A] (n × n) has n distinct eigenvalues then the eigenvectors are linearly independent 3. If [A] (n × n) is symmetric then its eigenvalues are real. If the eigenvalues are distinct, then the eigenvectors are orthogonal. A quadratic form is defined as {x}T [A]{x}, and is positive when [A] has positive eigenvalues. Explicitly, for a 3 × 3 matrix, we have    A11 A12 A13 x1 def {x}T [A]{x} = x1 x2 x3  A21 A22 A23   x2  . (12.27) A31 A32 A33 x3 A matrix [A] is said to be positive definite if the quadratic form is positive for all nonzero vectors x. Clearly, from equation 12.26 a positive definite matrix must have positive eigenvalues. Remark: If we set the determinant det[A − Λ1] = 0, it can be shown that the so-called “characteristic” polynomial is, for example for a 3 × 3 matrix: det(A − Λ1) = −Λ3 + IA Λ2 − II A Λ + III A = 0,

Since IA , II A and III A , can be written in terms of trA, which is invariant under frame rotation, they too are invariant under frame rotation.

94

12 Appendix 1: elementary mathematical concepts

12.4.3 Coordinate transformations To perform a coordinate transform for a 3 × 3 matrix [A] from one Cartesian ˆ we apply a transformation coordinate system to another (denoted with a (·)), matrix [Q]: ˆ = [Q][A][Q]−1 [A]

(12.30)

^

X3

X3

^

X1 ^

X2

X2 REFLECTION

X1

^

X3

X3

^

X2 ^

X1

X2 ROTATION

X1

Fig. 12.1. Top: reflection with respect to the x2 − x3 plane. Bottom: rotation with respect to the x3 axis.

In this chapter, we provide the reader with basic background information for field equations of interest.

13.1 Deformations The term deformation refers to a change in the shape of a continuum between a reference configuration and current configuration. In the reference configuration, a representative particle of a continuum occupies a point P in space and has the position vector (Figure 13.1) X = X1 e 1 + X2 e 2 + X3 e 3 ,

(13.1)

where e1 , e2 , e3 is a Cartesian reference triad, and X1 , X2 , X3 (with center O) can be thought of as labels for a material point. Sometimes the coordinates or labels (X1 , X2 , X3 ) are called the referential or material coordinates. In the current configuration, the particle originally located at point P (at time t = 0) is located at point P ′ and can be also expressed in terms of another position vector x, with coordinates (x1 , x2 , x3 ). These are called the current coordinates. In this framework, the displacement is u = x − X for a point originally at X and with final coordinates x.

When a continuum undergoes deformation (or flow), its points move along various paths in space. This motion may be expressed as a function of X and t as1 x(X, t) = u(X, t) + X(t) , 1

(13.2)

Frequently, analysts consider the referential configuration to be fixed in time thus, X 6= X(t). We shall adopt this in the present work.

98

13 Appendix 2: basic continuum mechanics

X 3, x 3 u+du dx dX

X+dX

u

P’

P X

x

O

X 2, x

2

X 1, x 1 Fig. 13.1. Different descriptions of a deforming body.

which gives the present location of a point at time t, written in terms of the referential coordinates X1 , X2 , X3 . The previous position vector may be interpreted as a mapping of the initial configuration onto the current configuration. In classical approaches, it is assumed that such a mapping is one-to-one and continuous, with continuous partial derivatives to whatever order required. The description of motion or deformation expressed previously is known as the Lagrangian formulation. Alternatively, if the independent variables are the coordinates x and time t, then x(x1 , x2 , x3 , t) = u(x1 , x2 , x3 , t)+X(x1 , x2 , x3 , t), and the formulation is denoted as Eulerian (Figure 13.1). Partial differentiation of the displacement vector u = x − X, with respect to X, produces the following displacement gradient: ∇X u = F − 1, where def

Remark: It should be clear that dx can be reinterpreted as the result of a mapping F ·dX → dx, or a change in configuration (reference to current). One may develop so-called Eulerian formulations, employing the current configuration coordinates to generate Eulerian strain tensor measures. An important def quantity is the Jacobian of the deformation gradient, J = detF , which relates differential volumes in the reference configuration (dω0 ) to differential volumes in the current configuration (dω) via dω = J dω0 . The Jacobian of the deformation gradient must remain positive, otherwise we obtain physically impossible “negative” volumes. For more details, we refer the reader to the texts of Malvern [18], Gurtin [11], Chandrasekharaiah and Debnath [9].

where ω ⊂ Ω is an arbitrary portion of the continuum, with boundary ∂ω, ρ is the material density, b is the body force per unit mass and u˙ is the time derivative of the displacement. The force densities, t, are commonly referred to as “surface forces” or tractions. 13.2.1 Postulates on volume and surface quantities Now, consider a tetrahedron in equilibrium, as shown in Figure 13.2, where a balance of forces yields ¨ , (13.8) t(n) ∆A(n) + t(−1) ∆A(1) + t(−2) ∆A(2) + t(−3) ∆A(3) + ρb∆V = ρ∆V u where ∆A(n) is the surface area of the face of the tetrahedron with normal n, and ∆V is the tetrahedron volume. As the distance (h) between the tetrahedron base (located at (0,0,0)) and the surface center goes to ∆V → 0. Geometrically, we zero (h → 0), we have ∆A(n) → 0 ⇒ ∆A (n) (i)

surface forces there, pictorially represented by a cube surrounding a point. The fundamental issue that must be resolved is the characterization of these surface forces. We can represent the surface force density vector, the so-called “traction”, on a surface by the component representation:    σi1  def (13.9) t(i) = σi2 ,   σi3

where σ is the so-called Cauchy stress tensor. Remark: In the absence of couple stresses, a balance of angular momentum implies a symmetry of stress, σ = σ T , and thus the difference in notations becomes immaterial. Explicitly, starting with an angular momentum balance,

A relationship can be determined Rbetween the Rdensities in the current and R reference configurations, ω ρdω = ω0 ρJdω0 = ω0 ρ0 dω0 . Therefore, the Jacobian can also be interpreted as the ratio of material densities at a point. Since the volume is arbitrary, we can assume that ρJ = ρ0 holds at every d d point in the body. Therefore, we may write dt (ρ0 ) = dt (ρJ) = 0, when the system is mass conservative over time. This leads to writing the lastR term in R d(ρJ) R R d ˙ ˙ u dω. ρ¨ u J dω ρ u dω = u dω + Equation 13.12 as dt 0 = ω ρ¨ 0 ω0 dt ω0 ω From Gauss’s Divergence theorem, and an implicit assumption that σ is difR u) dω = 0. If the volume is argued ferentiable, we have ω (∇x · σ + ρb − ρ¨ as being arbitrary, then the integrand must be equal to zero at every point, yielding ∇x · σ + ρb = ρ¨ u.

(13.13)

13.3 Referential descriptions of balance laws and Nanson’s formula Although we will not consider finite deformation problems in this monograph, some important concepts will be useful later in the context of mapping from one configuration to the next. In many cases it is quite difficult to perform a stress analysis, for finite deformation solid mechanics problems, in the current configuration, primarily because it is unknown a priori. Therefore all quantities are usually transformed (“pulled”) back to the original coordinates, the referential frame. Therefore, it is preferable to think of a formulation in terms of the referential fixed coordinated X, a so called Lagrangian formulation. With this in mind there are two commonly used referential measures of stresses. We start by a purely mathematical result, leading to the so-called “Nanson” formula for transformation of surface elements. Consider the cross product of two differential line elements in a current configuration,

adjoint defined by T ∗ = (detT )T −T . This leads to (detT )1 = T T · T ∗ . Applying the result we have dx(1) × dx(2) = F ∗ · (dX (1) × dX (2) ) and F T · (dx(1) × dx(2) ) = (detF )1 · (dX (1) × dX (2) ). This leads to F T · nda = (detF )n0 da0 . This is the so-called Nanson formula. Knowing this, we now formulate the equations of equilibrium in the current or a reference configuration. Consider two surface elements, one on the current configuration, and one on a reference configuration. If we form a new kind of stress tensor, call it P , such that the amount of force is the same we have P · n0 da0 = σ · nda = σ·F −T (detF )·n0 da0 which implies P = σ·F −T (detF ). The tensor P is called the first Piola-Kirchhoff stress, and gives the actual force on the current area, but calculated per unit area of reference area. However, it is is not symmetric, and this sometimes causes difficulties in an analysis. Therefore, we symmetrize it by F −1 · P = S = S T = F −1 · σ · F −T (detF ). The Rtensor S is called the second Piola-Kirchhoff stress. By definition we have ∂ω0 n0 · P da0 = R n · σ da, and thus ∂ω Z Z Z and therefore

13.4 The First Law of Thermodynamics/An energy balance The interconversions of mechanical, thermal and chemical energy in a system are governed by the first law of thermodynamics, which states that the time rate of change of the total energy, K + I, is equal to the mechanical power, d (K + I) = P + H + Q. Here P and the net heat supplied, H + Q, i.e. dt the kinetic energy of a subvolume of material contained in Ω, denoted ω, is def R K = ω 21 ρu˙ · u˙ dω, the power (rate of work) of the external forces acting on ω R def R is given by P = ω ρb · u˙ dω + ∂ω σ · n · u˙ da, the heat flow into the volume by R R def conduction is Q = − ∂ω q · n da = − ω ∇x · q dω, q being the heat flux, the def R heat generated due to sources, such as chemical reactions, is H = ω ρz dω, where z is the reaction source rate per unit mass and the internal energy is def R I = ω ρw dω, w being the internal energy per unit mass. Differentiating the kinetic energy yields, dK d = dt dt

Z

ω

1 ρu˙ · u˙ dω = 2 = =

Z

Z

Z

ω0

d 1 ˙ dω0 (ρJ u˙ · u) dt 2 (

ω0

ω

d 1 ρ0 ) u˙ · u˙ dω0 + dt 2

Z

ρ ω

d 1 ˙ dω (u˙ · u) dt 2

¨ dω, ρu˙ · u

(13.17)

where we have assumed that the mass in the system is constant. We also have d dI = dt dt

Z

ρw dω = ω

d dt

Z

ρJw dω0 = ω0

Z

d (ρ0 ) w dω0 + dt ω0 | {z } =0

Z

ρw ˙ dω = ω

Z

ρw ˙ dω. ω

(13.18)

By using the divergence theorem, we obtain Z

∂ω

σ · n · u˙ da =

Z

ω

˙ dω = ∇x · (σ · u)

Z

ω

(∇x · σ) · u˙ dω +

Z

ω

σ : ∇x u˙ dω. (13.19)

104

13 Appendix 2: basic continuum mechanics

Combining the results, and enforcing a balance of linear momentum, leads to Z

The starting point to develop a constitutive theory is to assume a stored elastic def energy function exists, a function denoted W = ρw, which depends only on the mechanical deformation. The simplest function that fulfills σ = ρ ∂w ∂ ǫ is W = 12 ǫ : IE : ǫ, where IE is the fourth rank elasticity tensor. Such a function satisfies the intuitive physical requirement that, for any small strain from an undeformed state, energy must be stored in the material. Alternatively, a small strain material law can be derived from σ = ∂W ∂ ǫ and W ≈ c0 + c1 : ǫ + 12 ǫ : IE : ǫ + ... which implies σ ≈ c1 + IE : ǫ + .... We are free to set c0 = 0 (it is arbitrary) in order to have zero strain energy at zero strain and, furthermore, we assume that no stresses exist in the reference state (c1 = 0). With these assumptions, we obtain the familiar relation σ = IE : ǫ.

The existence of a scalar energy function forces IE to be symmetric since the strains are symmetric, in other words W = 12 ǫ : IE : ǫ = 21 (ǫ : IE : ǫ)T = T 1 T : ǫT = 21 ǫ : IE T : ǫ which implies IE T = IE. Consequently, IE 2 ǫ : IE has only 21 independent components. The nonnegativity of W imposes the restriction that IE remains positive definite. At this point, based on many factors that depend on the material microstructure, it can be shown that the components of IE may be written in terms of anywhere between 21 and 2 independent parameters. Accordingly, for isotropic materials, we have two planes of symmetry and an infinite number of planes of directional independence (two free components), yielding 2

The symbol [·] is used to indicate the matrix notation equivalent to a tensor form, while {·} is used to indicate the vector representation.

This is one form of Hooke’s Law. The resistance to change in the volume is measured by κ. We note that ( tr3σ 1)′ = 0, which indicates that this part of the stress produces no distortion. Another fundamental form of Hooke’s law is E ν σ= ǫ+ (trǫ)1 , (13.31) 1+ν 1 − 2ν and the inverse form

ν 1+ν σ − (trσ)1 . (13.32) E E To interpret the material values, consider an idealized uniaxial tension test (pulled in the x1 direction inducing a uniform stress state) where σ12 = σ13 = σ23 = 0, which implies ǫ12 = ǫ13 = ǫ23 = 0. Also, we have σ22 = σ33 = 0. Under these conditions we have σ11 = Eǫ11 and ǫ22 = ǫ33 = −νǫ11 . Therefore, E, Young’s modulus, is the ratio of the uniaxial stress to the corresponding strain component. The Poisson ratio, ν, is the ratio of the transverse strains to the uniaxial strain. Another commonly used set of stress-strain forms are the Lam´e relations, ǫ=

13.6.2 Solid-state diffusion-reaction Consider a structure which occupies an open bounded domain in Ω ∈ IR3 , with boundary ∂Ω. The boundary consists of Γc and Γg , where the solute concentrations (c) and solute fluxes are respectively specified. The diffusive properties of the heterogeneous material are characterized by a spatially varying diffusivity ID 0 ∈ IR3×3 , which is assumed to be a symmetric bounded positive definite tensor-valued function. The mass balance for a small diffusing species, denoted by the normalized concentration of the solute c (molecules per unit volume), in an arbitrary subvolume of material contained within Ω, denoted ω, consists of a storage term (c), ˙R a reaction termR (s), ˙ and an inward normal ˙ dω = − ∂ω G · n da. It is a classical flux term (−G · n), leading to ω (c˙ + s) stoichiometrically inexact approximation to assume that the diffusing species reacts (is created or destroyed) in a manner such that the rate of production of the reactant (s) is directly proportional to the concentration of the diffusing species itself and the rate of change of the diffusing species, s˙ = τ c + ̟c. ˙ Q Q Here, τ = τ0 e− Rθ and ̟ = ̟0 e− Rθ , where τ0 and ̟0 are rate constants, Q

13.6 Related physical concepts

109

and Q (Q 6= Q) are activation energies per mole of diffusive species, R is the universal gas constant and θ is the temperature. Upon substitution of these relations into the conservation law for the diffusing species, and after using the divergence theorem, since the volume ω is arbitrary, one has a Fickian diffusion-reaction model in strong form, assuming G = −ID · ∇x c c˙ = ∇x · (ID · ∇x c) − τ c − ̟c˙ ⇒ c(1 ˙ + ̟) = ∇x · (ID · ∇x c) − τ c. (13.40) When τ0 > 0, the diffusing species is destroyed as it reacts, while τ0 < 0 means that the diffusing species is created as it reacts, i.e. an autocatalytic or “chain” reaction occurs. We will only consider the nonautocatalytic case in this work. Also, depending on the sign of ̟0 , effectively the process will have an accelerated or decelerated diffusivity as well as accelerated or decelerated reactivity. In Equation 13.40, ID is the diffusivity tensor (area per unit time). If we ignore reactions and time-dependency, and assume that the domain is not deforming, we then arrive at the familiar ∇X · (ID · ∇X c) = 0.

∇x · G + τ c + ̟c˙ = −c. ˙ From this point forth, we consider infinitesimal deformations, and we drop the explicit reference to differentiation with respect to X or x, under the assumption that they are one and the same at infinitesimal deformations

110

13 Appendix 2: basic continuum mechanics

∇ · (IE : ∇u) = 0, ∇ · (IK · ∇θ) = 0,

(13.44)

∇ · (ID · ∇c) = 0. Furthermore, we shall use the notation x to indicate the location of a point in space, under the assumption of infinitesimal deformations, where the difference between X and x is considered insignificant.

14 Appendix 3: convergence of recursive iterative schemes

Recursive iterative schemes arise frequently in computational mechanics, for example in implicit time stepping, domain decomposition, etc. To understand the convergence of such iterative schemes, consider a general system of coupled equations given by A(s) = F,

(14.1)

where s is a solution, and where it is assumed that the operator, A, is invertible. One desires that the sequence of iterated solutions, sI , I = 1, 2, ..., converge to A−1 (F) as I → ∞. It is assumed that the I − th iterate can be represented by some arbitrary function sI = T I (A, F, sI−1 ). One makes the following split sI = G I (sI−1 ) + r I .

(14.2)

For this method to be useful the exact solution should be reproduced. In other words, when s = A−1 (F), then s = A−1 (F) = G I (A−1 (F)) + r I .

(14.3)

Therefore, one has the following consistency condition r I = A−1 (F) − G I (A−1 (F)),