5 Acknowledgments It is my great pleasure to thank Prof. Dr. Klaus Ritter for his constant support and encouragement over the past ten years. Furthermore, I would like to thank Prof. Dr. Johann Edenhofer who stimulated my interest in optimal control of PDEs. My scientific work benefited significantly from two very enjoyable and fruitful research stays at the Department of Computational and Applied Mathematics (CAAM) and the Center for Research on Parallel Computation (CRPC), Rice University, Houston, Texas. These visits were made possible by Prof. John Dennis and Prof. Matthias Heinkenschloss. I am very thankful to both of them for their hospitality and support. During my second stay at Rice University, I laid the foundation of a large part of this work. The visits were funded by the Forschungsstipendium Ul157/1-1 and the Habilitandenstipendium Ul157/3-1 of the Deutsche Forschungsgemeinschaft, and by CRPC grant CCR This support is gratefully acknowledged. The computational results in chapter 9 for the boundary control of the compressible Navier Stokes equations build on joint work with Prof. Scott Collis, Prof. Matthias Heinkenschloss, Dr. Kaveh Ghayour, and Dr. Stefan Ulbrich as part of the Rice AeroAcoustic Control (RAAC) project, which is directed by Scott Collis and Matthias Heinkenschloss. I thank all RAAC group members for allowing me to use their contributions to the project for my computations. In particular, Scott Collis Navier Stokes solver was very helpful. The computations for chapter 9 were performed on an SGI Origin 2 at Rice University which was purchased with the aid of NSF SCREMS grant I am very thankful to Matthias Heinkenschloss for giving me access to this machine. Furthermore, I would like to thank Prof. Dr. Folkmar Bornemann for the opportunity to use his SGI Origin 2 for computations. I also would like to acknowledge the Zentrum Mathematik, Technische Universität München, for providing a very pleasant and professional working environment. In particular, I am thankful to the members of our Rechnerbetriebsgruppe, Dr. Michael Nast, Dr. Andreas Johann, and Rolf Schöne, for their good system administration and their helpfulness. In making the ideas for this work concrete, I profited from an inspiring conversation with Prof. Liqun Qi, Prof. Danny Ralph, and PD Dr. Christian Kanzow during the ICCP99 meeting in Madison, Wisconsin, which I would like to acknowledge. Finally, I wish to thank my parents, Margot and Peter, and my brother Stefan for always being there for me.

6 1. Introduction A central theme of applied mathematics is the design of accurate mathematical models for a variety of technical, financial, medical, and many other applications, and the development of efficient numerical algorithms for their solution. Often, these models contain parameters that should be adjusted in an optimal way, either to maximize the accuracy of the model (parameter identification), or to control the simulated system in a desired way (optimal control). Since optimization with simulation constraints is more challenging than simulation alone (which already can be very involved on its own), the development and analysis of efficient optimization methods is crucial for the viability of this approach. Besides the optimization of systems, minimization problems and variational inequalities often arise already in the process of building mathematical models; this, e.g., applies to contact problems, free boundary problems, and elastoplastic problems [47, 62, 63, 97, 98, 117]. Most of the variational problems mentioned so far join the property that they are continuous in time and/or space, so that infinite-dimensional function spaces provide the appropriate setting for their analysis. Since essential information on the problem to solve is carried by the properties of the underlying infinite-dimensional spaces, the successful design of robust and mesh-independent optimization methods requires a thorough convergence analysis in this infinite-dimensional function space setting. The purpose of this work is to develop and analyze a class of Newton-type methods for the solution of optimization problems and variational inequalities that are posed in function spaces and contain pointwise inequality constraints. A representative prototype of the problems we consider here is the following: Bound-Constrained Variational Inequality Problem (VIP): Find u L p (Ω) such that: u B def = {v L p (Ω) : a v b on Ω}, F (u), v u for all v B. (1.1) Hereby, u, v = Ω u(ω)v(ω)dω, and F : Lp (Ω) L p (Ω) with p, p (1, ], 1/p + 1/p 1, is an (in general nonlinear) operator, where L p (Ω) is the usual Lebesgue space on the bounded Lebesgue measurable set Ω R n. We assume that Ω has positive Lebesgue measure, so that < µ(ω) <. These requirements on Ω are assumed throughout this work. In case this is needed (e.g., for embeddings), but not explicitly stated, we assume that Ω is nonempty, open, and bounded with

7 2 1. Introduction sufficiently smooth boundary Ω. The lower- and upper bound functions a and b may be present only on measurable parts Ω a and Ω b of Ω, which is achieved by setting a Ω\Ωa = and b Ω\Ωb = +, respectively. We assume that the natural extensions by zero of a Ωa and b Ωb to Ω are elements of L p (Ω). We also require a minimum distance ν > of the bounds from each other, i.e., b a ν on Ω. In the definition of B, and throughout this work, relations between measurable functions are meant to hold pointwise almost everywhere on Ω in the Lebesgue sense. Various extensions of problem (1.1) will also be considered and are discussed below. In many situations, the VIP (1.1) describes the first-order necessary optimality conditions of the bound-constrained minimization problem minimize j(u) subject to u B. (1.2) In this case, F is the Fréchet derivative j : L p (Ω) L p (Ω) of the objective functional j : L p (Ω) R. The methods we are going to investigate are best explained by considering the unilateral case with lower bounds a. The resulting problem is called nonlinear complementarity problem (NCP): u L p (Ω), u, F (u), v u for all v L p (Ω), v. (1.3) As we will see, and as might be obvious to the reader, (1.3) is equivalent to the pointwise complementarity system u, F (u), uf (u) = on Ω. (1.4) The basic idea, which was developed in the nineties for the numerical solution of finite-dimensional NCPs, consists in the observation that (1.3) is equivalent to the operator equation Φ(u) =, where Φ(u) = φ ( u(ω), F (u)(ω) ) ω Ω. (1.5) Hereby, φ : R 2 R is an NCP-function, i.e., φ(x) = x 1, x 2, x 1 x 2 =. We will develop a semismoothness concept that is applicable to the operators arising in (1.5) and that allows us to develop a class of Newton-type methods for the solution of (1.5). The resulting algorithms have, as their finite-dimensional counter parts the semismooth Newton methods several remarkable properties: (a) The methods are locally superlinearly convergent, and they converge with q-rate > 1 under slightly stronger assumptions. (b) Although an inequality constrained problem is solved, only one linear operator equation has to be solved per iteration. Thus, the cost per iteration is comparable to that of Newton s method for smooth operator equations. We remark that sequential quadratic programming (SQP) algorithms, which are very efficient in

8 1. Introduction 3 practice, require the solution of an inequality constrained quadratic program per iteration, which can be significantly more expensive. Thus, it is also attractive to combine SQP methods with the class of Newton methods we describe here, either by using the Newton method for solving subproblems, or by rewriting the complementarity conditions in the Kuhn Tucker system as operator equation. (c) The convergence analysis does not require a strict complementarity condition to hold. Therefore, we can prove fast convergence also for the case where the set {ω : ū(ω) =, F (ū)(ω) = } has positive measure at the solution ū. (d) The systems that have to be solved in each iteration are of the form [d 1 I + d 2 F (u)]s = Φ(u), (1.6) where I : u u is the identity and F denotes the Fréchet derivative of F. Further, d 1, d 2 are nonnegative L -functions that are chosen depending on u and satisfy < γ 1 < d 1 + d 2 < γ 2 on Ω uniformly in u. More precisely: (d 1, d 2 ) is a measurable selection of the measurable multifunction ω Ω φ ( u(ω), F (u)(ω) ), where φ is Clarke s generalized gradient of φ. As we will see, in typical applications the system (1.6) can be symmetrized and is not much harder to solve than a system involving only the operator F (u), which would arise for the unconstrained problem F (u) =. In particular, fast solvers like multigrid methods, preconditioned iterative solvers, etc., can be applied to solve (1.6). (e) The method is not restricted to the problem class (1.1). Among the possible extensions we also investigate variational inequality problems of the form (1.1), but with the feasible set B replaced by C = {u L p (Ω) m : u(ω) C on Ω}, C R m closed and convex. Furthermore, we will consider mixed problems, where F (u) is replaced by F (y, u) and where we have the additional operator equation E(y, u) =. In particular, such problems arise as the first-order necessary optimality conditions (Karush Kuhn Tucker or KKT-conditions) of optimization problems with optimal control structure minimize J(y, u) subject to E(y, u) =, u C. (f) Other extensions are possible that we do not cover in this work. For instance, certain quasivariational inequalities [12, 13], i.e., variational inequalities for which the feasible set depends on u (e.g., a = A(u), b = B(u)), can be solved by our class of semismooth Newton methods. For illustration, we begin with examples of two problem classes that fit in the above framework.

9 4 1. Introduction 1.1 Examples of Applications Optimal Control Problems Let be given the state space Y (a Banach space), the control space U = L p (Ω), and the set B U of admissible or feasible controls as defined in (1.1). The state y Y of the system under consideration is governed by the state equation E(y, u) =, (1.7) where E : Y U W and W denotes the dual of a reflexive Banach space W. In our context, the state equation usually is given by the weak formulation of a partial differential equation (PDE), including all boundary conditions that are not already contained in the definition of Y. Suppose that, for every control u U, the state equation (1.7) possesses a unique solution y = y(u) Y. The control problem consists in finding a control ū such that the pair (y(ū), ū) minimizes a given objective function J : Y U R among all feasible controls u B. Thus, the control problem is minimize y Y,u U J(y, u) subject to (1.7) and u B. (1.8) Alternatively, we can use the state equation to express the state in terms of the control, y = y(u), and to write the control problem in the equivalent reduced form minimize j(u) subject to u B, (1.9) with the reduced objective function j(u) def = J(y(u), u). By the implicit function theorem, the continuous differentiability of y(u) in a neighborhood of ū follows if E is continuously differentiable and E y (y(ū), ū) is continuously invertible. Further, if in addition J is continuously differentiable in a neighborhood of (y(ū), ū) then j is continuously differentiable in a neighborhood of ū. In the same way, differentiability of higher order can be ensured. For problem (1.9), the gradient j (u) U is given by j (u) = J u (y, u) + y u (u) J y (y, u), with y = y(u). Alternatively, j can be represented via the adjoint state w = w(u) W, which is the solution of the adjoint equation E y (y, u) w = J y (y, u), where y = y(u). As discussed in more detail in appendix A.1, the gradient of j can be written in the form j (u) = J u (y, u) + E u (y, u) w. Adjoint-based expressions for the second derivative j are also available, see appendix A.1.

10 1.1 Examples of Applications 5 We now make the example more concrete and consider as state equation the Poisson problem with distributed control on the right hand side, y = u on Ω, y = on Ω, (1.1) and an objective function of tracking type J(y, u) = 1 y d ) 2 Ω(y 2 dx + λ 2 Ω u 2 dx. Hereby, Ω R n is a nonempty and bounded open set, y d L 2 (Ω) is a target state that we would like to achieve as well as possible by controlling u, and the second term is for the purpose of regularization (the parameter λ > is typically very small, e.g., λ = 1 3 ). We incorporate the boundary conditions into the state space by choosing Y = H 1 (Ω), the Sobolev space of functions vanishing on Ω. For the control space we choose U = L 2 (Ω). The control problem thus is minimize y H 1 (Ω),u L2 (Ω) 1 2 y d ) Ω(y 2 dx + λ 2 subject to y = u, u B. Ω u 2 dx (1.11) Defining the operator E : Y U W def = Y, E(y, u) = y u, we can write the state equation in the form (1.7). We identify L 2 (Ω) with its dual and introduce the Gelfand triples Then H 1 (Ω) = Y U = L2 (Ω) Y = H 1 (Ω). J y (y, u) = y y d, J u (y, u) = λu, E u (y, u)v = v v U, E y (y, u)z = z z Y. Therefore, the adjoint state w W = W = H 1 (Ω) is given by w = y d y on Ω, w = on Ω, (1.12) where y solves (1.1). Note that in (1.12) the boundary conditions could also be omitted because they are already enforced by w H 1 (Ω). The gradient of the reduced objective function j thus is j (u) = J u (y, u) + E u (y, u) w = λu w with y = y(u) and w = w(u) solutions of (1.1) and (1.12), respectively. This problem has the following properties that are common to many control problems and will be of use later on:

13 8 1. Introduction where λ > is a (small) parameter and u d L p (Ω), p [2, ), is chosen appropriately. We will show in section 7.3 that the solution ū λ of the regularized problem minimize u L 2 (Ω) j λ (u) subject to u (1.16) lies in L p (Ω) and satisfies ū λ ū H 1 = o(λ 1/2 ), which implies ȳ λ ȳ H 1 = o(λ 1/2 ), where ȳ λ = A 1 (f + ū λ ). Since j λ is strictly convex, problem (1.16) can be written in the form (1.1) with F = j λ. We have F (u) = λu + A 1 def (f + u) g λu d = λu + G(u). Using that A L(H 1, H 1 ) is a homeomorphism, and that H 1 (Ω) L p (Ω) for all p [1, ), we conclude that the operator G maps L 2 (Ω) continuously affine linearly into L p (Ω). Therefore, we see: F : L 2 (Ω) L 2 (Ω) is continuously differentiable (here even continuous affine linear). F has the form F (u) = λu + G(u), where G : L 2 (Ω) L p (Ω) is locally Lipschitz continuous (here even continuous affine linear). The solution is contained in L p (Ω). A detailed discussion of this problem including numerical results is given in section 7.3. In a similar way, obstacle problems on the boundary can be treated. Furthermore, time-dependent parabolic variational inequality problems can be reduced, by semidiscretization in time, to a sequence of elliptic variational inequality problems. 1.2 Motivation of the Method The class of methods for solving (1.1) that we consider here is based on the following equivalent formulation of (1.1) as a system of pointwise inequalities: (i) a u b, (ii) (u a)f (u), (iii) (u b)f (u) on Ω. (1.17) On Ω \Ω a, condition (ii) has to be interpreted as F (u), and on Ω \Ω b condition (iii) means F (u). The equivalence of (1.1) and (1.17) is easily verified. In fact, if u is a solution of (1.1) then (i) holds. Further, if (ii) is violated on a set Ω of positive measure, we define v B by v = a on Ω, and v = u on Ω \ Ω, and obtain the contradiction F (u), v u = F (u)(a u)dω <. In the same way, (iii) Ω can be shown to hold. Conversely, if u solves (1.17) then (i) (iii) imply that Ω is the union of the disjoint sets {a < u < b, F (u) = }, Ω = {u = a, F (u) }, and Ω {u = b, F (u) }. Now, for arbitrary v B, we have F (u), v u = F (u)(v a)dω + F (u)(v b)dω, Ω Ω

14 1.2 Motivation of the Method 9 so that u solves (1.1). As already mentioned, an important special case, which will provide our main example throughout, is the nonlinear complementarity problem (NCP), which corresponds to a and b +. Obviously, unilateral problems can be converted to an NCP via the transformation ũ = u a, F (ũ) = F (ũ + a) in the case of lower bounds, and ũ = b u, F (ũ) = F (b ũ) in the case of upper bounds. For NCPs, (1.17) reduces to (1.4). In finite dimensions, the NCP and, more generally, the box-constrained variational inequality problem (which is also called mixed complementarity problem, MCP) have been extensively investigated and there exists a significant, rapidly growing body of literature on numerical algorithms for their solution, see section Hereby, a major role is played by devices that allow to reformulate the problem equivalently in form of a system of (nonsmooth) equations. We begin with a description of these concepts in the framework of finite-dimensional MCPs and NCPs Finite-Dimensional Variational Inequalities Although we consider finite-dimensional problems throughout this section 1.2.1, we will work with the same notations as in the function space setting (a, b, u, F, etc.), since there is no danger of ambiguity. In analogy to (1.4), the finite-dimensional mixed complementarity problem consists in finding u R m such that a i u i b i, (u i a i )F i (u), (u i b i )F i (u), i = 1,..., m, (1.18) where a, b R m and F : R m R m are given. We begin with an early approach by Eaves [48] who observed (in the more general framework of VIPs on closed convex sets) that (1.18) can be equivalently written in the form u P [a,b] (u F (u)) =, (1.19) where P [a,b] (u) = max{a, min{u, b}} (componentwise) is the Euclidean projection onto [a, b] = m i=1 [a i, b i ]. Note that if the function F is C k then the left hand side of (1.19) is piecewise C k and thus, as we will see, semismooth. The reformulation (1.19) can be embedded in a more general framework. To this end, we interpret (1.18) as a system of m conditions of the form α x 1 β, (x 1 α)x 2, (x 1 β)x 2, (1.2) which have to be fulfilled by x = (u i, F i (u)) for [α, β] = [a i, b i ], i = 1,..., m. Given any function φ [α,β] : R 2 R with the property we can write (1.18) equivalently as φ [α,β] (x) = (1.2) holds, (1.21) φ [ai,b i ](u i, F i (u)) =, i = 1,..., m. (1.22)

15 1 1. Introduction A function with the property (1.21) is called MCP-function for the interval [α, β] (also the name BVIP-function is used, where BVIP stands for box constrained variational inequality problem). The link between (1.19) and (1.22) consists in the fact that the function φ [α,β] : R 2 R 2, φ E [α,β] (x) = x 1 P [α,β] (x 1 x 2 ) with P [α,β] (t) = max{α, min{t, β}} (1.23) defines an MCP-function for the interval [α, β]. The reformulation of NCPs requires only an MCP-function for the interval [, ). As already said, such functions are called NCP-functions. According to (1.21), φ : R 2 R is an NCP-function if and only if φ(x) = x 1, x 2, x 1 x 2 =. (1.24) The corresponding reformulation of the NCP then is φ(u 1, F 1 (u)) Φ(u) def =. φ(u m, F m (u)) and the NCP-function φ E [, ) can be written in the form φ E (x) = φ E [, ) (x) = min{x 1, x 2 }. =, (1.25) A further important reformulation, which is due to Robinson [127], uses the normal map F [a,b] (z) = F (P [a,b] (z)) + z P [a,b] (z). It is not difficult to see that any solution z of the normal map equation F [a,b] (z) = (1.26) gives rise to a solution u = P [a,b] (z) of (1.18), and, conversely, that, for any solution u of (1.26), the vector z = u F (u) solves (1.26). Therefore, the MCP (1.18) and the normal equation (1.26) are equivalent. Again, the normal map is piecewise C k if F is C k. In contrast to the reformulation based on NCP- and MCP-functions, the normal map approach evaluates F only at feasible points, which can be advantageous in certain situations. Many modern algorithms for finite dimensional NCPs and MCPs are based on reformulations by means of the Fischer Burmeister NCP-function φ F B (x) = x 1 + x 2 x x2 2, (1.27) which was introduced by Fischer [55]. This function is Lipschitz continuous and 1- order semismooth on R 2 (the definition of semismoothness is given below, and, in more detail, in chapter 2). Further, φ F B is C on R 2 \ {}, and (φ F B ) 2 is continuously differentiable on R 2. The latter property implies that, if F is continuously

16 1.2 Motivation of the Method 11 differentiable, the function 1 2 ΦF B (u) T Φ F B (u) can serve as a continuously differentiable merit function for (1.25). It is also possible to obtain 1-order semismooth MCP-functions from the Fischer Burmeister function, see [18, 54] and section The described reformulations were successfully used as basis for the development of locally superlinearly convergent Newton-type methods for the solution of (mixed) nonlinear complementarity problems [18, 38, 39, 45, 5, 52, 53, 54, 88, 89, 93, 116, 124, 14]. This is remarkable, since all these reformulations are nonsmooth systems of equations. However, the underlying functions are semismooth, a concept introduced by Mifflin [113] for real-valued functions on R n, and extended to mappings between finite-dimensional spaces by Qi [12] and Qi and Sun [122]. Hereby details are given in chapter 2 a function f : R l R m is called semismooth at x R l if it is Lipschitz continuous near x, directionally differentiable at x, and if sup f(x + h) f(x) Mh = o( h ) as h, M f(x+h) where the setvalued function f : R l R m l, f(x) = co{m R m l : x k x, f is differentiable at x k and f (x k ) M} denotes Clarke s generalized Jacobian ( co is the convex hull). It can be shown that piecewise C 1 functions are semismooth, see section Further, it is easy to prove that Newton s method (where in Newton s equation the Jacobian is replaced by an arbitrary element of f) converges superlinearly in a neighborhood of a CDregular ( CD for Clarke-differential) solution x, i.e., a solution where all elements of f(x ) are invertible. More details on semismoothness in finite dimensions can be found in chapter 2. It should be mentioned that also continuously differentiable NCP-functions can be constructed. In fact, already in the seventies, Mangasarian [11] proved the equivalence of the NCP to a system of equations, which, in our terminology, he obtained by choosing the NCP-function φ M (x) = θ( x 2 x 1 ) θ(x 2 ) θ(x 1 ), where θ : R R is any strictly increasing function with θ() =. Maybe the most straightforward choice is θ(t) = t, which gives φ M = 2φ E. If, in addition, θ is C 1 with θ () =, then φ M is C 1. This is, e.g., satisfied by θ(t) = t t. Nevertheless, most modern approaches prefer nondifferentiable, semismooth reformulations. This has a good reason. In fact, consider (1.25) with a differentiable NCP-function. Then the Jacobian of Φ is given by Φ (u) = diag ( φ x1 (u i, F (u i )) ) + diag ( φ x2 (u i, F (u i )) ) F (u). Now, since φ(t, ) = = φ(, t) for all t, we see that φ (, ) =. Thus, if strict complementarity is violated for the ith component, i.e., if u i = = F i (u), then the ith row of Φ (u) is zero, and thus Newton s method is not applicable if strict complementarity is violated at the solution. This can be avoided by using nonsmooth

17 12 1. Introduction NCP-functions, because they can be constructed in such a way that any element of the generalized gradient φ(x) is bounded away from zero at any point x R 2. For the Fischer Burmeister function, e.g., holds φ F B (x) = (1, 1) x/ x 2 for all x and thus g for all g φ F B (x) and all x R 2. The development of nonsmooth Newton methods [12, 13, 12, 122, 118], especially the unifying notion of semismoothness [12, 122], has led to considerable research on numerical methods for the solution of finite-dimensional VIPs that are based on semismooth reformulations [18, 38, 39, 5, 52, 53, 54, 88, 89, 93, 116, 14]. These investigations confirm that this approach admits an elegant and general theory (in particular, no strict complementarity assumption is required) and leads to very efficient numerical algorithms [54, 115, 116]. Related approaches The research on semismoothness-based methods is still in progress. Promising new directions of research are provided by Jacobian smoothing methods and continuation methods [31, 29, 92]. Hereby, a family of functions (φ µ ) µ is introduced such that φ is a semismooth NCP- or MCP-function, φ µ, µ >, is smooth and φ µ φ in a suitable sense as µ. These functions are used to derive a family of equations Φ µ (u) = in analogy to (1.25). In the continuation approach [29], a sequence (u k ) of approximate solutions corresponding to parameter values µ = µ k with µ k is generated such that u k converges to a solution of the equation Φ (u) =. Steps are usually obtained by solving the smoothed Newton equation Φ µ k (u k )s c k = Φ µ k (u k ), yielding centering steps towards the central path {x : Φ µ (x) = for some µ > }, or by solving the Jacobian smoothing Newton equation Φ µ k (u k )s k = Φ (u k ), yielding fast steps towards the solution set of Φ (u) =. The latter steps are also used as trial steps in the recently developed Jacobian smoothing methods [31, 92]. Since the limit operator Φ is semismooth, the analysis of these methods heavily relies on the properties of Φ and the semismoothness of Φ. The smoothing approach is also used in the development of algorithms for mathematical programs with equilibrium constraints (MPECs) [51, 57, 9, 19]. In this difficult class of problems, an objective function f(u, v) has to be minimized under the constraint u S(v), where S(v) is the solution set of a VIP that is parameterized by v. Under suitable conditions on this inner problem, S(v) can be characterized equivalently by its KKT conditions. These, however, when taken as constraints for the outer problem, violate any standard constraint qualification. Alternatively, the KKT conditions can be rewritten as a system of semismooth equations by means of an NCP-function. This, however, introduces the (mainly numerical) difficulty of nonsmooth constraints, which can be circumvented by replacing the NCP-function with a smoothing NCP-function and considering a sequence of solutions of the smoothed MPEC corresponding to µ = µ k, µ k. In conclusion, semismooth Newton methods are at the heart of many modern algorithms in finite-dimensional optimization, and hence should also be investigated

18 1.2 Motivation of the Method 13 in the framework of optimal control and infinite-dimensional VIPs. This is the goal of the present manuscript Infinite-Dimensional Variational Inequalities A main concern of this work is to extend the concept of semismooth Newton methods to a class of nonsmooth operator equations sufficiently rich to cover appropriate reformulations of the infinite-dimensional VIP (1.1). In a first step we derive analogues of the reformulations in section 1.2.1, but now in the function space setting. We begin with the NCP (1.4). Replacing componentwise operations by pointwise (a.e.) operations, we can apply an NCP-function φ pointwise to the pair of functions (u, F (u)) to define the superposition operator Φ(u)(ω) = φ ( u(ω), F (u)(ω) ). (1.28) which, under appropriate assumptions, defines a mapping Φ : L p (Ω) L r (Ω), r 1, see section Obviously, (1.4) is equivalent to the nonsmooth operator equation Φ(u) =. (1.29) In the same way, the more general problem (1.1) can be converted into an equivalent nonsmooth equation. To this end, we use a semismooth NCP-function φ and a semismooth MCP-function φ [α,β], < α < β < +. Now, we define the operator Φ : L p (Ω) L r (Ω), F (u)(ω) ω Ω \ (Ω a Ω b ), φ ( u(ω) a(ω), F (u)(ω) ) ω Ω a \ Ω b, Φ(u)(ω) = φ ( b(ω) u(ω), F (u)(ω) ) ω Ω b \ Ω a, φ [a(ω),b(ω)] (u(ω), F (u)(ω)) ω Ω a Ω b. (1.3) Again, Φ is a superposition operator on the four different subsets of Ω distinguished in (1.3). Along the same line, the normal map approach can be generalized to the function space setting. We will concentrate on NCP-function based reformulations and their generalizations. Our approach is applicable whenever it is possible to write the problem under consideration as an operator equation in which the underlying operator is obtained by superposition Ψ = ψ G of a Lipschitz continuous and semismooth function ψ and a continuously Fréchet differentiable operator G with reasonable properties, which maps into a direct product of Lebesgue spaces. We will show that the results for finite-dimensional semismooth equations can be extended to superposition operators in function spaces. To this end, we first develop a general semismoothness concept for operators in Banach spaces and then use these results to analyze superlinearly convergent Newton methods for semismooth operator equations. Then we apply this theory to superposition operators in function spaces of the form Ψ = ψ G. We work with a setvalued generalized differential Ψ that is motivated by Qi s

19 14 1. Introduction finite-dimensional C-subdifferential. The semismoothness result we establish is an estimate of the form sup Ψ(y + s) Ψ(y) Ms L r = o( s Y ) as s Y. M Ψ(y+s) We also prove semismoothness of order α >, which means that the above estimate holds with o( s Y ) replaced by O( s 1+α Y ). This semismoothness result enables us to apply the class of semismooth Newton methods that we analyzed in the abstract setting. If applied to nonsmooth reformulations of variational inequality problems, these methods can be regarded as infinite-dimensional analogues of finite-dimensional semismooth Newton methods for this class of problems. As a consequence, we can adjust to the function space setting many of the ideas that were developed for finite-dimensional VIPs in recent years. 1.3 Organization We now give an overview on the organization of this work. In chapter 2 we recall important results of finite-dimensional nonsmooth analysis. Several generalized differentials known from the literature (Clarke s generalized Jacobian, B-differential, and Qi s C-subdifferential) and their properties are considered. Furthermore, finite-dimensional semismoothness is discussed and semismooth Newton methods are introduced. Finally, we give important examples for semismooth functions, e.g., piecewise smooth functions, and discuss finite-dimensional generalizations of the semismoothness concept. In the first part of chapter 3 we establish semismoothness results for operator equations in Banach spaces. The definition is based on a setvalued generalized differential and requires an approximation condition to hold. Furthermore, semismoothness of higher order is introduced. It is shown that continuously differentiable operators are semismooth with respect to their Fréchet derivative, and that the sum, composition, and direct product of semismoothness operators is again semismooth. The semismoothness concept is used to develop a Newton method for semismooth operator equations that is superlinearly convergent (with q-order 1 + α in the case of α-order semismoothness). Several variants of this method are considered, including an inexact version that allows to work with approximate generalized differentials in the Newton system, and a version that includes a projection in order to stay feasible with respect to a given closed convex set containing the solution. In the second part of chapter 3 this abstract semismoothness concept is applied to the concrete situation of operators obtained by superposition of a Lipschitz continuous semismooth function and a smooth operator mapping into a product of Lebesgue spaces. This class of operators is of significant practical importance as it contains reformulations of variational inequalities by means of semismooth NCP-, MCP-, and related functions. We first develop a suitable generalized differential that has simple structure and is closely related to the finite-dimensional C-subdifferential. Then

20 1.3 Organization 15 we show that the considered superposition operators are semismooth with respect to this differential. We also develop results to establish semismoothness of higher order. The theory is illustrated by applications to the NCP. The established semismoothness of superposition operators enables us, via nonsmooth reformulations, to develop superlinearly convergent Newton methods for the solution of the NCP (1.4), and, as we show in chapter 5, for the solution of the VIP (1.1) and even more general problems. Finally, further properties of the generalized differential are considered. In chapter 4 we investigate two ingredients that are needed in the analysis of chapter 3. In chapter 3 it becomes apparent that in general a smoothing step is required to close a gap between two different L p -norms. This necessity was already observed in similar contexts [95, 143]. In section 4.1 we describe a way how smoothing steps can be constructed, which is based on an idea by Kelley and Sachs [95]. Furthermore, in section 4.2 we investigate a particular choice of the MCP-function that leads to reformulations for which no smoothing step is required. The analysis of semismooth Newton methods in chapter 3 relies on a regularity condition that ensures the uniform invertibility (between appropriate spaces) of the generalized differentials in a neighborhood of the solution. In section 4.3 we develop sufficient conditions for this regularity assumption. In chapter 5 we show how the developed concepts can be applied to solve more general problems than NCPs. In particular, we propose semismooth reformulations for bound-constrained VIPs and, more generally, for VIPs with pointwise convex constraints. These reformulations allow us to apply semismooth Newton methods for their solution. Furthermore, we discuss how semismooth Newton methods can be applied to solve mixed problems, i.e., systems of VIPs and smooth operator equations. Hereby, we concentrate on mixed problems arising as the Karush Kuhn Tucker (KKT) conditions of constrained optimization problems with optimal control structure. A close relationship between reformulations based on the black-box approach, in which the reduced problem is considered, and reformulations based on the all-at-once approach, where the full KKT-system is considered, is established. We observe that the generalized differentials of the black-box reformulation appear as Schur complements in the generalized differentials of the all-at-once reformulation. This can be used to relate regularity conditions of both approaches. We also describe how smoothing steps can be computed. In chapter 6 we describe a way to make the developed class of semismooth Newton methods globally convergent by embedding them in a trust region method. To this end, we propose three variants of minimization problems such that solutions of the semismooth operator equation are critical points of the minimization problem. Then we develop and analyze a class of nonmonotone trust-region methods for the resulting optimization problems in a general Hilbert space setting. The trial steps have to fulfill a model decrease condition, which, as we show, can be implemented by means of a generalized fraction of Cauchy decrease condition. For this algorithm global convergence results are established. Further, it is shown how semismooth Newton steps can be used to compute trial steps and it is proved that, under

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of

Math 5311 Gateaux differentials and Frechet derivatives Kevin Long January 26, 2009 1 Differentiation in vector spaces Thus far, we ve developed the theory of minimization without reference to derivatives.

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence

Outline Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach The University of New South Wales SPOM 2013 Joint work with V. Jeyakumar, B.S. Mordukhovich and

9 Chapter 5 Banach Spaces Many linear equations may be formulated in terms of a suitable linear operator acting on a Banach space. In this chapter, we study Banach spaces and linear operators acting on

LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction

TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS VIPUL NAIK Abstract. In this journey, we are going to explore the so called separation axioms in greater detail. We shall try to understand how these axioms

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

Properties of BMO functions whose reciprocals are also BMO R. L. Johnson and C. J. Neugebauer The main result says that a non-negative BMO-function w, whose reciprocal is also in BMO, belongs to p> A p,and

No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely

ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type

Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

3.8 Finding Antiderivatives; Divergence and Curl of a Vector Field 77 3.8 Finding Antiderivatives; Divergence and Curl of a Vector Field Overview: The antiderivative in one variable calculus is an important

54 CHAPTER 5 Product Measures Given two measure spaces, we may construct a natural measure on their Cartesian product; the prototype is the construction of Lebesgue measure on R 2 as the product of Lebesgue

NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

CHAPTER 1 BASIC TOPOLOGY Topology, sometimes referred to as the mathematics of continuity, or rubber sheet geometry, or the theory of abstract topological spaces, is all of these, but, above all, it is

. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

Massachusetts Institute of Technology Handout 6 18.433: Combinatorial Optimization February 20th, 2009 Michel X. Goemans 3. Linear Programming and Polyhedral Combinatorics Summary of what was seen in the

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL R. DRNOVŠEK, T. KOŠIR Dedicated to Prof. Heydar Radjavi on the occasion of his seventieth birthday. Abstract. Let S be an irreducible

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

Interior Point Methods and Linear Programming Robert Robere University of Toronto December 13, 2012 Abstract The linear programming problem is usually solved through the use of one of two algorithms: either

A Note on Di erential Calculus in n by James Hebda August 2010 I. Partial Derivatives o Functions Let : U! be a real valued unction deined in an open neighborhood U o the point a =(a 1,...,a n ) in the

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such

Low upper bound of ideals, coding into rich Π 0 1 classes Antonín Kučera the main part is a joint project with T. Slaman Charles University, Prague September 2007, Chicago The main result There is a low

MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

Duality in Linear Programming 4 In the preceding chapter on sensitivity analysis, we saw that the shadow-price interpretation of the optimal simplex multipliers is a very useful concept. First, these shadow