Maximal Inequalities for Martingales and Their Differential Subordinates

Abstract

We introduce a method of proving maximal inequalities for Hilbert- space-valued differentially subordinate local martingales. As an application, we prove that if \(X=(X_t)_{t\ge 0},\, Y=(Y_t)_{t\ge 0}\) are local martingales such that \(Y\) is differentially subordinate to \(X\), then

Keywords

Martingale Maximal inequality Differential subordination

Mathematics Subject Classification (2000)

Primary 60G44 Secondary 60G42

1 Introduction

Since the works of Kolmogorov, Hardy and Littlewood, Wiener, Doob and many other mathematicians, maximal inequalities have played an important role in analysis and probability. One of the main goals of this paper is to present a method of proving such estimates for continuous-time Hilbert-space-valued local martingales satisfying differential subordination.

We start with introducing the necessary background and notation. Let \((\Omega ,\mathcal{F },\mathbb P )\) be a complete probability space, filtered by a nondecreasing right-continuous family \((\mathcal{F }_t)_{t\ge 0}\) of sub-\(\sigma \)-fields of \(\mathcal{F }\). In addition, we assume that \(\mathcal{F }_0\) contains all the events of probability \(0\). Let \(X,\, Y\) be two adapted local martingales, taking values in a certain separable Hilbert space \(\mathcal H \) with norm \(|\cdot |\) and scalar product \(\langle \cdot ,\cdot \rangle \). With no loss of generality, we may take \(\mathcal H =\ell ^2\). As usual, we assume that the trajectories of the processes are right-continuous and have limits from the left. The symbol \([X,X]\) will stand for the quadratic covariance process of \(X\): this object is given by \([X,X]=\sum _{n=1}^\infty [X^n,X^n]\), where \(X^n\) denotes the \(n\)th coordinate of \(X\) and \([X^n,X^n]\) is the usual square bracket of the real-valued martingale \(X^n\) (see e.g. Dellacherie and Meyer [15] for details). In what follows, \(X^*=\sup _{t\ge 0}|X_t|\) will denote the maximal function of \(X\), we also use the notation \(X^*_t=\sup _{0\le s\le t}|X_s|\). Furthermore, for \(1\le p\le \infty \), we shall write \(||X||_p=\sup _{t\ge 0}||X_t||_p\) and \(|||X|||_p=\sup _\tau ||X_\tau ||_p\), where the second supremum is taken over all adapted bounded stopping times \(\tau \).

Throughout the paper we assume that the process \(Y\) is differentially subordinate to \(X\). This concept was originally introduced by Burkholder [8] in the discrete-time case: a martingale \(g=(g_n)_{n\ge 0}\) is differentially subordinate to \(f=(f_n)_{n\ge 0}\), if for any \(n\ge 0\) we have \(|\text{ d}g_n|\le |\text{ d}f_n|\). Here \(\text{ d}f=(\text{ d}f_n)_{n\ge 0}\), \(\text{ d}g=(\text{ d}g_n)_{n\ge 0}\) are the difference sequences of \(f\) and \(g\), respectively, given by the equations

The extension of the domination to the continuous-time setting is due to Bañuelos and Wang [3] and Wang [23]. We say that \(Y\) is differentially subordinate to \(X\), if the process \(([X,X]_t-[Y,Y]_t)_{t\ge 0}\) is nondecreasing and nonnegative as a function of \(t\). If we treat given discrete-time martingales \(f\), \(g\) as continuous-time processes (via \(X_t=f_{\lfloor t\rfloor }\) and \(Y_t=g_{\lfloor t\rfloor }\), \(t\ge 0\)), we see this domination is consistent with Burkholder’s original definition of differential subordination.

To illustrate this notion, consider the following example. Suppose that \(X\) is an \(\mathcal{H }\)-valued martingale, \(H\) is a predictable process taking values in the interval \([-1,1]\) and let \(Y\) be given as the stochastic integral \(Y_t=H_0X_0+\int _{0+}^t H_s\text{ d}X_s\), \(t\ge 0\). Then \(Y\) is differentially subordinate to \(X\): we have

Another example for stochastic integrals, which plays an important role in applications (see e.g., [2, 3, 16]), is the following. Suppose that \(B\) is a Brownian motion in \(\mathbb{R }^d\) and \(H,\, K\) are predictable processes taking values in the matrices of dimensions \(m\times d\) and \(n\times d\), respectively. For any \(t\ge 0\), define

If the Hilbert–Schmidt norms of \(H\) and \(K\) satisfy \(||K_t||_\mathrm{HS}\le ||H_t||_\mathrm{HS}\) for all \(t>0\), then \(Y\) is differentially subordinate to \(X\): this follows from the identity

The differential subordination implies many interesting inequalities comparing the sizes of \(X\) and \(Y\). A celebrated result of Burkholder gives the following information on the \(L^p\)-norms (see [8, 10, 12, 13, 23]).

Theorem 1.1

Suppose that \(X\), \(Y\) are Hilbert-space-valued local martingales such that \(Y\) is differentially subordinate to \(X\). Then

where \(p^*=\max \{p,p/(p-1)\}\). The constant is the best possible, even if \(\mathcal H =\mathbb{R }\).

For \(p=1\), the above moment inequality does not hold with any finite constant, but we have the corresponding weak-type \((1,1)\) estimate. In fact, we have the following result for a wider range of parameters \(p\), proved by Burkholder [8] for \(1\le p\le 2\) and Suh [22] for \(p>2\). See also Wang [23].

Theorem 1.2

Suppose that \(X\), \(Y\) are Hilbert-space-valued local martingales such that \(Y\) is differentially subordinate to \(X\). Then

There are many other related results, see e.g., the papers [3] and [4] by Bañuelos and Wang, [11] and [13] by Burkholder and consult the references therein. For more recent works, we refer the interested reader to the papers [18, 19, 20] by the author, and [6, 7] by Borichev et al. The estimates have found numerous applications in many areas of mathematics, in particular, in the study of the boundedness of various classes of Fourier multipliers (consult, for instance, [1, 2, 3, 12, 16, 17]).

There is a general method, invented by Burkholder, which enables one not only to establish various estimates for differentially subordinated martingales, but is also very efficient in determining the optimal constants in such inequalities. The idea is to construct an appropriate special function, an upper solution to a nonlinear problem corresponding to the inequality under investigation, and then to exploit its properties. See the survey [13] for the detailed description of the technique in the discrete-time setting and consult Wang [23] for the necessary changes which have to be implemented so that the method worked in the continuous-time setting.

The above results can be extended in another, very interesting direction. Namely, in the present paper we will be interested in inequalities involving the maximal functions of \(X\) and/or \(Y\). Burkholder [14] modified his technique so that it could be used to study such inequalities for stochastic integrals, and applied it to obtain the following result, which can be regarded as another version of (1.1) for \(p=1\).

Theorem 1.3

Suppose that \(X\) is a real-valued martingale and \(Y\) is the stochastic integral, with respect to \(X\), of some predictable real-valued process \(H\) taking values in \([-1,1]\). Then we have the sharp estimate

$$\begin{aligned} |||Y|||_1\le \gamma ||X^*||_1, \end{aligned}$$

(1.2)

where \(\gamma =2.536\ldots \) is the unique positive number satisfying

$$\begin{aligned} \gamma =3-\exp \frac{1-\gamma }{2}. \end{aligned}$$

As we have already observed above, if \(X\) and \(Y\) satisfy the assumptions of this theorem, then \(Y\) is differentially subordinate to \(X\). An appropriate modification of the proof in [14] shows that the assertion is still valid if we impose this less restrictive condition on the processes. However, the assertion does not hold any more if we pass from the real to the vector valued case. Here is one of the main results of this paper.

Theorem 1.4

Suppose that \(X\), \(Y\) are Hilbert-space-valued local martingales such that \(Y\) is differentially subordinate to \(X\). Then

$$\begin{aligned} |||Y|||_1\le \beta ||X^*||_1, \end{aligned}$$

(1.3)

where \(\beta =2.585\ldots \) is the unique positive number satisfying

$$\begin{aligned} \beta =2+\log \frac{1+\beta }{2}. \end{aligned}$$

(1.4)

The constant \(\beta \) is the best possible, even for discrete-time martingales taking values in a two-dimensional subspace of \(\mathcal H \).

This is a very surprising result. In most cases, the inequalities for stochastic integrals of real valued martingales carry over, with unchanged constants, to the corresponding bounds for vector-valued local martingales satisfying differential subordination. In other words, given a sharp inequality for \(\mathcal H \)-valued differentially subordinated martingales, the extremal processes, i.e. those for which the equality is (almost) attained, can be usually realized as stochastic integrals in which the integrator takes values in one-dimensional subspace of \(\mathcal H \). See e.g., the statements of Theorems 1.1 and 1.2. Here the situation is different: the optimal constant does depend on the dimension of the range of \(X\) and \(Y\).

Finally, let us mention here another related result. In general, the best constants in non-maximal inequalities for differentially subordinated local martingales do not change when we restrict ourselves to continuous-path processes; see e.g., Section 15 in [8] for the justification of this phenomenon. However, if we study the maximal estimates, the best constants may be different: for example, the passage to continuous-path local martingales reduces the constant \(\gamma \) in (1.2) to \(\sqrt{2}\). Specifically, we have the following theorem, which is one of the principal results of [21].

Theorem 1.5

Assume that \(X,\, Y\) are Hilbert-space-valued, continuous-path local martingales such that \(Y\) is differentially subordinate to \(X\). Then

We have organized the paper as follows. The next section is devoted to an extension of Burkholder’s method. In Sect. 3 we apply the technique to establish (1.3). In Sect. 4 we prove that the constant \(\beta \) cannot be replaced in (1.3) by a smaller one. The final part of the paper contains the proofs of technical facts needed in the earlier considerations.

2 On the Method of Proof

Burkholder’s method from [14] is a powerful tool for proving maximal inequalities for transforms of discrete-time real-valued martingales. Then the results for the wider setting of stochastic integrals are obtained by the use of approximation theorems of Bichteler [5]. This approach has the advantage that it avoids practically all the technicalities which arise naturally in the study of continuous-time processes. On the other hand, it does not allow to study estimates for (local) martingales under differential subordination; the purpose of this section is to present a refinement of the method which can be used to handle such problems.

The general statement is the following. Let \(V:\mathcal H \times \mathcal H \times [0,\infty )\times [0,\infty )\rightarrow \mathbb{R }\) be a given Borel function and suppose that we want to show the estimate

for any \(t\ge 0\) and any \(\mathcal H \)-valued local martingales \(X,\, Y\) such that \(Y\) is differentially subordinate to \(X\). Due to some technical reasons, we shall deal with a slightly different, localized version of (2.1) (see Theorem 2.2 for the precise statement). Let \(D=\mathcal H \times \mathcal H \times (0,\infty )\times (0,\infty )\). Introduce the class \(\mathcal U (V)\), which consists of all \(C^2\) functions \(U:D\rightarrow \mathbb{R }\) satisfying (2.2)–(2.5) below: for any \((x,y,z,w)\in D\),

For example, let us establish the bound for \(U_w\). Pick \(x,\,y,\,z,\,w\) with \(|y|=w\); by the continuity of \(U_w\), we may and do assume that \(|x|<z\). Apply (2.5) to \(x,\,y,\,z,\,w\) and \(h=k=sy\) for some \(s>0\). Then, take all the terms on one side, divide throughout by \(s\) and let \(s\rightarrow 0\). Since \(|x+sy|\vee z=z\) for sufficiently small \(s\) and \(|y+sy|\vee w=(1+s)w\) for all \(s>0\), we obtain the second estimate in (2.6); the bound for \(U_z\) is established similarly.

Before we turn to the main result of this section, let us mention here a technical fact, which will be needed later. Recall that for any semimartingale \(X\) there exists a unique continuous local martingale part \(X^c\) of \(X\) satisfying \(X^c_0=0\) and

Here \(\Delta X_s=X_s-X_{s-}\) is the jump of \(X\) at time \(s\). Furthermore, \([X^c,X^c]=[X,X]^c\), the pathwise continuous part of \([X,X]\). Here is Lemma 1 of Wang [23].

Lemma 2.1

If \(X\) and \(Y\) are semimartingales, then \(Y\) is differentially subordinate to \(X\) if and only if \(Y^c\) is differentially subordinate to \(X^c\), \(|\Delta Y_t|\le |\Delta X_t|\) for all \(t>0\) and \(|Y_0|\le |X_0|\).

We are ready to study the interplay between the class \(\mathcal U (V)\) and the bound (2.1).

Theorem 2.2

Assume that \(\mathcal U (V)\) is nonempty and \(X,\, Y\) are Hilbert-space-valued local martingales such that \(Y\) is differentially subordinate to \(X\). Then there is a nondecreasing sequence \((\tau _N)_{N\ge 1}\) of stopping times such that \(\lim _{N\rightarrow \infty }\tau _N=\infty \) and

There is a sequence \((T_{N,j})_{j\ge 1}\) of stopping times with \(T_{N,j}\uparrow \tau _N\), localizing the stochastic integrals \(\int U_x(Z^{(d)}_{s-})\cdot \text{ d}X_s^{(d)},\, \int U_y(Z^{(d)}_{s-})\cdot \text{ d}Y_s^{(d)}\). Since \(X^{(d)},\, Y^{(d)}\) take values in finite-dimensional subspace, we may apply Ito’s formula to get

Note that the integrals in \(I_2\) are with respect to the continuous parts of the processes \(X^{(d)*}\vee \varepsilon \) and \(Y^{(d)*}\vee \varepsilon \); this is due to the lack of the terms \(U_z(Z^{(d)}_{s-})\Delta (X^{(d)*}_s\vee \varepsilon )\) and \(U_w(Z^{(d)}_{s-})\Delta (Y^{(d)*}_s\vee \varepsilon )\) in \(I_4\).

Let us analyze the terms \(I_1\)–\(I_4\). We have \(\mathbb{E }I_1=0\), since both the stochastic integrals are martingales. Next, \(I_2\le 0\): by (2.6), we have \(U_z(Z^{(d)}_{s-})\le 0\) on the set \(\{s: |X^{(d)}_{s-}|=X^{(d)*}_{s-}\vee \varepsilon \}\) in which the support of d\((X^{(d)*}\vee \varepsilon )_s^c\) is contained. This gives that the first integral in \(I_2\) is nonpositive, the second one is handled analogously. To deal with \(I_3\), fix \(0\le s_0<s_1\le t\). For any \(\ell \ge 0\), let \((\eta _i^\ell )_{1\le i\le i_\ell }\) be a nondecreasing sequence of stopping times with \(\eta _0^\ell =s_0,\,\eta _{i_\ell }^\ell =s_1\) such that \(\lim _{\ell \rightarrow \infty }\max _{1\le i\le i_\ell -1}|\eta _{i+1}^\ell -\eta _i^\ell |=0\). Keeping \(\ell \) fixed, we apply, for each \(i=0,\,1,\,2,\,\ldots ,\,i_\ell \), the property (2.4) to \(x=X_{s_0-}^{(d)},\, y=Y_{s_0-}^{(d)},\, z=X^{(d)*}_{s_0-}\vee \varepsilon \), \(w=Y^{(d)*}_{s_0-}\vee \varepsilon \) and \(h=h_i^\ell =X^{(d)c}_{T_{N,j}\wedge \eta _{i+1}^\ell }-X^{(d)c}_{T_{N,j}\wedge \eta _{i}^\ell }\), \(k=k_i^\ell =Y^{(d)c}_{T_{N,j}\wedge \eta _{i+1}^\ell }-Y^{(d)c}_{T_{N,j}\wedge \eta _{i}^\ell }\). We sum the obtained \(i_\ell +1\) inequalities and let \(\ell \rightarrow \infty \). Using the notation \([S,T]_s^u=[S,T]_u-[S,T]_s\), we may write the result in the form

where in the third passage we have exploited the differential subordination of \(Y^c\) to \(X^c\). From the local boundedness of \(c\) and the definition of \(\tau _N\), we infer that on the set \(\{T_{N,j}>0\},\, c(Z_{T_{N,j}\wedge s_0-}^{(d)})\) is bounded by a constant \(C\) depending only on \(N\) and \(\varepsilon \). Thus

using a standard approximation of integrals by discrete sums. Finally, we see that each summand in \(I_4\) is nonpositive, directly from (2.5) and the fact that \(|\Delta Y_s|\le |\Delta X_s|\), see Lemma 2.1. Consequently,

For fixed \(N\), the random variables \(Z^{(d)}_{T_{N,j}\wedge t-}\), \(j\ge 1,\, d\ge \mathcal D \), are uniformly bounded on \(\{\tau _N>0\}\), in view of the definition of \(\tau _N\). Moreover, we have

Remark 2.3

A careful inspection of the proof of the above theorem shows that the function \(U\) need not be given on the whole \(D=\mathcal H \times \mathcal H \times (0,\infty )\times (0,\infty )\). Indeed, it suffices to define it on a certain neighborhood of the set \(\{(x,y,z,w)\in D: |x|\le z,\,|y|\le w\}\) in which the process \(Z\) takes its values. This can be further relaxed: if we are allowed to work with those \(X,\, Y\) which are bounded away from \(0\), then all we need is a \(C^2\) function \(U\) given on some neighborhood of \(\{(x,y,z,w)\in D: 0<|x|\le z,\,0<|y|\le w\}\), satisfying (2.2)–(2.5) on this set.

3 The Special Function Corresponding to (1.3)

Now we apply the approach described in the previous section to establish (1.3). Let \(V:D\rightarrow \mathbb{R }\) be given by \(V(x,y,z,w)=|y|-\beta (|x|\vee z)\). Furthermore, put

The majorization (2.3): in particular, (2.4) implies that for any \(h\) the function \(t\mapsto U(x+th,y,z,w)\) is concave on \([t_-,t_+]\), where \(t_-=\inf \{t: |x+th|\le z\}\) and \(t_+=\sup \{ t:|x+th|\le z \}\). Consequently, it suffices to verify (2.3) only for \((x,y,z,w)\) satisfying \(|x|= z\). But this reduces to the second part of Lemma 3.1.

The condition (2.5): by homogeneity and continuity of both sides, we may assume that \(z=1\) and \(|x|<1\). Define

for \( t\in \mathbb{R }\) and let \(t_-\), \(t_+\) be as above; note that \(t_-<0\) and \(t_+>0\). By (2.4), \(H\) is concave on \([t_-,t_+]\) and hence (2.5) holds if \(|x+h|\le 1\). Suppose then that \(|x+h|>1\) or, in other words, that \(t_+<1\). The vector \(x^{\prime }=x+t_+h\) satisfies \(\langle x^{\prime },h\rangle \ge 0\): this is equivalent to \(\frac{d}{dt}|x+th|^2|_{t=t_+}\ge 0\). Hence, by (3.4), if we put \(y^{\prime }=y+t_+k\), then

Proof of (1.3)

It suffices to establish the estimate for \(X^*\in L^1\), because otherwise there is nothing to prove. Furthermore, we may assume that \(Y\) is bounded away from \(0\). To see this, consider a new Hilbert space \( \mathbb{R }\,\times \, \mathcal H \) and the martingales \((\delta ,X)\) and \((\delta ,Y)\), with \(\delta >0\). These martingales are bounded away from \(0\) and \((\delta ,Y)\) is differentially subordinate to \((\delta ,X)\). Having proved (1.3) for these processes, we let \(\delta \rightarrow 0\) and get the bound for \(X\) and \(Y\), by Lebesgue’s dominated convergence theorem.

Now we make use of the methodology described in the previous section (in particular, we exploit Remark 2.3). Since \(U\in \mathcal U (V)\), the above estimate follows immediately from (2.7), applied to the local martingales \((X_{\tau \wedge t})_{t\ge 0},\, (Y_{\tau \wedge t})_{t\ge 0}\), and letting \(N\rightarrow \infty ,\, t\rightarrow \infty \) and \(\varepsilon \rightarrow 0\). \(\square \)

4 Sharpness

The constant \(\beta \) can be shown to be optimal in (1.3) by the use of appropriate examples, but then the calculations are quite involved. To simplify the proof, we use a different approach. Assume that the probability space is the interval \([0,1]\) equipped with its Borel subsets and Lebesgue’s measure. Suppose that there is \(\beta _0\in (0,\beta )\) with the following property: for any discrete filtration \((\mathcal{F }_n)_{n\ge 0}\) and any adapted martingales \(f,\, g\) taking values in \(\mathbb{R }^2\) such that \(g\) is differentially subordinate to \(f\), we have

$$\begin{aligned} ||g||_1\le \beta _0||f^*||_1. \end{aligned}$$

(4.1)

We shall show that the validity of this estimate implies the existence of a certain special function, with properties similar to those in the definition of the class \(\mathcal U (V)\). Then, by proper exploitation of these conditions, we shall deduce that \(\beta _0\ge \beta \).

Recall that a sequence \((f_n)_{n\ge 0}\) is called simple if for any \(n\) the term \(f_n\) takes only a finite number of values and there is a deterministic \(N\) such that \(f_N=f_{N+1}=f_{N+2}=\ldots =f_\infty \). For any \((x,y)\in \mathbb{R }^2\times \mathbb{R }^2\), introduce the class \(\mathcal M (x,y)\) which consists of those simple martingale pairs \((f,g)\) with values in \(\mathbb{R }^2\times \mathbb{R }^2\), which satisfy the following two conditions.

This is a consequence of the so-called “splicing argument” of Burkholder (see e.g., in [9, p. 77]). For the convenience of the reader, let us provide the easy proof. Pick \((f^+,g^+)\in \mathcal M (x+th,y+tk),\, (f^-,g^-)\in \mathcal M (x-sh,y-sk)\). These two pairs are spliced together into one pair \((f,g)\) as follows: set \((f_0,g_0)\equiv (x,y)\) and (recall that \(\Omega =[0,1]\))

for \(n=1,\,2,\,\ldots \). It is not difficult to see that \((f,g)\) is a martingale pair with respect to its natural filtration. Furthermore, it is clear that this pair belongs to \(\mathcal M (x,y)\). Finally, since \(|x|\le z\), we have \(f_n^*\vee z=\sup _{1\le k\le n}|f_k|\vee z\) for \(n=1,\,2,\,\ldots \) and therefore

where we have used parts (ii) and (iii) of Lemma 4.1. By part (v) of that lemma, the function \(s\mapsto W(x,sy,1)\), \(s\in \mathbb{R }\), is continuous. Thus, if we let \(t\rightarrow \infty \), we get

Now we have come to the point where we use the fact that we are in the vector-valued setting. Namely, we pick a vector \(d\in \mathbb{R }^2\setminus \{0\}\), orthogonal to \(y+\delta y/r-(x-\delta x)\). Let \(s,\,t>0\) be uniquely determined by the equalities \( |x-\delta x-sd|=|x-\delta x+td|=1.\) Then

since, as we have assumed at the beginning, \(|y|=r\) and \(|x|=1\). In other words, we have \(|y+\delta y/r-sd|=\sqrt{|y|^2+2\delta |y|+2\delta }\) and, similarly, \(|y+\delta y/r+td|=\sqrt{|y|^2+2\delta |y|+2\delta }\). Therefore, if we apply (4.2) with \(x^{\prime }:=x-\delta x,\, y^{\prime }:=y+\delta y/r,\, z=1,\, h=k=d\) and \(s,\,t\) as above, and combine it with the definition of \(\Psi \), we get

Now we shall choose an appropriate \(t\). We have \(\Psi (1)<-1\); otherwise, we would let \(t\rightarrow \infty \) and obtain the contradiction with the assumption \(\beta _0<\beta \). Furthermore, \(\Psi (1)\ge 1-\beta _0>-2\). Thus, the number \(t\), determined by the equation \(\Psi (1)=-\frac{1+t}{t}\), satisfies \(t>1\). Application of (4.5) with this choice of \(t\) gives

For simplicity, we shall write \(k,\, y\) instead of \(|k|,\, |y|\), respectively. We consider two major cases.

Case I: Suppose that

$$\begin{aligned} \sqrt{1+k^2}\ge (2-\log 2)(1+y). \end{aligned}$$

(5.1)

Then \(\sqrt{1+k^2}\ge 2-\log 2\), or \(k\ge k_0:=\sqrt{(2-\log 2)^2-1}\). In addition, \(\frac{k-y}{\sqrt{1+k^2}}\le 1\), so using the convexity of the function \(\xi (s)= 2s-\log (1+s),\, s\ge 0\), we have

Suppose finally, that \(y\ge 2\). As previously, the left-hand side of (5.2) is bounded from above by \(2-\log 2\). On the other hand, the right-hand side is larger than \(-\log 3+2(2-\log 2)>2-\log 2\).

Case II: Now we assume that

$$\begin{aligned} \sqrt{1+k^2}< (2-\log 2)(1+y). \end{aligned}$$

(5.3)

The inequality (3.3) is equivalent to \(F(k)\le 2y-\log (1+y)\), where

We shall prove that this function is convex. To do this, fix \(t_1,\,t_2\ge 0,\, \alpha _1,\,\alpha _2\in (0,1)\) with \(\alpha _1+\alpha _2=1\), and let \(t=\alpha _1t_1+\alpha _2t_2\). Using Lemma 3.2, we get

As a function of \(|h|\), the left-hand side of the inequality is nonincreasing (see Lemma 3.1 (iii)), so it suffices to prove the bound for \(|h|=|k|\). Fix \(|y|,\, |k|\) and consider the left-hand side as a function \(F\) of \(\langle y,k\rangle \). This function is concave (Lemma 3.1 (iv)), and

Now, if \(|y|+1> \sqrt{1+|k|^2}+|k|-|y|\), then \(F^{\prime }\) vanishes at \(\langle y,k\rangle =(1+|y|)\)\((1-\sqrt{1+|k|^2})\) and hence it suffices to establish (5.4) for \(y\) and \(k\) satisfying this equation. A little calculation transforms the estimate into (3.2). On the other hand, if \(|y|+1\le \sqrt{1+|k|^2}+|k|-|y|\), then \(F^{\prime }\) is nonpositive on \([-|y||k|,|y||k|]\) and we need to verify (5.4) for \(y,\,k\) satisfying \(\langle y,k\rangle =-|y||k|\). Then the bound reduces to (3.3).

Step 3

Finally, we treat (3.4) for general vectors. The bound is equivalent to

For fixed \(|x|,\, y,\, h\), and \(k\), the left-hand side, as a function of \(\langle x,h\rangle \), is convex (see Lemma 3.1 (iii)) and hence it suffices to verify the estimate in the case when \(\langle x,h\rangle =\{|x||h|,0\}\). These cases have been considered in Steps 1 and 2. \(\square \)

Notes

Acknowledgments

This work was partially supported by Polish Ministry of Science and Higher Education (MNiSW) Grant N N201 397437. The author would like to thank the anonymous Referee for the careful reading of the first version of the paper, the comments and remarks, which greatly improved the presentation.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

This article is published under an open access license.
Please check the 'Copyright Information' section for details of this license and
what re-use is permitted.
If your intended use exceeds what is permitted by the license or if
you are unable to locate the licence and re-use information,
please contact the Rights and Permissions team.