Semiparametric inference on the fractal index of Gaussian and
conditionally Gaussian time series data

Abstract

We study a well-known estimator of the fractal index of a stochastic process. Our framework is very general and encompasses many models of interest; we show how to extend the theory of the estimator to a large class of non-Gaussian processes. Particular focus is on clarity and ease of implementation of the estimator and the associated asymptotic results, making it easy for practitioners to apply the methods. We additionally show how measurement noise in the observations will bias the estimator, potentially resulting in the practitioner erroneously finding evidence of fractal characteristics in a time series. We propose a new estimator which is robust to such noise and construct a formal hypothesis test for the presence of noise in the observations. Finally, the methods are illustrated on two empirical data sets; one of turbulent velocity flows and one of financial prices.

Fractal-like models are used in a wide array of applications such as in the characterization of surface smoothness/roughness (Constantine and
Hall, 1994), in the study of turbulence (Corcuera et al., 2013), and many others (e.g., Burrough, 1981; Mandelbrot, 1982; Falconer, 1990). Most recently, these models have attracted attention in mathematical finance, as models of stochastic volatility (e.g., Gatheral
et al., 2018; Bayer
et al., 2016; Bennedsen
et al., 2017a, b; Jacquier
et al., 2017). In such applications, it is imperative to be able to estimate and conduct inference on the key parameter in these models, the fractal index. Many estimators of this parameter exist (see Gneiting
et al., 2012, for a survey); however, the underlying assumptions behind the various estimators, as well as their asymptotic properties, are often different and rarely stated in a clear and concise manner. These facts can make analysis difficult for the practitioner, as well as for the researcher.

This paper aims at making empirical analysis in applications, such as the ones mentioned above, easier. We clearly lay out a large and coherent framework – including the valid underlying assumptions – for analysing time series data which are potentially fractal-like. We focus on a specific estimator, which is arguably the most widely used in practice and which in our experience is the most accurate. Further, the estimator is easy to implement – it relies on a simple OLS regression – and its asymptotic properties are easy to apply. Our hope is that this will provide a transparent guide to analyzing fractal data using sound statistical methods.

The main contribution of the paper is to lay out the theory of the estimator and provide the theoretical underpinnings of it, stating the results in a manner so that application of the results becomes straight forward. For this, we rely heavily on earlier theoretical work on the increments of fractal processes, most notably Barndorff-Nielsen et al. (2009) and Barndorff-Nielsen et al. (2011). We further investigate the estimator numerically to gauge it’s properties when applied to data, leading to a number of practical recommendations for implementation. Most importantly, we advocate a different choice of bandwidth parameter for the estimator, than what is generally accepted practice in the literature, cf. Section 3.1.

In their survey of the asymptotic theory of various estimators of the fractal index, Gneiting
et al. (2012) section 3.1., report that “a general non-Gaussian theory remains lacking”. The second contribution of this paper is to extend the estimation theory beyond the Gaussian paradigm. We accomplish this by volatility modulation which turns out to be a convenient way of extending the theory to a large class of non-Gaussian processes. As will be seen, this results in conditionally Gaussian processes for which the fractal theory continues to hold. Again we clearly lay out the relevant assumptions and focus on the interpretation of the results and implementation of the methods.

The final contribution of the paper is an in-depth study of the case where the data are contaminated by noise, such as measurement noise. We prove that noise will bias estimates of the fractal index downwards, thereby making noise-contaminated data look more rough than the underlying process actually is. We go on to propose a novel way to construct an estimator which is robust to noise in the observations. The new estimator also relies on an OLS regression and is just as easy to implement as the standard (non-robust) estimator studied in the first part of the paper. We present the asymptotic theory concerning the robust estimator and propose a hypothesis test, which can be used to formally test for the presence of noise in the observations.

The rest of the paper is structured as follows. Section 2 presents the mathematical setup and assumptions and gives some examples of the kind of processes we have in mind. The section then goes on to consider some extensions to the basic setup, most notably the extension to non-Gaussian processes. Section 3 presents the semiparametric estimator of the fractal index and it’s asymptotic properties. Then, in Section 3.2, we consider the case where the observations have been contaminated by noise and present asymptotic theory for a new estimator in this case; Section 3.2.1 presents a formal test for the presence of noise. Section 4 contains small simulation studies, illustrating the finite sample properties of the asymptotic results presented in the paper. Finally, Section 5 contains two illustrations of the methods: the first using measurements of the longitudinal component of a turbulent velocity field, and the second using a time series of financial prices. Section 6 concludes and gives some directions for future study. Proofs of technical results and some mathematical derivations are given in an appendix.

Let (Ω,F,P) be a probability space satisfying the usual assumptions and supporting X, a one-dimensional, zero-mean, stochastic process with stationary increments. Define the p’th order variogram of X:

γp(h;X):=E[|Xt+h−Xt|p],h∈R.

As we intend to make use of the theory developed in Barndorff-Nielsen et al. (2009, 2011) we adopt the assumptions of those papers. The assumptions are standard in the literature on fractal processes and are as follows.

[label=(A0),ref=A0,leftmargin=3em]

For some α∈(−12,12),

γ2(x;X)=x2α+1L(x),x∈(0,∞),

(2.1)

where L:(0,∞)→[0,∞) is continuously differentiable and bounded away from zero in a neighborhood of x=0. The function L is assumed to be slowly varying at zero, in the sense that limx→0L(tx)L(x)=1 for all t>0.

d2dx2γ2(x;X)=x2α−1L2(x) for some slowly varying (at zero) function L2, which is continuous on (0,∞).2

There exists b∈(0,1) with

limsupx→0supy∈[x,xb]∣∣∣L2(y)L(x)∣∣∣<∞.

There exists a constant C>0 such that the derivative L′ of L satisfies

Remark 2.1.

Remark 2.2.

The technical assumption (4) is only needed for the asymptotic normality of the estimator of α and not for consistency.

The parameter α∈(−1/2,1/2) is termed the fractal index because it, under mild assumptions, is related to the fractal dimensionD=32−α of the sample paths of the process X(Falconer, 1990; Gneiting
et al., 2012). It is also refered to as the roughness index of X, since the value of α reflects itself in the pathwise properties of X, as the following result formalizes.

Proposition 2.1.

Let X be a Gaussian process with stationary increments satisfying (1) with fractal index α∈(−1/2,1/2). Then there exists a modification of X which has locally Hölder continuous trajectories of order ϕ for all ϕ∈(0,α+12).

Proposition 2.1 shows that α controls the degree of (Hölder) continuity of X. In particular, negative values of α corresponds to X having rough paths, while positive values of α corresponds to smooth paths. It is well known that the Brownian motion has α=0. In Table 1 we give some parametric examples of the kind of processes we have in mind and comment on how they fit into the setup of the present paper; the examples are taken from Table 1 in Gneiting
et al. (2012).

Class

Autocorrelation function

Slowly varying function

Parameters

fBm

−

L(x)=β

α∈(−1/2,1/2)

Matérn

ρ(x)=2−α+1/2Γ(α+1/2)|βx|α+1/2Kα+1/2(|βx|)

L(x)=2x−2α−1(1−ρ(x))

α∈(−1/2,1/2)a

Powered exp.

ρ(x)=exp(−|βx|2α+1)

L(x)=2x−2α−1(1−ρ(x))

α∈(−1/2,1/2)b

Cauchy

ρ(x)=(1+|βx|2α+1)−τ2α+1

L(x)=2x−2α−1(1−ρ(x))

α∈(−1/2,1/2)b, τ>0

Dagum

ρ(x)=1−(|βx|2τ+11+|βx|2τ+1)2α+12τ+1

L(x)=2x−2α−1(1−ρ(x))

τ∈(−1/2,1/2)c, α∈(−1/2,τ)

Parametric examples of Gaussian fractal processes. “fBm” is the fractional Brownian motion; “Powered exp.” is the powered exponential process. β>0 is a scale parameter and α is the fractal index. The processes fulfill assumptions (1)–(3) for the parameter ranges given in the rightmost column; a letter superscript denotes whether the parameter ranges are different under (4). a: (4) valid for α∈(−1/2,1/4). b: (4) valid for α∈(−1/4,1/2). c: (4) valid for τ∈[−1/4,1/2).

Table 1: Parametric examples of Gaussian fractal processes

To get an intuitive understanding of how the trajectories of the fractal processes look, and in particular how the value of α reflects itself in the roughness of the paths, Figure 1 plots three simulated trajectories of the Matérn process. It is evident how negative values of α correspond to very rough paths, while the paths become smoother as α increases.

Figure 1: Simulations of the unit-variance Matérn process, cf. Table 1, with β=1, α as indicated above the plots, and n=500 observations on the unit interval. The same random numbers were used for all three instances.

The processes in Table 1 are all Gaussian. However, in many applications it is preferable to have rough processes which are both fractal and non-Gaussian (Gneiting
et al., 2012, section 3.1.). In the following section we suggest an extension to the above setup that explicitly results in non-Gaussian processes with fractal properties, by considering processes which are volatility modulated.

2.1 Extension to stochastic volatility processes

A flexible way to introduce non-Gaussianity of processes for which the theory of the fractal index continues to hold, is through volatility modulation. Following Barndorff-Nielsen et al. (2009), consider processes of the form

Xt=X0+∫t0σsdGs,t≥0,

(2.2)

where X0∈R, σ=(σt)t≥0 is a stochastic volatility process, and G=(Gt)t≥0 is a zero-mean Gaussian process with stationary increments satisfying (1)–(4), e.g. one of the processes from Table 1. The modulation of the increments of G by the stochastic volatility process is a convenient way of introducing non-Gaussianity. To see this, note that the marginal distribution of Xt, conditional on the past of the stochastic volatility process and the starting value X0, is

Xt|(σs,s∈[0,t];X0)∼N(X0,∫t0σ2xdx),t≥0.

In other words, the marginal distribution of Xt is a normal mean-variance mixture distribution, where the distribution of the stochastic process σ and initial value X0 determine the mixture.

For the integral in (2.2) to be well defined (in a pathwise Riemann-Stieltjes sense), we require that σ has finite q-variation for some q<11/2−α. Intuitively, this means that the “more rough” G is, the “less rough” σ can be. Under these conditions on G and σ, the process X in (2.2) will inherit the fractal properties of the driving process G, as shown in Barndorff-Nielsen et al. (2009).

For the central limit theorems developed below to hold, we further require another assumption on σ.

[label=(SV),ref=SV]

For any q>0, it holds that

E[|σt−σs|q]≤Cq|t−s|ξq,t,s∈R,

for some ξ>0 and Cq>0.

As pointed out in Bennedsen
et al. (2017a), the requirement that σ has finite q-variation for a q<11/2−α can be quite restrictive. For instance, if α<0 (i.e., G is rough) then σ can not be driven by a standard Brownian motion. A very convenient process, which does not have these restrictions and which is very tractable, is the Brownian semistationary process, which we consider next.

The Brownian semistationary process

where W is a Brownian motion on R, σ=(σt)t∈R a stationary process, and g a Borel measurable function such that ∫t−∞g(t−s)2σ2sds<∞ a.s. See, e.g., Bennedsen
et al. (2017a) for further details of the BSS process. The BSS process is also a normal mean-variance mixture:

Xt|(σs,s≤t)∼N(0,∫∞0g(x)2σ2t−xdx),t≥0.

It is interesting to note that Barndorff-Nielsen et al. (2013) show that for a particular choice of kernel function g and stochastic volatility process σ, X will have a marginal distribution of the ubiquitous Normal Inverse Gaussian type.

We need to impose some technical assumptions on the kernel function g. They are as follows.

[label=(BSS),ref=BSS]

It holds that

g(x)=xαLg(x), where Lg is slowly varying at zero.

g′(x)=xα−1Lg′(x), where Lg′ is slowly varying at zero, and, for any ϵ>0, we have g′∈L2((ϵ,∞)). Also, for some a>0, |g′| is non-increasing on the interval (a,∞).3

For any t>0,

Ft:=∫∞1|g′(x)|2σ2t−xdx<∞.

The kernel function gives the BSS framework great flexibility. A particularly useful kernel function which has been applied in a number of studies, e.g. Barndorff-Nielsen et al. (2013) and Bennedsen (2017), is the so-called gamma kernel.

Example 2.1 (Γ-BSS process).

Let g be the gamma kernel, i.e. g(x)=xαe−λx for α∈(−1/2,1/2) and λ>0. The resulting process

Xt=∫t−∞(t−s)αe−λ(t−s)σsdWs,t≥0,

is called the (volatility modulated) Γ-BSS process. It is not hard to show that this process fulfills assumptions (1)–(4) and (1), see Example 2.3. in Bennedsen
et al. (2017b).

Remark 2.3.

In Bennedsen
et al. (2017b) it was shown that BSS processes satisfying (1)–(3), (1), and (1) will have the same fractal and continuity properties as their Gaussian counterparts: for such a BSS process Proposition 2.1 continues to hold. In other words, X will have a modification with Hölder continuous trajectories of order ϕ for all ϕ∈(0,α+1/2).

2.2 Extension to processes with non-stationary increments

When the increments of X are non-stationary an approach similar to the one in Bennedsen
et al. (2017b) can be adopted as follows. Define the time-dependent variogram

where again C2,t>0, α∈(−12,12), and L is a slowly varying function at zero. The methods considered in this paper applies – mutatis mutandis – also to such processes. An example is the truncated Brownian semistationary process.

where X0∈R, W is a Brownian motion, and σ a stochastic volatility process. Bennedsen
et al. (2017b) call such a process a truncatedBSS (TBSS) process. When X satisfies (1)–(3) and (1), Bennedsen
et al. (2017b) show that α is indeed the fractal index of X, in the sense of γ2(h,t) satisfying (2.4). We note that processes similar to the TBSS process (with σt=1 for all t) have recently been proposed as models of stochastic log-volatility of financial assets, e.g., Gatheral
et al. (2018); Bayer
et al. (2016).

2.3 Summary of assumptions

Above we introduced a number of processes, differing in important ways, most notably through their distributional properties. In spite of these differences, the results presented in this paper will apply equally to all of them. To ease notation, we briefly summarize the assumptions here.

The first set of assumptions is required for consistency of the estimator of the fractal index α.

X is defined by (2.2) satisfying (1) for an α∈(−1/2,1/4), as well as (4). The process σ additionally fulfills (1).

X is a BSS process, defined by (2.3), satisfying (1) for an α∈(−1/2,1/4), as well as (4). The process σ additionally fulfills (1).

Remark 2.4.

As seen from the assumptions, the central limit theorems will not be applicable for α≥1/4. In fact, a central limit theorem do hold in this case, but with a different convergence rate and limiting distribution from what we derive below. When α=1/4, the convergence rate is √nlogn and the limiting distribution is zero-mean Gaussian with an asymptotic variance different from when α<1/4. When α>1/4 the convergence rate is n1−2α and the limiting distribution is of the Rosenblatt type, see Taqqu (1979). If one is interested in the range α∈[1/4,1/2) and desire asymptotic normality results similar to what we have below, we recommend using gaps between the observations as in Corcuera et al. (2013) Remark 4.4; the downside of this approach is that one is forced to throw away observations. Given the results presented below, filling in the details of this approach is straight forward, albeit notationally cumbersome. Since the case of very smooth processes, i.e. α≥1/4, seems of limited practical value, we do not pursue this further here.

Consider n equidistant observations X1/n,X2/n,…,X1 of the stochastic process X, observed over a fixed time interval, which we without loss of generality take to be the unit interval, so that the time between observations is 1n. As n→∞, this gives rise to the so-called in-fill asymptotics. In what follows, suppose that the process X satisfies the assumptions (1)–(3).

When X is Gaussian, it holds, by standard properties on the (absolute) moments of the Gaussian distribution and (2.1), that

γp(h;X)=Cp|h|(2α+1)p/2Lp(h),h∈R,

(3.1)

where p>0, the function Lp(h):=L(h)p/2 is slowly varying at zero, and Cp>0 is a constant. This motivates the regression

This estimator is well known and much used in the literature, e.g. Gneiting and
Schlather (2004); Gatheral
et al. (2018); Bennedsen
et al. (2017a). The following proposition shows the consistency of the OLS estimator of α.

Proposition 3.1.

A number of studies have considered the asymptotic properties of the OLS estimates coming from (3.4), e.g. Constantine and
Hall (1994), Davies and Hall (1999), and Coeurjolly (2001, 2008). For a brief summary of this literature, see Gneiting
et al. (2012), Section 3.1. The following theorem presents the details in the context of this paper.

Remark 3.1.

The limit in (3.5) exists for k,v=1,…,m by Breuer and
Major (1983), Theorem 1. See also Remark 3.3. in Corcuera et al. (2013).

Perhaps surprisingly, Theorem 3.1 shows that the asymptotic distribution of the OLS estimator does not depend on the precise structure of the underlying process X, but only on the value of the fractal index α, through the correlation structure of the increments of a fractional Brownian motion (fBm) with Hurst index H=α+1/2, and possibly the “heteroskedasticity factor” Sp. The reason for this is that the small scale behavior of a process X fulfilling assumption (1), will have the same small scale behavior as increments of the fBm. To see this, write

rn(j)

:=Corr(Xj+1n−Xjn,X1n−X0)

=γ2((j+1)/n;X)−2γ2(j/n;X)+γ2((j−1)/n;X)2γ2(1/n;X)

→12(|j+1|2α+1−2|j|2α+1+|j−1|2α+1),n→∞,

(3.7)

by assumption (1) and the properties of slowly varying functions. We recognize (3.7) as the correlation function of the increments of an fBm with Hurst index H=α+1/2. As shown in the proof of Theorem 3.1, this will imply that the asymptotic variance of the estimator, σ2m,p, is the same for all Gaussian processes fulfilling assumptions (1)–(4), including the fBm. However, as the theorem also shows, the asymptotic distribution proves to be slightly different when we consider conditionally Gaussian processes. In this case, the stochastic volatility component σ introduces heteroskedasticity, which results in the extra factor Sp in the central limit theorem. To make inference feasible in practice, we need to estimate this factor. For this, define

Proposition 3.2.

Proposition 3.2 shows that ˆSp of (3.8) is a suitable estimator for our purpose: when X is Gaussian, the factor is asymptotically irrelevant, while when X is non-Gaussian (volatility modulated) it provides the correct normalization. This justifies including the factor ˆSp, whether or not one believes the data is Gaussian, at least when any potential non-Gaussianity is volatility induced. In fact, the following corollary is a straightforward consequence of Theorem 3.1, Proposition 3.2, and the properties of stable convergence; the corollary has obvious applications to feasibly conducting inference and making confidence intervals for α.

Corollary 3.1.

where “d” denotes convergence in distribution and σ2m,p(^α) denotes the asymptotic variance calculated using the estimate ^α.

Remark 3.2.

When using Corollary 3.1 for hypothesis testing, we recommend calculating σ2m,p(⋅) using the value of α under the null, instead of ^α.

To apply the above results we need to calculate the factor σ2m,p, which boils down to calculating the entries of the matrix Λp given in equation (3.5). Unfortunately, this is only feasible when p=2 and becomes increasingly cumbersome as m increases. (The already tedious calculation for p=m=2 is given in Bennedsen et al., 2016, Appendix B.). For this reason, we recommend Monte Carlo estimation of σ2m,p; in fact, we suggest using the finite sample analogue of this factor. The procedure is detailed in Appendix B; in the next section we present an example of the output, when we study the effect of the choice of bandwidth, m.

3.1 Choosing the bandwidth parameter

The choice of bandwidth parameter m is, in general, an open problem. Standard practice in the literature is to set m=2(Gneiting
et al., 2012, Section 2.3). Indeed, Constantine and
Hall (1994) argue that the bias of the estimator increases with m and Davies and Hall (1999) present simulation evidence for the optimal value, in terms of mean squared error, being m=2. Setting m=2 amounts to estimating α by drawing a straight line between only the two points closest to the origin, log^γp(1/n;X) and log^γp(2/n;X), when running the OLS regression in (3.2). While tempting from a bias viewpoint, we conjecture that this can result in increased variance of the estimator, by relying on just two points in the regression. In what follows, we examine this in more depth. To be specific, we consider the effect that the bandwidth has on the estimator of the fractal index; first on the theoretical (finite sample) variance of ^α, as derived in Theorem 3.1 (Figure 2), and then on the finite sample bias and mean squared error of the estimator when applied to simulated paths of the various processes of Table 1 (Figure 3). For these investigations, we consider both α=−0.20 (rough case) and α=0.20 (smooth case).

Figure 2: Monte Carlo approximation (B=10000 replications) of the finite sample analogue of the variance of ^α, σ2m,p,n≈n−1σ2m,p, cf. Theorem 3.1. The true value of α is indicated above the plots. See Appendix B for details of the calculations.

Figure 2 studies the effect that the choice of bandwidth has on the variance of the estimator of α: we plot the approximation of the finite sample variance of ^α, σ2m,p,n, which is approximately equal to n−1σ2m,p, cf. Theorem 3.1. From the figure, we see that the choice of bandwidth indeed has an effect on the variance of the OLS estimator of α. Interestingly, the effect is very different in the rough case, as compared to the smooth case. In the former, it is evident from the top left plot of Figure 2, that the variance is minimized by an intermediate value of m such as m=5 or m=10. To further investigate this, the top right plot shows the ratio between the finite sample variance when m=2 and when m∈{5,10,25}. Numbers less than one indicate that the variance of the estimator with m=2 is greater than the variance of the corresponding estimator with m>2, and vice versa. These ratios seem quite stable as a function of sample size n and it is evident that, from a variance stand point, it is preferable to choose an intermediate m>2 — indeed, the variance of the estimator is reduced by approximately 40% when going from m=2 to m=5. These conclusions get turned on their heads when we consider the smooth case, α=0.20, in the bottom row: here it seems that m=2 is optimal.

Figure 3: Monte Carlo approximation (B=10000 replications) of the finite sample bias (left) and mean squared error (MSE, right) of ^α, as function of the bandwidth m, for the processes of Table 1. We set p=2 and n=1000. The parameter values are given in the text.

We further investigate this through simulations as in Davies and Hall (1999): Figure 3 plots the bias (left) and mean squared error (right) of the estimator (3.4), as a function of bandwidth m, for the five parametric processes of Table 1. To calculate the finite sample bias and mean squared error of the estimator, we simulate B=10000 instances of each process, each with n=1000 observations; the true value of the fractal index in this exercise is α=−0.2 (top row) and α=0.2 (bottom row). The scale parameter is set to β=1. For the Cauchy and Dagum processes we additionally set τ=1 and τ=0, respectively. When looking at the rough case, α=−0.2, the same conclusion as above emerges: even though the bias do increase, as expected, with increasing m, it is clear that the mean squared error is minimized for an m>2. In this case, i.e. for these parameter values and this sample size, the minimum is attained between m=5 and m=10 for all five processes. We again conclude that an intermediate value for the bandwidth is preferable in finite samples when α<0. The smooth case, α=0.2, also matches what we found above: indeed, we find that both bias and mean squared error increase with increasing m, so here m=2 seems optimal.

In conclusion, the evidence of this section suggests that when the underlying process is rough, the optimal choice of bandwidth is some m>2 and we recommend an intermediate value such as m=5. In contrast, when the process is smooth, m=2 is preferable. Although setting m=2 seems to be accepted practice in the literature, we believe that the rough case of α<0 is arguably more relevant in empirical applications. For this reason we suggest using an intermediate value for the bandwidth parameter, unless one has reason to believe the underlying data to be smooth.4

3.2 Asymptotic theory in the presence of additive noise

Consider now the situation, where the observations of X, satisfying (1)–(3), are contaminated by additive noise; that is, instead of observing X, we observe the process Z, given by

Zj/n:=μ+Xj/n+uj,j=1,2,…,n,

(3.9)

where μ∈R is a constant and u={uj}nj=1 is a Gaussian iid noise sequence with mean zero and variance σ2u:=Var(u1)≥0. (When σ2u=0 we mean that the noise is absent from the observations)

Since we observe Z, and not X, what is relevant for us is the “contaminated”, or “noisy”, variogram, i.e. the variogram of the observation process Z:

γ2(h;Z)=E[|Zt+h−Zt|2]=γ2(h;X)+2σ2u=h2α+1L(h)+2σ2u,h∈R,

(3.10)

where the last equality follows from Assumption (1). From this we see that when σ2u>0, logγ2(h;Z) will not be linear in logh, hence the estimator (3.4) of α will not be applicable; in fact, it is not hard to show that this estimator will be downwards biased in the presence of noise, i.e. when applied to γp(⋅;Z). In fact, the following is true.

Proposition 3.3.

Suppose that the observations of a process Z are given by (3.9) with σ2u>0, where X satisfies assumption (1). Fix p>0, m∈N, and let ^α=^αp,m be the OLS estimator of α from (3.4) using the contaminated version of the empirical variogram ^γp(⋅;Z) in place of ^γp(⋅;X) in the regression (3.2). Now,

^α\lx@stackrelP→−1/2,n→∞.

Proposition 3.3 shows that if the data are contaminated by noise, then estimates of the parameter α will be biased downwards towards −1/2, i.e. the lowest permissible value for α. In other words, if the data are contaminated by noise, then the estimator of α considered above, will lead one to conclude that the data are more rough than what is actually the case for the underlying process X. This is an important point to note for the practitioner: when finding evidence of roughness (i.e. α<0) in data, it is crucial to consider whether this is due to an intrinsic property of the underlying data generating mechanism or whether it could simply be the product of noise, e.g. measurement noise.

Fortunately, it is possible to account for the noise when estimating α to arrive at a consistent estimator. For instance, Bennedsen
et al. (2017a) suggest a noise-robust estimator based on a non-linear least squares regression – however, this estimator does not allow for the slowly varying function L and requires the interval over which the process is observed to grow. Presently, therefore, we propose an alternative noise-robust estimator which is valid in our in-fill asymptotics setup and again relies on a simple OLS regression.

is slowly varying at zero. From this, it is clear that the logarithm of fp(h;Z,κ) is – up to the slowly varying function L∗p – linear in logh. This motivates a linear regression as the one in (3.2) with log^fp in place of log^γp:

log^fp(k/n;Z,κ)=b∗+a∗log|k/n|+U∗k,n+ϵ∗k,n,k=1,2,…,m,

(3.12)

where

^fp(k/n;Z,κ):=^γp(κk/n;Z)2/p−^γp(k/n;Z)2/p

is the empirical estimate of the function f, which is feasible to calculate from the observations Zj/n. Define the noise robust estimate of α as

^α∗:=^a∗OLS2−12,

(3.13)

where ^a∗OLS is the OLS estimate of a∗ from the linear regression (3.12), analogous to (3.4) with ^fp in place of ^γp. We can prove the following.

Proposition 3.4.

Suppose that the observations of a process Z are given by (3.9) with σ2u≥0, where X satisfies assumption (1). Fix p>0, m∈N, and let ^α∗=^α∗p,m be the OLS estimator of α from (3.13). Now,

^α∗\lx@stackrelP→α,n→∞.

Remark 3.3.

Proposition 3.4 allows for σ2u=0 i.e. for there to be no noise in the observations. In other words, the robust estimator is a consistent estimator of α, also in the absence of noise.

In Figure 4 we illustrate the use of Proposition 3.4 by calculating the bias and root mean squared error (RMSE) of the two OLS estimators given in (3.4) and (3.13), when applied to a process Z with σ2u>0. The details are provided in the caption of the figure. The former estimator is not robust to the noise in Z, while the latter estimator is per Proposition 3.4. It is clear how this manifests itself in a large bias in the OLS estimator (3.4). In fact, although the true value of the fractal index of the underlying process is α=−0.20, the mean OLS estimates coming from the non-robust estimator is −0.4608, i.e. almost at the lowest permissible value of −1/2. This is of course a consequence of Proposition 3.3. In contrast, the robust estimator (3.13) proposed in this section is practically unbiased for most values of the parameter κ, at least when κ≥4.

Although the results of this section hold for all integer κ≥2, the actual finite sample performance of the results can be quite sensitive to this tuning parameter, as also witnessed in Figure 4. The optimal choice of κ seems to depend on the number of observations n and the variance of the noise σ2u; an investigation into the exact way this is the case is beyond the scope of the present paper. In practice, we recommend that the researcher run some numerical experiments on simulated data under conditions similar to those of the practical experiment; simulation experiments such as the one in Figure 4 for example. We provide an example of how one can construct such a simulation experiment to arrive at a reasonable value for κ in Section 5.2, where we apply the robust estimator to a time series of financial prices.

The next result provides the central limit theorem, as it relates to the robust estimator.

Figure 4: Bias (left) and root mean squared error (RMSE, right) for the two OLS estimators (3.4) and (3.13); blue line and red line with crosses, respectively. The bias and RMSE were calculated from B=10000 Monte Carlo simulations. The underlying data generating process for X is n=2500 observations of an fBm with α=−0.20, while μ=1 and σ2u=0.05 were used for the noise sequence. The bandwidth is m=5.

Theorem 3.2.

Suppose that the observations of a process Z are given by (3.9) with σ2u≥0, where X satisfies assumption (1). Fix p>0, m∈N, and let ^α∗=^αp,m be the OLS estimator of α from (3.13). If (b) or (c) holds for X, we require ξ⋅min{p,1}>1/2, cf. assumption (1). Now the following holds.

Remark 3.4.

A feasible central limit theorem in the case of 1 is straightforwardly constructed in the same way as in Corollary 3.1, including Monte Carlo estimation of σ2,∗m,p.

As shown in 2 of Theorem 3.2, the presence of the noise will unfortunately result in a variance of ^α∗, which decays slower than √n; indeed, the exact distribution of ^α∗ is difficult to derive and even harder to feasibly estimate.

A test for the presence of noise

Using the above, we can now construct a test for whether the observed time series Z contains noise or not. To be specific, we are interested in testing the null hypothesis

H0:σu=0against the alternativeH1:σu>0.

(3.14)

Tests of this kind, in the context of time series of asset prices, were considered in Aït-Sahalia and
Xiu (2018), where the authors develop a test for the presence of market microstructure noise in high frequency data. The test proposed here is similar in spirit to the test of Aït-Sahalia and
Xiu (2018) and in Section 5.2 we briefly consider testing for the presence of market microstructure in high frequency asset prices as well.

To device the test, we consider the difference between the robust estimator ^α∗ from (3.13) and the usual (non-robust) estimator ^α from (3.4). From Propositions 3.1 and 3.4 is is immediately clear that under H0

Corollary 3.2.

The applicability of Corollary 3.2 for testing whether a fractal process is contaminated by noise is obvious.

Remark 3.5.

Above we have assumed that the noise sequence u is Gaussian. However, one can show that all the results of sections 3.2 and 3.2.1 apply for general iid noise sequences with finite variance when p=2. In other words, if the Gaussian assumption on the noise sequence u is not fulfilled – or seems too restrictive – then one should choose p=2 and go ahead and apply the results of these sections.

To examine the finite sample properties of the central limit results presented above, we here conduct three small simulation studies and collect the results in Tables 2–4. In each study we will let X be an fBm with Hurst index H=α+1/2, for various values of α, and simulate n observations on the interval [0,1]. Additional information on the exacts simulation setups are given in the captions of the tables.

For a value α0∈(−1/2,1/4), Corollary 3.1 allows us to test the null hypothesis