For computation of realized volatility, especially range based volatility, deal prices are commonly used.
If Level I data available should the deals data still be used or another measures of spot price would be preferable(for example, mid-price)?

Such measures could diminish the impact of bid-ask spread, but what would be the consequences for the volatility measures?
Would the assumptions of the process be violated or quite the opposite - the price process would become closer to theoretical one?

Assuming for the ranged based models, price follows Brownian motion with zero drift:

$dp_t = \sigma dW_t$, where $p = ln(P)$.

What would the better price measure for them?
Does the answer change if we use different time windows( 1, 5, 30, 60 minutes)?

There is no "right answer," it depends on what exactly you're trying to do. If you need intraday look into intraday garman klass (ask me and I'll tell you how to implement) if you dont need intraday, do standard garman klass you'll need open, high, low and close prices. If there's little trade frequency, use mid prices, if there's a lot of trade frequency you can use only trade prices. If you do intraday I would bucket into 5 minutes.
–
frickskitMay 24 '13 at 15:05

I'm doing this intraday on tick-data and now I'm trying to adapt the methodology used to live data stream. The key idea was that using Level 2 data error due to side change could be eliminated. If we take open/high/low/close for mid-price wouldn't it be a more correct measure for "true price" process that is modelled? There are no bid-ask spread in the model, thus isn't it more correct to eliminate it? And why for a high trade frequency trade prices are better to use?
–
IlyaMay 27 '13 at 8:27

1 Answer
1

Which realized volatility are you attempting to measure is highly important in order to determine which prices and return series to utilize to compute realized volatility.

Here couple ideas:

What do you attempt to measure: Bid/Offer spread volatility, traded price variations,...Even if you attempt to measure asset price variations it can make a difference whether you use bid, offer, mid-prices, or traded prices. Using bids or offers for this particular purpose can sometimes result in erratic moves (because prices may fluctuate widely even though nobody trades on those bids and offers), mid prices somewhat smooth that out, and traded prices are maybe the most preferable to use in this regards. However, there are asset classes where you do not easily get a hold of traded prices, such as OTC currency price series. So, depending exactly what you attempt to measure will help to narrow down which price series to use.

Which time compression do you target with your volatility measure? Are you dealing with tick based data, with compressed intraday data, daily, weekly data. It can make a difference because there are realized volatility models out there that shine on capturing intraday variations, while others are better in measuring high frequency price return variations or daily return volatility.

What specific volatility measure are you targeting. Garman-Klass, for example specifies exactly which price series to use (such as Open, High, Low, Close).

Combined, you should be able to exactly determine which price series to use.

I'm targeting intraday volatility patterns for volatility process calibration end current state estimation. I understand that Garman-Klass(as well as Parkinson, Meilijson and Rogers-Satchell) specifies the measure that should be used, but, as I wrote in another comment, isn't the methodology was the result of availability of O/H/L/C data? When the process is modelled there is no bid-ask spread and this standard measure has this error. For example, on 5 minute candles open/close can change subtantially due to side of corresponding deal. With mid-price this effect is eliminated.
–
IlyaMay 27 '13 at 8:41

@Ilya, I do not understand the second part of your comment. So which intraday volatility model do you apply? If it requires O/H/L/C data points then just use those (mid points of bid/offer or trades), if your model dictates the usage of compressed data points throughout the trading session then compress as much as necessary to get sane readings, meaning a 1-minute bucket where prices jump all over makes little sense, better to use larger compression and end up with less buckets but better quality of data that produce meaningful results.
–
Matt WolfMay 27 '13 at 8:50

the key idea of the question is how the differences in real data and its theoretical model affect volatility estimators. I understand how range based statistics were developed. But trades, bid/ask, mid-price processes are quite different and some of them could be better proxies of modelled "true price" than another. Hypothetically, if during 5-minutes bundle neither bid nor ask price have changed and trades occured at those prices RB measures would depend on which side open and close deals were made and spread size. In theory volatility should be lower than estimated, I guess.
–
IlyaMay 27 '13 at 13:07