Due to some economics/regime problem, I can only have access to non full-tick data from an exchange.

To make the problem precise, a full tick data $X$ is a series of $(t_i,p_i,v_i)$ for $0 \leq i \leq N$ where $t_i$ is the timestamp, $p_i$ is the price, $v_i$ is the deal volume.

The data that I could only see is a lower resolution $\hat{X}$ of $X$, in the sense that,
I can only observe the market in a sequence $j_1 < j_2 < \ldots < j_m$
and get the data like: (the sequence is not necessary deterministic or in fixed interval)

Suppose I only have one source of $\hat{X}$, what is the best way to recover most missing tick? I know this may be a bad question, as information has already been lost. I think I need to add some model assumption for this problem from Bayesian point of view, any reference for this?

Suppose I have two different source of $\hat{X}$, and because of random nature of the missing ticks, two source would be different. Any method to recover it?

P.S. I think I can think the tick data as a one-dimensional image, and lower resolution data is a pixelized version of real image data, and apply some image processing technique on it, any idea?

2 Answers
2

If you have two sources, then designate one source as the primary feed and then fill-in gaps from the secondary feed. Of course, you'll have to mind the timestamps when determining whether the secondary feed can be used properly.

Obviously merging two streams is harmless and it should be done. But it's hard to advise you regarding the "interpolation" methods you can use to generate the ticks without knowing why you need this. The reason is that any method will introduce a certain bias to the data. Therefore, it very much depends on what are you going to do with your altered data on the next step.

Some links regarding the interpolation methods that you can find useful:

may I have some reference/paper for those "interpolation" method?
–
wonghangFeb 18 '13 at 3:19

1

in the book High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems by Irene Aldridge, specifically in the Chapter 9 (Working with tick data) you can find the interpolation method. The book is available in Amazon or maybe you can find it free on the web
–
AlgoQuantFeb 18 '13 at 20:06

3

While interpolation can be useful at medium frequencies, it is hardly useful at the tick level. Arguably most of the information is contained in the fact that the "tick" happened ...
–
RyogiFeb 19 '13 at 0:50

@Ryogi I fully agree. The only case I see this useful is testing a time series algorithm on the tick data with different "densities" and checking its stability, etc.
–
Alexey KalmykovFeb 19 '13 at 9:42