Binning Estimator

Here we describe an estimation based on fixed state space partitioning. This approach is based on performing uniform quantization of the time series and then estimating the entropy approximating probabilities with the frequency of visitation of the quantized states. A time series , realization of the generic process , is first normalized to have zero mean and unit variance, and then coarse grained spreading its dynamics over quantization levels of amplitude , where and represent minimum and maximum values of the normalized series. Quantization assigns to each sample the number of the level to which it belongs, so that the quantized time series takes values within the alphabet . Uniform quantization of embedding vectors of dimension builds an uniform partition of the dimensional state space into disjoint hypercubes of size , such that all vectors falling within the same hypercube are associated with the same quantized vector , and are thus indistinguishable within the tolerance . The entropy is then estimated as:

(1)

where the sum is extended over all vectors found in the available realization of the quantized series, and the probabilities are estimated for each hypercube simply as the fraction of quantized vectors falling into the hypercube (i.e., the frequency of occurrence of within ). According to this approach, the estimate of TE based on binning results from application of (1) to the four embedding vectors defined in ( equation (2) ) and determined either by UE or by NUE.

In the NUE implementation, maximization of the mutual information between the component selected at the step and the target variable (step (*) of the algorithm ) was obtained in terms of minimization of the CE , with the two entropy terms estimated through the application of (1). As for the LIN estimator, the randomization procedure applied to test candidate significance consisted time-shifting the points of by a randomly selected lag, Quiroga (2002).

The statistical significance of the TE estimated through the UE BIN approach exploited the method of surrogate data implemented by the time-shift procedure proposed in Vlachos (2010), Faes (2008), Quiroga (2002). Specifically, the estimated TE is tested against its null distribution formed by the values of TE computed on replications of the original series, where in each replication the source series is time-shifted by a randomly selected lag (larger than 20, set to exclude autocorrelation effects).