This notebook is an overview of Quantopian's slippage model for futures, VolatilityVolumeShare. If you aren't familiar with what slippage is, consider a simple example: buying futures contracts for coffee. We want to buy five contracts at \$500 apiece. We buy the first four, but by the time we buy the fifth, the price of the coffee contract has increased to \$503. This change in price is what is known as slippage. It will typically work against us — our own buys in a market increase demand, which in-turn increases price.

Our slippage model for futures should, within the backtester, slightly increase the price on a buy and slightly decrease the price on a sell. We can do this similarly to how we apply slippage to stocks. If $MI$ is our market impact, $P_0$ is the original price of the future, and $P$ is the price with slippage applied, then

Where $r$ is the percentage change of the asset's price in each bar (which we take the absolute value of) and $J$ is $\frac{market\ volume\ traded\ in\ bar}{20-day\ ADV}$ for every bar. This method is known as Kyle's Lambda.

Once we have a value of $\eta$ for each month in a certain time frame (5 years maximum for each future, in this case), we average these values together to get the final value of $\eta$ we will use for each continuous future. This notebook is meant to calculate these $\eta$ values. It can also be run without the volatility ($\sigma$) term.

Say we're trying to buy 100 front contracts of ES, the E-Mini S&P 500, in one minute. If E-Mini is trading at \$2000, with an ADV of 1.4 million contracts/day and a volatility of 9%. Eta for ES is 0.047. Our market impact is then $MI = 0.047 \cdot 0.09 \cdot \sqrt{\frac{100}{1,400,000}} = 0.00003575 = 0.3575 \ \text{bps}$. (bps, or [basis points](https://en.wikipedia.org/wiki/Basis_point), is one-hundredth of one percent). Therefore, the price we would buy E-Mini at here, according to our slippage model, is $ \$2000 \cdot \left(1 + 0.00003575\right) = \$2000.0715 $.

# Computes eta for a given root symbol, as described above# This can take a while to rundefcompute_eta(symbol):cf=continuous_future(symbol,offset=0,roll='volume',adjustment='mul')cf_end_date=cf.end_date-pd.Timedelta(days=1)cf_start_date=max(cf_end_date-max_days,cf.start_date,pd.Timestamp('2002-02-01',tz='UTC'))ifcf_start_date>cf_end_date:returnNonemarket_data=history(cf,fields=['price','volume'],# We want the forward-filled pricefrequency='minute',start_date=cf_start_date+pd.Timedelta(hours=12),end_date=cf_end_date+pd.Timedelta(hours=12))market_data.index=market_data.index.tz_convert('EST')# Get daily volume & closedaily_volume=market_data.volume.resample('1D').sum().dropna()daily_close=market_data.price.resample('1D').last().dropna()daily_returns=daily_close.pct_change().dropna()# Calculate 20 day volatility and ADV. Only include previous days.vol_20d=daily_returns.rolling(21).agg(lambdax:empyrical.annual_volatility(x[:-1]))adv_20d=daily_volume.rolling(21).agg(lambdax:np.mean(x[:-1]))# Calculate regression variablesmarket_data['returns']=market_data.price.pct_change()market_data['abs_returns']=market_data.returns.abs()market_data['poadv']=market_data.apply(lambdax:x.volume/daily_volume[x.name.normalize()]ifx.name.normalize()indaily_volumeelsenp.nan,axis=1)market_data['sqrt_poadv']=market_data.poadv.map(np.sqrt)market_data['vol_20d']=market_data.apply(lambdax:vol_20d[x.name.normalize()]ifx.name.normalize()invol_20delsenp.nan,axis=1)market_data['vol_x_sqrt_poadv']=market_data.vol_20d*market_data.sqrt_poadv# Delete rows where the previous row wasn't 1 minute earlier# We only want to capture minute-to-minute price changesindex_series=market_data.index.to_series()rows_to_drop=index_series[index_series.diff()!=pd.Timedelta(minutes=1)].indexmarket_data=market_data.drop(rows_to_drop)# If volume is 0, the price shouldn't changemarket_data=market_data[market_data.volume>0]ifinclude_volatility:market_data=market_data.dropna(subset=['sqrt_poadv','abs_returns','vol_20d'])else:market_data=market_data.dropna(subset=['sqrt_poadv','abs_returns'])sample_size=len(market_data)printsymbol,'samples:',sample_sizeifsample_size<2:# Confirm the sample size is large enoughreturnNoneetas=[]days=pd.date_range(cf_start_date+pd.Timedelta(days=60),cf_end_date,freq=eta_freq)fordayindays:sample_market_data=market_data[day-pd.Timedelta(days=60):day]iflen(sample_market_data)<2:returnNoneifinclude_volatility:x=sample_market_data.vol_x_sqrt_poadv.valuesy=sample_market_data.abs_returns.valuesx=x[:,np.newaxis]slope,_,_,_=np.linalg.lstsq(x,y)else:x=sample_market_data.sqrt_poadv.valuesy=sample_market_data.abs_returns.valuesx=x[:,np.newaxis]slope,_,_,_=np.linalg.lstsq(x,y)etas.append(slope)returnetas,sample_size

In [3]:

all_etas={}all_sample_sizes={}# Iterate through each futureforsymbolinroot_symbols:try:res=compute_eta(symbol)ifresisnotNone:all_etas[symbol]=res[0]all_sample_sizes[symbol]=res[1]exceptExceptionase:printsymbol,':',e# Mean eta for each symbolmean_eta={symbol:np.mean(etas)forsymbol,etasinall_etas.iteritems()}

HG samples: 1546475
HO samples: 1086430
GC samples: 1724718

In [4]:

# Only take etas with a large enough sample size# Also compute a sample-size weighted mean eta, as a fallback for thinly-traded namesmin_sample_size=10000clean_etas={}weighted_sum=0total_weight=0forsymbol,etainmean_eta.iteritems():ss=all_sample_sizes[symbol]ifeta<1:# The result must be reasonableweighted_sum+=eta*sstotal_weight+=ssifss>min_sample_size:clean_etas[symbol]=etagrand_mean_eta=weighted_sum/total_weightprint'Symbol etas:',clean_etasprint'Default eta: ',grand_mean_eta