When implementing a factor in a trading algorithm, the complexity and wide range of parameters that come with basket selection and trading logic hinder our ability to evaluate the value factor's alpha signal in isolation. Before we proceed to the implementation of an algorithm, we want to know if the factor has any predictive value.

In this analysis, we'll measure a factor's predictive value using the Spearman rank correlation between the factor value and various N day forward price movement windows over a large universe of stocks. This correlation is called the Information Coefficient (IC).
This tear sheet takes a pipeline factor and attempt to answer the following questions, in order:

What is the sector-neutral rolling mean IC for our different forward price windows?

What are the sector-neutral factor decile mean returns for our different forward price windows?

How much are the contents of the top and bottom quintile changing each day?

What is the autocorrelation in sector-wise factor rankings?

What is IC decay (difference in IC for different forward price windows) for each sector?

What is the IC decay for each sector over time?

What are the factor quintile returns for each sector?

For more information on Spearman Rank correlation, check out this notebook from the Quantopian lecture series.

In the plots that are not disagregated by sector, sector adjustment has been applied to forward price movements. You can think of this sector adjustment as incorperating the assumption of a sector-netural portfolio constraint. If we are equally weighted in each sector, we'd want our factor to help us compare stocks within their own sectors. For example, if AAPL 5-day forward return is 0.1% and the mean 5-day forward return for the Technology stocks in our universe was 0.5% in the same period, the sector adjusted 5 day return for AAPL in this period is -0.4%.

The autocorrelation and decile turnover figures are meant to be used as a measure of factor horizon. It is worth noting that these stats are potentially misleading, as our top X liquidity constraint makes our universe dynamic. This dynamic universe likely contributes to a higher quantile turnover and lower rank autocorrelation than we would see in a static universe.

classSidFactor(CustomFactor):""" Workaround to screen by sids in pipeline """inputs=[]window_length=1sids=[]# list, tuple and whatever np.asarray acceptdefcompute(self,today,assets,out):out[:]=np.in1d(assets,self.sids)defcreate_sid_screen(sids):sid_factor=SidFactor()sid_factor.sids=sidsreturnsid_factor.eq(True)classReturnsMarketExcess(CustomFactor):""" Calculates the percent change in close price (market adjusted) over the given window_length. **Default Inputs**: [USEquityPricing.close] """params=('market_sid',)inputs=[USEquityPricing.close]defcompute(self,today,assets,out,close,market_sid):returns=(close[-1]-close[0])/close[0]market_idx=assets.get_loc(market_sid)returns-=returns[market_idx]# remove market returnsout[:]=returnsdef_beta(stock_prices,bench_prices):# `linregress` returns its results in the following order:# slope, intercept, r-value, p-value, stderrregr_results=stats.linregress(y=bench_prices,x=stock_prices)#alpha = regr_results[1]beta=regr_results[0]#r_value = regr_results[2]p_value=regr_results[3]#stderr = regr_results[4] # Check null hypothesisifp_value>0.05:beta=0.returnbetaclassReturnsBetaExcess(CustomFactor):""" Calculates the percent change in close price (beta adjusted) over the given window_length. **Default Inputs**: [USEquityPricing.close] """params=('delta_days','market_sid',)inputs=[USEquityPricing.close]window_length=60defcompute(self,today,assets,out,close,delta_days,market_sid):returns=(close[delta_days:]-close[:-delta_days])# absolute returnsreturns/=close[:-delta_days]# percentage returnsmarket_idx=assets.get_loc(market_sid)market_returns=returns[:,market_idx]betas=np.apply_along_axis(_beta,0,returns,market_returns)returns-=(returns[:,[market_idx]]*betas)# remove returns due to betaout[:]=returns[-1]defget_daily_perc_ret(sid_universe,start_date,end_date,ret_type='normal',market=None):""" Creates a DataFrame containing daily percentage returns: normal, market_excess or beta_excess returns Parameters ---------- sid_universe : list List of sids for which the ruturns are computed. start_date : string Starting date for returns computation. end_date : string End date for returns computation. ret_type: string Type of returns: normal, market_excess or beta_excess market: equity The market, if None is passed 'SPY' will be used """ifmarketisNone:market=symbols('SPY')mask=create_sid_screen(sid_universe+[market.sid])inputs=[USEquityPricing.open]daily_ret_columns={'normal':Returns(inputs=inputs,window_length=2,mask=mask),'market_excess':ReturnsMarketExcess(inputs=inputs,window_length=2,market_sid=market.sid,mask=mask),'beta_excess':ReturnsBetaExcess(inputs=inputs,window_length=60,mask=mask,delta_days=2,market_sid=market.sid),}tmp_pipe=Pipeline(columns={'daily_perc_ret':daily_ret_columns[ret_type]})tmp_pipe.set_screen(mask)perc_ret_pipe=run_pipeline(tmp_pipe,start_date=start_date,end_date=end_date)perc_ret_pipe=perc_ret_pipe.unstack().fillna(0)perc_ret_pipe=perc_ret_pipe['daily_perc_ret']#this drop top level column# pipeline is run in the morning of each days before having open/close price for that day,# this means we need to shift the return to have that day returnperc_ret_pipe=perc_ret_pipe.shift(-1)[:-1]returnperc_ret_pipe

In [14]:

classLiquidity(CustomFactor):inputs=[USEquityPricing.volume,USEquityPricing.close]window_length=5defcompute(self,today,assets,out,volume,close):out[:]=(volume*close).mean(axis=0)classSector(CustomFactor):inputs=[morningstar.asset_classification.morningstar_sector_code]window_length=1defcompute(self,today,assets,out,msc):out[:]=msc[-1]defconstruct_factor_history(factor_cls,start_date='2015-10-1',end_date='2016-2-1',factor_name='factor',top_liquid=1000,universe_constraints=None,sector_names=None):""" Creates a DataFrame containing daily factor values and sector codes for a liquidity constrained universe. The returned DataFrame is can be used in the factor tear sheet. Parameters ---------- factor_cls : quantopian.pipeline.CustomFactor Factor class to be computed. start_date : string or pd.datetime Starting date for factor computation. end_date : string or pd.datetime End date for factor computation. factor_name : string, optional Column name for factor column in returned DataFrame. top_liquid : int, optional Limit universe to the top N most liquid names each trading day. Based on trailing 5 days traded dollar volume. universe_constraints : num_expr, optional Pipeline universe constraint. Returns ------- daily_factor : pd.DataFrame DataFrame with integer index and date, equity, factor, and sector code columns. """price=SimpleMovingAverage(inputs=[USEquityPricing.close],window_length=22)price_filter=(price>=5.0)liquidity=Liquidity(mask=price_filter)liquidity_rank=liquidity.rank(ascending=False)ok_universe=(top_liquid>liquidity_rank)ifuniverse_constraintsisnotNone:ok_universe=ok_universe&universe_constraintsfactor=factor_cls(mask=ok_universe)sector=Sector(mask=ok_universe)ok_universe=ok_universe&factor.eq(factor)&sector.eq(sector)pipe=Pipeline()pipe.add(factor,factor_name)pipe.add(sector,'sector_code')pipe.set_screen(ok_universe)daily_factor=run_pipeline(pipe,start_date=start_date,end_date=end_date)daily_factor=daily_factor.reset_index().rename(columns={'level_0':'date','level_1':'equity'})daily_factor=daily_factor[daily_factor.sector_code!=-1]ifsector_namesisnotNone:daily_factor.sector_code=daily_factor.sector_code.apply(lambdax:sector_names[x])returndaily_factordefget_daily_returns(daily_factor,start_date,end_date,extra_days_before=0,extra_days_after=0,ret_type='normal'):""" Creates a DataFrame containing daily percentage returns: normal, market_excess or beta_excess returns Parameters ---------- daily_factor : pd.DataFrame DataFrame with, at minimum, date, equity, factor, columns. Index can be integer or date/equity multiIndex. See construct_factor_history for more detail. start_date : string Starting date for returns computation. end_date : string End date for returns computation. ret_type: string Type of returns: normal, market_excess or beta_excess Returns ------- price : pd.DataFrame DataFrame with date index and equity columns with and percentage price movement that can be normal, market excess or beta excess """extra_days=math.ceil(extra_days_before*365.0/252.0)+5# just to be surestart_date=datetime.datetime.strptime(start_date,"%Y-%m-%d")-datetime.timedelta(days=extra_days)start_date=start_date.strftime("%Y-%m-%d")extra_days=math.ceil(extra_days_after*365.0/252.0)+5# just to be sureend_date=datetime.datetime.strptime(end_date,"%Y-%m-%d")+datetime.timedelta(days=extra_days)end_date=end_date.strftime("%Y-%m-%d")sid_universe=daily_factor['equity'].unique()sid_universe=map(lambdax:x.sid,sid_universe)daily_perc_ret=get_daily_perc_ret(sid_universe=sid_universe,start_date=start_date,end_date=end_date,ret_type=ret_type)returndaily_perc_retdefadd_forward_price_movement(daily_factor,days=[1,5,10],prices=None):""" Adds N day forward price movements (as percent change) to a factor value DataFrame. Parameters ---------- daily_factor : pd.DataFrame DataFrame with, at minimum, date, equity, factor, columns. Index can be integer or date/equity multiIndex. See construct_factor_history for more detail. days : list Number of days forward to project price movement. One column will be added for each value. prices : pd.DataFrame, optional Pricing data to use in forward price calculation. Equities as columns, dates as index. If no value is passed, get pricing will be called. Returns ------- factor_and_fp : pd.DataFrame DataFrame with integer index and date, equity, factor, sector code columns with and an arbitary number of N day forward percentage price movement columns. """factor_and_fp=daily_factor.copy()ifnotisinstance(factor_and_fp.index,pd.core.index.MultiIndex):factor_and_fp=factor_and_fp.set_index(['date','equity'])ifpricesisNone:start_date=factor_and_fp.index.levels[0].values.min()end_date=factor_and_fp.index.levels[0].values.max()equities=factor_and_fp.index.levels[1].unique()time_buffer=pd.Timedelta(days=max(days)+5)prices=get_pricing(equities,start_date=start_date,end_date=end_date+time_buffer,fields='open_price')col_n='%s_day_fwd_price_change'foriindays:delta=prices.pct_change(i).shift(-i)factor_and_fp[col_n%i] = delta.stack()
factor_and_fp=factor_and_fp.reset_index()returnfactor_and_fpdefsector_adjust_forward_price_moves(factor_and_fp):""" Convert forward price movements to price movements relative to mean sector price movements. This normalization incorperates the assumption of a sector neutral portfolio constraint and thus allows allows the factor to be evaluated across sectors. For example, if AAPL 5 day return is 0.1% and mean 5 day return for the Technology stocks in our universe was 0.5% in the same period, the sector adjusted 5 day return for AAPL in this period is -0.4%. Parameters ---------- factor_and_fp : pd.DataFrame DataFrame with date, equity, factor, and forward price movement columns. Index should be integer. See add_forward_price_movement for more detail. Returns ------- adj_factor_and_fp : pd.DataFrame DataFrame with integer index and date, equity, factor, sector code columns with and an arbitary number of N day forward percentage price movement columns, each normalized by sector. """adj_factor_and_fp=factor_and_fp.copy()pc_cols=[colforcolinfactor_and_fp.columns.valuesif'fwd_price_change'incol]adj_factor_and_fp[pc_cols]=factor_and_fp.groupby(['date','sector_code'])[pc_cols].apply(lambdax:x-x.mean())returnadj_factor_and_fpdeffactor_spearman_rank_IC(factor_and_fp,time_rule=None,by_sector=True,factor_name='factor'):""" Computes sector neutral Spearman Rank Correlation based Information Coefficient between factor values and N day forward price movements. Parameters ---------- factor_and_fp : pd.DataFrame DataFrame with date, equity, factor, and forward price movement columns. Index should be integer. See add_forward_price_movement for more detail. time_rule : string, optional Time span to use in Pandas DateTimeIndex grouping reduction. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html for available options. by_sector : boolean If True, compute ic separately for each sector factor_name : string Name of factor column on which to compute IC. Returns ------- ic : pd.DataFrame Spearman Rank correlation between factor and provided forward price movement columns. MultiIndex of date, sector. err : pd.DataFrame Standard error of computed IC. MultiIndex of date, sector. MultiIndex of date, sector. """defsrc_ic(x):cn="%s_day_IC"ic=pd.Series()fordays,colinzip(fwd_days,pc_cols):ic[cn%days] = sp.stats.spearmanr(x[factor_name], x[col])[0]
ic['obs_count']=len(x)returnicdefsrc_std_error(rho,n):returnnp.sqrt((1-rho**2)/(n-2))fwd_days,pc_cols=get_price_move_cols(factor_and_fp)grpr=['date','sector_code']ifby_sectorelse['date']ic=factor_and_fp.groupby(grpr).apply(src_ic)obs_count=ic.pop('obs_count')err=ic.apply(lambdax:src_std_error(x,obs_count))iftime_ruleisnotNone:ic=ic.reset_index().set_index('date')err=err.reset_index().set_index('date')grpr=[pd.TimeGrouper(time_rule),'sector_code']ifby_sectorelse[pd.TimeGrouper(time_rule)]ic=ic.groupby(grpr).mean()err=err.groupby(grpr).agg(lambdax:np.sqrt((np.sum(np.power(x,2))/len(x))))else:ifby_sector:ic=ic.reset_index().groupby(['sector_code']).mean()err=err.reset_index().groupby(['sector_code']).agg(lambdax:np.sqrt((np.sum(np.power(x,2))/len(x))))returnic,errdefquantile_bucket_factor(factor_and_fp,by_sector=True,quantiles=5,factor_name='factor'):""" Computes daily factor quantiles. Parameters ---------- factor_and_fp : pd.DataFrame DataFrame with date, equity, factor, and forward price movement columns. Index should be integer. See add_forward_price_movement for more detail. by_sector : boolean If True, compute quantile buckets separately for each sector. quantiles : integer Number of quantiles buckets to use in factor bucketing. factor_name : string Name of factor column on which to compute quantiles. Returns ------- factor_and_fp_ : pd.DataFrame Factor and forward price movements with additional factor quantile column. """g_by=['date','sector_code']ifby_sectorelse['date']factor_and_fp_=factor_and_fp.copy()factor_and_fp_['factor_percentile']=factor_and_fp_.groupby(g_by)[factor_name].rank(pct=True)q_int_width=1./quantilesfactor_and_fp_['factor_bucket']=factor_and_fp_.factor_percentile.apply(lambdax:((x-.000000001)//q_int_width)+1)returnfactor_and_fp_defquantile_bucket_mean_daily_return(quantile_factor,by_sector=False):""" Computes mean daily returns for factor quantiles across provided forward price movement columns. Parameters ---------- quantile_factor : pd.DataFrame DataFrame with date, equity, factor, factor quantile, and forward price movement columns. Index should be integer. See quantile_bucket_factor for more detail. by_sector : boolean If True, compute quintile bucket returns separately for each sector quantiles : integer Number of quantiles buckets to use in factor bucketing. Returns ------- mean_returns_by_quantile : pd.DataFrame Sector-wise mean daily returns by specified factor quantile. """fwd_days,pc_cols=get_price_move_cols(quantile_factor)defdaily_mean_ret(x):mean_ret=pd.Series()fordays,colinzip(fwd_days,pc_cols):mean_ret[col]=x[col].mean()#/ daysreturnmean_retg_by=['sector_code','factor_bucket']ifby_sectorelse['factor_bucket']mean_ret_by_quantile=quantile_factor.groupby(g_by)[pc_cols].apply(daily_mean_ret)returnmean_ret_by_quantiledefquantile_turnover(quantile_factor,quantile):""" Computes the proportion of names in a factor quantile that were not in that quantile in the previous period. Parameters ---------- quantile_factor : pd.DataFrame DataFrame with date, equity, factor, factor quantile, and forward price movement columns. Index should be integer. See quantile_bucket_factor for more detail. quantile : integer Quantile on which to perform turnover analysis. Returns ------- quant_turnover : pd.Series Period by period turnover for that quantile. """quant_names=quantile_factor[quantile_factor.factor_bucket==quantile]quant_name_sets=quant_names.groupby(['date']).equity.apply(set)new_names=(quant_name_sets-quant_name_sets.shift(1)).dropna()quant_turnover=new_names.apply(lambdax:len(x))/quant_name_sets.apply(lambdax:len(x))returnquant_turnoverdeffactor_rank_autocorrelation(daily_factor,time_rule='W',factor_name='factor'):""" Computes autocorrelation of mean factor ranks in specified timespans. We must compare week to week factor ranks rather than factor values to account for systematic shifts in the factor values of all names or names within a sector. This metric is useful for measuring the turnover of a factor. If the value of a factor for each name changes randomly from week to week, we'd expect a weekly autocorrelation of 0. Parameters ---------- daily_factor : pd.DataFrame DataFrame with integer index and date, equity, factor, and sector code columns. time_rule : string, optional Time span to use in factor grouping mean reduction. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html for available options. factor_name : string Name of factor column on which to compute IC. Returns ------- autocorr : pd.Series Rolling 1 period (defined by time_rule) autocorrelation of factor values. """daily_ranks=daily_factor.copy()daily_ranks[factor_name]=daily_factor.groupby(['date','sector_code'])[factor_name].apply(lambdax:x.rank(ascending=True))equity_factor=daily_ranks.pivot(index='date',columns='equity',values=factor_name)iftime_ruleisnotNone:equity_factor=equity_factor.resample(time_rule,how='mean')autocorr=equity_factor.corrwith(equity_factor.shift(1),axis=1)returnautocorrdefget_price_move_cols(x):pc_cols=[colforcolinx.columns.valuesif'fwd_price_change'incol]fwd_days=map(lambdax:int(x.split('_')[0]),pc_cols)returnfwd_days,pc_colsdefget_ic_cols(x):return[colforcolinx.columns.valuesif'day_IC'incol]defis_outlier(points,thresh=3.0):""" Utility function to remove outliers, for a better graph visualization Returns a boolean array with True if points are outliers and False otherwise. Parameters: ----------- points : An numobservations by numdimensions array of observations thresh : The modified z-score to use as a threshold. Observations with a modified z-score (based on the median absolute deviation) greater than this value will be classified as outliers. Returns: -------- mask : A numobservations-length boolean array. References: ---------- Boris Iglewicz and David Hoaglin (1993), "Volume 16: How to Detect and Handle Outliers", The ASQC Basic References in Quality Control: Statistical Techniques, Edward F. Mykytka, Ph.D., Editor. """iflen(points.shape)==1:points=points[:,None]median=np.median(points,axis=0)diff=np.sum((points-median)**2,axis=-1)diff=np.sqrt(diff)med_abs_deviation=np.median(diff)modified_z_score=0.6745*diff/med_abs_deviationreturnmodified_z_score>threshdefbuild_cumulative_returns_series(factor_and_fp,daily_perc_ret,days_before,days_after,day_zero_align=False):""" An equity and date pair is extracted from each row in the input dataframe and for each of these pairs a cumulative return time series is built starting 'days_before' days before and ending 'days_after' days after the date specified in the pair Parameters ---------- factor_and_fp : pd.DataFrame DataFrame with at least date and equity columns. daily_perc_ret : pd.DataFrame Pricing data to use in cumulative return calculation. Equities as columns, dates as index. day_zero_align : boolean Aling returns at day 0 (timeseries is 0 at day 0) """ret_df=pd.DataFrame()forindex,rowinfactor_and_fp.iterrows():timestamp,equity=row['date'],row['equity']timestamp_idx=daily_perc_ret.index.get_loc(timestamp)start=timestamp_idx-days_beforeend=timestamp_idx+days_afterseries=daily_perc_ret.ix[start:end,equity]ret_df=pd.concat([ret_df,series],axis=1,ignore_index=True)# Reset index to have the same starting point (from datetime to day offset)ret_df=ret_df.apply(lambdax:x.dropna().reset_index(drop=True),axis=0)ret_df.index=range(-days_before,days_after)# From daily percent returns to comulative returnsret_df=(ret_df+1).cumprod()-1# Make returns be 0 at day 0ifday_zero_align:ret_df-=ret_df.iloc[days_before,:]returnret_df

defcreate_factor_tear_sheet(factor_cls,factor_name='factor',start_date='2015-10-1',end_date='2016-2-1',top_liquid=1000,sector_names=None,days=[1,5,10],nquantiles=10,ret_type='normal',# normal, market_excess or beta_excessdays_before=36,days_after=20,):factor=construct_factor_history(factor_cls,start_date=start_date,end_date=end_date,factor_name=factor_name,top_liquid=top_liquid,sector_names=sector_names)daily_perc_ret=get_daily_returns(factor,start_date,end_date,extra_days_before=days_before,extra_days_after=days_after,ret_type=ret_type)factor_and_fp=add_forward_price_movement(factor,days=days,prices=(daily_perc_ret+1.0).cumprod())pattern_dict={-2:'HS',2:'IHS',-1:'BTOP',1:'BBOT',-4:'TTOP',4:'TBOT',-3:'RTOP',3:'RBOT',}forpattern,dfinfactor_and_fp.groupby(by='PatternFactor'):print"Pattern ",pattern_dict[pattern]," entries ",len(df.index)# Plot comulative returns over time for each quantileplot_quantile_cumulative_return(df,daily_perc_ret,quantiles=nquantiles,by_quantile=False,factor_name=factor_name,days_before=days_before,days_after=days_after,std_bar=False)plot_quantile_cumulative_return(df,daily_perc_ret,quantiles=nquantiles,by_quantile=False,factor_name=factor_name,days_before=2,days_after=max(days)+1,std_bar=True)# What are the sector-neutral factor decile mean returns for our different forward price windows? plot_quantile_returns(df,by_sector=False,quantiles=nquantiles,factor_name=factor_name)# As above but more detailed, we want to know the volatility of returnsplot_quantile_returns_box(df,by_sector=False,quantiles=nquantiles,factor_name=factor_name)

from__future__importdivisionfromstatsmodels.nonparametric.kernel_regressionimportKernelRegfromnumpyimportlinspacefromscipy.signalimportargrelextremafromcollectionsimportdefaultdictdeffind_max_min(prices):prices_=prices.copy()prices_.index=linspace(1.,len(prices_),len(prices_))kr=KernelReg([prices_.values],[prices_.index.values],var_type='c',bw=[1.8,1])f=kr.fit([prices_.index.values])smooth_prices=pd.Series(data=f[0],index=prices.index)local_max=argrelextrema(smooth_prices.values,np.greater)[0]local_min=argrelextrema(smooth_prices.values,np.less)[0]price_local_max_dt=[]foriinlocal_max:if(i>1)and(i<len(prices)-1):price_local_max_dt.append(prices.iloc[i-2:i+2].argmax())price_local_min_dt=[]foriinlocal_min:if(i>1)and(i<len(prices)-1):price_local_min_dt.append(prices.iloc[i-2:i+2].argmin())prices.name='price'maxima=pd.DataFrame(prices.loc[price_local_max_dt])minima=pd.DataFrame(prices.loc[price_local_min_dt])max_min=pd.concat([maxima,minima]).sort_index()max_min.index.name='date'max_min=max_min.reset_index()max_min=max_min[~max_min.date.duplicated()]p=prices.reset_index()max_min['day_num']=p[p['index'].isin(max_min.date)].index.valuesmax_min=max_min.set_index('day_num').pricereturnmax_mindeffind_patterns(max_min):patterns=defaultdict(list)foriinrange(5,len(max_min)+1):window=max_min.iloc[i-5:i]# pattern must play out in less than 36 daysifwindow.index[-1]-window.index[0]>35:continue# Using the notation from the paper to avoid mistakese1=window.iloc[0]e2=window.iloc[1]e3=window.iloc[2]e4=window.iloc[3]e5=window.iloc[4]rtop_g1=np.mean([e1,e3,e5])rtop_g2=np.mean([e2,e4])# Head and Shouldersif(e1>e2)and(e3>e1)and(e3>e5)and \
(abs(e1-e5)<=0.03*np.mean([e1,e5]))and \
(abs(e2-e4)<=0.03*np.mean([e1,e5])):patterns['HS'].append((window.index[0],window.index[-1]))# Inverse Head and Shoulderselif(e1<e2)and(e3<e1)and(e3<e5)and \
(abs(e1-e5)<=0.03*np.mean([e1,e5]))and \
(abs(e2-e4)<=0.03*np.mean([e1,e5])):patterns['IHS'].append((window.index[0],window.index[-1]))# Broadening Topelif(e1>e2)and(e1<e3)and(e3<e5)and(e2>e4):patterns['BTOP'].append((window.index[0],window.index[-1]))# Broadening Bottomelif(e1<e2)and(e1>e3)and(e3>e5)and(e2<e4):patterns['BBOT'].append((window.index[0],window.index[-1]))# Triangle Topelif(e1>e2)and(e1>e3)and(e3>e5)and(e2<e4):patterns['TTOP'].append((window.index[0],window.index[-1]))# Triangle Bottomelif(e1<e2)and(e1<e3)and(e3<e5)and(e2>e4):patterns['TBOT'].append((window.index[0],window.index[-1]))# Rectangle Topelif(e1>e2)and(abs(e1-rtop_g1)/rtop_g1<0.0075)and \
(abs(e3-rtop_g1)/rtop_g1<0.0075)and(abs(e5-rtop_g1)/rtop_g1<0.0075)and \
(abs(e2-rtop_g2)/rtop_g2<0.0075)and(abs(e4-rtop_g2)/rtop_g2<0.0075)and \
(min(e1,e3,e5)>max(e2,e4)):patterns['RTOP'].append((window.index[0],window.index[-1]))# Rectangle Bottomelif(e1<e2)and(abs(e1-rtop_g1)/rtop_g1<0.0075)and \
(abs(e3-rtop_g1)/rtop_g1<0.0075)and(abs(e5-rtop_g1)/rtop_g1<0.0075)and \
(abs(e2-rtop_g2)/rtop_g2<0.0075)and(abs(e4-rtop_g2)/rtop_g2<0.0075)and \
(max(e1,e3,e5)>min(e2,e4)):patterns['RBOT'].append((window.index[0],window.index[-1]))returnpatternsdef_pattern_identification(prices,indentification_lag):max_min=find_max_min(prices)# we are only interested in the last pattern (if multiple patterns are there)# and also the last min/max must have happened less than "indentification_lag"# days ago otherways it mush have already been identified or it is too late to be usefullmax_min_last_window=Noneforiinreversed(range(len(max_min))):if(prices.index[-1]-max_min.index[i])==indentification_lag:max_min_last_window=max_min.iloc[i-4:i+1]breakifmax_min_last_windowisNone:returnnp.nan# possibly identify a pattern in the selected windowpatterns=find_patterns(max_min_last_window)iflen(patterns)!=1:returnnp.nanname,start_end_day_nums=patterns.iteritems().next()pattern_code={'HS':-2,'IHS':2,'BTOP':-1,'BBOT':1,'TTOP':-4,'TBOT':4,'RTOP':-3,'RBOT':3,}returnpattern_code[name]classPatternFactor(CustomFactor):params=('indentification_lag',)inputs=[USEquityPricing.close]window_length=40defcompute(self,today,assets,out,close,indentification_lag):prices=pd.DataFrame(close,columns=assets)out[:]=prices.apply(_pattern_identification,args=(indentification_lag,))