While there are a few different ways you could analyze a financial event, two stand out to be the easiest and most widely used strategies. The first is to create an algorithm based off it and look at how it does compared to the SPY (for those who haven’t seen it, I highly suggest taking a look here first). The second is to conduct an actual event study looking at the impact of an event on a stock’s price. This impact is measured through something abnormal returns. Abnormal returns are simply the returns that were caused by an event compared to what the returns would normally be. Understanding the abnormal returns around an event help to understand whether the event can be a profitable, alpha-generating signal for trading. So while my algorithm (mentioned in the post above) generates more than 2 times the return of the SPY, I’m going to take a closer look at the event to maybe understand HOW and WHY my algorithm might be doing that. In the process, you’re going to learn how to conduct an event study and replicate the results for your own datasets. I promise you’ll learn a lot and will walk away feeling a bit more comfortable with the new research platform.

Before I take you into all the code, I’m going to layout the structure of this notebook so you have a general sense of where I’m going:

First, I’m going to spend some time loading in the data and writing a few helper functions that I’m going to use repeatedly throughout my notebook.

Next, I’ll be looking first at the average cumulative returns of all stocks around the time that an event was announced (you’ll see what I mean) and the average cumulative abnormal returns of all stocks in the same time period. I look at both in order to compare how the event does independently of a benchmark and how it’d fare if compared simply against the SPY.

Lastly, I’m going to look at the volatility of the average returns that I found in the previous step, just to get a sense of the noise level of this event.

In [14]:

"""Step One: Load our imports and data"""importnumpyasnpimportpandasaspdimportseabornassnsimportmatplotlib.pyplotaspyplotfromdatetimeimporttimedeltamin_date=pd.to_datetime("12/31/2006")#: EventVestor dataev_data_raw=local_csv('event_vestor_data_complete.csv')ev_data=local_csv('event_vestor_data_complete.csv',symbol_column='symbol',date_column='trade_date')#: Converting our Dates into a Series of Datetimes so we can do some date logic easilyev_data_raw['event_date']=pd.to_datetime(ev_data_raw['event_date'])ev_data_raw=ev_data_raw[(ev_data_raw['event_date']>min_date)]ev_data=ev_data[(ev_data.index>min_date)]ev_data=ev_data[(ev_data['symbol']!=0)]"""Get just the closing and open prices for our symbols"""ev_symbols=symbols(np.append(ev_data_raw['symbol'].unique(),['SPY']),handle_missing='ignore')data=get_pricing(ev_symbols,fields=['close_price'],start_date='2007-01-01',end_date='2015-02-01')

By now, I've loaded in essentialy two pieces of data:

EventVestor's Share Buyback's Announcement Data

Stock pricing data for all valid tickers (close price)

So since I have the close price for all these securities from 2007-2015, what I'm going to do is look at a band around each ticker's share buyback announcement date and track the movement of its stock price. By band I mean a specific timeframe around the event (which I'm specifiying as t=0).

In [15]:

"""Step Two: Creating some helper functions to find our open prices and close prices"""defget_close_price(data,sid,current_date,day_number):#: If we're looking at day 0 just return the indexed dateifday_number==0:returndata['close_price'].ix[current_date][sid]#: Find the close price day_number away from the current_dateelse:#: If the close price is too far ahead, just get the last availabletotal_date_index_length=len(data['close_price'].index)#: Find the closest date to the target datedate_index=data['close_price'].index.searchsorted(current_date+timedelta(day_number))#: If the closest date is too far ahead, reset to the latest date possibledate_index=total_date_index_length-1ifdate_index>=total_date_index_lengthelsedate_index#: Use the index to return a close price that matchesreturndata['close_price'].iloc[date_index][sid]defget_first_price(data,starting_point,sid,date):starting_day=date-timedelta(starting_point)date_index=data['close_price'].index.searchsorted(starting_day)returndata['close_price'].iloc[date_index][sid]defremove_outliers(returns,num_std_devs):returnreturns[~((returns-returns.mean()).abs()>num_std_devs*returns.std())]defget_returns(data,starting_point,sid,date,day_num):#: Get stock pricesfirst_price=get_first_price(data,starting_point,sid,date)close_price=get_close_price(data,sid,date,day_num)#: Calculate returnsret=(close_price-first_price)/(first_price+0.0)returnret

The get_close_price function is definitely a little tricky to tackle at first. The basic gist of it is that Pandas provides a very helpful method called searchsorted that allows you to look at the dates of an index and find the closest date BUT it returns an index number, not the actual index. Finally, I take that index number and then use .iloc to get the close price that belongs to the number.

In [606]:

"""Step Three: Calculate average cumulative returns"""#: Dictionaries that I'm going to be storing calculated data in all_returns={}all_std_devs={}total_sample_size={}#: Create our range of day_numbers that will be used to calculate returnsstarting_point=30#: Looking from -starting_point till +starting_point which creates our timeframe bandday_numbers=[iforiinrange(-starting_point,starting_point)]forday_numinday_numbers:#: Reset our returns and sample size each iterationreturns=[]sample_size=0#: Get the return compared to t=0 fordate,rowinev_data.iterrows():sid=row.symbol#: Make sure that data exists for the datesifdatenotindata['close_price'].indexorsidnotindata['close_price'].columns:continuereturns.append(get_returns(data,starting_point,sid,date,day_num))sample_size+=1#: Drop any Nans, remove outliers, find outliers and aggregate returns and std devreturns=pd.Series(returns).dropna()returns=remove_outliers(returns,2)all_returns[day_num]=np.average(returns)all_std_devs[day_num]=np.std(returns)total_sample_size[day_num]=sample_size#: Take all the returns, stds, and sample sizes that I got and put that into a Seriesall_returns=pd.Series(all_returns)all_std_devs=pd.Series(all_std_devs)N=np.average(pd.Series(total_sample_size))

It's clear that the upspike exists around t=0 with a 1% upspike from the end of closing on t=-1 and a 1% drift from t=0 till t=30.

Interestingly, you'll see that most of the upspike begins at the end of trading on t=-1 and ends on the trading day of t=0. This is pretty common for a big event like this since the fastest hits on an event will happen immediately after the event is announced. So in this case, the buyback announcement might've occured after market close on t=-1 and by the time it reaches market open on t=0, most of the alpha from that is gone. That's where the drift comes in.

Immediately after the 1% spike in average return, there's this gentle drift upwards (also ~1%). This is what retail investors cash in on. This is what YOU can trade on.

So by now you might be saying, "Okay! Then let me trade on it!" Well it isn't as simple as that. Remember that uptill now, I've only shown you cumulative returns, not cumulative abnormal returns.

By definition, abnormal returns are the difference between the actual returns of the security and expected return. In this case, I'm measuring how much of the returns are "triggered" by an event (Share buybacks announcement). Here's the simplest version of how to calculate abnormal returns:

AR = Stock Return - (Beta*Market Return)

In [615]:

"""Comparing with the benchmark's cumulative returns"""all_returns={}benchmark_returns={}#: Create our range of day_numbers that will be used to calculate returnsstarting_point=30day_numbers=[iforiinrange(-starting_point,starting_point)]forday_numinday_numbers:#: Reset our returns and sample size each iterationreturns=[]b_returns=[]sample_size=0#: Get the return compared to t=0 fordate,rowinev_data.iterrows():sid=row.sid#: Make sure that data exists for the datesifdatenotindata['close_price'].indexorsidnotindata['close_price'].columns:continuereturns.append(get_returns(data,starting_point,sid,date,day_num))#: 8554 is the sid for the benchmarkb_returns.append(get_returns(data,starting_point,8554,date,day_num))#: Drop any Nans, remove outliers, find outliers and aggregate returns and std devall_returns[day_num]=np.average(remove_outliers(pd.Series(returns).dropna(),2))benchmark_returns[day_num]=np.average(pd.Series(b_returns).dropna())#: Plotxticks=[dfordinday_numbersifd%2==0]all_returns=pd.Series(all_returns)all_returns.plot(xticks=xticks,label="PSBAD")benchmark_returns=pd.Series(benchmark_returns)benchmark_returns.plot(xticks=xticks,label='Benchmark')pyplot.title("Comparing the benchmark's average returns around that time to PSBAD")pyplot.ylabel("% Cumulative Return")pyplot.xlabel("Time Window")pyplot.legend()pyplot.grid(b=None,which=u'major',axis=u'y')

In [17]:

"""Now plotting strictly the abnormal returns using a rolling 30 day beta"""defcalc_beta(stock,benchmark,price_history):""" Calculate our beta amounts for each security """stock_prices=price_history[stock].pct_change().dropna()bench_prices=price_history[benchmark].pct_change().dropna()aligned_prices=bench_prices.align(stock_prices,join='inner')bench_prices=aligned_prices[0]stock_prices=aligned_prices[1]bench_prices=np.array(bench_prices.values)stock_prices=np.array(stock_prices.values)bench_prices=np.reshape(bench_prices,len(bench_prices))stock_prices=np.reshape(stock_prices,len(stock_prices))iflen(stock_prices)==0:returnNonem,b=np.polyfit(bench_prices,stock_prices,1)returnm#: Create our range of day_numbers that will be used to calculate returnsab_all_returns={}ab_volatility={}starting_point=30day_numbers=[iforiinrange(-starting_point,starting_point)]forday_numinday_numbers:#: Reset our returns and sample size each iterationreturns=[]b_returns=[]sample_size=0#: Get the return compared to t=0 fordate,rowinev_data.iterrows():sid=row.sid#: Make sure that data exists for the datesifdatenotindata['close_price'].indexorsidnotindata['close_price'].columns:continueret=get_returns(data,starting_point,sid,date,day_num)b_ret=get_returns(data,starting_point,8554,date,day_num)""" Calculate beta by getting the last X days of data 1. Create a DataFrame containing the data for the necessary sids within that time frame 2. Pass that DataFrame into our calc_beta function in order to spit out a beta """history_index=data['close_price'].index.searchsorted(date)history_index_start=max([history_index-starting_point,0])price_history=data['close_price'].iloc[history_index_start:history_index][[sid,8554]]beta=calc_beta(sid,8554,price_history)ifbetaisNone:continue#: Calculate abnormal returnsabnormal_return=ret-(beta*b_ret)returns.append(abnormal_return)#: Drop any Nans, remove outliers, find outliers and aggregate returns and std devreturns=pd.Series(returns).dropna()returns=remove_outliers(returns,2)ab_volatility[day_num]=np.std(returns)ab_all_returns[day_num]=np.average(returns)

Just a few things to note, you can see that the same general pattern stays the same. The quick upspike in price directly after the announcement and a general positive drift a few days after. The main difference here is that in the case of the cumulative abnormal returns (where you are comparing against the SPY), there's plateau and even a movement downwards towards the end of the time frame. This is why in the algorithm, the holding period was set to 7 days rather than something longer or shorter in order to maximize capturing the drift after the buyback announcement.

What's also important to note here is that it's important to focus on the movement of the stock price not necessarily it's absolute numbers! E.g. You obtain about a .5% abnormal drift after the buybacks announcement (on average).

A buyback can affect companys differently and it really does depend on investor perception. For example, if investors perceive that a buyback comes from a good root (e.g. internal executives really do believe the company is undervalued), then the price can react positively. But there's the inverse where the buyback could be perceived to be a negative and the price can react downwards instead. So in an effort to capture just how differently the stock price can react, I'm going to look at the volatility of returns after the event.

In [611]:

"""Plotting the same graph but with error bars"""all_std_devs.ix[:-1]=0pyplot.errorbar(all_returns.index,all_returns,xerr=0,yerr=all_std_devs,label="N=%s"%N)pyplot.grid(b=None,which=u'major',axis=u'y')pyplot.title("Cumulative Return from Share Buyback Announcements before and after event with error")pyplot.xlabel("Window Length (t)")pyplot.ylabel("Cumulative Return (r)")pyplot.legend()pyplot.show()

So as you can see, the volatility is quite big in either direction. This implies, as is true for many events, that there's a bit of noise in there. It's something that you'll have to take into account when creating an algorithm based off this event.

One thing you might helpful is to narrow down the scope of your data. In previous algorithms, I've done that with percent of shares bought back or the specific sector that a stock belongs to (e.g. Technology or Finance). Here, I'm going to be doing something a little different and actually compare year over year to see whether or not the time effects the volatility of a share buyback.

"""Step Three: Going through our same volatility and return calculations from above and findinganswers for different types of datasets"""starting_point=30day_numbers=[iforiinrange(-starting_point,starting_point)]defget_volatility_and_all_returns(ev_data_type,ab_volatility,ab_all_returns):forday_numinday_numbers:#: Reset our returns and sample size each iterationreturns=[]b_returns=[]sample_size=0#: Get the return compared to t=0 fordate,rowinev_data_type.iterrows():sid=row.symbol#: Make sure that data exists for the datesifdatenotindata['close_price'].indexorsidnotindata['close_price'].columns:continueret=get_returns(data,starting_point,sid,date,day_num)b_ret=get_returns(data,starting_point,8554,date,day_num)""" Calculate beta by getting the last X days of data 1. Create a DataFrame containing the data for the necessary sids within that time frame 2. Pass that DataFrame into our calc_beta function in order to spit out a beta """history_index=data['close_price'].index.searchsorted(date)history_index_start=max([history_index-starting_point,0])price_history=data['close_price'].iloc[history_index_start:history_index][[sid,8554]]beta=calc_beta(sid,8554,price_history)ifbetaisNone:continue#: Calculate abnormal returnsabnormal_return=ret-(beta*b_ret)returns.append(abnormal_return)#: Drop any Nans, remove outliers, find outliers and aggregate returns and std devreturns=pd.Series(returns).dropna()returns=remove_outliers(returns,2)ab_volatility[day_num]=np.std(returns)ab_all_returns[day_num]=np.average(returns)returnab_volatility,ab_all_returns#: Find volatility and return levels for both yearsab_volatility_9,ab_all_returns_9=get_volatility_and_all_returns(ev_data_9,{},{})ab_volatility_13,ab_all_returns_13=get_volatility_and_all_returns(ev_data_13,{},{})

Perhaps you could reduce that by looking at only big cap companies, or those with a PE ratio greater than 10 but no less than 20. Or maybe you even want to take out all technology stocks, finance stocks. In my case, I used percent of shares bought back as a way to filter down my securities. If you're curious on how I did that, check out the the first notebook that shows you some of the things I've just mentioned.

Share buybacks announcements seem to have a significant impact on stock prices immediately following the event as well as in the days that follow (allowing retail investors to catch the drift). However, dependent on the year that you're trading on, the volatility of returns differs significantly with years like 2009 being a lot more volatile than 2013. As always, the data is available through Event Vestor (http://bit.ly/1zGbhXM) if you'd like to sample and test it out for yourselves.

And that concludes this notebook. Most of the code here is easily replicated and as always, the Quantopian team is here to answer any questions you might have.