11.1 – Co-Integration of two-time series

I guess this chapter will get a little complex. We would be skimming the surface of some higher order statistical theory. I will try my best and stick to practical stuff and avoid all the fluff. I’ll try and explain these things from a trading point of view, but I’m afraid, some amount of theory will be necessary for you to know.

Given the path ahead I think it is necessary to re-rack our learnings so far and put some order to it. Hence let me just summarize our journey so far –

Starting from Chapter 1 to 7, we discussed a very basic version of a pair trade. We discussed this simply to lay out a strong foundation for the higher order pair trading technique, which is generally known as the relative value trade

The relative value trade requires the use of linear regression

In linear regression, we regress an independent variable, X against a dependent variable Y.

When we regress – some of the outputs that are of interest are the intercept, slope, residuals, standard error, and the standard error of the intercept

The decision to classify a stock as dependent and independent really depends on the error ratio.

We calculate the error ratio by interchanging both X and Y. The one which offers the lowest error ratio will define which stock is X and which on as Y.

I hope you have read and understood everything that we have discussed up to this point. If not, I’d suggest you read the chapters again, get clarity, and then proceed.

Recollect, in the previous chapter, we discussed the residuals. In fact, I also mentioned that the bulk of the focus going forward will be on the residuals. It is time we study the residuals in more detail and try and establish the kind of behavior the residuals exhibit. In our attempt to do this, we will be introduced to two new jargons – Cointegration and Stationarity.

Generally speaking, if two time series are ‘co integrated’ (stock X and stock Y in our case), then it means, that the two stocks move together and if at all there is a deviation from this movement, it is either temporary or can be attributed to a stray event, and one can expect the two time series to revert to its regular orbit i.e. converge and move together again. Which is exactly what we want while pair trading. This means to say, the pair that we choose to pair trade on, should be cointegrated.

So the question is – how do we evaluate if the two stocks are cointegrated?

Well, to check if the two stock is cointegrated, we first need to run a linear regression on the two stocks, then take up the residuals obtained from the linear regression algorithm, and check if the residual is ‘stationary’.

If the residuals are stationary, then it implies that the two stocks are cointegrated, if the two stocks are cointegrated, then the two stocks move together, and therefore the ‘pair’ is ripe for tracking pair trading opportunity.

Here is an interesting way to look at this – one can take any two-time series and apply regression, the regression algorithm will always throw out an output. How would one know if the output is reliable? This is where stationarity comes into play. The regression equation is valid if and only if residuals are stationary. If the residuals are not stationary, regression relation shouldn’t be used.

Speculating and setting up trades on a co-integrated time series is a lot more meaningful and is independent of market direction.

So, essentially, this boils down to figuring out if the residuals are stationary or not.

At this point, I can straight away show you how to check if the residuals are stationary or not, there is a simple test called the ‘ADF test’ to do this – frankly, this is all you need to know. However, I think you are better off if you spend few minutes to understand what ‘Stationarity’ really means (without actually deep diving into the quants).

So, read the following section only if you are curious to know more, else go to the section which talks about ADF test.

11.2 Stationary and non-stationary series

A time series is considered ‘Stationary’ if it follows three 3 simple statistical conditions. If the time series partially satisfies these conditions, like 2 out of 3 or 1 out of 3, then the stationarity is considered weak. If none of the three conditions are satisfied, then the time series is ‘non-stationary’.

The three simple statistical conditions are –

The mean of the series should be same or within a tight range

The standard deviation of the series should be within a range

There should be no autocorrelation within the series – this means any particular value in the time series – say value ‘n’, should not be dependent on any other value before ‘n’. Will talk more about this at a later stage.

While pair trading, we only look for pairs which exhibit complete stationarity. Non-stationary series or weak stationary series will not work for us.

I guess it is best to take up an example (like a sample time series) and figure out what the above three conditions really mean and hopefully, that will help you understand ‘stationarity’ better.

For the sake of this example, I have two-time series data, with 9000 data points in each. I’ve named them Series A and Series B, and on this time series data, I will evaluate the above three stationarity conditions.

Condition 1 – The mean of the series should be same or within a tight range

To evaluate this, I will split each of the time series data into 3 parts and calculate the respective mean for each part. The mean for all three different parts should be around the same value. If this is true, then I can conclude that the mean will more or less be the same even when new data points flow in the future.

So let us go ahead and do this. To begin with, I’m splitting the Series A data into three parts and calculating its respective means, here is how it looks –

Like I mentioned, I have 9000 data points in Series A and Series B. I have split Series A data points into 3 parts and as you can see, I’ve even highlighted the starting and ending cells for these parts.

The mean for all the three parts are similar, clearly satisfying the first condition.

I’ve done the same thing for Series B, here is how the mean looks –

Now as you can see, the mean for Series B swings quite wildly and thereby not satisfying the first condition for stationarity.

Condition 2 -Thestandard deviation should be within a range.

I’m following the same approach here – I will go ahead and calculate the standard deviation for all the three parts for both the series and observe the values.

Here is the result obtained for Series A –

The standard deviation oscillates between 14-19%, which is quite ‘tight’ and therefore qualifies the 2nd stationarity condition.

Here is how the standard deviation works out for Series B –

Notice the difference? The range of standard deviation for Series B is quite random. Series B is clearly not a stationary series. However, Series A looks stationary at this point. However, we still need to evaluate the last condition i.e the autocorrelation bit, let us go ahead and do that.

Condition 3 – There should be no autocorrelation within the series

In layman words, autocorrelation is a phenomenon where any value in the time series is not really dependent on any other value before it.

For example, have a look at the snapshot below –

The 9th value in Series A is 29, and if there is no autocorrelation in this series, the value 29 is not really dependent on any values before it i.e the values from cell 2 to cell 8.

But the question is how do we establish this?

Well, there is a technique for this.

Assume there are 10 data points, I take the data from Cell 1 to Cell 9, call this series X, now take the data from Cell 2 to Cell 10, call this Series Y. Now, calculate the correlation between Series X and Y. This is called 1-lag correlation. The correlation should be near to 0.

I can do this for 2 lag as well – i.e between Cell 1 to Cell 8, and then between Cell 3 to Cell 10, again, the correlation should be close to 0. If this is true, then it is safe to assume assumed that the series is not autocorrelated, and hence the 3rd condition for stationarity is proved.

I’ve calculated 2 lag correlation for Series A, and here is how it looks –

Remember, I’m subdividing Series A into two parts and creating two subseries i.e series X and series Y. The correlation is calculated on these two subseries. Clearly, the correlation is close to zero and with this, we can safely conclude that Time Series A is stationary.

Let’s do this for Series B as well.

I’ve taken a similar approach, and the correlation as you can see is quite close to 1.

So, as you can see all the conditions for stationarity is met for Series A – which means the time series is stationary. While Series B is not.

I know that I’ve taken a rather unconventional approach to explaining stationarity and co-integration. After all, no statistical explanation is complete without those scary looking formulas. But this is a deliberate approach and I thought this would be the best possible way to discuss these topics, as eventually, our goal is to learn how to pair trade efficiently and not really deep dive into statistics.

Anyway, you could be thinking if it is really required for you to do all of the above to figure out if the time series (residuals) are indeed stationary. Well, like I said before, this is not required.

We only need to look at the results of something called as the ‘The ADF Test’, to establish if the time series is stationary or not.

11.3 –The ADF test

The augmented Dickey-Fuller or the ADF test is perhaps one of the best techniques to test for the stationarity of a time series. Remember, in our case, the time series in consideration is the residuals series.

Basically, the ADF test does everything that we discussed above, including a multiple lag process to check the autocorrelation within the series. Here is something you need to know – the output of the ADF test is not a definitive ‘Yes – this is a stationary series’ or ‘No – this is not a stationary series’. Rather, the output of the ADF test is a probability. It tells us the probability of the series, not being stationary.

For example, if the output of the ADF test a time series is 0.25, then this means the series has a 25% chance of not being stationary or in other words, there is a 75% chance of the series being stationary. This probability number is also called ‘The P value’.

To consider a time series stationary, the P value should be as low as 0.05 (5%) or lower. This essentially means the probability of the time series is stationary is as high as 95% (or higher).

Alright, so how do you run an ADF test?

Frankly, this is a highly complex process and unfortunately, I could not find a single source online which will help you run an ADF test for free. I do have an excel sheet (which has a paid plugin) to run an ADF test, but unfortunately, I cannot share it here. If I could, I would have.

If you are a programmer, I’ve been told that there are Python plugins easily available to run an ADF test, so you could try that.

But if you are a non-programmer like me, then you will be stuck at this stage. So here is what I will do, once in a weak or 15 days, I will try and upload a ‘Pair Data’ sheet, which will contain the following information of the best possible combination of pairs, this includes –

You will know which stock is X and which stock is Y

You will know the intercept and Beta of this combination

You will also know the p-value of the combination

The look back period for generating this is 200 trading days. I’ve restricted this just to banking stocks, but hopefully, I can include more sectors going forward. To help you understand this better, here is the snapshot of the latest Pair Datasheet for banking stocks –

The first line suggests that Federal Bank as Y and PNB as X is a viable pair. This also means, that the regression of Federal as Y and PNB as X and Federal as X and PNB as Y was conducted and the error ratio for both the combination was calculated, and it was found that Federal as Y and PNB as X had the least error ratio.

Once the order has been figured out (as in which one is Y and which one is X), the intercept and Beta for the combination has also been calculated. Finally, the ADF was conducted and the P value was calculated. If you see, the P value for Federal Bank as Y and PNB as X is 0.365.

In other words, this is not a combination you should be dealing with as the probability of the residuals being stationary is only 63.5%.

In fact, if you look at the snapshot above, you will find only 2 pairs which have the desired p-value i.e Kotak and PNB with a P value of 0.01 and HDFC and PNB with a P value of 0.037.

The p values don’t usually change overnight. Hence, for this reason, I check for p-value once in 15 or 20 days and try and update them here.

I think we have learned quite a bit in this chapter. A lot of information discussed here could be new for most of the readers. For this reason, I will summarize all the things you should know about Pair trading at this point –

The basic premise of pair trading

Basic overview of linear regression and how to perform one

In linear regression, we regress an independent variable, X against a dependent variable Y.

When we regress – some of the outputs that are of interest are the intercept, slope, residuals, standard error, and the standard error of the intercept

The decision to classify a stock as dependent and independent really depends on the error ratio.

We calculate the error ratio by interchanging both X and Y. The one which offers the lowest error ratio will define which stock is X and which on as Y

The residuals obtained from the regression should be stationary. If they are stationary, then we can conclude that the two stocks are co-integrated

If the stocks are cointegrated, then they move together

Stationarity of a series can be evaluated by running an ADF test.

If you are not clear on any of the points above, then I’d suggest you give this another shot and start reading from Chapter 7.

In the next chapter, we will try and take up an example of a pair trade and understand its dynamics.

148 comments

Thanks Karthik.
Excellent writeup!!!
Is it possible to have the complete list of upcoming chapters to know where we are with regard this pair trade journey ?
Can we expect a few more chapters this month ? Sorry for being greedy…

I don’t plan for it in advance, but generally, go with the flow. To give you a rough idea, the next step would be to take up an example of a trade and try and put all the learning together. Hopefully, that will be exciting enough 🙂

Thank You Karthik sir,
Even though ADF test is not available , you have taught us how to calculate Stationarity using excel by dividing the data in to parts and calculate Mean,SD and 2 Lag correlation.But please mention how much variation in Mean,SD which would represent ‘p’value of 0.05 (rough estimate).

For mean – I’d suggest a tight variation, not more than 3-5 points difference. For SD, technically you will have to look at the standard error of the standard deviation, but then, it may just get a little overboard. Stick to -5-10% at the most. This should result in a pvalue less than 0.05%.

Thanks to you and Prakash for taking the pain to make us understand this chapter. Overall I am thoroughly enjoying this module. However, I have few questions in my mind while going thru’ this chapter. Hope you can clarify the doubts here.

1. You mentioned that the look back period is 200 trading days. When I am calculating the pair (let’s say PNB as x and Kotak Bank as Y), the Intercept coefficient I am arriving at is in the vicinity of 1111. However, in the sheet you shared it is around 1099. My data range is starting from 23rd June, 2017 till 13th Apr, 2018. Am I missing anything here. I am following the same procedure which you mentioned in chapter 9.

2. When I am calculating the p-value (using the python in-built packages), for the period as mentioned above – it is coming around .40 instead of .01. Not sure why such a huge difference. Can you please elaborate if there are any additional parameters go into calculating the p-value in your case.

Ok. Also, we have considered the data from 20th June 2017 to 10th apr 2018. The intercept difference is due to that I guess. Also, as you may have figured, in most ADF functions, one needs to give a lag. In our case its 5. Recommend value is the cube root of the length of data points (or thereabouts). Since we had 200 data points, cube root is 5.8, decided to go with 5.

Thanks Prakash and Kartik..
For p.value i use amibroker. Cointegration is not inbuilt indicator for p value so we have to outsource the data to pythone from ami . For that search “how to calculate cointegration in amibroker” on marketcalls.in, there is v.good step by step explanation on that.
I find nifty/banknifty, ambujacem/acc and tatamtrdvr/tatamotors very stationary pairs to trade even on 60min chart too..
I keep searching stocks in same sectors only.

@kartik,
the p value for axis/icici showing 0.00 all time i look, what does it mean? Is it 100% probability that its mean reverting?

hi karthik
ami gives me data in this format, copy below link paste in other tab to see the screen of amibroker.http://prntscr.com/j9xawyhttp://prntscr.com/j9xc40http://prntscr.com/j9xcguhttp://prntscr.com/j9xct9
above are my favorite pairs. one can overlook the correlation data as i calculate 63 trading days correlation by amibroker builin function. but i took 252 trading days to calculate co-integration.
below is correlation table link which can run in amibroker by simple aflhttp://prntscr.com/j9xeih
i have cointegration afl also but its not running properly otherwise we can just see the cointgration in tabular form in selected watchlist. so i keep looking cointegration in individual pairs only.

– while studying co-integration i find web pages for pure calculation of how to calculate co-integration, frankly i could not understand any of math symbol and calculation.

– regarding 0.5 or 0.05 about p-value, 0.05 is confirm. but the afl i m using with ami is simply outsource the data to python servers and displays the coint value to amibroker afl window. i think we should divide the displayed value with 10. because some good pair with my experiment showing the coint value of 0.20…so i m taking it as 0.02…and its working fine. any python coder can crack its afl and throw some more light about it.

– http://prntscr.com/jahlrd in this my data starts from 12/4/17 to 26/4/18 almost 252 days period..co-int value is 0.08 so i will take it as 0.008 and
– http://prntscr.com/jahp5z in this tatamtrdvr/tatamotors pair coint showing 0.18 so i m taking it as 0.018…u don’t believe that this pair is so tight spread that i m trading it on 60min basis and touch wood earning good money….pl click below link for hourly chart
– http://prntscr.com/jahsfo profit in hourly chart is less than day chart but there are plenty of trading opportunities..(on 5lac both side u can earn around 2500-brkrg in 2-3 days)

previously i started all good stocks pair, then after experience i narrow down it to good banking stock (total 104 pairs possible), but since last 6 month i narrowed down it further and i trade only nf/bnf, tatamtrdvr/tatamotors, hdfc/hdfcbank and acc/ambuja only.

besides theory any experience trader can tell after watching 2-5 yrs of daily chart that this pair is regularly mean-reverting or not?

Hey, Akash thanks for the insights. Nothing beats practical market experience 🙂
Btw, what makes you divide the p-value by 10?
I was not aware of TM DVR future, I’m sure it opens a window of opportunity.

Hi, Kartik,
dividing p-value by 10 was just my simple logic/intuition from motors/dvr exp…the long term chart showing its very tight spread and it regularly cross its mean, and coint value showing me 0.11 to 0.45 range, if we took this value this pair is not reliable for pair trade, but if we divide it by 10 we will get 0.011 and 0.045 and that would be v.good for pair trade. and second example was bn/n its coint showed me 0.15 to 0.68 but practically if we see that is also verygood pair. thats why i came on that conclusion that i should divide it with 10. i dont have any knowledge to write or read and understand AFL language… i believe in KISS. and i never regret for it. but i still not understand why axis/icici coint showing 0.00 value? that project is still under process…will let you know. mean time i m searching who has sound AFL coding understanding to crack the python AFL. hope this will help.

I m a regular student of Zerodha Varsity. I am wondering whether it is possible for you to have a separate chapter for “How Stock/Financial Market Operates” which cover basically the mechanism of stock market like Market Makers, Clearing Agents, etc. ( as there are many other components who operate on the back stage of Market and I’m just mentioning couple of them that I know. Hoping you will cover the rest) How they operate and who they are on the context of Indian market.

There is not so much stuff available online also on this subject. I personally think that one should have knowledge about the mechanism which will broad our knowledge and I believe knowledge is Power.

Sir I know this is not a proper question but this is just eating away at me. A few days back Airtel had announced its results and it was bad. But it was better than what the market was expecting. Still the next day, the share went up. What do you think caused this?

Sir what you just said only brings me a few more questions sir. I’m sorry to pester you like this.
1. You once said when you give market good news and bad news, it always reacts to good news first. By that logic, don’t you think the shares of Airtel and Axis should have gone down?
2. In the hindsight, do you think you could have predicted that even if those two companies posted bad earnings, it is going to go up? (I’m just asking that to see if seasoned traders can do that, since I had no clue it was a possibility. )
3. Is there a method to associate a particular news to its reaction to stock price. For example, in the above case, what were the factors that led to stock moving up?

I hugely appreciate what you’re doing to help fellow traders like myself. Varsity is a treasure and I encourage my friends to read it as well. Thanks in advance for the answer.

1) The market always looks at futures, Sundeep. So, they expect a better outlook for these stocks as they believe the worst could be over. But your guess is as good as mine
2) My colleague actually had a bet with another colleague that the stocks would go up the next day 🙂
3) This is largely depended on your experience reading the markets

Fantastic series Karthik. I had not been here for a while and had to skim through to get here. You have successfully managed to keep it as a easy read. Hats off.

Considering that I am starting my journey as a full time trader n a month, I see myself coming here more often.
Question:
Do you have tools within kite to figure out cointegration and other analysis ? Are you considering bringing any capability around it?

Sir I have a very personal question to ask you. But since it relates to mindset of a good trader I decided to ask you anyway. How do you feel when your fellow trader made more money, assuming you started out with same amount of capital. I know I felt really bad when it happened to me. How do you deal with that?

Sundeep, this is personal. The way I react maybe different from the way another. I think you should be happy since you can always check with your friend on what went right for him and learn from his success. End of the day, the only way to move ahead in markets is by having an open mind to learn and adapt. Good luck and keep learning 🙂

Sir you’ve written exhaustive text on trading using Technical and Quantitative methods. Can you write a module on trading using Fundamental analysis (based on earnings or news). If not, can you give some methodology on how to learn them ?

Yes sir that is exactly what I am doing right now. But the returns are all over the place. And I need some guidance. What books are there on the subject or if you can give me few pointers it would be very very helpful. Thank you.

Hello,
Thanks for providing wonderful modules in Varsity. I have some queries listed below:
When the next chapter will come?
How much time it will take to complete the entire module and how many more chapters will be added?
Can you please name some reference books or resources for a deeper understanding of Trading system and coding one by himself?

Hello sir
I have have installed EViews statistical package for one year trial period ? In the “lag length” drop menu of ADF test section there are many options available like Schwarz Info Criterion, Hann-Quin criterion, Modified Akaik, T- static each giving different P value for the same max lag of 15. You may see it here:http://prntscr.com/jdu0xn
Even it gives value below the threshold value of 0.05 the header reads as Null Hypothesis: Residual has a unit root. If the series has a unit root how could it be a stationary series? I have taken a screen shot here:http://prntscr.com/jdu7m4
Thanks
Varsity student

Hello sir
My question is about updating the pair data everyday. I run regression analysis and copy the residual data and paste in another sheet where I analyse density curve. I’m repeating the same actions everyday. Can you suggest me some smart way to keep my excel sheet updated?
Next question: Residual, i get everyday, slightly differs than that of previous day albeit the difference is at third or fourth places after decimal. Should i paste the whole set of data or instead add one day data to the already existing column?
Thanks
Varsity student

I understand, Kumar. This actually needs some programming help and unfortunately, I can be of very little help in that perspective. You can update the latest close to get the latest position of the residuals.

Hello sir
Weel, my quest for updating data fast has got some success. I learnt to use macros but it runs on the fixed amount of data. I mean if i recorded to perform on 255 set of data then it can’t run on 256 set of data.
Now my excel sheet has become dynamic. Whenever i add new data (today’s close price) the oldest data in the column gets deleted on its own and i have the same number of data but different a starting date.
I want to know if there is any issue with such dynamic updating of data. Hope you got me 🙂
Thanks
Varsity student

So can both method be used in intraday as well? What will be the data series in that case.Will it be 15 min close price in case of 15min chart.Which period would be more reliable daily or intraday? And what should be the profit expectection in case of intraday in percentage terms? Lastly are there some other pair trading method apart from the two (btw I found these two mehod very informative and practical)u showed us and can u suggest some books or reading for same.Thanks in advance

No Vinay, I would not suggest you do this for intraday. These pairs trades need time to evolve and this happens over 3-4 days. However, I have opened and closed pair trades on intraday basis, but this has happened due to luck and not design.

Good series on Pair Trading. Got me hooked!!
Just a small query..when you say data should be adjusted for bonus, split, dividend etc..Where do we get such data..I am Importing the data from NSE website..So can it be considered clean or else can you give any other source to obtain clean data?

I checked the NSE bhav copy which is published daily and does not have the adjusted price if you go back and pull the same file from the earlier dates like for example TCS whose price was recently got changed due to bonus .

Can you please share the link or tell me where to get the adjusted price ?

I am trying to reproduce all the steps which you mentioned in this blog. I have downloaded the excel sheet in which you have provided HDFC Bank and ICICI Bank data from 4th Dec 2015 until 4th Dec 2017. I have calculated slope, intercept, Standard Error and Standard Error of Intercept and finally the Error Ratio.

To calculate P value on time series data of residuals, I have used R language. There is a function adf.test() which executes ADF test on the given data. However, when I run the test, I receive the data as follows:
P value for residual of ICICI bank(Y) = 0.03729
P value for residual of HDFC bank(Y) = 0.08545

However, in your post, as you mentioned other results e.g. slope, intercept, Standard Error and Standard Error of Intercept and finally the Error Ratio, I could compare them to verify if the calculations that I am running are correct or not.

Can you please run the test on the same data and please confirm if the P values which I have received are correct or not? For reference, I run adf.test(c(The time series residual data here….)) function. Without passing any other arguments to this function. There are arguments by which Lag parameter can be defined. But I was not sure about that so ran the function with default arguments. Can you or someone from your team confirm if values which I have received are correct or not? If not then how exactly are they using R to get P values?

I got the latest excel sheet on 12th Jun 2018. Thanks for your pointer. Can you please clarify the From and To date of the data used for calculation of this sheet? So that I can use the same data and match the results precisely. Current I am using HDFC bank and ICICI bank data from 4th Dec 2015 until 4th Dec 2017 which you have shared in an excel sheet.

By reading your note for programmers in the following chapter I got that you are using last 200 days of data. Considering that you have data published last Pair data excel sheet which you have pointed out to me in your comment, I figured out that you are using last 200 days of data starting from 23th Aug 2017 till 12th Jun 2018. Now when I run my calculation on it, I could match values of beta, intercept, Std. Error and Sigma, precisely to decimal points. That gives my confidence that my calculations are correct. However, when I pass time series residual data of HDFC to adf.test function in R, the outcome is:

The p-value does not match with the excel sheet value of 0.2073132413. Can you please clarify how you guys are executing ADF test to get the number. If anyone in your team can tell me how to get to this number using R, that will be great.

Did you get any chance to look into the ADF test data and parameters. I am also facing the same issue as reported in the above post. When checked with the latest provided pair data excel, I could also match (from 23th Aug 2017 till 12th Jun 2018) all the values (beta, intercept, Std. Error and Sigma) precisely to decimal points except p-value. Could you please check on this.

I found one free excel add-in for ADF test. I have tested it but don’t know how useful the output is.
Can u please test it once, Just want to know it is the same we are looking for free ADF test.
To download click on below link.http://www.web-reg.de/adf_addin.html#

But Karthik I don’t know how to use python. Excel adf plug in will not take much of your time. Please help me, from last 2 months I stuck on adf test. And finally find a Excel plug in just run it once.
Please.

Dear Karthik,
As informed by someone, R studio with urca package is excellent tool for ADF test. there is very simple programme to be written (credit youtube). I guess u must share this solution with other viewers of varsiry as it wil be of great help to all of them. Instead of leaving the users at the end with the statemnet that they will be stuck at ADF test, it is better to feed them with this solution. It will be an added cause to your already running great cause.

Needless to say, thanx for this lovely module on pair trading, It was thrilling.

I have used it but results I have not verified yet. I request you to pls run and verify on your pair data sheet, if possible. I m sharing a youtube link which will work for everyone.https://m.youtube.com/watch?v=mkHtP0nONJY

R studio is open source software. easy to download and install for free.

sincere request to u Karthik that pls verify the results and let us know if it is useful. Thnx in advance.

2 more queries Karthik,
1. running a script downloads dats for spot price but we r interested in futures. Analysis done on spot but trade to be taken on futures..is this wht we gonna do?
2. trade has to be initiated ONLY when Zscore touches nearly +2.5/-2.5 or it can also be initiated when it touches 2.6, 2.7 or 2.8?

We need them to value one stock versus the other. Remember, at the core of the strategy is linear regression where we try and explain the price of stock y (dependent) by using the price of stock x (independent).

Thanks for all the enlightening on financial modelling for pair trades. Few questions on the data for Hero motocorp and bajaj-auto i am trying to check for possible pair trade scenario for both companies with data count of 246 days for past 356 days till 14th december 2018.
-Having run the adf test in excel, with condition of constant only for unit root at levels got an adf score p value or prob of 0.153961 higher than 0.01
– Having run the same test with test for unit root in first difference the adf result was positive for stationarity withp values or prob values of 0.000000.

how should this be treated as there is stationarity at first difference while no stationarity at levels. should the adf data be considered only at levels and differences (1st, 2nd) ignored?

There are many type of trading systems.
1. News based
2. Single Candle stick patterns like engulfing candle, dozis etc
Or patterns like double ot tripple tops and bottoms, head and shoulders, cup and handle etc
3. Combining Indicators like rsi, macd. Super trend, bollinger bands lab lab lab
4. Moving average cross overs
5. Trend following using S and R
6. Treand reversal or bottom fishing
7. Based on open interest and price relationship
8. Pair trading
9. Swing trade or intraday
10. Option trad
11. Just by watching price action
Many more…
Which is your favourite? Mean to say, in which you could able to generate more consistent profits

Thanks, Vijay. Hope to learn and grow from here 🙂
Apologies for not posting a reply earlier. Can you please share the technique of how you did the ADF test and also the test result file? Maybe you can mail that across to [email protected], addressing me. Thanks.

I have the account of Zerodha. You mentioned that you will upload pair data sheet once in a week or 15 days. But I do not find this anywhere. Pl help me to find this so that I could be able to tradenbased on better p-value.

I have been trading all types of trading i.e. futures, options and equity spotfor last ten years. But I am not been able to make profit consistently and ultimatly lost. I have knowledge of charts and tried different syatems but not successful. So therefore give me suggestions which method I should trade. I know trading is not holy grail but profit can be made consistently from trading. I am interested in short term trading not long term. Pl give me suggestions.