Tracking an Index with equal weight positions

I'd like to build a simple model that simply tracks an index, the Eurostoxx 50 (SX5E), with equal weight positions at the moment of the first order).

My idea is:

Get the historical composition of the index, daily

Add a column "MEMBERSHIP" to the CSV price file with a 0 or 1 if the security is member of the index.

Rules should be something like that:

If ticker['MEMBER'] == 1 and ticker not in portfolio Buy next open
If ticker['MEMBER'] == 0 and ticker in portfolio Exit Long next open
Size the positions (Portfolio Mark to market + cash) / len(index_members)

Is there a better (simpler) approach with BackTrader to accomplish something like this?

Obviously a better model should resize the positions to mantain equal weights but let's keep the things simple.

However this could raise a problem, if you use target order or 'Portfolio Mark to market' in your question, BT use current bar's close price to calculate portfolio or individual stock's value, but the order is issused on next open and the price will be different. Say the price go up on next open ,the cash requied will be higher and your trade will fail for some stocks due to insufficient cash. This problem is discussed in:

Get the historical composition of the index, daily
Add a column "MEMBERSHIP" to the CSV price file with a 0 or 1 if the security is member of the index.

The thing here is that you seem to want to add well over 50 data feeds. No limitation there, just to make sure we are all on the same page. Each data feed will have the extra membership column, which will let you decide whether it is a target or not.

The problem here: survivorship bias. Because most probably than not, not all stocks have started being part of the market (not the index) at the same time, which means that the system won't go into next until all assets and associated indicators meet the guarantee to be able to deliver.

@backtrader Thanks for your reply and I couldnt agree more.
In real world these kind of rebalancing problem is often solved during market time when other players come in to play and the buy/sell order size changes accordingly, eventually the portfolio wont balance perfectly anyway. I see how people wants to run backtest as close to real world as possible, though history dont simply repeat itself. For me I just add a 95% on total portfolio value usage so there are extra room for cash.
But yeah thanks to the vast arsenal of BT, people can do w/e they wants I guess.

To be sure we are on the same page, I'd like to specify how I would write the process if I had to code it from scratch not using BackTrader and dealing with realtime.

Let’s add a simple moving average system to complicate the things a little bit.

For every day in test and for real time:

Ask the data provider (I use Bloomberg professional but I can also download data into CSV/Excel from it) the composition of index "X" at the date. For Eurostoxx 50 it would give me a list 50 tickers. Obviously this list will change in time, usually it is reviewed every three months. I could optimize the request to be made only at the end of the quarter but to be sure let’s ask Bloomberg the membership data every single day.

Ask the data provider the needed n historical data (PX_LAST) of all the tickers in the members list and for every Series compute a moving average on it. Let’s say a 50 day moving average.

System rules:

If a ticker is in this list but not in my portfolio, BUY it at next open if the last close is above it's moving average.

If a ticker is in my portfolio but not in the list (so it's not anymore a member of the index), SELL it at next open. I don’t won’t it to be part of my portfolio.

If a ticker is in my portfolio and it's also in the index list but the last close is below the moving average sell it at next open.

Size:

Compute the size of each new position according to some formula (equal weight to be simple but could be some more complex algorithm like minimum variance, risk parity…)

Now, If I only want to backtest and not to deal with realtime trading I have a big advantage.

I could ask Bloomberg (or any other data provider that can give me this information) the historical composition of the index I'm tracking for each day in the test. In the end, at the very last day, with a simple loop I can get all the tickers that have been members of the index in the period of time between test_start_date and test_end_date. This is why I thought I could simply add a column “membership” with a Y/N flag.

You’re right about the fact not all the historical series would start the same day. We have two options here:

Find a way to have BackTrader asking the CSV data provider, at each heartbeat, the composition of the index, the same as I’d do with Bloomberg in real time if I had to write the whole process from scratch. Is it possible with BackTrader?

Cheat a little bit with the benefit of insight. Align, before importing the data, all the series to start at the same day with pandas. This could mean having to add anther column to be sure I won’t buy/sell a ticker on fake data. Sure, I would have to pre-process all the data with pandas before feeding BackTrader with it but it wouldn't be a big deal. And if done with care it won't damage the validity of the test.

Thanks, I am well aware of all the problems to compute the size using the last close, the only data we know for sure, and the possibile slippage I would get at next open (positive or negative). I guess the best option is, as you pointed out, to keep some free cash (5%?) and to prioritise the sells orders to be sure to get all the buy orders filled. I didn't mention it in my posto only to keep the things the simplest as I could

So in __init__ I create a list of all stocks and an attribute called lengthlist for every stock. Then on every bar, the length of each stock is checked and appended, if the length starts to grow, it means data start to be meaningful and might be used for later calculations. Everything in next() is just a repetition of prenext(), so your strategy will run even not all data comes into play.
This way your wont need to manipute the original data or add extra lines.

Please let me know if there is a better or intuitive way, as I think prenext() were originally there for indicator like SMA calculation before the required period were met. If there is a switch to call next() even not all data lenght is met might be very handy.

Thanks for sharing your code! It might be a simple and effective shortcut for the alignment problem. Still, I have to know if this particular stock is a member of the index or not. Pre-processing the data would also add the benefit to speed up some ranking of the stocks that are part of the index for portfolio selection. What do you think?

I think pre-processing is requied if additional information are required for stock selection.

What I did was including all additional infomation into volume line instead of adding new lines, since default BT behaviour is to ignore volume anyway(I hope in the future more complicated logic can be included), and it may seem intuitive to only trade when volume is not 0. So volume value is set to 1 if particular stock is in the index in your case. Of course you may wanna add additional infomation into volume line, like tradability etc., but yeah always look out for look ahead bias.

In my case I need to add about 3600 stocks into BT and running a 5 years daily bar backtest takes about 18 mins. I preprocess the data with pandas and create a custom pandas feed. I have tested that having less lines actually speed the process up a bit.

@nicola-prada
I think pre-processing is requied if additional information are required for stock selection.

Yes. The coolest part would be being able to also rank, day by day, only the stocks that are members of the index according to one or more indicators. And to use the rank position as a system parameter to generate buy and sell signals.