This is my first post, so please forgive if it resembles a run-on sentence... If my delivery seems familiar, you might know me by my blog, MKTSTK.com. If so, you know I generally DGAF about dropping some truly valuable information on the community just for fun, so without further ado...

Today I wanted to present a little strategy that's been causing some buzz amongst the prop traders and hedge fundos I've been talking to so far. In case you were wondering if Quantopian gave you an edge, this should be proof positive of their commitment to attracting top-notch datasets. Please note that although this strat uses the fetcher API, soon you should be able to use Psychsignal's datasets I've developed directly in Quantopian in addition to the PsychSignal Trader Mood API which is already available for free for backtesting and live trading.

The strategy is driven by a daily datafeed I created called the HIVE-MIND. The Hive, as its called for brevity's sake, is made of two distinct components: 1) the Hive-Bot and 2) the Hive-Net.

The Hive-Bot measures the activity of a symbol with respect to the social media landscape. The Hive-Bot transforms the multidimensional social message flow into a simple scale between 0 and 1.0, called the Social Anomaly Score (SAS). At the high end, 1.0 represents a frenzy level of activity related to a symbol. In the middle, 0.5 is meant to signify a normal social pattern (i.e. what is expected given the historical profile of the symbol over time). At the other extreme, a reading near 0.0 represents a low amount of interest.

My research has shown that the Hive-Bot's SAS is predictive of volatility and correlation. Thus it makes sense to use it as a market timing mechanism. There are many possible forms for this to take within a real trading strategy. One such conceivable usage is to switch between mean reverting and momentum strategies. Despite many idiosyncrasies, trading strategies often break-down into simplistic categories of being levered to momentum or mean reversion. If one could differentiate, a priori, between mean reverting and momentum periods in the market one could make a fortune... but how might you construct such a strategy?

The following presents a model which combines both mean reverting and momentum based strategies. While it is not levered by default, the strategy can choose to employ leverage when it is advantageous. The strategy uses the Hive-Bot's SAS to sense when to switch between mean reverting and momentum regimes. The strategy uses two different look-back windows to gauge momentum and trades around 50 equity ETF's.

Since I'm new to Q, any feedback would be much appreciated. Happy Trading

Why don't you use the SAS for each symbol rather than just one for all for SPY? It would be helpful if the usefulness of the signal could be validated in isolation, outside of a backtest with a bunch of free parameters, perhaps using the "alpha lens" framework they just announced for Research.

Good question. My research into the Hive-Bot has shown that SPY's SAS exerts a powerful influence on the dependency structure of the market. Thus, there was good reason to think that SPY's SAS could be used as a global indicator / signal with value across a number of trading instruments. Moreover, I found that adding SAS to absolute momentum strategies improved their risk/return profiles Thus, I viewed this strategy as an extension of the above lines of research. You can get a deeper look at some of the published results here

That being said, no doubt this is just the tip of the iceberg, my plan is to post many, many more tests as we go forward. My hope is that the community will start to explore the possibilities in the Hive dataset as well. The idea here was to provide one of many possible example of how to use the Hive-Mind in a trading algorithm. I think things will get truly interesting once we get the Hive-Net integrated as well and can start connecting the dots... using the network graphs directly in trading strategies on Quantopian.

Hello Tynan,
Is there any chance this algo is leaking SAS data from the future? You say in the text that your SAS is a daily datafeed, so I assume that the daily value for SAS would be calculated based on the social media from that day, rather than from the previous day. In your algo, you use data.current('spy_sas','SAS')) pulled from the historical SAS records, suggesting that you're getting the current day's SAS value (day[0]) at 10am and trading on that knowledge, when what you would really have available at 10am would be the SAS value for day[-1]. This might explain some of the exceptional performance seen here... when SPY is going up well, the social media reflects this intraday, and hence a purchase around 10am based on the full day's social insights might be quite a good idea.
On a side note, I find that this algo performs considerably better by reducing the universe of ETFs substantially, to a handful or so. Also, of course, these returns are remarkably fictitious for any retail trader given the very large number of trades made. It's quite a fun algo to play with though, thanks for publishing.

no, the production hive is updated intraday, so this daily version is generated well in advance of the open to avoid exactly that bias

re: transactions

i'm using the default impact/commish models, which i think default to $0.0075 a share? that is more than double IB's US equity commissions, so I think if anything its pretty conservative with regards to that, although with some higher rate brokers daily rebalancing could probably get expensive. could always trade a super liquid subset if commish/liquidity is an issue. or just trade the e-mini using spy's signal (it will be very cool when futures are up and running on quantopian!)

Thanks Tynan, I am new here and missed the important default commission and slippage calculations.
But I'm still confused about historical SAS data. Correct me if I'm wrong, because this seems awkward.
When I get a price (say the "close") for a ticker for a given date, that will be the close that was recorded for that specific date. But if what you say is correct about the hive data, this historical SAS value you're pulling gives you the SAS value (i.e., the social media indicator value) for the previous day's social media, not the value calculated for the date specified. This seems to make your data arrangement different than most -- i.e., to get the SAS social media value for SPY as recorded for 2015-11-23, I would have to use the value for 2015-11-24!

Have a look at the attached notebook, specifically the trader mood data set columns. You'll notice there is an "asof_date: The date to which this data applies." and a second "timestamp" which is the date the data is available to trade. Quantopian adds this "timestamp" column to every data set in order to prevent future snooping and standardize non-standard data sets.

With the Hive data, Tynan is following the exact same convention Quantopian uses to timestamp the underlying Trader Mood data. The data is available after 4am UTC for trading the same day.

Until Quantopian starts pulling HIVE data into the data store we will continue to make the data available via the fetcher API. Feel free to use it as much as you like. Let me check with Seong for an estimate as to when the data will make it into the store.

Hi James, thanks for that. The fetcher data in the example is only up until last week so I assume there will be no current data until it gets integrated? I assume it will be a premium data source?

I've quickly integrated the sample data above into an algorithm I've been getting ready to live trade and it's had positive results both in terms or return and DD so I'm keen to have more of a play around with it. But obviously it's ready when it's ready :)

One thing I'm not too sure about. So this data indicates the level of social media activity, not the mood, correct? I've had a brief play with the trader mood data and couldn't get consistent results from it, but this activity data seemed to have good results without much effort. That seems counter intuitive; wouldn't it be possible that a high SAS score could equally indicate a high activity of negative social discussion and therefore indicate a downturn? That doesn't seem to be the case in backtesting. I imagine there's more info in your whitepaper so I'll have a read on the weekend.

Hi Brian C.
I'm guessing, but for this particular universe and weights, the choice of 40 day SAS lookback and SAS threshold of .66 seem fairly optimized, not based on any particular rationale. Clone it yourself and try several nearby variants and I think you'll find it so (I did). My own playing with this algorithm seems to work best with a smaller universe of ETFs (e.g., try removing all but SPY and TQQQ for an extreme example with double the returns, but a concurrently high MDD and volatility). Personally, I'm bending this algorithm in other ways, like removing the short side (long only) and keeping it unleveraged. I'm more partial to finding parameters that are both not so optimized (ie, more of a plateau of decent returns in a 2D space of lookback and SAS threshold) and also display a better equity curve over the past couple of flat years. What I'd like best to find is a decent standalone momentum strategy where the SAS data improves it, rather than what is shown here as an example, which is a reasonably poor strategy without the SAS data (try setting sas_thresh to -1 and see the momentum strategy fail on its own). In some cases, I've found what appear to be decent momentum strategies (with sas_thresh set to -1) worsened by addition of a positive sas_thresh. These are intriguing data I'm continuing to explore.

With regards to the sas threshold, since the sas is bounded by 0 and 1, initially I looked at a simple partitioning: 0 to 1/3 was defined as “low”, 1/3 to 2/3 as “medium” and anything above 2/3 as “high”. This rough classification system proved useful, so the 0.66 follows from that, and thus I doubt its optimized globally although it’s seems like it’s a good enough heuristic to start with. An extension of this strategy would be to make the sas threshold adaptive, maybe using some kind of ema or something more elaborate to vary the threshold in a sensible manner. Its also conceivable that the appropriate thresholds could vary based on the particular strategic use case, e.g. have diff thresholds for momentum and mean reversion

Also, you could def run this strat with equal long/short weightings, but I know in practice it can be advantageous to tilt these strats slightly long because of the persistent bias towards the stock market, which has imposed a sort of penalty to shorts, hence the slight bias in weights 1.0/-0.8.

Firstly, an update on data: we are currently onboarding the SAS dataset w Q and hope to have it live mid-september

In the meantime, after a slight learning curve (is it possible to get fetch_csv to work in a notebook?), I wanted to post a notebook that dives a little bit deeper into the SAS data. You see, sometimes when people hear that there is an intimate link between SAS and volatility, inquiring minds want to know if they can get the same value as the SAS by replicating it with the VIX (or the VIX futures curve). The thing about the VIX is that it is biased towards downside volatility. Nobody really wants/needs to hedge upside volatility (those shocks are GOOD) so using the VIX only captures part of the picture. If prices are ripping, realized volatility could be very high while the vix is actually falling. Thus, this notebook takes a look at the contemporaneous correlation between the SAS (smoothed and unsmoothed) and the vix futures curve.

Even though the code is the same, you did run a different backtest. Your algo trades with a much smaller capital base, $10k vs my algo at $1MM

My guess is that this illustrates the effect of commissions on different sized portfolios. This strategy uses the default Q commission model, so there is a $1 minimum trade cost (or 0.0075 per share) cost associated with each trade

This algo rebalances daily, so there's a potential $50 a day fixed commission cost. For a $1MM portfolio, $50 represents a 0.005% drag on returns. For a $10k portfolio, $50 is 0.5%!

hope you're still following this thread. I was also pretty interested into the predictive power of social media data. So I've done some research. As you linked earlier, there are 4 psychsignal datasets on Q right now.

aggregated_twitter_withretweets_stocktwits
stocktwits (same as stocktwits_free, which is used in the tutorial notebook)
twitter_noretweets
import twitter_withretweets

to kind of figure the differences out I put them all into alphalens. There are quite many questions on how to perform the alphalens test (Which time frame? Which variables from the datasets? Which values from the tearsheet to compare?)

My conclusions from the attached Notebook:
- I think bull_minus_bear is the field to choose (*(-1) because it's a negative correlation).
- The predicitve power is the strongest for short term forecasts (like one day). Which kind of makes sense.
- I like the stocktwits most. It's mostly green and it has a high IR. And at least for the chosen period isn't negative in it's returns.
- To not only use stocktwits data, I think additionally also tweets without retweets is valuable (also kind of a good correlation)

One risk that you can't see in the notebook: if you shift the timeframe a bit, the results are definitelly worse.

Tynan, you said they want to include the SAS Dataset mid-september, any updates on this?

hive.psychsocial.com subdomain is no longer live, and their website seems a bit moribund, with no blog posts since Sept 2016 which was when this thread was started. i have no insider knowledge, but no tweets from the company since June, no clear signs of activity. their CEO has a linkedin presence, but only thing of interest seems to be a move toward using this sort of data toward trading crypto (see decryptz.com).

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian.

In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian.

In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.