Monday, 14 August 2017

Thursday, 22 June 2017

Some more trading rules

It is a common misconception that the most important thing to have when you're trading, or investing, systematically is good trading rules. In fact it is much, much, much more important to have a good position management framework (as discussed in my first book) and to trade a diversified set of instruments. Combine those with a couple of simple trading rules, and you'll have a pretty decent system. Adding additional rules will improve your expected return, but with rapidly diminishing returns.

It's for this reason that only 2 out of the 75 posts I've published on this blog have been about trading rules (this on trend following and carry; and this one on my 'breakout' system). But ... if I look at my inbox or blog comments or my thread on the most common request is for me to "write about X"... where X is some trading rule I may have casually mentioned in passing that I use, but haven't written about it.

So I have mixed feelings writing this post (in which the metaphorical kimono will be completely opened- there are no more secret trading rules hiding inside my system). I'm hoping that this will satisfy the clamour for information about other trading rules that I run.  Of course it's also worth adding these rules to my open source python project pysystemtrade, since I hope that will eventually replace the legacy system I use for my own trading, and I won't want to do that unless I have a complete set of trading rules that matches what I currently use.

But I'd like to (re-)emphasise that there is much, much, much more to successful systems trading than throwing every possible trading rule into your back test and hoping for the best. Adding trading rules should be your last resort once you have a decent framework, and have done as much instrument diversification as your capital can cope with.

Pre-requisites: Although there is some messy pysystemtrade python code for this post here you don't need to use it. It will however be helpful to have a good understanding of my existing trading rules: Carry and EWMAC (Exponentially weighted Moving Average Crossover) which you can glean from my first book or this post - most of the rules I discuss here are built upon those two basic ideas.

PS You'll probably notice that I won't talk in detail about how you'd develop a new trading rule; but don't panic, that's the subject of this post.

Short volatility

I'm often asked "What do you think your trading edge is?" A tiresome question (don't ask it again if you want to stay in my good books). If I have any 'edge' it's that I've learned, the hard way, the importance of correct position sizing and sticking to your trading system. My edge certainly doesn't lie in creating novel trading rules. 

Instead the rules I use all capitalise on well known risk factors: momentum and carry for example. You'll sometimes see these called return factors but you don't get return without risk. Of course we all have different risk tolerances, but if you are happy to hold positions that the average investor finds uncomfortably risky, then you'll earn a risk premium (at least it will look like a premium if you use standard measures of risk when doing your analysis). A comprehensive overview of the world of return factors can be found in this excellent book or in this website

One well known risk factor is the volatility premium. Simply put investors are terrified of the market falling, and bid up the price of options. This means that implied volatility (effectively the price of volatility implied by option prices) will on average be higher than expected realised volatility.

How can a systematic futures trader earn the volatility premium? You could of course build a full blown options trading strategy, like my ex AHL colleague. But this is a huge amount of work. A much simpler way is to just sell volatility futures (the US VIX, and European 
V2TX); in my framework that equates to using a constant forecast of -10, or what I call in my book the "no rule" trading rule (note because of position scaling we'll still have smaller positions when the volatility of volatility was higher, and vice versa).

And here is a nice picture showing a backtest of this rule:

"With hindsight Rob realised that starting his short vol strategy in late 2007 may not have been ideal timing...."

Earning this particular premium isn't for the faint hearted. You will usually earn a consistent return with occasional, horrific, drawdowns. This is what I call a negative skew / insurance selling strategy. Indeed based on monthly returns the skew of the above is a horror show -0.664. This isn't as bad as the underlying price series, because vol scaling helps improve skew, but still pretty ugly (on S&P500 using the same strategy it's a much nicer 0.36).

It is a good compliment to the positive skew trend following rules that form the core of my system (carry is broadly skew neutral, depending on the asset class). For various reasons I don't recommend using the first contract when trading vol futures (in my data the back adjusted price is based on holding the second contract). One of these good reasons is that the skew is really, really bad on the first contract. 

But... we already have trend following and carry in vol? Do we need a short bias as well?

I already include the VIX, and V2TX, in my trend following and carry strategies. That means to an extent I am already earning a volatility premium. 

How come? Well imagine you're holding the first VIX contract, due to expire in a months time. The price of that (implied vol) will be higher than the current level of the VIX (which I'll call, inaccurately, spot vol), reflecting the desire of investors to pay up for protection against volatility in the next month. As the contract ages the price will drift down to spot levels, assuming nothing changes; a rolldown effect on futures prices. That's exactly what the carry strategy is designed to capture.

This isn't exactly the same as the implied versus spot vol premium; but it's very closely related.

Now consider trend following. Assuming you use back adjusted futures prices then in an environment when spot vol doesn't move, but in which there is negative rolldown for the reasons described above, then the back adjusted price will drift downwards. This will create a trend in which the trend following strategy will want to participate.

Arguably trend following and carry are actually better than being short vol, since they are reactive to changing conditions. In 2008 a short vol strategy would have remained stubbornly short in the face of rapidly rising vol levels. But trend following would have ended up going long vol (eventually, depending on the speed of the rule variation). Also in a crisis the vol curve tends to invert (further out vol becoming cheaper than nearer vol) - in this situation a carry strategy would buy vol.

The vol curve tends to invert in a crisis

So.... what happens if I throw carry and trend following back into the mix? Using the default optimisation method in pysystemtrade (bayesian shrinkage) the short biased signal gets roughly a 10% weight (sticking to just VIX and V2X). That equates to an improvement in Sharpe Ratio on the overall account curve of the two vol futures of just 0.03, a difference that isn't statistically different. And the skew gets absolutely horrific.

So... is this worth doing? I'll discuss this general issue at the end of the post. But on the face of it using trend following and carry on vol futures might a better way of capturing the vol premium than just a fixed short bias. Using all three of course could be even better.

An aside: What about other asset classes?

An excellent question is why we don't incorporate a bias to other asset classes that are known to earn a risk premium; for example long equities (earning the equity risk premium) or long bonds (earning the term premium)?*

* I'm not convinced that there is a risk premia in Commodities, at best these might act as an inflation hedge but without a positive expected return. It's not obvious what the premia you'd earn in FX is, or which way round you should be to earn it.

This might make sense if all your capital was in systematic futures trading (which I don't recommend - it's extremely difficult to earn a regular income purely from trading). But I, like most people, own a chunk of shares and ETFs which nicely cover the equity and bond universe (and which pay relatively steady dividends which I'm happy to earn an income from). I don't really need any more exposure to these traditional asset classes.

And of course the short vol strategy has a relationship with equity prices; crashes in equities normally happen alongside spikes in the VIX / V2X (I deliberately say relationship here rather than correlation, since the relationship is highly non linear). Having both long equity and short vol in the same portfolio is effectively loading up massively on short black swan exposure.

Relative carry

The next rule I want to consider is also relatively simple - it's a relative version of the carry rule that I describe in my book and which is already implemented in pysystemtrade. As the authors of this seminal paper put it:

"For each global asset class, we construct a carry strategy that invests in high-carry securities while short selling low-carry instruments, where each instrument is weighted by the rank of its carry"

Remember for carry the original forecast is quite noisy, to avoid that we need to smooth it. In my own system I use a fixed smooth of 90 business days (as many futures roll quarterly) for both absolute and relative carry. 

Mathematically the relative carry measure for some instrument x will be:

Rx_t = Cx_t - median(Ca_t, Cb_t, ...) 

Where Ca_t is the smoothed carry forecast for some instrument a, Cb_t for instrument b and so on; where a,b, c....x are all in the same asset class. 

Note - some people will apply a further normalisation here to reflect periods when the carry values are tightly clustered within an asset class, or when they are further apart - the normalisation will ensure a consistent expected cross sectional standard deviation for the forecast. However this is leveraging up on weak information - not usually a good idea.

This rule isn't super brilliant by itself. Here it is, tested using the full set of futures in my dataset:

It clearly underperforms it's cousin, absolute carry. More interestingly though the predictors look to be doing relatively different things (correlation is much lower than you might expect at around 0.6), and the optimisation actually gives the relative carry predictor around 40% of the weight when I just run a backtest with only these two predictors. 

Lobbing together a backtest with both relative and absolute carry the Sharpe ratio is improved from 0.508 to 0.524 (monthly returns, annualised). Again hardly an earth shattering improvement, but it all helps.

Normalised momentum

Now for something completely different. Most trading rules rely on the idea of filtering the price series to capture certain features (the other school of thought within the technical analysis campus is that one should look for patterns, which I'm less enthusiastic about). For example an EWMAC trend following rule is a filter which tries to see trends in data. Filtering is required because price series are noisy, and a lot of that noise just contributes to potentially higher trading costs rather than giving us new information. 

But there is another approach - we could normalise the price series to make it less noisy, and then apply a filter to the resulting data. The normalised series is cleaner, and so the filters have less work to do.

The normalisation I use is the cumulative normalised return. So given a price series P_0, P_1 ... P_T the normalised return is:

R_t = (P_t - P_[t-1] ) / sigma (P_0.... P_t)
Where sigma is a standard deviation calculation.Also to avoid really low vol or bad prices screwing things up I apply a cap of 6.0 in absolute values on R_t. Then the normalised price on any given day t will be:

N_t = R_1 + R_2 + R_3 + .... R_t

NOTE: For scholars of financial history I've personally never seen this trading rule used elsewhere - it's something I dreamed up myself about three years ago. However it comes under the "too simple not to have been already thought of" category so I expect to see comments pointing out that this was invented by some guy, or gal, in 1952. If nobody does then I will not feel too embarrassed to call this "Carvers Normalised Momentum".

Perceptive readers will note:

  • You probably shouldn't use normalised prices to identify levels since the level of the price is stripped out by the normalisation.
  • These price series will not show exponential growth; the returns will be roughly normal rather than log normal. This is a good thing since over long horizons using prices that show exponential growth tends to screw up most filters since they don't know about exponential growth. Over relatively short horizons however it makes no difference.
  • Simple returns calculated using the change in normalised price can be directly compared and aggregated across different instruments, asset classes and time periods; something that you can't do with ordinary prices. We'll use this fact later.

Rather boringly I am now going to apply my favourite EWMAC filter to these normalised price series, although frankly you could apply pretty much anything you like to them. 

Minor point: The volatility normalisation stage of an EWMAC calculation [remember its ewma_fast - ewma_slow / volatility] isn't strictly necessary when applied to normalised price series which will have a constant expected volatility but it's more hassle to take it out so I leave it in here.

Normalised momentum
Performance wise there isn't much to choose between normalised and the use of standard EWMAC on the actual price; but these things aren't perfectly correlated, and that can only be a good thing.

Aggregate momentum

It's generally accepted that momentum doesn't work that well on individual stocks. It does however sort of work on industries. And it is relatively better again when applied to country level equity indices. 

I have an explanation for this. The price of an individual equity is going to be related to the global equity risk premium, plus country specific, industry specific, and idiosyncratic firm specific factors. The global equity risk premium seems to show pretty decent trends. The other factors less so; and indeed by the time you are down to within industries mean reversion tends to dominate (though you might call it the value factor, which if per share fundamentals are unchanged amounts to the same thing).

Value type strategies then tend to work best when we're comparing similar assets, like equities in the same country and industry; also because accounting ratios are more comparable across two Japanese banks, than across a Japanese bank and a Belgian chocolate manufacturer. There is a more complete expounding of this idea in my new book, to be released later this year.

So trading equity index futures then means we're trying to pick up the momentum in global equity prices through a noisy measurement (the price of the equity index) with a dollop of mean reverting factor added on top.

If you follow this argument to it's logical conclusion then the best places to see momentum will be at the global asset class level*. There we will have best measure of the underlying risk factor, without any pesky mean reversion effects getting in the way.

* A future research project is to go even further. I could for example create super asset classes, like "all risky assets" [equities, vol, IMM FX which are all short USD in the numeraire, commodities...?] and "all safe assets" [bonds, precious metals, STIR, ...]. I could even try and create a single asset class using some kind of PCA analysis to identify the single most important global factor. 

How do we measure momentum at the asset class level? This is by no means a novel idea (see here) so there are plenty of suggestions out there. We could use benchmarks like MSCI world for equities, but that would involve dipping into another data source (and having to adjust because futures returns are excess returns, whilst MSCI world is a total return); and it's not obvious what we'd use for certain other asset classes. Instead I'm going to leverage off the idea of normalised prices and normalised returns which I introduced above.

The normalised return for an asset class at time t will be:

RA_t = median(Ra_t, Rb_t, Rc_t, ...)

Where Ra_t, Rb_t are the normalised returns for the individual instruments within that asset class (eg for equities that might include SP500 futures, EUROSTOXX and so on). You could take a weighted average, using market cap, or your own risk allocations to each instrument, but I'm not going to bother and just use a simple average.

Then the normalised price for an asset class is just:

NA_t = RA_1 + RA_2 + RA_3 + .... RA_t

Next step is to apply a trend following filter to the normalised price... yes why not use EWMAC? 

Minor point of order - it's definitely worth keeping the volatility normalisation part of EWMAC here because the volatility of NA is not constant even when the volatility of each Na, Nb... is - if equities become less correlated then the volatility of NA will fall, and vice versa; as more assets are added to the data basket and diversification increases again the volatility of NA will fall. Indeed NA should have an expected volatility that is lower than the expected volatility of any of Na, Nb...

Having done that we have a forecast that will be the same for all instruments in a particular asset class. 

If I compare this to standard, and normalised, momentum:

... again performance wise not much to see here, but there is clearly diversification despite all three rules using EWMAC with identical speeds!

Cross sectional within assets

So we can improve our measure of momentum using aggregated returns across an asset class. This works because the price of an instrument within an asset class is affected by the global asset class underlying latent momentum, plus a factor that is mostly mean reverting. Won't it also make sense then to trade that mean reversion? In concrete terms if for example the NASDAQ has been outperforming the DAX, shouldn't we bet on that no longer happening?

Mathematically then, if NA_t is the normalised price for an asset class, and Nx_t is the normalised price for some instrument within that asset class, then the amount of outperformance (or if you prefer, Disequilibrium) over a given time horizon (tau, t) is:

Dx_t = [Nx_t - Nx_tau] - [NA_t - NA_tau]

Be careful of making t-tau too large as remember the slightly different properties of Nx and NA; the former has constant expected vol whilst the latter will, by construction, have lower and time varying vol. But also be careful of making it too small- you need sufficient time to estimate an equilibrium. A value of around 6 months probably makes sense

And my personal favourite measure of mean reversion is a smooth of this out-performance:

- EWMA(Dx_t, span)

Where EWMA is the usual exponentially weighted moving average; this basically ensures we don't trade too much whilst betting on the mean reversion. The minus sign is there to show mean reversion is expected to occur (I prefer this explicit reminder, rather than reversing the stuff inside Dx).

Using my usual heuristic, finger in the air, combined with some fake data I concluded that a good value to use for the EWMA span was one quarter of the horizon length, t - tau.

Here is an example for US 10 year bond futures. First of all the normalised prices:

Blue is US 10 year normalised price. Orange is the normalised price for all bond futures.
Let's plot the difference:

US 10 year bond future normalised price - Bond asset class normalised price
This is a classic mean reversion trade. For most of history there is beautiful mean reversion, and then the "taper tantrum" happens in 2013 and US bonds massively underperform. Now for the forecast:

Notice how the system first bets strongly on mean reversion occurring during the taper tantrum, but then re-estimates the equilibrium and cuts its bet. With any mean reversion system it's important to have some mechanism to stop the falling knife being caught; whether it be something simple like this, a formal test for a structural break, or a stop loss mechanism (also note that forecast capping does some work here).

What about performance? You know what - it isn't great:

Performance across all my futures markets of mean reversion rule

BUT this is a really nice rule to have, since by construction it's strongly negatively correlated with all the trend following rules we have (in case you have lost count there are now four!: original EWMAC, breakout, normalised momentum, and aggregate momentum; with just two carry rules - absolute and relative; plus the odd one out - short volatility). Rules that are negatively correlated are like buying an insurance policy - you shouldn't expect them to be profitable (because insurance companies make profits in the long run) but you'll be glad you bought them when if your car is stolen.

In fact I wouldn't expect this rule to perform very well, since plenty of people have found that cross sectional momentum works sort of okay in some asset classes (read this: thank you my ex-colleagues at AHL) and this is doing the opposite (sort of). But strong negative correlation means we can afford to have a little slack in accepting a rule that isn't stellar in isolation (a negatively correlated asset with a positive expected return can be used to create a magic money machine).

Note: This rule is similar in spirit to the "Value" measure defined for commodity futures in this seminal paper (although the implementation in the paper isn't cross sectional). To reconcile this it's worth noting that momentum and value mostly operate on different time frequencies - in the paper the value measure is based on 5 year mean reversion [I use 6 months], whilst the authors use a 12 month measure for momentum [roughly congruent to my slowest variation].


Does adding these rules improve the performance of a basic trend following using EWMAC on price, plus carry strategy? It doesn't (I did warn you right at the start of the post!) but is it sill worth doing? I use a variation of Occam's Razor when evaluating changes to my trading strategy. Does the change provide a statistically significant improvement in performance? If not is it worth the effort? (By the way I make exceptions for simplifying and instrument diversifying changes when applying these rules).

I'd expect there to be a small improvement in performance given these rules are diversifying, and given that there isn't enough evidence to suggest that these rules are better or worse than any of my existing rules, but in practice it actually comes out with slightly worse performance; although not with a statistically significant difference.

But I don't care. I have a Bayesian view that the 'true' Sharpe Ratio of the expanded set of rules is higher, even if one sample (the actual backtest) comes out slightly different that doesn't dissuade me. I'm also a bit wary of relying on just one form of momentum rule to pick up trends in the future, even if it has been astonishingly successful in the past. I'd rather have some diversification.

Note if I had dropped any of the 'dud' rules like mean reversion, I'd be guilty of in sample implicit [over]fitting. Instead I choose to keep them in the backtest, and let the optimisation downweight them in as much as there was statically significant evidence they weren't any good.

The new rules have less of a long bias to assets that have gone up consistently in the backtest period; so arguably they have more 'alpha' though I haven't formally judged that.

Although on the face of it there is no compelling case for adding all these extra rules I'm prepared to make an exception. Although I don't like making my system more complex without good reason there is complexity, and there is complexity. I would rather have (a) a relatively large number of simple rules combined in a linear way, with no fancy portfolio construction, than (b) a single rule which has an insane number of parameters and is used to determine expected returns in a full blown markowitz optimisation.

So I'm going to be keeping all these numerous rule variations in my portfolio.

Monday, 15 May 2017

People are worried about the VIX

"Today the VIX traded below 10 briefly intraday. A pretty rare occurrence. Since 1993, there have been only 18 days where it traded below 10 intraday and only 9 days where it closed below 10." (source: some random dude on my linkedin feed)

... indeed 18 observations is a long.... long... way from anything close to a statistically significant sample size. (my response to random dude)

You can't move on the internet these days for scare stories about the incredibly low level of the VIX, a measure of US implied stock market volatility. Notably the VIX closed below 10 on a couple of days last week, although it has since slightly ticked up. Levels of the VIX this low are very rare - they've only happened on 11 days since 1990 (as of the date I'm writing this).

The VIX in all it's glory

The message is that we should be very worried about this. The logic is simple - "Calm before a storm". Low levels of the VIX seem to presage scary stuff happening in the near future. Really low levels, then, must mean a very bad storm indeed.

Consider for example the VIX in early 2007:

Pootling around at 10 in late 2006, early 2007, the VIX responded to the failure of two Bear Stearns hedge funds which (as we know now) marked the beginning of the credit crunch. 18 months later there was a full blown panic happening.

This happened then, therefore it will happen again.

It struck me that this story is an example of what behavioural finance type people call narrative bias; the tendency of human beings to extrapolate single events into a pattern. But we need to use some actual statistics to see if we can really extend this anecdotal evidence into a full blown forecasting rule.

There has been some sensible attempt to properly quantify how worried we should be, most notably here on the FT alphaville site, but I thought it worth doing my own little analysis on the subject. Spoiler alert for the terminally lazy: there is probably nothing to be worried about. If you're going to read the rest of the post then along the way you'll also learn a little about judging uncertainty when forecasting, the effect of current vol on future price movements, and predicting volatility generally.

(Note: Explanations for the low level of the VIX abound, and self appointed finance "experts" can be found pontificating on this subject. It's also puzzling how the VIX is so low, when apparently serious sized traders are buying options on it in bucket load sized units (this guy thinks he knows why). I won't be dealing with this conundrum here. I'm only concerned about making money. To make money we just need to judge if the level of the VIX really has any predictive power. We probably don't need to know why the VIX is low.)

Does the level of VIX predict stock prices?

If this was an educational piece I'd work up to this conclusion gradually, but as it's clickbait I'll deal with the question everyone wants to know first (fully aware that most people will then stop reading).

This graph shows the distribution of rolling 20 business day (about one month) US stock returns since 1997:

(To be precise it's the return of the S&P 500 futures contract since I happened to have that lying around; strictly speaking you'd add LIBOR to these. The S&P data goes back to 1997. I've also done this analysis with actual US stock monthly returns going back to 1990. The results are the same - I'm only using the futures here as I have daily returns which makes for nicer, more granular, plots.) 

Important point here: this is an unconditional plot. It tells us how (un)predictable one month stock returns are in the absence of any conditioning information. Now let's add some conditioning information - the level of spot VIX:

I've split history in half - times when VIX was low (below 19.44%) shown in red, and when it was high (above 19.44%), which are in blue (overlaps are in purple). Things I notice about this plot are:

  • The average return doesn't seem to be any different between the two periods of history
  • The blue distribution is wider than the red one. In other words if spot VIX is high, then returns are likely to be more volatile. Really this is just telling us that implied vol (what the VIX is measuring) is a pretty good predictor of realised vol (what actually happens). I'll talk more about predicting vol, rather than the direction of returns, later in the post.
  • Digging in a bit more it looks like there are more bad returns in the blue period (negative skew to use the jargon)

The upshot of the first bullet point is that spot VIX doesn't predict future equity returns very well. In fact the average monthly return is 0.22% when vol is low, and 0.38% when vol is high; a difference of 0.16% a month. That doesn't seem like a big difference - and it's hard to see from the plot - but can we test that properly?

Yes we can. This plot shows the distribution of the differences in averages:

This was produced by monte carlo: repeatedly comparing the difference between random independent draws from the two distributions. This is better than using something like a 't-test' which assumes a certain distribution.

A negative number here means that high VIX gives a higher return than low VIX. We already know this is true, but the distribution plot shows us that this difference is actually reasonably significant. In fact 94.4% of the differences above are below zero. That isn't quite at the 95% level that many statisticians use for significance testing, but it's close.

To put it another way we can be 94.4% confident that the expected return for a low VIX (below 20%) environment will be lower than that for days when VIX is high (above 20%).

A moments thought shows it would be surprising if we got a different result. In finance we expect that with a higher return you will get higher risk. We know that when VIX is high that returns will have a higher volatility. So it's not shocking that they also have higher risk.

So a better way of testing this is to use risk adjusted returns. This isn't the place to debate the best way of risk adjusting returns, I'm going to use the Sharpe Ratio and that is that. Here I define the Sharpe as the 20 business day return divided by the volatility of that return, and then annualised.

(You can see now why using the futures contract is better, because to calculate Sharpe Ratios I don't need to deduct the risk free rate)

Now we've adjusted for risk there is little to choose between the high VIX and low VIX environments. In fact things have reversed, with low VIX having a higher Sharpe Ratio than high VIX. But the difference in Sharpes is just 0.04, which isn't very much.

We can only be 63% confident that low VIX is better than high VIX. This is little better than chance, which would be 50% confidence.

An important point: notice that although the difference in Sharpes isn't significant, we do know it with reasonably high confidence, as each bucket of observations (high or low VIX) is quite large. We can be almost 100% confident that the difference was somewhere between -0.04 and +0.04.

"Hang on a minute!", I hear you cry. The point now is that vol is really really low now. The analysis above is for VIX above and below 20%. You want to know what happens to stock returns when VIX is incredibly low - below 10%.

The conditional Sharpe Ratio for VIX below 10 is actually negative (-0.14) versus the positive Sharpe we get the rest of the time (0.14). Do we have a newspaper story here?

Here is the plot of Sharpe Ratios for very low VIX below 10% (red), and the rest of the time (blue):

But hang on, where are the red bars in the plot? Well remember there are only a tiny number of observations where we see vol below 10. You can just about make them out at the bottom of the plot. In statistics when we have a small number of observations we can also be much less certain about any inference we can draw from them.

Here for example is the plot of the difference between the Sharpe Ratio of returns for very low VIX and 'normal' VIX.

Notice that the amount of uncertainty about the size of the difference is substantial. Earlier it was between -0.04 and 0.04, now it's between -1 and 0.5; a much larger range. To reiterate this is because one of the samples we're using to calculate the expected difference in Sharpe Ratios is very small indeed. It does look however as if there is a reasonable chance that returns are lower when VIX is low; we can be 86% confident that this is the case.

Perhaps we should do a "proper" quant investingation, and take the top and bottom 10% of VIX observations, plus the middle, and compare and contrast.That way we can get some more data. After all although statistics can allow us to make inferences from tiny sample sizes (like the 11 days the VIX closed below 10), it doesn't mean we should.

The big blue area is obviously the middle of the VIX distribution; whilst the purple (actually red on blue) is relatively low VIX, and the green is relatively high VIX.

It's not obvious from the plot but there is actually a nice pattern here. When the VIX is very low the average SR is 0.071; when it's in the middle the SR is 0.139, and when it's really high the SR is 0.20. 
Comparing these numbers the differences are actually highly significant (99.3% chance mid VIX is better than low VIX, 98.4% chance high VIX is better than mid VIX, and 99.999% chance high VIX is better than low VIX).

So it looks like there might be something here - an inverse relationship between VIX and future equity returns. However to be clear you should still expect to make money owning S&P 500 when the VIX is relatively low - just a little bit less money than normal. Buying equities when the VIX is above 30 also looks like a good strategy. It will be interesting to see if market talking heads start pontificating on that idea when, at some point, the VIX gets back to that level.

"Hang on another minute!!", I hear you unoriginally cry, again. The original story I told at the top of this post was about VIX spiking in February 2007, and the stock market reacting about 18 months later. Perhaps 20 business days is just too short a period to pick up the effect we're expecting. Let's use a year instead.

The results here are more interesting. The best time to invest is when VIX is very high (average SR in the subsequent year, 1.94). So the 'buy when everyone else is terrified' mantra is true. But the second best time to invest is when VIX is relatively low! (average SR 1.14). These are both higher Sharpes than what you get when the VIX is just middling (around 0.94). Again these are also statistically significant differences (low VIX versus average VIX is 97% confidence, the other pairs of tests are >99%).

I could play with permutations of these figures all day, and I'd be rightly accused of data mining. So let me summarise. Buying when the VIX is really high (say above 30) will probably result in you doing well, but you'll need nerves of steel to do it. Buying when the VIX is really low (say less than 15) might give you results that are a little worse than usual, or they might not.

However there is nothing special about the VIX being below 10. We just can't extrapolate from the tiny number of times it has happened and say anything concrete.

Does the level of VIX predict vol?

Whilst the VIX isn't that great for predicting the direction of equity markets, I noted in passing above that it looks like it's pretty good at predicting their future volatility

We're still conditioning on low, middling, and high VIX here but the response variable is the annualised level of volatility over the subsequent 20 days. You can see that most of the red (turning purple) low VIX observations are on the left hand side of the plot - low VIX means vol will continue to be low. The green (high VIX) observations are spread out over a wider area, but they extend over to the far right.


Low VIX (below 12.5): Average subsequent vol 8%
Medium VIX: Average subsequent vol 12.3%
High VIX: Average subsequent vol 21.9%

These numbers are massively statistically significant from each other (above 99.99%). I get similar numbers for trying to predict one year volatility. 

So it looks like the current low level of VIX means that prices probably won't move very much. 

Does the level of vol predict vol?

The VIX is a forward looking measure of future volatility, and it turns out a pretty good one. However there is an even simpler predictor of future vol, and that is recent vol. The level of the VIX, and the level of recent volatility, are very similar - their correlation is around 0.77.

Skipping to the figures, how well does recent vol (over the last 20 days) predict subsequent vol (over the next 20 days)?

Recent Vol less than 6.7%: Average subsequent vol 7.9%
Recent Vol between 6.7% and 21.7%: Average subsequent vol 13.2%
Recent Vol over 21.7%: Average subsequent vol 23.4%

These are also hugely significant differences (>99.99% probability). 

The best way of predicting volatility

Interestingly if you use the VIX to try and predict what the VIX will be in one months time you find it is also very good. Basically both recent vol and implied vol (as measured by the VIX) cluster - high values tend to follow high values, and vice versa. Over the longer run vol tends not to stay high, but will mean revert to more average levels - and this applies to both implied vol (so the VIX) and realised vol.

So a complete model for forecasting future volatility should include the following:

  1. recent vol (+ve effect)
  2. current implied vol (the VIX) (+ve)
  3. recent vol relative to long run average (-ve)
  4. recent level of spot VIX relative to long run average
  5. (You can chuck in intraday returns and option smile if you have time on your hands)
However there is decreasing benefit from including each of these things. Recent vol does a great job of telling you what vol is probably going to be in the near future. Including the current level of the VIX improves your predictive power, but not very much.


The importance of the VIX to future equity returns is somewhat overblown. It's just plain silly to say we can forecast anything from something that's only happened on a handful of occasions in the past (granted that the handful in question belongs to someone with 11 fingers). Low VIX might be a signal that returns will be a little lower than average in the short term, but by no means is inevitable impending doom fast approaching. 

If there is a consistent lesson here it's that very high levels of VIX are a great buy signal. 

The VIX is also helpful for predicting future volatility - but if you have room in your life for just one forecasting rule using recent realised vol is better.