Tuesday, 4 March 2025

Very.... slow... mean reversion .... and some thoughts on trading at different speeds

 Bit of a mixed bag post today. The golden thread connecting them is the idea that markets trend and mean revert at different frequencies.

- A review of the discussion around timeframes for momentum and mean reversion in 'Advanced Futures Trading Strategies', in light of this excellent recent paper (which I also discussed on the TTU podcast, here from 1:02:12 onwards).

- A mea culpa on the mean reversion strategies in 'Advanced Futures Trading Strategies'. TLDR - there is an error in the backtest and they don't work at least in the specified form.

- A new slow.... absolute mean reversion strategy inspired by a question from Paul Calluzzo on the aforementioned podcast episode.

Note: in this article I use the terms momentum and trend following interchangeably to both mean absolute momentum - not relative.


When do markets trend and mean revert?

When do markets trend? When do they not trend... perhaps even mean reverting? This is a very important question! 

You might think it would depend on the market, but actually there seem to be some fairly common patterns across many different instruments. Here is how I summarised by thoughts in my most recent book, Advanced Futures Trading Strategies (AFTS):

Multi-year horizons: Mean reversion sort of works (although the results are not statistically significant, as the value strategy in part three attests). Note: this value strategy is a relative value strategy that looks for mean reversion within asset classes. Such strategies are common in academic equity research.

Several months to one year horizon: Trend following works, but is not at it’s best (consider the slightly poorer results we get for EWMAC64 versus faster trend variations).

Several weeks to several month horizon: Trend works extremely well (consider the excellent performance of EWMAC8, EWMAC16 and EWMAC32).

Several days to one week: Trend is starting to work less well (EWMAC4 and especially EWMAC2 perform somewhat worse than slower variations, even before costs are deducted).

A few days: We might expect mean reversion to work?

Less than a day: We might expect trend to work?

Less than a second: Mean reversion works well (high frequency trading - HFT - is very profitable).      

Note that 'momentum works for months or years' and 'mean reversion / value works for years' is a very well known stylised fact which has been established in the literature for many decades; see for example this seminal paper. And given the existence of profitable CTAs with holding periods in the weekly to monthly range, it's hardly surprising that momentum works for shorter holding periods. Nor is the fact that HFT firms make a ton of money a secret.

However, in the region between high frequency trading and a horizon of a week or so I wasn't sure exactly what to expect, but I speculated that there would be a region where mean reversion would start to work (more on that later!), and I also thought trend following with holding periods in the 'few hours' range (mainly because there has been some sell side research on that). Note that since my own data is hourly at best, I couldn't really test anything with a horizon of less than about one day.

Fortunately someone came along to fill in this gap in our understanding, with this excellent paper:

"Trends and Reversion in Financial Markets on Time Scales from Minutes to Decades" by Sara A. Safari and Christof Schmidhuber

I won't summarise the paper in much detail (for example it has some interesting results around the relationship between trend strength and reversal), but they have the following pattern of results (from figure 10 in the paper):


Horizon over two years: Mean reversion works, becoming more effective at longer horizons. They used literally centuries of data to check this result. 

One week to two year horizon: Trend following works, but it's effectiveness peaks at around one year

One hour to one week horizon: Trend following works, getting gradually less effective as the time horizon shortens.

Two minute to 30 minute horizon: Mean reversion works, and is most effective at the 4-8 minute horizon


The key differences between my results and theirs only occur in my 'zone of speculation', where I was only guessing and they had actual evidence so let's go with them :-) In particular they have two 'crossing points' from when mean reversion stops working and momentum starts working (at just over two years, and somewhere between 30 minutes and one hour), giving the following broad ranges:

Horizon over two years: Mean reversion works.

One hour to two year horizon: Trend following works.

Two minute to 30 minute horizon: Mean reversion works

Whilst I had speculated that there was something more complicated going on. Even without evidence, Occams razor would suggest you should prefer their results to mine.

Another difference is that when I looked 'optimal points' for eg momentum I was concerned with Sharpe Ratio, but they are instead fitting a response function and seeing when it has the best statistical fitness. Because of the Law Of Active Management, Sharpe Ratio (loosely) scales inversely with the square root of time for a given level of prediction accuracy. So you if you are equally good at predicting one year trends, and 3 month trends, the latter will have twice the Sharpe Ratio of the former. Hence there are good reasons why my optimal SR point is different from their optimal response point; all other things being equal the optimal SR is going to be at a shorter horizon.

Combining the two pieces of research together, and thinking about what sorts of strategies we could be trading, we get this:

Horizon over two years: Mean reversion works. The optimal SR is probably quite flat for anything between three and ten years. Equity value, relative value within asset classes, and absolute mean reversion (of which more in a moment) are all nice strategies. But given their holding period you shouldn't expect high Sharpe Ratios unless you are Warren Buffett (hi Warren!). 

One to two years: Momentum will work but will be getting steadily worse as the timescale gets longer, both from a predictability perspective and a Sharpe Ratio viewpoint. Avoid.

Three months to one year horizon: Trend following works with high predictability, but is not at it’s highest Sharpe ratio due to the slow turnover. However, the advantage here is that this is a playing field that even retail punters with expensive trading costs can play in. Slower momentum strategies are all good.

Three weeks to three months horizon: Trend following probably has it's optimal Sharpe Ratio somewhere in this region, depending on the asset class. Any medium speed momentum strategies are good, and nearly all futures traders can play in this area if they avoid a few very expensive instruments.

Several days to three weeks: Trend is starting to work less well (because the improvement from trading faster is being overwhelmed by the deficit in response) and trading costs will start to bite except for the very cheapest futures (see calculation below), traded with exemplary execution. On the upside, trend following models at this speed will have the highest positive skew. Trade selectively.

A few hours to several days: Trend still just about works but but there are probably only a small number of futures where  you can overcome the bid/offer costs (although I hear costs are very low in Crypto, and there might be US traders who get zero commission able to trade highly liquid ETFs like SPY); I'd doubt though it would be worth doing. As the authors note, strong trends also tend to reverse strongly in this region (see AFTS for my own confirmation of this effect). Against that there have been the sell side papers on this subject, but they seem to rely on gamma hedging effects which may not persist. Avoid.

1 hour to a few hours: The authors in the paper note that the very weak trend effect here can't overcome the tick size effect. Avoid.

Two minute to 30 minute horizon: Mean reversion works, and is most effective at the 4-8 minute horizon from a predictive perspective; although from a Sharpe Ratio angle it's likely the benefits of speeding up to a two minute trade window would overcome the slight loss in predictability. There is no possibility that you would be able to overcome trading costs unless you were passively filled, with all that implies (see below). Automating trading strategies at this latency - as you would inevitably want to do - requires some careful handling (although I guess consistently profitable manual scalpers do exist that aren't just roleplaying instagram content creators, I've never met one). Fast mean reversion is also of course a negatively skewed strategy so you will need deep pockets to cope with sharp drawdowns. Trade mean reversion but proceed with great care.

Less than a second to two minutes: Not covered in the paper, but I would speculate that mean reversion continues to work, and the pre-cost Sharpe Ratio would also continue to improve as the horizon falls. Proceed with even more care.

Less than a second: High frequency trading works, and clearly has a very high Sharpe Ratio, but this is not for the amateur.


Notes on costs: 

The very cheapest equity index future I trade has a cost of around 0.2bp assuming we execute market orders; and vol of around 20% a year, for a SR cost of  about 1bp. Median single instrument SR on the optimal trend strategy (holding period around 3 weeks) is around 0.30. Predictability, as a regression coefficient, from the linked paper is around 6.5% at 3 weeks; and around 1.8% at 2 days (a reduction of 3.6x). Time scaling would improve the SR by 2.7x so the net effect is a 25% fall in SR to around 0.22 for a two day forecast horizon. 

If we take a third of that (my 'speed limit') or 0.22 SR for costs, then our annual cost budget is 0.07 SR or 700bp; implying we could perhaps safely trade a couple of times a day implying a two day forecast horizon (which means trading once a day) is possible. 

But the median future I get data for has a cost of around twenty times that, meaning a holding period of around two weeks is required to meet the 'speed limit'.

Do we have to pay the spread? Broadly speaking, if you are trading slowly, then you can afford to be more patient in your execution, using passive fills where possible (as I do myself). But as a fast trend follower who thinks the price is going to move away from you in the near future, it's probably harder to sit on your hands and wait. 

Alternatively if we are fast mean reverting traders then we can use passive fills by setting limits around where we think the equilibrium is. That of course runs the risk of adverse selection, but without doing this we are never going to make enough money to overcome the bid/offer if we're trading dozens of times a day. You may also be still liable to commissions unless you received exchange rebates from providing liquidity. Note since we earn the bid/offer spread from passive fills, it might be that the best instruments for this strategy are those with wider rather than narrower bid/offers.



Forgive my father, for I have sinned against the gods of backtesting...

Now in AFTS I introduce two strategies which trade mean reversion, with horizons of around a week (since I'd speculated that would work). It included a very elegant way of including limit orders to passively execute, and the second strategy introduced a very nice trend following overlay. And it looked great! But that obviously isn't consistent with the findings above.

Well gentle reader, I screwed up. As I said in my book:

"But what jumps out from this table is the Sharpe ratio. It is impressively large, and the first we have seen in this book that is over two. In my career as a quantitative trader I have always had a long standing policy: I do not trust a back tested Sharpe ratio over two. There are certainly plenty of reasons not to trust this one. 

Firstly, the historical back test period, just over ten years, is shorter than I would like. There are good reasons to suppose that the last ten years included unusual market conditions that might just have favored this strategy. Secondly, it is hard to back test a strategy deploying limit orders that effectively trades continuously using hourly data. There may well be assumptions or errors in my code that make the results look better than they really would have been."

The underlined section (not underlined in the book!) is key here; basically there was an implicit forward fill in my backtest as I calculated the equilibrium price including todays closing price (which of course I wouldn't have known in the morning). The real backtest shows basically no statistically significant return at all.

The good news is that the basic technology of this strategy should work well, at least pre-costs, with a much shorter time horizon; although for all the reasons above I haven't tried it myself (though I know others that have).




A new slow absolute mean reversion strategy

Since I'm taking away one strategy, let me replace it with another. In AFTS strategy twenty two is a 'value' strategy, which bets on mean reversion over five year periods in relative terms against an asset class index. It has crappy SR (basically zero), but positive alpha and improved overall SR when added to trend and carry strategies. 

But on a recent TTU podcast, herePaul Calluzzo asked me if I'd ever tested absolute mean reversion. Certainly I haven't on this blog. So let's do that.

I'll use a three year return for my forecast, which is slow enough to avoid the two year point where we know momentum probably still works; whilst being quick enough to avoid the death by sqrt(T) that will reduce my SR. We go long if the return is negative so:

Forecast = Price_t-3yrs - Price_t

To avoid the turnover being excessive (this is a slow forecast!), and because we should always vol scale:

Smoothed vol scaled Forecast = EWM_64(Forecast/ EW_std_dev(returns))

Drumroll...

It's not..... great (SR -0.48), apart from perhaps a recent pickup. You could argue that as a lot of my data starts in 2013, and the first five year return occurs in 2018, that it's actually profitable for many instruments and we've just been unlucky in the instruments we've traded before. The median SR is -0.06 though which doesn't completely support that argument. 

But really it would appear that at least with this construction absolute mean reversion isn't as good as the relative mean reversion I tested in AFTS.

OK so we've dropped a strategy with an unfeasibly high backtested SR, and I've replaced it with one that has a very poor backtested SR. Unfair? Well, life isn't fair.


Summary

Good things to trade:

Horizon over two years:  Cross sectional mean reversion, but possibly not absolute mean reversion. And similar type things like equity relative value.

One to two years: Nothing*

Three months to one year horizon: Trend following of pretty much anything

Three weeks to three months horizon: Trend following; avoiding very expensive instruments.

Several days to three weeks: Trend following; only the very cheapest instruments.

One hour to several days: Nothing*

Less than a 30 minute horizon: Mean reversion - the faster the better, but only with limit orders and with great care (the faster you are, the more care needs to be taken).

* or at least not outright momentum or mean reversion

Thursday, 6 February 2025

How much should we get paid for skew risk? Not as much as you think!

 A bit of a theme in my posts a few years ago was my 'battle' with the 'classic' trend followers, which can perhaps be summarised as:

Me: Better Sharpe!

Them: Yeah, but Skew!!

My final post on the subject (when I realised it as a futile battle, as we were playing on different fields - me on the field of empirical evidence, them on .... a different field) was this one, in which the key takeaway was this:

The backtest evidence shows that you can achieve a higher maximum CAGR with vol targeting, because it has a large Sharpe Ratio advantage that is only partly offset by it's small skew disadvantage. For lower levels of relative leverage, at more sensible risk targets, vol targeting still has a substantially higher CAGR. The slightly worse skew of vol targeting does not become problematic enough to overcome the SR advantage, except at extremely high levels of risk; well beyond what any sensible person would run.

And another more recent post was on Bitcoin, and why your allocation to it would depend on your appetite for skew. 

With those in mind I recently came to the insight that I could use my framework of 'maximising expected geometric mean / final wealth at different quantile points of the expectation distribution given you can use leverage or not'* to give an intuitive answer an intruiging question - probably one of the core questions in finance:

"What should the price of risk be?"

* or MEGMFWADQPOTED for short - looking actively for a better acronym - which I used in the Bitcoin post linked to above, but explain better in the first half of this post and also this one from a year ago

The whole academic risk factor literature assumes the price of risk often without much reasoning. We can work out the size of the exposure, and the risk of the factor, but that doesn't really justify it's price. After all, academics spent a long time justifying the equity risk premium

I think it would be fun to think about the price of different kinds of risk. Given the background above, I thought only about skew (3rd moment) risk but I will also briefly discuss standard deviation (2nd moment) risk. Generally speaking the idea is to answer the question "What additional Sharpe Ratio should an investor require for each unit of additional risk in the form of X?" Whilst this has certainly been covered by academics at some length, I think the approach of wrapping up into expressing risk preference as optimising for different distributional points is novel and means pretty graphs.

I'm going to assume you're familiar with the idea of maximising geometric return / CAGR / log(final wealth) at some distributional point (50% median or more conservative points like 10, 25%), to find some optimal level of leverage. If not enjoy reading the prior work.


The "price" of standard deviation risk - with and without leverage

To an investor who can use leverage, for Gaussian normal returns, this is trivial. We want the higest Sharpe Ratio asset, irrespective of what it's standard deviation is. Therefore the 'price' of standard deviation is zero. We don't mind getting additional standard deviation risk as long as it doesn't affect our Sharpe Ratio - we don't need a higher SR to compensate. Indeed in practice, we might prefer higher standard deviations since it will require less potential leverage that could be problematic if we are wrong about our SR estimates or assumptions about return distributions.

In classical Markowitz finance to an investor who cannot use leverage, the price of standard deviation is negative. We will happily pay for higher risk in the form of a lower Sharpe Ratio. We want higher returns at all costs; that may come at the cost of higher standard deviation so we aren't fully compensated for the additional risk, but we don't care. This is the 'betting against beta' explanation from the classic Pedersen paper. Consider for example an investment with a mean of 5% and a standard deviation of 10% for a Sharpe Ratio of 0.5 (I set the risk free rate to zero without loss of generality) . If the standard deviation doubles to 20%, but the mean only rises to 6%, well we'd happily take that higher mean. We'd even take it if the mean only increased by 0.00001%. That means the 'price' of higher standard deviation is not only negative, but a very big negative number.

But we are not maximising arithmetic mean. Instead we're maximising geometric mean, which is penalised by higher standard deviation. That means there will be some point at which the higher standard deviation penalty for greater mean is just too high. For the median point on the quantile distribution, which is a full Kelly investor, that will be once the standard deviation has gone above the Kelly optimal level. Until that point the price of risk will be negative; above it will turn positive.

Consider again an arbitrary investment with a mean of 5% and a standard deviation of 10%; SR =0.5. If returns are Gaussian then the geometric mean will be 4.5%. The Kelly optimal risk is much higher 50%, which means it's likely the local price of risk is still negative. So for example, if the standard deviation goes up to 20%, with the mean rising to say 6.5%, for a new (lower) SR of 0.325; we'd still end up with the same geometric mean of 4.5%. In this simple case the price of 10% units of risk is a SR penalty of 0.175; we are willing to pay 0.0175 units of SR for each 1% unit of standard deviation. 

If however the standard deviation goes up another 10%, then the maximum SR penalty for equal geometric mean we would accept is 0.025 units (getting us to a SR of 0.3 or returns of 6.5% a year on 30% standard deviation equating again to a geometric mean of 4.5%); and for any further increase in standard deviation we will have to be payed SR units. This is because the standard deviation is now 30% and so is the SR; we are at the Kelly optimal point. We wouldn't want to take on any additional standard deviation risk unless it is at a higher SR, which will then push the Kelly optimal point upwards.

So we'd need to get paid SR units to push the standard deviation up to say 40%. With 40% standard deviation we'd only be interested in taking the additional risk if we could get a SR of 0.3125 to maintain the geometric mean at 4.5%. Something weird happens here however, since 40% is higher than the new Kelly optimal we can actually get a higher geometric mean if we used less risk (basically by splitting our investment between cash and the new asset). To actually want to use that 40% of risk the SR would trivially have to be 40%. For someone who is remaining fully invested the price of standard deviation risk once you hit the Kelly optimal is going to be 1:1 (1% of standard deviation risk requiring 0.01 of SR benefit).

That is all for a Kelly optimal investor, but how would using my probabilistic methodology with a lower quantile point than the median change this? Well clearly, that would penalise higher standard deviations more, reducing the point at which standard deviation risk was negative.

Because the interaction of leverage and Kelly optimal is complex and will depend on exactly how close the initial asset is to the cutoff point, I'm not going to do more detailed analysis on this as it would be timeconsuming to write, and to read, and not add more intuition thatn the above. Suffice to say there is a reason why I usually assume we can get as much leverage as required!


The "price" of skew - with leverage

Now let's turn to skew (and let's also drop the annoying lack of leverage which makes our life so complicated). The question we now want to answer is "What is the price of skew: how many additional points of SR do we need to compensate us for a unit change in skew, assuming we can freely use leverage? And how does this change at different distributional points?". Returning to the debate that heads this post; is an extra 0.50 units of skew worth a 0.30 drop in SR when we go from continous to 'classical' trend following? We know that would only be the case if we were allowed to use a lot of leverage; which implies we were unlikely to be anything but a full Kelly optimising median distributional point investor. But at what distributional point does that sort of tradeoff become worth it?

To answer this, I'm going to recycle some code from this post and adapt it. That code uses a brute force technique to by mixing Gaussian returns to produce returns with different levels of skewness and fat tailed-ness, but with the same given Sharpe Ratio. We then bootstrap those returns at different leverage levels. That gives us a distribution of returns for each leverage level. We can then choose the optimal leverage that produces the maximum geometric return at a given distributional point (eg median for full Kelly, 10% to be conservative and so on). I then have an expected CAGR level at a given SR, for a given level of skew and fat tailness. By modifying the SR, skew and fat tailness I can see how the geometric return varies, and construct planes where the CAGR is constant. From that I can derive the price of skew (and fat tailness, but I will look at that in a momen) in SR units at different distributional points. Phew!

(Be prepared to set aside many hours of compute time for this exercise if you want to replicate...)


The "price" of skew: Kelly investor

Let's begin by looking at the results for the Kelly maximiser who focuses on the median point of the distribution when calculating their optimal leverage. 

The plots show 'indifference curves' at which the geometric mean is approximately equal. Each coloured line is for a different level of geometric mean. The plots are 'cross plots' that show statistical significance and the median of a cloud of points, as due to the brute force approach there is a cloud of points underneath.

Even then, there is still some non monotonic behaviour. But hopefully the broad message is clear; for this sort of person skew is not worth paying much for! At most we might be willing to give up 4 SR basis points to go from a skew of -3 to +3, which is a pretty massive range.



The "price" of skew: very conservative investor

Now let's consider someone who is working at the 10% quantile point.

If anything these curves are slightly flatter; at most the price of skew might be a couple of basis points. The intuition for this is that these people are working at much lower levels of leverage. They are much less likely to see a penalty from high negative skew, or much of a benefit from a high positive skew.


The "price" of lower tail risk: Kelly investor

Now let's consider the lower tail risk. Remember, a ratio of 1 means we have a Gaussian distribution, and a value above 1 means the left tail is fatter.


This may seem surprising; with a more extreme left tail it looks like you can have a higher SR. But the improvement is modest again, perhaps 5bp of SR at most.


The "price" of lower tail risk: 10% percentile investor

Once again, investors at a lower point on the quantile spectrum are less affected by changes in tail risk, requiring perhaps 3bp of SR in compensation.


How does the optimal leverage / skew relationship change at different percentiles?

As we have the data we can update the plots done earlier and consider how optimal leverage changes with skew. First for the Kelly investor:




Here each coloured line is for a different SR. We can see that for the lowest SR the optimal leverage goes from around 2.7 to 3.7 between the largest negative and positive skews; and for the higest from around 4.2 to 5.6. This is the same result as the last post: leverage can be higher if skew is positive, but not that much higher (from skew of -2 to +2 we can leverage up by around a third).

Here is the 10% investor:




The optimal leverage is lower as you would expect, since we are scaredy cats. It looks like the leverage range is higher though; for the highest SR strategies we go from around 1.7 to 2.8; a two thirds increase. And for the lower SR the rise in optimal leverage is even more dramatic. 


 

One final cut of the data cake

Finally another way to slice the cake is to draw different coloured lines for each level of skew and then see how the geometric mean varies as we change Sharpe Ratio. First the Kelly guy:


This is really reinforcing the point that skew is second order compared to Sharpe Ratio. Each of the bunches of coloured lines is very close to each other. At the very lowest SR at around 0.52 we only get a modest improvement in CAGR going from skew of -2.4 (purple) to +2.4 (red). We get a bigger improvement in CAGR when we add around 3bp of SR and move along the x-axis. Hence 5 units of skew are worth less than 3bp in SR. It's only at relatively high levels of SR that skew becomes more valuable; perhaps 5bp of SR for each 5 units of skew.


Here is the 10% person:


As we noted before there is almost no benefit from skew for the conservative investor (coloured lines close together at each SR point), except until SR ramps up. At the end 5 units of skew are worth the same as around 6bp of SR. 


Conclusion: Skew isn't as valuable as you might think

I started this post harking back to this question: is an extra 0.50 units of skew from 'traditional' trend following worth a 0.30 drop in SR? And the answer is, almost certainly not. The best price we get for skew is around 6bp for 5 units of skew. At that price, 0.5 units of skew should cost us less than 1bp in SR penalty. We're being charged about 50 times the correct price!!!

And this is for Kelly investors. For those with a lower risk tolerance, much of the time there is basically no significant benefit from skew.

That doesn't mean that you shouldn't know what your skew is, as it will affect your optimal leverage, particularly as we saw above if you are a conservative utility person (being such a person will also protect you if you think your skew or Sharpe ratio is better than it actually is, and that's no bad thing). And negatively skewed strategies at la LTCM with very low natural vol that have to be run at insane leverage will always be dangerous, particularly if you don't realise they are negatively skewed. 

But part of the problem with the original debate is a false argument by taking a true statement 'highly negatively skewed strategies are very dangerous with leverage' and extending it to 'you should be happy to suffer significantly lower Sharpe Ratio to get a marginally more positive skew' (which I have demonstrated is false). 

Anyway outside of that argument I think I have shown that to an extent the obsession with getting positive skew is a bit of an unhealthy one. Sure, get it if it's free, but don't pay much for it otherwise. 









Tuesday, 7 January 2025

Do less liquid assets trend better or is that they are just more diversified?

 As most of you know, one of the many projects / things I am involved with is the TTU Systematic Investor podcast series where I'm one of the rotating cast of co-hosts.

On a recent episode (at 24:05) we discussed the reasons why 'alt' CTAs tend to do better than traditional CTAs. Examples of alt-CTAs mentioned in that segment are the Man-AHL Evolution fund which I was heavily involved with when I was at AHL, and the Florin Court product  which is run by some ex-AHL colleagues. 

(Other funds are available and this is not endorsement or financial advice which I am not regulated to provide. It may be utterly illegal of you to even be aware of these products in your jurisdiction never mind invest in them them, and that is your problem not mine)

An 'alt-CTA' is one that trades non traditional markets, but in a traditional way (eg mostly by trend following). These could be less liquid futures markets, but is more likely to be non futures markets like options, OTC derivatives or cash equities. In this article I'm going to focus on the 'less liquid futures' definition of alt-, because that is the data I happen to have. This means that the analysis is also analogous to one of the classical issues in financial economics - the small cap effect in equities. 

In that episode I mentioned some research I had once done on that very topic; albeit many, many years ago, and that document certainly isn't available on my blog. So I thought it worth redoing this exercise.


Reasons why alt CTAs might do better

There are a number of reasons why one CTA might outperform another, but we're going to focus on just three here:

  • more diversification (the products they trade have lower correlations with each other, and/or nice co-skewness properties)
  • better pre-cost performance from the products they are trading
  • lower costs

Now of course we would expect higher costs from less liquid futures; the key question is whether we get enough extra pre-cost performance to compensate.... or no extra performance at all. In which case is the extra alt-CTA juice coming from the diversification properties of the alt-markets (eithier linear correlation or something funkier in the higher moments)? Or will my simple analysis fail to uncover any extra alt-performance, eithier because the alt-CTA's have some extra magic or because their special black magic power can only be found in non futures markets. Or because they've just been lucky.

In any case we'll see if the equivalent of the 'small cap' effect in stocks is present in futures, or if it's something that was around in the past but has gone.

Note: There is some debate about whether the small cap effect, eithier outright, or in combination with the value effect, is still a thing. 


What we are measuring

We need a way of measuring:

  •  the liquidity of futures
  • the trend following performance

To keep things simple, for trend following performance I'm going to use the Sharpe Ratio of an EWMAC16,64 trend following continous forecast with my usual vol based position sizing. To calculate the Sharpe Ratio for a given period (eg a year), I'll use the annualised average daily percentage return divided by the expected annual percentage standard deviation. So this is a Sharpe Ratio based on the vol targeted, not the realised vol. This is because for short periods we might have a weak signal producing a high SR on a contract we didn't actually make any significant money out of. 

For futures liquidity, I'm going to use the 30 day rolling average of daily volume in $ million of annualised risk units for the contract that currently has the highest volume. That is the same measure I track daily here. And then I'm going to log(x) this volume, as these figures vary by many orders of magnitude.

Note: I currently set this measure at a minimum of $1.5 million to trade a given future. 

Note: The definition of $ annualised risk units is the number of contracts of volume, multiplied by the annual standard deviation in price units, mutiplied by the $ value of each price unit.

There could be other ways of measuring liquidity; for example open interest, or the cost of trading. I'm wary of using open interest since there are contracts with large open interest and small volume, and the reverse is also true. Personally I think unless you are a massive trader the size of the volume is more important than the open interest. I don't want to use cost of trading as a measure of liquidity, since I will be analysing that seperately.

Normally when I do this kind of analysis, I exclude instruments for all kinds of reasons including because they are too expensive or illiquid to trade. In this case I don't want to do that. I will however exclude instruments in my data set that are:

  • Duplicates. For example, I don't analyse both the micro and mini S&P 500. The instrument in my dataset are those which meet my minimum requirements for liquidity but have the smallest contract size. Note that the definition of which is the duplicate contract to trade could have been different in the past. For example, immediately after the micro future came in to being it wouldn't have met my requirements for liquidity, so I would have in practice used the mini future. This will affect the results in a small number of edge cases, but mostly for high volume instruments.
  • Ignored. These instruments eithier have garbage data, or they are spread instruments.

I won't exclude instruments that have:

  • Trading Restrictions - mostly ICE markets for which I don't have access to live data so don't currently trade, and certain US derivatives I'm banned from trading
  • 'Bad markets' - these are those that are too expensive or illiquid for me to trade - I want to see if there are size effects so I want to keep these. 

This gives me 205 instruments to analyse. Finally, I have around 12 years of data since I don't have volume data prior to 2013 in my dataset. 


Results across all years

Let's start by just plotting the average volume across all the available data, versus the pre-cost trend following p&l, by instrument.



That isn't especially suggestive of a strong relationship; although our eyes are drawn to the outlier in the top left (US housing equity sector if you care). If I do this as a 'bin cross' plot, which shows statistical significance (explained in more detail in chapter 12 of AFTS), then we can see there is really nothing there - in fact there is a slight tendency for very liquid markets to have a higher trend following SR:




What about costs?

Perhaps a slightly better relationship here - lower volume means higher costs - but not super consistent. There are instruments with very volume but not bad costs, such as the CLP and CZK FX markets at the extreme left. However these costs are based on sampled bid-ask spreads so are unlikely to be indicative of what you could actually achieve trading any size.

The cross plot shows that very illiquid markets do indeed cost more, but beyond that the relationship is relatively non linear. There is a 'zone of increasing costs' up to around $20m of volume in annual risk units, but beyond that risk adjusted costs are relatively flat. Again, this applies to bid-ask spreads only (and commisions) and for institutional size traders the 'zone of increasing costs' would apply to more instruments. 



Trend following p&l: Year by year results

This kind of market analysis has a fatal flaw; it doesn't account for the fact that some instruments will have been trading across the entire 12 year dataset whilst others will only have a few years of data. It also doesn't account for time series effects such as a given instrument seeing an increase or decrease in volume over the relevant period. To get around this, instead I'm going to break the results down into year by year results. So each point on the following scatter plot is the SR and volume for a given instrument and a given year. 

There is little point doing this for costs, since the costs in my backtest aren't actual costs, but here are the results for pre-cost returns. I haven't bothered with a scatter plot as it will be insanely noisy; here is the cross plot:


As with costs it does look like there is something there for very illiquid instruments; roughly those with less than $1m of volume units per day. But it's not statistically significant. The results incidentally survive the application of costs:

The median SR for log(volume) less than 0 (volume units < $1m per day) is 0.04 SR units higher even after costs, and the less robust mean SR is 0.12 units higher.


Measuring diversification via IDM

OK so it looks like very illiquid markets might have a slight edge in performance. But this isn't enough to explain the outperformance of alt-CTAs (with all the caveats from before); I'd also like to look at diversification.

Expected linear diversification can be measured easily by using what I call the 'IDM'. Intuitively, it's the multiplication factor required to leverage up a portfolio of assets with some weightings and correlations. See any of books on trading for details. A portfolio of assets with all correlations=1 will have an IDM of 1. A portfolio of N assets with all correlations zero will have an IDM of sqrt(N).

Note: We can also measure the actual diversification (which will confound both linear and non linear effects) by looking at the ratio between the portfolio SR and the SR of individual instruments - the Sharpe Ratio Ratio (SRR). This tends to be higher than we'd expect from looking at the IDM, as I note in AFTS and here; there is also another take from an ex colleague here. It's tricky to do here however as there are a lot of instruments jumping in and out of the portfolio.

So what I need to do is create portfolios of different liquidity instrument trend following sub-strategies and measure their diversifications (not the correlation of the underlying returns!). An open question is how these portfolios are weighted. I will do this two ways; firstly with equal weights. Secondly, using my handcrafting method (H/C) but in it's simplest form with just correlations (but naturally, using out of sample optimisation). 

This will be a crude in sample test where I look at the average volume over the entire trading period when we have volume figures and then use that to split the portfolio into different buckets. Because I'm trying to work out the why not the how of how this result could be exploited. I will use the final IDM (likely an overestimate given the IDM should increase as more instruments are added).

First by cutting off the portfolio at the median log(volume) of 2.8 (about $16m of daily volume units):

                          IDM EW                        IDM H/C 
Low volume                 2.30                           2.10
High volume                2.28                           2.14

That's... not very much difference. Here are the results as a time series, just to check it isn't a weird end of days effect:


Notice that IDM's fall over time, probably because correlations generally are rising. Earlier in the period when more diversification is available, the less liquid markets do better. But the differences aren't especially substantial.

But above it did seem that the better performance effect only kicked in once we were at very low volumes - below log(volume) of 0 (less than $1m in volume units). Let's go a bit more granular and cut our list of instruments into four groups of ~50 instruments, and for simplicity just look at handcrafted results:


Note the key is in log(volume) units. Note also that there isn't much going on here.


A very silly comparison

The one thing we haven't yet done is plot an account curve, so let's see what the portfolio p&l is like for each of the four buckets of liquidity (which essentially will confound both any improvement in per instrument trend following, plus the realised diversification both linear and non linear). To make this really silly, I'm going to do this for the whole of history despite only using volumes from 2013 to the present to decided which instrument goes in which bucket. This is a shocking idea for a huge number of reasons, almost too many to elucidate here.

With all that in mind, this is the strongest effect yet with less liquid markets underperforming. However this is very likely to be luck; and it's confined mostly to the period prior to 1985 when the less liquid market sets probably only contained only a few instruments which happened to do badly. After that there really isn't much in it.

Summary

On an individual market basis there is indeed a faint 'small cap' effect in futures, at least at this single speed. But it doesn't look like there is much of a difference in measurable diversification benefits. 

As I warned none of this goes very far to explaining the puzzle of the alt-CTA's outperformance, mainly because I don't really have the data to do this properly (so perhaps Man AHL or Florin coud do so?) - the benefit's of being an alt- aren't so much having a higher exposure to illiquid futures, than to trading things that aren't futures at all.

Although perhaps it really was luck, since the outperformance has started to fade recently and the five year track records for say Evo and AHL Alpha are now very similar. That could be because the diversification benefit has fallen off in alts more than in liquid futures, or because the alt- markets have 'matured' and become less 'trendy'.

What we haven't done here is look at the effect of including less liquid instruments in an existing portfolio of liquid instruments; ceritus paribus that should be a good thing since my starting assumption is that more diversification is better, especially as many of the less liquid instruments are commodities rather than another flipping US bond future.

Perhaps I should rethink my very strict policy on what I trade (minimum liquidity of $1.5m volume units per day); after all one of the advantages of being a smaller trader is being able to trade less liquid markets, and not all of the instruments with that sort of volume are super expensive as one of the earlier plots showed.