Tuesday, 7 October 2025

Is the degradation of trend following performance a cohort effect, instrument decay, or an environmental problem?

It's probably bad luck to say this, but the most recent poor performance of CTAs and trend following managers this year appears to have been reversed. My own system is up over 12% since the nadir of the summer drawdown, and is now up for year; admittedly by only by 5.5%. 

Nevetheless, it's true to say that trend following performance appears to have been degrading over the last few decades. If I can literally talk my own book (the book in question being Advanced Futures Trading Strategies - AFTS), then in one chapter I note:

Having said that, it does look like the returns from strategy nine are falling over time. The inflationary 1970s were particularly strong, with a total non-compounded return of over 500% over the decade (if we had been compounding, our returns would have been even more spectacular). We then made over 200% in each of the next three decades. But since 2010 our average return has roughly halved.

But where the better returns in the 1970's (and to a lesser extent 80's, 90's and 00's) because we had a better environment for trend following, or because we had better instruments, or because over time instrument performance decays?

Let me explain. Back in 1972 when my backtest begins, there were just a few instruments. In the first few years only 11 instruments were around, out of the 100 or so in my usual list of liquid instruments I use for testing. And they were weird: seven were agricultural commodities, two are currencies and two metals. Perhaps the better performance of trend following in the 1970's was because the instruments we had then were just better at trend following? Or, perhaps it's just that when an instrument is first traded it does very well, because there aren't many other smart people hanging around to extract 'alpha'?

So, let's see which of these three explanations is most likely.

I'm going to start with the fastest EWMAC 2,8 crossover I use. In AFTS I noted that the two fastest crossovers have suffered particularly bad degeneration in performance since about 1990. These returns are before costs, so the after cost performance would be even worse.

There are quite a few of these graphs in this format, so let me explain. Each line is a different cohort of instruments. So the blue line for example, is all the instruments that began trading in the period 1971-1980 inclusive. On the y-axis is the average SR for those instruments in the five year period beginning in the date on the x-axis. So for example, for the instruments that began trading in the first ten years, from 1981-1985 their average SR was around 0.30; a little higher than the next cohort of instruments that just came in. 

* Important note. The date an instrument starts trading is the earliest point I have data for it. It may have been trading long before then.

Let's now review three possible explanations for the reduction in trend following p&l in the fifty years or so, and see which most represents the empirical results.

First of all we have a cohort effect explanation. Instruments which enter the dataset later have worse performance, so they drag down the average performance. This would look something like this:


You can see that the older an instrument is, the better it's performance. To combat this effect we could just avoid trading newer instruments.

Next we have an instrument lifetime decay effect. Each instrument does well when it first enters the dataset, but then it's performance decays as it ages. As a result, as more instruments are old and fewer newer, and the average performance falls. An extreme version of that effect would look something like this (I've drawn the horizontal lines slightly apart for clarity, they should be on top of each other): :

To deal with this we'd have to be constantly adding new instruments that haven't yet been affected by the influx of sophisticated traders looking for alpha. This is the opposite of what we'd do with a cohort effect.

Finally we have the general enviroment effect. All instruments have roughly similar performance in each period, which gets worse over time. This would result in something a bit like this (I've drawn these lines slightly apart for clarity, they should be on top of each other): 

There is no solution here. We are buggered. We need to get out of the trend following game. Or at least hope that this is just a temporary setback; after all people have tested trend following rules over hundreds of years so a few decades of bad performance is nothing to worry about.

Looking back at the EWMAC 2,8 graph, there is no support for the cohort effect. It looks like there is some evidence of an instrument decay, most strikingly for the 1991 cohort. However this only contains 7 instruments, so it's one of the smallest cohorts, so the significance is questionable. Overall though, this does look an enviroment effect with noise. 

Let's step down the speed to EWMAC4,8:

Looks like a very similar picture. How about EWMAC8,16?
Again apart from the 1991 cohort, this looks pretty much like a gradual decline due to enviromental effects. Now let's turn to EWMAC16,64, which readers of AFTS will know is the best following of my momentum indicators:
Feels like we're watching the same film over and over again, doesn't it?

EWMAC32,128


EWMAC64,256

Not quite as clear, but we're trading very slowly now so would expect more noise.

To summarise then, it looks like the decay of momentum performance has been solely down to  general enviromental effects, rather than a cohort effect, or the decay of instrument performance. This is bad, because it means we can't do anything about it by restricting ourselves to older or newer instruments. This is good, because instrument diversification really is our best chance of being profitable traders and making the most out of a weakening signal. I wouldn't want to suggest that you should do anything different than trading all the instruments you can get your hands on.

And to reiterate, let's hope this is a temporary situation. For me personally, as someone who isn't tied into a CTA box, I'm happy to continue trying to trade as many different risk / return factors as possible.

Bonus postscript: Here's carry!

This does look a little bit more like a decaying instrument story...



Monday, 8 September 2025

PCA analysis of Futures returns for fun and profit, part deux

 In my previous post I discussed what would happen if you did the crazy thing of doing a PCA on the whole universe of futures across assets, rather than just within US equities or bonds like The Man would want you to. In this post I explore how we could do something useful with them. There is some messy code here, to run all of it you'll need psystemtrade, but you can exploit big chunks with your own data even if you don't.


The big problem: sign flipping

Before hitting some p&l generating activity, first however we need to deal with an outstanding issue from the previous post.

TLDR, most of the time factor one is 'risk on /equities are go' and factor two is global interest rates; although not always. Factor sign flipping was a problem however (thanks to people below the line for that insight). So sometimes factor one was long equities, sometimes short equities. Sometimes it was something else entirely. 

As an example, remember this plot from part one? It's the factor exposure of the S&P 500 over time for factors 1(0), 2 (1) and 3(2).

Note there are 'blips' when we have a short exposure to factor 1, mostly in the period since 2008 when we're normally long factor 1. That's clearly a temporary sign flip. We probably want to get rid of those. But there is also the long period in the early 2000's when we're persistently short factor 1. That might be a 'sign flip'; but it could also be that factor one in this period was something more interesting than just 'long equity risk'. 

A couple of ideas spring to mind here. One is just smoothing the factor weights. That would easily solve the blips; and the smooth need only be a few weeks to get rid of them. But a longer smooth, of the length needed to get rid of the other periods, would reduce the information about the factors; in particularly we'd be missing out on interesting times when something other than boring old risk on and off is driving the market.

Another bright idea I had was to reverse the sign on weights when the largest absolute value weight was negative. My expectation was that generally the largest weight on factor 1 (mostly risk on) would usually be equities, and when that factor flipped sign we'd flip it back again. However that didn't produce the expected results. If I were less lazy (and eager to get back to writing book #5), I'd probably do some research; eg I'm pretty sure the answer is somewhere in Gappy's new book but I haven't got there yet. 

In the end I decided to relax and ignore the sign flipping; I can do this because of the four ideas I outlined:

1- own the factors

2- trade the factors

3- buy assets with persistent alpha (+ve residual) 

4- mean revert the cumulative residual 

.... it's only really 1 and 2 that are affected by sign flipping. And I feel I already have things in my armoury for 1 and 2. For example my aggregate momentum signal (blogged about here, and also in my most recent book AFTS) is basically like 2, and on assets with a long bias that will also give us a chunk of 1 as well. 

<Sidebar * note to browser not actually HTML>

Arguably my relative momentum and long term mean reversion are also a bit like 3 and 4. Yet another idea is to build 'asset classes' using clustering as I did here, and then use those for the purposes of 1,2 and possibly 3 and 4. 

So we have three different ways of forming 'factors': exogenously determine asset classes, PCA, and clustering; and four different ways of trading each of them. Those won't give radically different results since clusters mostly follow asset classes, but they could be a little different.

<\Sidebar * see previous note>

But, I hear you cry, why can you flippantly ignore sign flipping when trading only the residuals? Well it's pretty simple; consider a standard APT type equation with a single PCA k and market i:

r_i_t = a_i + (b_i,k * r_k,t) +e_i,t

If we now do a sign flip, then the beta (b) will have a minus one in front of, but the market or PCA return r_m will also have a minus one in front of it. These cancel, and estimation of both the persistent bias (alpha, a_i) and the temporary error (epsilon, e_m) will be unaffected. 


Trading the alpha

So we have two basic ideas; we generate our PCA and then run regressions that look like this:

r_i_t = a_i + (b_i,k * r_k,t) + ... +e_i,t

Where there are one or more PCA k.... And then we eithier buy positive a_i and sell negative; or we sell things with recent cumulative positive e_i.

There are still many design questions to resolve here. How many PCA do we include? Too few, and we'll probably end up missing something interesting. Too many and there is a risk we'll end up without clear signals. Over what period should we estimate betas and alphas? Basically how persistent are they likely to be. Over what period should we cumulate epsilon? Are there periods in which episilon will be trending rather than mean reverting; eg assets that have outperformed their factor adjusted return will continue to do so (which will look an awful lot like buying positive alpha)?

For the PCA I'm going to keep it simple and initially use three PCA, which happens to be the most I can plot and get my head around it. I'm also going to stick to estimating my alphas and betas over a 12 month period, which is the arbitrary period I used before to estimate the PCA themselves (seems weird to use a different period). For the question of epsilon decay I will risk the wrath of the overfitting gods and do a time sensitivity analysis.

To summarise then: At the start of each month we look at the 12 months normalised returns, do a PCA, and then regress each instrument on the returns of each component. We then have an alpha intercept coefficient, and some betas (at most three, once for each PCA). We can see how predictable the alpha is of returns in the following month(s). Then for the following month we can also calculate the residual of performance vs the fitted model. We can cumulate up these residuals and see how they forecast performance.


Alpha

Let's start with the alphas. Here be a massive scatter plot:


Each point is the alpha calculated at the start of a given month for an instrument, and the normalised ex-post return for the following month. It looks like there might be a weak positive relationship there, so let's do some stats.

                           OLS Regression Results                            
==============================================================================
Dep. Variable:         ex_post_return   R-squared:                       0.006
Model:                            OLS   Adj. R-squared:                  0.006
Method:                 Least Squares   F-statistic:                     169.0
Date:                Mon, 08 Sep 2025   Prob (F-statistic):           1.62e-38
Time:                        11:46:09   Log-Likelihood:                 1910.6
No. Observations:               26311   AIC:                            -3817.
Df Residuals:                   26309   BIC:                            -3801.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0052      0.001      3.735      0.000       0.002       0.008
alpha          0.3144      0.024     12.999      0.000       0.267       0.362
==============================================================================

There: we have the classic undergrad stats exercise question "Why is my t-stat big but my R squared is low?". Answer: there is something here, but it's weak. This often happens if you use a large dataset (26,000 observations here).

To show this differently, the conditional ex-post average daily return over the following month with a positive alpha is 0.02, and with a negative alpha is -0.01 (both conditional subsets are roughly half the dataset overall). The t-statistic comparing these is a hefty 10, corresponding to a p_value of the order of 10^-26. So again, alpha definitely has an effect, but is that difference really that big? Hard to tell.

But we know that low R squared are pretty common in finance, so is this a problem? To test this I tried using the alphas as a forecast, and then calculated the Sharpe Ratio of each forecast. The median across all instruments is a SR of 0.1. Remember that trend following gives us around 0.3 to 0.4 for each instrument, so this isn't especially interesting.

It might be that I would get better results from a different lookback to calculate the alphas (remember we use one year). Everything from a 1 month to a lookback of 10 years. What about using fewer, or more principal components? Remember we're going with three.  It turns out that a one year lookback is pretty optimal compared to shorter or longer; but using one PC is better than using two or more. Still the very best we can do is a one year lookback with one PC, and that gives us a SR of 0.12 which is hardly in wallet busting territory and also not significantly different from the result wih three factors.


Trading the residual

Let's turn then to trading the residuals. We're going to cumulate up residuals over various periods and see how well that predicts future returns. To avoid a forward looking forecast, the residuals are calculated on the out of sample month following the point at which the model is fitted. Otherwise the regression coefficients would be forward looking, and hence so would the forecast.

Note that because the model changes slightly each month, the coefficients used to calculate residuals will also change slightly. Such is life. But we'll keep stacking up the residuals month by month even though they are using different models.

We now have 3 knobs to twiddle on our overfitting machine; lookback and number of PCs as before, but also the number of days we sum up residuals. To keep things relatively simple I will initially sum them up using 22 days (about a month of business days). So our base case is:
  • One year lookback to do PCA and calculate coefficients
  • 3 principal components
  • 22 days summing up of residuals; our forecast is minus the summed residuals
And jumping straight to Sharpe Ratio calculation like an impatient toddler, we get a SR of -0.04. The sign is wrong, showing that positive residuals lead to more positive performance, and the effect is also v.v.v. weak.

Does increasing the residual summing period work; eg mean reversion works over longer time periods? Nope. Anything up to a year is actually worse. Going down to a week (which would be v.v.v. costly to trade) does at least push the SR into positive territory, but only just.

Dropping to one PC (which was marginally better than for alpha above), changing the lookback on the PCA, .... nothing produces useful results. This idea is an already dead donkey that has been subsequently thrown off a cliff and then burned**.

** no actual donkeys were harmed in the creation of this blog post

 

Back to alpha

So we had a not too promising individual SR using alpha on an instrument level, but how does that look on a portfolio level? Surprisingly, quite good. Here are the combined results for 100 instruments:


That bad boy has a SR of 0.84! Some of that is diversification, but some of it is because a more accurately calculated median SR per instrument is 0.15, higher than the 0.10 calculated earlier (because, many reasons, like buffering and what not). Still, that's an extremely high realised diversification of over 5. Let's compare it to the 'gold standard' of single momentum models, EWMAC16,64 with the same instruments:


OK not as good, even counting the different vol ewmac comes in with a SR of 1.08, but their correlation is a relatively lowly 0.3. That suggests a modest allocation to alpha persistence will earn some money. Chucking 10% of your forecast weights into alpha persistence bumps up the SR of ewmac16,64 from 1.08 to 1.12. Going to the (arguably in sample fitted) best model with one principal component improves the SR of the alpha model by itself to 1.02, but also increases the correlation with momentum; so that the joint SR with 10% in alpha persistence and 90% in ewmac produces a pretty much unchanged SR of 1.13.


Summary

Research in systematic trading tends to result in a lot of blind alleys. I thought this would be another one. Certainly the idea of mean reverting the errors, a classic from the equity stat arb crowd, doesn't really work in this context. However there does seem to be some modest performance gain to basic momentum from including a PCA derived alpha persistence model. The gain is small however, so it's debatable whether it's worth what would be quite a lot of additional work. Not a blind alley then, but not a very pleasant one to spend much time in.




Tuesday, 1 July 2025

PCA analysis of Futures returns for fun and profit, part #1

 I know I had said I wouldn't be doing any substantive blog posts because of book writing (which is going well, thanks for asking) but this particular topic has been bugging me for a while. And if you listened to the last episode of Top Traders Unplugged you will hear me mention this in response to a question. So it's an itch I feel I need to scratch. Who knows, it might lead to a profitable trading system.

Having said all that, this post will be quite short as it's really going to be an introduction to a series of posts.


Given factor analysis

So at it's heart this is a post about factors. Factors are the source of returns, and of risk. This concept came from the land of equities, specifically the long short factor sorts beloved of Mssrs Fama and French; and it also spawned an entire industry: the modern equity market neutral hedge funds (although Alfred Winslow Jones actually implemented the whole hedge fund idea whilst Fama and French were still in high school). 

At it's core then we have the idea of the APT risk model which is basically a linear regression:

r_i,t = a_i + B_1_i*r_1_t + ..... + e

Where r_i,t is the return on asset i and time t, a_i is the alpha on asset i (assumed to be zero), B_1_i is the Beta on the first risk factor of asset i, r_1_t is the return of the first risk factor, there are more terms like this, and e is an error term with mean zero. Strictly speaking the returns on both i and the risk factor should be excess returns with risk free rate deducted, but we're futures traders so that detail can be safely ignored.

In it's simplest form with a single factor that is 'the market',  this is basically just the OG CAPM/EMH, and B_1 is just Beta. In a more complex form we can include things like the sorted portfolios of Fama and French. Notice that risk and return are intrinsically linked here. The factor is assumed to be some kind of risk that we get paid a price for exposure to. That price is the B_N term. 

(Should B_N be estimated in a time varying way? Perhaps. Although if you vol normalise everything first, you will find your B_N are much more stable, as well as being more interpretable).

Note that for both the market and the Fama French factors (FFF), the factors are given. To be precise, in both cases the factors consist of portfolios of the underlying assets, with some portfolio weights. For the market portfolio, those portfolio weights are (usually) market cap weights. For the FFF they are the +1 for top quartile, -1 for bottom quartile sort of thing. 


What can we do with factors?

Many things! The dual nature of factors as risk and return drivers leads them to multiple uses. So for example, we could own the factors. They are just portfolios, and going long if you think the factor will earn you a risk premium is not a bad idea. If you buy an S&P 500 ETF, well congratulations you have gone long the equity market beta factor. With the ability to go long and short we can own FFF as easily as the market factor. Indeed there are funds that allow you to get exposure to FFF factors or similar, though sometimes only on the long side. 

We could also trade the factors. My own work in my previous book, AFTS, suggests that 70% of the returns of a momentum portfolio come from trading an asset class index. That is an equal vol weighted rather than market cap weighted portfolio, but the overall effect is similar. Trading, i.e. market timing, the FFF or similar is a little more difficult and if you try to do it Cliff Asness will turn up at your house and hit you repeatedly with a stick.

If we treat the factors as risk we don't want, and we don't buy the idea of an efficient market, then we can buy high alpha / sell low alpha. If a stock looks like it has excess return, over and above what that market and FFF say it should have, then maybe it is a good bet? Although financial economists will scoff at you and say you are exposed to a risk that is not in your regression for which you are earning a risk premium, you can just point to your porsche and explain in great detail how you don't care.

Perhaps we believe in the efficient market hypothesis in the long term, but not in the short term. We wouldn't trust those alphas to be persistent as far as we could throw them. But if we take the residual term, e, well that will most likely show a lovely mean reverting pattern when cumulated. So we can mean revert the residual. Big upward swings away from efficiency that we can short the asset on, and lovely downward pulls we can go long on.

There are more esoteric things people do with factors, mainly to do with risk management. You can for example use them to construct robust correlation matricies, hedging portfolios and what not. Risk management isn't my principal concern here, but that is still good to know.


PCA factor analysis

This is all lovely, especially in equities, but in futures things are a bit more mysterious. For starters, we can do things at an asset class level (which is closer in spirit to the equity market neutral world, although we're still at a level higher as our components are e.g. equity indices, not individual equities); but we can also uniquely do a 'whole market' look by considering futures as a whole.

We could probably take a stab at creating an 'asset class' factor in each market that would be like Beta, and indeed I did that in AFTS with my equal risk weighted index. We know that there are certain bellweather markets like the S&P 500 that we could use as proxies for 'the market' in individual asset classes. 

But for futures as a whole, things are much harder. Is the 'market' really just long everything? Even VIX/VSTOXX where we know the risk premium is on the short side? My gut feeling is that our most important factor will be some kind of risk on/off, but then there will be times like 2022 when it would plausibly have been more inflation related. And what would the second factor be?

So we will switch tactics, and rather than use given factors, we will use discovered factors. The idea here is that data itself can tell us what the main latent drivers of returns are, if we just look hard enough. Sure in many cases that will give us the first factor as basically the market portfolio, but the subsequent factors will be more interesting. And in the specific case of futures, where we don't know what the likely factors are, it's going to be quite intruiging.

We use a PCA to discover these factors, with vol normalised returns as the starting point. For each factor we end up with a set of portfolio weights (can be long or short), which can then be helpful to interpret the factor. Note the weights are on vol normalised returns, which are more intuitive.


Sidebar: PCA meta factor analysis (on strategy returns)

Just as a brief note, as I don't intend to cover this here, but it was touched on in the podcast. If we started with the returns of trading eg momentum on a bunch of instruments, rather than the underlying returns themselves, then that might be useful for someone was thinking about replicating a hedge fund index or risk managing a CTA, or perhaps constructing a CTA where they have hedged out the principal component(s) of CTA risk. I've written about replication before, and I've already said I'm not really concerned with risk management here, so I won't talk about this again.


Some nice pictures

This won't be a long post, as I said, as I won't be looking at how to use the PCA returns now I have them. Instead I'm going to focus on visualising the PCAs and interpreting them. Which will be a bit of fun anyway. Methodological points, I used vol normalised returns and one year rolling windows to estimate my PCA. There is a debate to be had as to whether a year is best at compromising between having enough data and a stable result, or whether we need to adapt quicker to changing market conditions. 

I estimated at least two, and up to N/2 PCA depending on how many markets N had data. I used 100 liquid futures markets with daily data back to 1970 where possible.

Let's start with the contribution to variance. This is how important each PCA is.

We can see that the first PCA for the last 12 months at least explains 18% of the variance, the second 12.5% and so on. In contrast if we did this for US equities we'd find the first PCA explained 50%, and for US bonds it would be 70%. There is a lot more going on here.

If we look at how the first two factors contributions vary over time:


... we can see that there has been a bit of a downward trend as more factors arrive in the sample, but more generally the first PCA does hover around 20% and the second around 10%. There are exceptions like 2008 where I would imagine a big risk off bet drove the market. The same was true doing COVID.

What is the first PCA? Well currently it looks like this:

For clarity I've only included the top 20 and bottom 20 instruments by weight. Still you may be struggling to read the labels. The top markets are pretty much all stocks, with European equities getting a bigger weight. The S&P does just sneak into the top 20. The bottom 20 starts with VIX and VSTOXX, but mostly the weights here are quite small. So the first PCA right now is "Equity Beta, with a tilt towards Europe".

What about the second PCA?

The first 6 positive instruments are all US bonds, and nearly all the rest are government bonds of one flavour or another. Only EU-Utility stocks get to crash this party (interest rate sensitive?). On the short side we have some FX and quite a few energy futures. So this second factor is "Long bonds / Short energies"


PCA 3 is long a whole bunch of FX, which means it's short USD, and also some metals and random commodities. On the short side it's short EU-Health equities, CNHUSD FX and a whole bunch of European bonds. Feels a bit trade related. Shall we call this the Trump factor?

Anyway I could continue, but more intuitive would be to understand how these factors have changed over time. We'll pick some key markets with lots of history. We will then plot the weight each has in a given PCA over time. 

Here is the S&P 500:

We can see that is mostly positive on PCA1 and negative on PCA2,3 but there are periods when that is not the case. The sharp drops in weighting suggest that perhaps we ought to run at something longer than a year, or use an EWMA of weights to smooth things out.

Here is US10 year:


Again, this mostly loads positive on PCA2 but not always. You can see the increase in correlation of bonds and equities happening as PCA1 creeps up in the last few years.

Here is the first PCA weighting in June 2004, one of those interesting periods.


You can see that it was all about currencies in that period; plus silver and gold, various other metals, bonds and energies. So very much a short Dollar, long metals trade.

We're nearly done for today. Last job is to plot the factors. Here are the cumulated returns for PCA1:



That looks a lot like a vol normalised equity market; note the drops in 2009 and 2020.

And here is PCA2:

Again that could plausibly be bonds, mostly up with the exception of the post 2022 period.

This suggests another research idea which is to use the S&P 500 and US interest rates as 'given factors' which might be more stable than using PCA. Still that would mean missing out on times like 2004 when other things were driving the market. 


What's next

Next step would be to look at some of those opportunities for factor use and misuse outlined above, and see if there is profit as well as fun in this game!