Monday, 8 September 2025

PCA analysis of Futures returns for fun and profit, part deux

 In my previous post I discussed what would happen if you did the crazy thing of doing a PCA on the whole universe of futures across assets, rather than just within US equities or bonds like The Man would want you to. In this post I explore how we could do something useful with them. There is some messy code here, to run all of it you'll need psystemtrade, but you can exploit big chunks with your own data even if you don't.


The big problem: sign flipping

Before hitting some p&l generating activity, first however we need to deal with an outstanding issue from the previous post.

TLDR, most of the time factor one is 'risk on /equities are go' and factor two is global interest rates; although not always. Factor sign flipping was a problem however (thanks to people below the line for that insight). So sometimes factor one was long equities, sometimes short equities. Sometimes it was something else entirely. 

As an example, remember this plot from part one? It's the factor exposure of the S&P 500 over time for factors 1(0), 2 (1) and 3(2).

Note there are 'blips' when we have a short exposure to factor 1, mostly in the period since 2008 when we're normally long factor 1. That's clearly a temporary sign flip. We probably want to get rid of those. But there is also the long period in the early 2000's when we're persistently short factor 1. That might be a 'sign flip'; but it could also be that factor one in this period was something more interesting than just 'long equity risk'. 

A couple of ideas spring to mind here. One is just smoothing the factor weights. That would easily solve the blips; and the smooth need only be a few weeks to get rid of them. But a longer smooth, of the length needed to get rid of the other periods, would reduce the information about the factors; in particularly we'd be missing out on interesting times when something other than boring old risk on and off is driving the market.

Another bright idea I had was to reverse the sign on weights when the largest absolute value weight was negative. My expectation was that generally the largest weight on factor 1 (mostly risk on) would usually be equities, and when that factor flipped sign we'd flip it back again. However that didn't produce the expected results. If I were less lazy (and eager to get back to writing book #5), I'd probably do some research; eg I'm pretty sure the answer is somewhere in Gappy's new book but I haven't got there yet. 

In the end I decided to relax and ignore the sign flipping; I can do this because of the four ideas I outlined:

1- own the factors

2- trade the factors

3- buy assets with persistent alpha (+ve residual) 

4- mean revert the cumulative residual 

.... it's only really 1 and 2 that are affected by sign flipping. And I feel I already have things in my armoury for 1 and 2. For example my aggregate momentum signal (blogged about here, and also in my most recent book AFTS) is basically like 2, and on assets with a long bias that will also give us a chunk of 1 as well. 

<Sidebar * note to browser not actually HTML>

Arguably my relative momentum and long term mean reversion are also a bit like 3 and 4. Yet another idea is to build 'asset classes' using clustering as I did here, and then use those for the purposes of 1,2 and possibly 3 and 4. 

So we have three different ways of forming 'factors': exogenously determine asset classes, PCA, and clustering; and four different ways of trading each of them. Those won't give radically different results since clusters mostly follow asset classes, but they could be a little different.

<\Sidebar * see previous note>

But, I hear you cry, why can you flippantly ignore sign flipping when trading only the residuals? Well it's pretty simple; consider a standard APT type equation with a single PCA k and market i:

r_i_t = a_i + (b_i,k * r_k,t) +e_i,t

If we now do a sign flip, then the beta (b) will have a minus one in front of, but the market or PCA return r_m will also have a minus one in front of it. These cancel, and estimation of both the persistent bias (alpha, a_i) and the temporary error (epsilon, e_m) will be unaffected. 


Trading the alpha

So we have two basic ideas; we generate our PCA and then run regressions that look like this:

r_i_t = a_i + (b_i,k * r_k,t) + ... +e_i,t

Where there are one or more PCA k.... And then we eithier buy positive a_i and sell negative; or we sell things with recent cumulative positive e_i.

There are still many design questions to resolve here. How many PCA do we include? Too few, and we'll probably end up missing something interesting. Too many and there is a risk we'll end up without clear signals. Over what period should we estimate betas and alphas? Basically how persistent are they likely to be. Over what period should we cumulate epsilon? Are there periods in which episilon will be trending rather than mean reverting; eg assets that have outperformed their factor adjusted return will continue to do so (which will look an awful lot like buying positive alpha)?

For the PCA I'm going to keep it simple and initially use three PCA, which happens to be the most I can plot and get my head around it. I'm also going to stick to estimating my alphas and betas over a 12 month period, which is the arbitrary period I used before to estimate the PCA themselves (seems weird to use a different period). For the question of epsilon decay I will risk the wrath of the overfitting gods and do a time sensitivity analysis.

To summarise then: At the start of each month we look at the 12 months normalised returns, do a PCA, and then regress each instrument on the returns of each component. We then have an alpha intercept coefficient, and some betas (at most three, once for each PCA). We can see how predictable the alpha is of returns in the following month(s). Then for the following month we can also calculate the residual of performance vs the fitted model. We can cumulate up these residuals and see how they forecast performance.


Alpha

Let's start with the alphas. Here be a massive scatter plot:


Each point is the alpha calculated at the start of a given month for an instrument, and the normalised ex-post return for the following month. It looks like there might be a weak positive relationship there, so let's do some stats.

                           OLS Regression Results                            
==============================================================================
Dep. Variable:         ex_post_return   R-squared:                       0.006
Model:                            OLS   Adj. R-squared:                  0.006
Method:                 Least Squares   F-statistic:                     169.0
Date:                Mon, 08 Sep 2025   Prob (F-statistic):           1.62e-38
Time:                        11:46:09   Log-Likelihood:                 1910.6
No. Observations:               26311   AIC:                            -3817.
Df Residuals:                   26309   BIC:                            -3801.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0052      0.001      3.735      0.000       0.002       0.008
alpha          0.3144      0.024     12.999      0.000       0.267       0.362
==============================================================================

There: we have the classic undergrad stats exercise question "Why is my t-stat big but my R squared is low?". Answer: there is something here, but it's weak. This often happens if you use a large dataset (26,000 observations here).

To show this differently, the conditional ex-post average daily return over the following month with a positive alpha is 0.02, and with a negative alpha is -0.01 (both conditional subsets are roughly half the dataset overall). The t-statistic comparing these is a hefty 10, corresponding to a p_value of the order of 10^-26. So again, alpha definitely has an effect, but is that difference really that big? Hard to tell.

But we know that low R squared are pretty common in finance, so is this a problem? To test this I tried using the alphas as a forecast, and then calculated the Sharpe Ratio of each forecast. The median across all instruments is a SR of 0.1. Remember that trend following gives us around 0.3 to 0.4 for each instrument, so this isn't especially interesting.

It might be that I would get better results from a different lookback to calculate the alphas (remember we use one year). Everything from a 1 month to a lookback of 10 years. What about using fewer, or more principal components? Remember we're going with three.  It turns out that a one year lookback is pretty optimal compared to shorter or longer; but using one PC is better than using two or more. Still the very best we can do is a one year lookback with one PC, and that gives us a SR of 0.12 which is hardly in wallet busting territory and also not significantly different from the result wih three factors.


Trading the residual

Let's turn then to trading the residuals. We're going to cumulate up residuals over various periods and see how well that predicts future returns. To avoid a forward looking forecast, the residuals are calculated on the out of sample month following the point at which the model is fitted. Otherwise the regression coefficients would be forward looking, and hence so would the forecast.

Note that because the model changes slightly each month, the coefficients used to calculate residuals will also change slightly. Such is life. But we'll keep stacking up the residuals month by month even though they are using different models.

We now have 3 knobs to twiddle on our overfitting machine; lookback and number of PCs as before, but also the number of days we sum up residuals. To keep things relatively simple I will initially sum them up using 22 days (about a month of business days). So our base case is:
  • One year lookback to do PCA and calculate coefficients
  • 3 principal components
  • 22 days summing up of residuals; our forecast is minus the summed residuals
And jumping straight to Sharpe Ratio calculation like an impatient toddler, we get a SR of -0.04. The sign is wrong, showing that positive residuals lead to more positive performance, and the effect is also v.v.v. weak.

Does increasing the residual summing period work; eg mean reversion works over longer time periods? Nope. Anything up to a year is actually worse. Going down to a week (which would be v.v.v. costly to trade) does at least push the SR into positive territory, but only just.

Dropping to one PC (which was marginally better than for alpha above), changing the lookback on the PCA, .... nothing produces useful results. This idea is an already dead donkey that has been subsequently thrown off a cliff and then burned**.

** no actual donkeys were harmed in the creation of this blog post

 

Back to alpha

So we had a not too promising individual SR using alpha on an instrument level, but how does that look on a portfolio level? Surprisingly, quite good. Here are the combined results for 100 instruments:


That bad boy has a SR of 0.84! Some of that is diversification, but some of it is because a more accurately calculated median SR per instrument is 0.15, higher than the 0.10 calculated earlier (because, many reasons, like buffering and what not). Still, that's an extremely high realised diversification of over 5. Let's compare it to the 'gold standard' of single momentum models, EWMAC16,64 with the same instruments:


OK not as good, even counting the different vol ewmac comes in with a SR of 1.08, but their correlation is a relatively lowly 0.3. That suggests a modest allocation to alpha persistence will earn some money. Chucking 10% of your forecast weights into alpha persistence bumps up the SR of ewmac16,64 from 1.08 to 1.12. Going to the (arguably in sample fitted) best model with one principal component improves the SR of the alpha model by itself to 1.02, but also increases the correlation with momentum; so that the joint SR with 10% in alpha persistence and 90% in ewmac produces a pretty much unchanged SR of 1.13.


Summary

Research in systematic trading tends to result in a lot of blind alleys. I thought this would be another one. Certainly the idea of mean reverting the errors, a classic from the equity stat arb crowd, doesn't really work in this context. However there does seem to be some modest performance gain to basic momentum from including a PCA derived alpha persistence model. The gain is small however, so it's debatable whether it's worth what would be quite a lot of additional work. Not a blind alley then, but not a very pleasant one to spend much time in.




No comments:

Post a Comment

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.