Monday, 29 June 2026

Rolling, rolling, rolling.... updating statistical estimates yes or no

 The mega blog post series on portfolio optimisation continues!

A couple of posts ago, here, I looked at using the idea of formal testing for structural breaks in parameter estimates. Important parameters like Sharpe Ratio (SR). Because stuff like this happens:


This is the pre-cost performance of the momentum4 rule on CORN. The formal test found a structural break in 1989.

It's fair to say the structural break stuff didn't work that well. But there may be a much easier way of dealing with the non stationarity of these estimates, and that's to use rolling estimates. For example, if you were to use a 10 year rolling estimate of SR then by the mid 1990s we would conclude that this was a money losing rule. We could also use a rolling estimate for correlation, though as these are stable enough over periods of five years or more this wouldn't affect things much.

Of course I wouldn't be so crude as to use a mere rolling window, instead I'd use an exponential window. As usual I'm going to specify this using the span parameter of the pandas ewm function. A 10 year span has a 3.5 year half life; i.e. the 10 year EWM is roughly equivalent (same halflife) to a 7 year simple moving average.


The test

Regular readers will know exactly what to expect here, but for those that aren't regular here is how I test this procedure to be as sure as possible there is no luck involved.

  • Select 10,20,30 or 40 years of in sample data (shorter periods won't make sense to apply an exponentially weighted [EW] estimation)
  • Select 1 or 5 years of out of sample data
  • Pick a random instrument, ensuring there is enough history available (between 11 and 45 years). We will only choose from instruments with sufficient history for the time required. 
  • Randomly pick N=9 forecasting rules from those available (the same as in previous posts)

Then for each of those sumsamples:

  • Cycle through using an exponentially weighted [EW] span of 5,10,20,30 years; and no span (use all available history). For shorter in sample periods the EW results using longer spans will be very similar to those without EW estimation.
  • Estimate SR using the EW span.
  • Estimate correlation using all the in sample data (we could use an EW span here, but correlations are sufficiently stable that it won't unduly affect results).
  • Use fixed shrinkage levels (estimated here): SR shrinkage 0.5, correlation 0.75 (since we'll always have at least five years of in sample data we don't need to worry about the higher levels of shrinkage required when we have insufficent data). The results won't be much different with any vaguely similar shrinkage; you could argue we'd need more shrinkage with shorter EW spans but I am not going to test this.
  • Run in sample optimisation and out of sample optimisation on all the options above

Finally once we have all our subsamples:

  • Get the median SR from the distribution of subsamples
  • Find the optimal EWM span with the higest median SR
  • Test to see if that optimum is significantly higher than the others

As I also did in my last post I'm going to see if the figures are different without any implicit fitting. To achieve this I include the opposite of a given trading rule as a candidate; and then when I come to do optimisation I pick the version that has a positive SR (there are no costs, so the SR will be identical with a negative sign).


10 years in sample, one year out of sample

We only have five optionts to consider so we can do this in a simple table.

         SR  pvalue
5 0.026 0.065
10 0.021 0.026
20 0.033 NaN
30 0.030 0.042
999999 0.017 0.131

Each row is a different EW span. '99999' means the entire in sample period was used. The next column is the out of sample Sharpe Ratio for each option. In the second column is the p-value for a test of the optimal option against the relevant option. NaN is the optimal option, and lower values (say below 0.05) mean the optimal option is significantly better than the alternatives. We can see that a 20 year span is the optimal, and it's a little better than the other alternatives but not significantly better than the entire in sample period.

Do the results differ when we don't preselect only the 'correct' rules?


SR pvalue
5 0.076 0.188
10 0.093 NaN
20 0.085 0.186
30 0.087 0.169
999999 0.083 0.236

Nothing is really significant there.


10 years in sample, five years out of sample


SR pvalue
5 0.201 0.009
10 0.215 NaN
20 0.214 0.078
30 0.211 0.277
999999 0.199 0.271

SR pvalue 5 0.093 0.000 10 0.112 NaN 20 0.104 0.339 30 0.104 0.466 999999 0.110 0.533

Here we do get better performance with anything more than 5 years.

20 years in sample, one year out of sample

          SR  pvalue
5 -0.154 0.892
10 -0.170 0.687
20 -0.163 0.610
30 -0.164 0.647
999999 -0.138 NaN


          SR  pvalue
5 -0.202 0.006
10 -0.135 0.011
20 -0.110 0.029
30 -0.097 0.059
999999 -0.055 NaN

Longer estimates are better.

20 years in sample, five years out of sample


SR pvalue
5 0.093 0.000
10 0.119 0.000
20 0.134 0.026
30 0.132 0.043
999999 0.149 NaN
          SR  pvalue
5 0.061 0.000
10 0.111 0.003
20 0.129 0.070
30 0.131 0.071
999999 0.140 NaN

Yes, longer estimates are better.


30 years in sample, one year out of sample


SR pvalue
5 -0.073 0.0
10 -0.078 0.0
20 -0.031 0.0
30 -0.005 0.0
999999 0.042 NaN

SR pvalue 5 -0.132 0.007 10 -0.152 0.000 20 -0.156 0.000 30 -0.104 0.000 999999 0.000 NaN


Same story, slightly different numbers.

30 years in sample, five years out of sample

          SR  pvalue
5 -0.038 0.0
10 -0.031 0.0
20 -0.017 0.0
30 -0.006 0.0
999999 0.008 NaN


SR pvalue
5 -0.050 0.001
10 -0.046 0.003
20 -0.038 0.003
30 -0.029 0.003
999999 -0.017 NaN


40 years in sample, one year out of sample


SR pvalue
5 -0.135 0.145
10 -0.113 0.023
20 -0.110 0.001
30 -0.090 NaN
999999 -0.150 0.947

Again, we basically want a very long estimate.
          SR  pvalue
5 -0.159 NaN
10 -0.224 0.044
20 -0.236 0.028
30 -0.238 0.072
999999 -0.254 0.061

That was a little unexpected.

40 years in sample, five years out of sample


SR pvalue
5 -0.049 0.0
10 -0.040 0.0
20 -0.029 0.0
30 -0.021 0.0

999999 -0.015     NaN    


SR pvalue
5 -0.097 0.000
10 -0.064 0.000
20 -0.016 0.006
30 -0.006 NaN
999999 -0.029 0.703


Conclusion

I'm a big believer in publishing (well blogging) research even if it doesn't result in a positive result. And certainly it looks like you don't really gain anything from using exponentially weighted estimates of Sharpe Ratios for optimisation, versus the simpler alternative of using all the data. Still there is that nagging feeling that we should at least have the option of dropping something that hasn't worked for a while which implies a very slow EWM. A 30 year EWM span has a 10.3 year halflife, the same as a 20 year or so SMA; whilst a 40 year EWM span is equivalent to a 28 year SMA. 



No comments:

Post a Comment

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.