Monday, 8 June 2026

Forecasting statistical estimates when data gets real

 This is my third post in a series about optimisation and fitting. In my previous post I used random data to calibrate and evaluate many portfolio optimisation techniques. It's worth quoting in full from that post:

Random data is not real data: Well duh. But why is this important? Because random data is drawn from a fixed and well behaved distribution. This means the optimiser only has to discover / estimate the parameters of that distribution as more data is revealed to it. But real data doesn't have a fixed and known distribution. It doesn't actually have any distribution at all. We just model it hoping it does.

To summarise then, random data from a fixed distribution differs from real data in three important ways:

  • There is no distribution! We just assume there is one.
  • The distribution (which doesn't exist) is not known, and thus it's likely the distribution we assume is the wrong one. This is especially true for modelling underlying financial price returns with joint Gaussian models.
  • The distribution (which again, doesn't exist) isn't fixed, but can change over time.

And it has one thing in common:

  • The unknown parameters of the distribution are unknown and have to be learned over time.
In this post I'm going to explore this learning process for two key statistical estimates: correlations and Sharpe Ratios. What I am interested in is how much wider of the mark our estimates for these two things are likely to be for real data vs random data. This obviously has important implications for optimisation.


Let's look at a plot.



This is for random data generated by a process with a true SR of 1. It shows the evolution of the SR and it's statistical distribution as it is re-estimated each year. There is a burn in year which is missing, and then in the first year we can see our estimate of the SR using all available data so far (in orange), and the SR for the current year (in blue). You can see that the orange line is lagged by a year as it is purely out of sample and always a year behind. I've then used the orange line to estimate the theoretical sampling distribution of the Sharpe Ratio for a one year period, and constructed a 1.96x confidence interval (so about 95%) around the orange line which are the green and red lines. 

Note: The theoretical standard deviation of the sampling distribution of the Sharpe Ratio, assuming i.i.d. returns, is sqrt[(1+0.5SR^2)/N] where N is the number of periods.

Broadly speaking if our estimates are correct then we'd hope to see around 1/20 of the blue points outside the red and green lines, and around 19/20 on the inside. There are 40 years of data here and we go outside the range twice, which is roughly what we'd expect.

Another way of measuring this is to look at our error term, normalised by our standard deviation. This will be equal to:

[(SR estimate this year N) - (SR estimate years 0...N-1)]/(SR sampling std dev error 0... N-1)

If I take the square of this, average of all years and then square root I get the normalised root mean squared error. This comes out at 0.998 for all the data above.



The blue line in this plot shows the absolute value of the error term for each year. The orange line shows the RMSE. You can see this gradually declining over time and settling in at around 0.85

Here are the same two plots for a correlation pair estimate:





Again the RMSE tends to end up around 0.86

Incidentally, we can also do these plots for longer periods. Here is the RMSE evolution for a SR estimate looking ahead over the next 5 years:

The RMSE here is a little higher - around 1.0


Now, let's look at some real data. I'm going to use the p&l from trading the US10 year bond with a 16,64 day EWMAC. Let's begin by trying to forecast the SR one year ahead:


Even without calculating the error we can see that there are more boundary breakages than before with random data. Here is the error:

Notice that it is higher than before (around 1.25; or about sqrt(2) times bigger than the random data RMSE) and doesn't slowly converge as it did with random data, instead it stays roughly constant (ignoring the initial period of luck at the start). 

We get a similar picture for 5 years:

What about correlations? Let's look at the correlation between this slow momentum on 10 year US bonds, and the carry rule on the same instrument:


Wow, that's noisy. The RMSE will be off the charts. What about over 5 years?

Ouch. If we look at the correlation between two variations of the same trading rule, EWMAC64,256 and EWMAC32,128 - which are naturally highly correlated - then it's not much better:

Again the RMSE would be in double digits.

Those might be flukes, so let's look at lots of random results. I'm going to pick an instrument and trading rule randomly, and measure it's final RMSE number. I will then generate some random returns of the same length from the same SR distribution (by measuring the full sample SR for the relevant instrument/rule pairing); and measure that's RMSE. I will then select another rule from the same instrument, get the correlation of the two p&l streams, and generate some more random returns with the given expected correlation. Next and finally I will measure the correlation RMSE for the two sets of real returns, and the two sets of random returns.

If I consider the ratio [RMSE real data / RMSE random data] (both for next one year); then the median of this over a few thousand randomly selected trading strategy components is 1.06 for Sharpe Ratios, and for correlations around 5.6. 

In simple terms, we are a little bit worse than forecasting Sharpe Ratios in real data one year ahead than we would be with random data, but a LOT worse with correlations. 

Partly this is because we are pretty terrible at forecasting SR one year ahead anyway even with a stable underlying distribution; we don't do much worse with real data. However it does seem that correlations are far more unstable in reality than in randomly generated data. Note that these are correlations for trading strategy component returns. In some cases they are mathematically related (eg EWMAC of different speeds) and could be derived with some assumptions, a pencil, and a napkin. They are certainly more stable than the returns of the underlying instruments themselves (think about the changing correlation of stocks and bonds in different inflation environments). 

(Note: These numbers are about the same for five years ahead and also ten years ahead)

If we recall from the prior post that the optimal shrinkage is zero on correlations with random data; we can now see why with actual data we'd probably want to opt for some correlation shrinkage; purely because the sampling error is much larger in practice. That is the empirical finding of the EPO paper. It does feel a bit weird since up to now my gut feeling has been that we have to shrink means a lot because they are much harder to forecast and because they have an outsized effect on portfolio weights compared to differences in correlation. Whilst the latter is still true it seems the former is not.

Food for though. Anyway the next step is to repeat the 'Ultimate Fitting Championships' battle, but this time with real data.

 

















No comments:

Post a Comment

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.