I've talked at some length before about the question of fitting forecast weights, the weights you use to allocate risk amongst different signals used to trade a particular instrument. Generally I've concluded that there isn't much point wasting time on this, for example consider my previous post on the subject here.
However it's an itch I keep wanting to scratch, and in particular there are three things I'd like to look at which I haven't considered before:
- I've generally used ALL my data history, weighted equally. But there are known examples of trading rules which just stop working during the backtested period, for example faster momentum pre-cost (see my last book for a discussion).
- I've generally used Sharpe ratio as my indicator of performance of choice. But one big disadvantage of it is that will tend to favour rules with more positive Beta exposure on markets that have historically gone up
- I've always used a two step process where I first fit forecast weights, and then instrument weights. This seperation makes things easier. But we can imagine many examples where it would produce a suboptimal performance.
- Exponential weighting, with more recent performance getting a higher weight.
- Using alpha rather than Sharpe ratio to fit.
- A kitchen sink approach where both instrument and forecast weights are fitted together.
Note I have a longer term project in mind where I re-consider the entire structure of my trading system, but that is a big job, and I want to put in place these changes before the end of the UK tax year, when I will also be introducing another 50 or so instruments into my traded universe, something that would require some fitting of some kind to be done anyway.
Exponential weighting
Here is the 2nd most famous 'hockey stick' graph in existence:
(From my latest book, Advanced Futures Trading Strategies AFTS)Focusing on the black lines, which show the net performance of the two fastest EWMAC trading rules across a portfolio of 102 futures contracts, there's a clear pattern. Prior to 1990 these rules do pretty well, then afterwards they flatline (EWMAC4 in a very clear hockey stick pattern) and do badly (EWMAC2).
I discuss some reasons why this might have happened in the book, but that isn't what concerns us now. What bothers me is this; if I allocate my portfolio across these trading strategies using all the data since 1970 then I'm going to give some allocation to EWMAC4 and even a little bit to EWMAC2. But does that really make sense, to put money in something that's been flat / money losing for over 30 years?
Fitting by use of historic data is a constant balance between using more history, to get more robust statistically significant results, and using more recent data that is more likely to be relevant and also accounts for alpha decay. The right balance depends on both the holding period of our strategies (HFT traders use months of data, I should certainly be using decades), and also the context (to predict instrument standard deviation, I use something equvalent to using about a month of returns, whereas for this problem a much longer history would be appropriate).
Now I am not talking about crazy and doing something daft like allocating everything to the strategy that did best last week, but it does seem reasonable to use something like a 15 year halflife when estimating means and Sharpe Ratios of trading strategy returns.
That would mean I'd currently be giving about 86% of any weighting to the period after 1990, compared to about 62% now with equal weighting. So it's not a pure rolling window; the distant past still has some value, but the recent past is more important.
Using alpha rather than Sharpe Ratio to fit
One big difference between Quant equity people and Quant futures traders is that the former are obsessed with alpha. They get mugs from their significant others with 'worlds best alpha generator' on them for christmas. They wear jumpers with the alpha symbol on. You get the idea. Beta is something to be hedged out. Much of the logic is that we're probably getting our daily diet of Beta exposure elsewhere, so the holistic optimal portfolio will consist of our existing Beta plus some pure alpha.
Quant futures traders are, broadly speaking, more concerned with outright return. I'm guilty of this myself. Look at the Sharpe Ratio in the backtest. Isn't it great? And you know what, that's probably fine. The correlation of a typical managed futures strategy with equity/bond 60:40 is pretty low. So most of our performance will be alpha anyway.
However evaluating different trading strategies on outright performance is somewhat problematic. Certain rules are more likely to have a high correlation with underlying markets. Typically this will include carry in assets where carry is usually positive (eg bonds), and slower momentum on anything that has most usually gone up in the past (eg equities)*. To an extent some of this is fine since we want to collect some risk premia, but if we're already collecting those premia elsewhere in long only portfolios**, why bother?
* This also means that any weighting of instrument performance will be biased towards things that have gone up in the past - not a problem for me right now as I generally ignore it, but could be a problem if we adopt a 'kitchen sink' approach as I will discuss later.
** Naturally 'Trend following plus nothing' people will prefer to collect their risk premia inside their trend following portfolios, but they are an exception. I note in passing that for a retail investor who has to pay capital gains when their futures roll, it is likely that holding futures positions is an inferior way of collecting alpha.
I'm reminded of a comment by an old colleague of mine* on investigating different trading rules in the bond sector (naturally evalutin. After several depressing weeks he concluded that 'Nothing I try is any better than long only'.
*Hi Tom!
So in my latest book AFTS (sorry for the repeated plugs, but you're reading this for free so there has to be some advertising and at least it's more relevant than whatever clickbait nonsense the evil algo would serve up to you otherwise) I did redress this slightly by looking at alpha and not just outright returns. For example my slowest momentum rule (EWMAC64,256) has a slightly higher SR than one of my fastest (EWMAC8,32), but an inferior alpha even after costs.
Which benchmark?
Well this idea of using alpha is all very well, but what benchmark are we regressing on to get it? This isn't US equities now mate, you can't just use the S&P 500 without thinking. Some plausible candidates are:
- The S&P 500,.
- The 60:40 portfolio that some mythical investor might own as well as this, or a more tailored version to my own requirementsThis would be roughly equivalent to long everything on a subset of markets, with sector risk weights of about 80% in equities and 20% in bonds. Frankly this wouldn't be much different to the S&P 500.
- The 'long everything' portfolio I used in AFTS, which consists of all my futures with a constant positive fixed forecast (the system from chapter 4, as readers will know).
- A long only portfolio just for the sector a given instrument is trading in.
- A long only position just on the given instrument we are trading.
There are a number of things to consider here. What is the other portfolio that we hold? It might well be the S&P 500 or just the magnificent 7; it's more likely to consist of a globally diversified bunch of bonds and stocks; it's less likely to have a long only cash position in some obscure commodities contract.
Also not all things deliver risk premia in their naked Beta outfits. Looking at median long only constant forecast SR in chapter 3 of AFTS, they appear lower in the non financial assets (0.07 in ags, 0.27 in metals and 0.32 in energy; versus 0.40 in short vol, 0.46 in equity and 0.59 in bonds; incidentally FX is also close to zero at 0.09, but EM apart there's no reason why we should earn a risk premium here). This implies we should be veering towards loading up on Beta in financials and Alpha in non financials).
But it's hard to disaggregate what is the natural risk premium from holding financial assets, versus what we've earned just from a secular downtrend in rates and inflation that lasted for much of the 50 odd years of the backtest. Much of the logic for doing this exercise is because I'm assuming that these long only returns will be lower in the future because that secular trend has now finished.
Looking at the alpha just on one instrument will make it a bit weird when comparing alphas across different instruments. It might sort of make more sense to do the regression on a sector Beta. This would be more analogus to what the equity people do.
On balance I think the 'long everything' benchmark I used in AFTS is the best compromise. Because trends have been stronger in equities and bonds it will be reasonably correlated to 60:40 anyway. Regressing against this will thus give a lower Beta and potentially better Alpha for instruments outside of those two sectors.
One nice exercise to do is to then see what a blend of long everything and the alpha optimised portfolio looks like. This would allow us to include a certain amount of general Beta into the strategy. We probably shouldn't optimise for this.
Optimising with alpha
We want to allocate more to strategies with higher alpha. We also want that alpha to be statistically significant. We'll get more statistical significance with more observations, and/or a better fit to the regression.
Unlike with means and Sharpe Ratios, I don't personally have any well developed methodologies, theories, or heuristics, for allocating weights according to alpha or significance of alpha. I did consider developing a new heuristic, and wasted a bit of time with toy formula involving the product of (1- p_value) and alpha.
But I quickly realised that it's fairly easy to adapt work I have done on this before. Instead of using naked return streams, we use residual return streams; basically the return left over after subtracting Beta*benchmark return. We can then divide this by the target return to get a Sharpe Ratio, which is then plugged in as normal.
How does this fit into an exponential framework? There are a number of ways of doing this, but I decided against the complexity of writing code (which would be slow) to do my regression in a full exponential way. Instead I estimate my Betas on first a rolling, then an expanding, 30 year window (which trivially has a 15 year half life). I don't expect Betas to vary that much over time. I estimate my alphas (and hence Sharpe ratios) with a 15 year half life on the residuals. Betas are re-estimated every year, and the most up to date estimate is then used to correct returns in the past year (otherwise the residual returns would change over time which is a bit weird and also computationally more expensive).
Kitchen sink
I've always done my optimisation in a two step process. First, what is the best way to forecast the price of this market (what is the best allocation across trading rules, i.e. what are my forecast weights)? Second, how should I put together a portfolio of these forecasters (what is the best allocation across instruments, i.e. what are my instrument weights)?
Partly that reflects the way my trading strategy is constructed, but this seperation also makes things easier. But it does reflect a forecasting mindset, rather than a 'diversified set of risk premia' mindset. Under the latter mindset, it would make sense to do a joint optimisation where the individual 'lego bricks' are ONE trading rule and ONE instrument.
It strikes me that this is also a much more logical approach once we move to maximising alpha rather than maximising Sharpe Ratio.
Of course there are potential pain points here. Even for a toy portfolio of 10 trading rules and 50 instruments we are optimising 500 assets. But the handcrafting approach of top down optimisation ought to be able to handle this fairly easily (we shall see!).
Testing
Setup
Let's think about how to setup some tests for these ideas. For speed and interpretation I want to keep things reasonably small. I'm going to use my usual five outright momentum EWMAC trading rules, plus 1 carry (correlations are pretty high here, I will use carry60), plus one of my skew rules (skewabs180 for those who care), asset class mean reversion - a value type rule (mrinasset1000), and both asset class momentum (assettrend64) and relative momentum (relmomentum80). My expectation is that the more RV type rules - relative momentum, skew, value - will get a higher weight than when we are just considering outright performance. I'm also expecting that the very fastest momentum will have a lower weight when exponential weighting is used.
The rules profitability is shown above. You can see that we're probably going to want to have less MR (mean reversion), as it's rubbish; and also if we update our estimates for profitability we'd probably want less faster momentum and relative momentum. There is another hockey stick from 2009 onwards when many rules seem to flatten off somewhat.For instruments, to avoid breaking my laptop with repeated optimisation of 200+ instruments I kept it simple and restricted myself to only those with at least 20 years of trading history. There are 39 of these old timers:
'BRE', 'CAD', 'CHF', 'CORN', 'DAX', 'DOW', 'EURCHF', 'EUR_micro', 'FEEDCOW', 'GASOILINE', 'GAS_US_mini', 'GBP', 'GBPEUR', 'GOLD_micro', 'HEATOIL', 'JPY', 'LEANHOG', 'LIVECOW', 'MILK', 'MSCISING', 'MXP', 'NASDAQ_micro', 'NZD', 'PALLAD', 'PLAT', 'REDWHEAT', 'RICE', 'SILVER', 'SOYBEAN_mini', 'SOYMEAL', 'SOYOIL', 'SP400', 'SP500_micro', 'US10', 'US20', 'US5', 'WHEAT', 'YENEUR', 'ZAR'
On the downside there is a bit of a sector bias here (12 FX, 11 Ags, 6 equity, 4 metals, and only 3 bonds amd 3 energy), but that also gives more work for the optimiser (FWIW my full set of instruments has biased towards equities, so you can't really win).
For my long only benchmark used for regressions I'm going to use a fixed forecast of +10, which in laymans terms means it's a risk parity type portfolio. I will set the instrument weights using my handcrafting method, but without any information about Sharpe Ratio, just correlations. IDM is estimated on backward looking data of course.
I will then have something that roughly resembles my current system (although clearly with fewer markets and trading rules, and without using dynamic optimisation of positions). I also use handcrafting, but I fit forecast weights and instrument weights seperately, again without using any information on performance just correlations.
I then check the effect of introducing the following features:
- 'SR' Allowing Sharpe Ratio information to influence forecast and instrument weights
- 'Alpha' Using alpha rather than Sharpe Ratio
- 'Short' Using a 15 year halflife rather than all the data to estimate Sharpe Ratios and correlations
- 'Sink' Estimating the weights for forecast and instrument weights at the same time
Apart from SR and alpha which are mutually exclusive, this gives me the following possible permutations:
- Baseline: Using no peformance information
- 'SR'
- 'SR+Short'
- 'Sink'
- 'SR+Sink'
- 'SR+Short+Sink'
- 'Alpha'
- 'Alpha+Short'
- 'Alpha+Sink'
- 'Alpha+Short+Sink'
Long only benchmark
Baseline
The massive weight to mrinasset is due to the fact it is very diversifying, and we are only using correlations here. But mrinasset isn't very good, so smuggling in outright performance would probably be a good thing to do.
SR of this thing is 0.98 and costs are a bit higher as we'd expect at 0.75% annualised. Always amazing how well just a simple diversified system can do. The Beta to our long only model is just 0.09 (again probably due to that big dollop of mean reversion which is slightly negative Beta if anything), so perhaps unsurprising the net alpha is 18.8% a year (dividing by the vol gets to a SR of 0.98 again just on the alpha). BUT...
'SR'
I'm now going to allow the fitting process for both forecast and instrument weighs to use Sharpe ratio. Naturally I'm doing this in a sensible way so the weights won't go completely nuts.
Let's have a look at the forecast weights for comparison:
momentum4 0.112
momentum8 0.065
momentum16 0.068
momentum32 0.075
momentum64 0.072
assettrend64 0.122
relmomentum80 0.097
mrinasset1000 0.135
skewabs180 0.105
carry60 0.149
We can see that money losing MR has a lower weight, and in general the non trendy part of the portfolio has dropped from about half to under 40%. But we still have lots of faster momentum as we're using the whole period to fit.
Instrument weights by asset class meanwhile look like this:
{'Ags': 0.202, 'Bond': 0.317, 'Equity': 0.191, 'FX': 0.138 'Metals': 0.0700, 'OilGas': 0.0820}
Not dramatic changes, but we do get a bit more of the winning asset classes.
'SR+Short'
Forecast weights:
momentum4 0.095
momentum8 0.055
momentum16 0.059
momentum32 0.067
momentum64 0.066
relmomentum80 0.095
skewabs180 0.117
carry60 0.142
mrinasset1000 0.183
There's definitely been a shift out of faster momentum, and into things that have done better recently such as skew. We are also seeing more MR which seems counterintuitive, my initial theory is that it's because MR becomes more diversifying over time and this is indeed the case.
'Sink+SR+Short'
Interlude - clustering when everything is optimised together
'Alpha'
Results
SR beta r_SR H1_SR H1_beta H1_r_SR H2_SR H2_beta H2_r_SR
LONG_ONLY 0.601 1.000 0.000 0.740 1.000 0.000 0.191 1.000 0.000
BASELINE 0.983 0.094 0.940 1.246 0.094 1.192 0.197 0.058 0.188
SR 1.087 0.308 0.932 1.298 0.338 1.084 0.458 0.151 0.438
SR_short 1.089 0.324 0.927 1.318 0.353 1.100 0.375 0.166 0.350
sink 1.025 0.323 0.871 1.252 0.330 1.055 0.356 0.260 0.319
SR_sink 0.975 0.362 0.804 1.167 0.378 0.947 0.388 0.265 0.349
SR_short_sink 0.929 0.384 0.753 1.121 0.395 0.899 0.323 0.305 0.277
alpha 0.993 0.199 0.891 1.212 0.220 1.069 0.345 0.081 0.336
alpha_short 1.023 0.238 0.904 1.237 0.249 1.082 0.367 0.160 0.343
alpha_sink_short 0.888 0.336 0.735 1.090 0.336 0.903 0.257 0.299 0.210
Coda: Forecast or instrument weights?
SR beta r_SR H1_SR H1_beta H1_r_SR H2_SR H2_beta H2_r_SR
LONG_ONLY 0.601 1.000 0.000 0.740 1.000 0.000 0.191 1.000 0.000
BASELINE 0.983 0.094 0.940 1.246 0.094 1.192 0.197 0.058 0.188
SR 1.087 0.308 0.932 1.298 0.338 1.084 0.458 0.151 0.438
SR_forecasts 1.173 0.330 1.002 1.416 0.364 1.179 0.441 0.154 0.420
SR_instruments 0.948 0.118 0.892 1.179 0.121 1.106 0.259 0.076 0.248
To clarify, you're using Sharpe on an expanding window to adjust weights? It is interesting that there was no benefit to adjusting instrument weightings this way!
ReplyDeleteWhat if you used a rolling 48 or 96-month window or such for rules? I've done some testing on systems that showed a benefit to selecting the best subsystems based on this criteria - though not for Futures. Unfortunately they still had severe look-back bias as the systems were created post the test data...
48/96 months is *way* too short to get statistical significance
DeleteThank you. I've seen the same return degradation you found in futures in equity selection and taa (etf) asset class rotation and was hoping to also find a solution.
DeleteI'm a moron though here - is it simply the number of price points that account for it being okay for high frequency traders to alter signals in only weeks, but not for us to use 48 months? So for 3 weeks a high frequency trader would have 7,200 price points with minute data, whereas 48 months only accounts for 1460 price points using daily data?
Another mistake, 252 days * 4 years is only 968.
Delete" is it simply the number of price points that account for it being okay for high frequency traders to alter signals in only weeks, but not for us to use 48 months?" to precise it's the amount of information, broadly speaking the number of decisions you are making. So if I was to test my strategy on tick data, that wouldn't give me more information since my holding period is still ~1 month. Also the statistical significance grows with square root time.
DeleteThank you for your help. I went back to my program with statsmodels and your advice. It showed significance vs 60-40, but when comparing to the best individual strategies there was no significance and very low power with both the expanding and rolling windows (even though I was aggregating 12 out of 96). I think I basically over-engineered sorting an Excel sheet by Sharpe and pretending I picked a few of the best back in 1970.
DeleteVery interesting post, sir. Quick question: is there any statistical 'incompleteness' or hazard from using p-values as opposed to, say, confidence intervals for comparing SR to alpha in your returns estimates?
ReplyDeleteWith regard to actual implementation: you are using your hand-crafted weights and then applying a formula to exponentially weight a Sharpe ratio adjustment over time? Is this somehow applied to and adjusts the dataframe that forecast_weights_for_instruments() pulls (and just starts with handcrafted weights set in a config file)? Would you consider releasing the formula used and code? Thank you!
ReplyDeleteYeah it's 'handcrafting' but not done by hand. You're welcome to look at the code it's horrible though https://gist.github.com/robcarver17/58b3668407fdbd05954c34373c63d9ed
DeleteWow, thank you very much. Quite a bit of work went into this paper/post!
DeleteI tried to understand the code and what's happening in PySystemTrade, but had a bit of trouble:
DeleteIs this right?
1) Taking an expanding window of returns over the rolling standard deviation to calculate Sharpe (*not* a rolling window of returns).
2) Applying the ew_lookback to the above (i.e. 15 years)
3) Adjusting weights depending on confidence
Sorry about the above. I was trying to implement a similar setup in my own (far, far, far simpler) Python scripts. I finally simply exponentially weighted the mean() and std() and created the Sharpe, then ranked. I did see a consistent effect across three different systems that ranked 90+ TAA (etf) strategies and two systems that used a variety of equity systems (many with decade plus out of sample).
DeleteFor fun I also tried to expand on your tests here. There appeared to be a small Sharpe bump by only applying sr_equalize False to forecast and not instruments (an idea you alluded to in the end) and a significant bumps by separately testing the full rule set with these limited instruments, and testing the full instrument set with these limited rules - not surprising of course.
Great post as usual Rob, thanks so much. I've been playing around with this as well. I generated the unweighted performance of each trading rule; for the fast trading rules I filtered out the markets that were too expensive (so it only reflects performance on markets that were viable to trade), then ran those weekly returns through your full handcrafting code included in Pysystemtrade and below is what popped out. Interesting results and some interesting divergences from the manually handcrafted weights. I believe your full handcrafting script already includes sharpe (along with correlations) as a weighting criteria similar to the sharpe only test here, no?
ReplyDeleteaccel16: 0.02581
accel32: 0.00000
accel64: 0.00000
assettrend16: 0.07681
assettrend2: 0.00684
assettrend32: 0.00822
assettrend4: 0.00129
assettrend64: 0.00745
assettrend8: 0.01186
breakout10: 0.00475
breakout160: 0.00029
breakout20: 0.00667
breakout320: 0.05110
breakout40: 0.00606
breakout80: 0.01203
carry10: 0.00082
carry125: 0.10106
carry30: 0.00274
carry60: 0.02006
momentum16: 0.00184
momentum32: 0.02651
momentum4: 0.00121
momentum64: 0.00255
momentum8: 0.00398
mrinasset1000: 0.15134
normmom16: 0.01894
normmom2: 0.00121
normmom32: 0.03793
normmom4: 0.00138
normmom64: 0.01305
normmom8: 0.00953
relcarry: 0.00000
relmomentum10: 0.09064
relmomentum20: 0.10128
relmomentum40: 0.00223
relmomentum80: 0.00759
skewabs180: 0.01640
skewabs365: 0.07140
skewrv180: 0.03902
skewrv365: 0.05809
Tree:
[' '
'Contains '
'3 '
'sub '
'portfolios',
['[0] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][0] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][0][0] '
'Contains '
"['relmomentum40', "
"'relmomentum80']"],
['[0][0][1] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][0][1][0] '
'Contains '
"['breakout80', "
"'momentum16', "
"'normmom16']"],
['[0][0][1][1] '
'Contains '
"['assettrend16']"],
['[0][0][1][2] '
'Contains '
"['accel64']"]],
['[0][0][2] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][0][2][0] '
'Contains '
'2 '
'sub '
'portfolios',
['[0][0][2][0][0] '
'Contains '
"['breakout160', "
"'momentum32', "
"'normmom32']"],
['[0][0][2][0][1] '
'Contains '
"['assettrend32']"]],
['[0][0][2][1] '
'Contains '
"['momentum64', "
"'normmom64']"],
['[0][0][2][2] '
'Contains '
"['assettrend64', "
"'breakout320']"]]],
['[0][1] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][1][0] '
'Contains '
"['assettrend8', "
"'momentum8', "
"'normmom8']"],
['[0][1][1] '
'Contains '
"['breakout40']"],
['[0][1][2] '
'Contains '
"['accel32']"]],
['[0][2] '
'Contains '
'3 '
'sub '
'portfolios',
['[0][2][0] '
'Contains '
"['assettrend2', "
"'breakout10', "
"'normmom2']"],
['[0][2][1] '
'Contains '
'2 '
'sub '
'portfolios',
['[0][2][1][0] '
'Contains '
"['assettrend4', "
"'momentum4', "
"'normmom4']"],
['[0][2][1][1] '
'Contains '
"['breakout20']"]],
['[0][2][2] '
'Contains '
"['accel16']"]]],
['[1] '
'Contains '
'3 '
'sub '
'portfolios',
['[1][0] '
'Contains '
'3 '
'sub '
'portfolios',
['[1][0][0] '
'Contains '
"['carry10', "
"'carry30', "
"'carry60']"],
['[1][0][1] '
'Contains '
"['carry125']"],
['[1][0][2] '
'Contains '
"['relcarry']"]],
['[1][1] '
'Contains '
"['relmomentum10', "
"'relmomentum20']"],
['[1][2] '
'Contains '
'2 '
'sub '
'portfolios',
['[1][2][0] '
'Contains '
"['skewabs180', "
"'skewabs365']"],
['[1][2][1] '
'Contains '
"['skewrv180', "
"'skewrv365']"]]],
['[2] '
'Contains '
"['mrinasset1000']"]]
You could try the residual SR to weight, instead of the alphas.
ReplyDeleteHi Rob,
ReplyDeleteThank you for your, as always, very informative post.
I'm curious what you think about novel portfolio optimization models like Mean Variance Skewness Kurtosis Efficient optimization (MVSKE).
This method captures more moments of the return distribution, but I'm uncertain about its practical value compared to its computational complexity.
Do you think the additional insights from considering skewness and kurtosis are worth the extra effort?
Best regards,
Nikolay
No
Delete