Friday, 12 May 2023

Clustering trading rule p&l

 I recently upgraded my live production system to include all the extra instruments I've added on recently. I also did a little consolidation of trading rules, simplifying things slightly by removing some rules that didn't really have much allocation, and adding a couple from my new book. As usual I set the instrument weights and forecast weights using my handcrafting methodology, which is basically a top down method that involves clustering things into groups in a hierarchical fasion.

In my backtests I do this clustering using the correlation matrix as a guide, but for production weights I use heurestics. So for instruments I say things like 'bonds are probably more correlated with each other than with other assets' and form the clusters initially as asset classes. And for forecast weights, which allocate across trading rules, I say things like 'momentum type trading rules are probably more correlated with each other', so I end up using a hierarchy like this:

  • Convergent (eg carry and mean reversion), Divergent (eg momentum)
  • Generic trading rule (eg EWMAC)
  • Specific trading rule variation (eg EWMAC2,8)

Now I recently tested this clustering method for instruments in this blog post. OK it was 17 months ago, but it felt recent to me. Basically I used a clustering methodology and threw in the actual correlation matrix to see how the grouping turned out. It was quite interesting. So I thought it would be quite interesting to do a similar thing with forecast weights. Effectively I am sense checking my heuristic guidelines to see if they are completely nuts, or vaguely okay.

Some code.


Getting the correlation matrix

Well you might think this is easy, but it's not. The correlation matrix here is the correlation of returns for a given set of trading rules and variations. But returns of what? A single instrument, like the S&P 500? That obviously may be unrepresentative of the sample generally, and we're not going to do this exercise for the 200+ instruments I have in my dataset now. What about the correlation of average returns taken across a bunch of instruments, or perhaps the average of the correlations taken across the same bunch - note these aren't quite the same thing. For example an average of correlations will give every instrument the same weight, wheras an average of returns will give a higher weighting to instruments with more data history.

And if we are doing averaging, do we just do a simple average - which will be biased since 37% of my futures are equities? Or do we use the instrument weights?

The good news is it probably won't make too much difference. Given enough history, the correlation of trading rule forecast returns is pretty similar across instruments. But we probably want to avoid overweighting certain asset classes, or equally weighting instruments without much history. So I'm going to go for taking the return correlation of portfolios for each trading rule. Each portfolio consists of the same trading rule being traded in all the instruments I trade, weighted by my actual instrument weights. 

Now I don't actually trade all rules in all instruments, because of trading costs, and sometimes because the instrument has certain flaws, but what we are trying to get here is as much information as possible to build a robust correlation matrix. I will also use pre-cost returns; not that it will make any difference, but the point here is to discover how similar rules are to each other, which doesn't depend on costs.

Finally note that I have 135 instruments with instrument weights, because some of my 208 are duplicates (eg micro and mini S&P 500), or I can't legally trade them, or for some other reason.


Results: N=2

Let's kick things off then:

Cluster 1 'convergent'
['mrinasset160', 'carry10', 'carry30', 'carry60', 'carry125', 'relcarry',
'skewabs365', 'skewabs180', 'skewrv365', 'skewrv180']
Cluster 2 'divergent'
['breakout10', 'breakout20', 'breakout40', 'breakout80', 'breakout160', 'breakout320',
'relmomentum10', 'relmomentum20', 'relmomentum40', 'relmomentum80', 
'assettrend2', 'assettrend4', 'assettrend8', 'assettrend16', 
'assettrend32', 'assettrend64', 
'normmom2', 'normmom4', 'normmom8', 'normmom16', 'normmom32', 'normmom64', 
'momentum4', 'momentum8', 'momentum16', 'momentum32', 'momentum64', 
'accel16', 'accel32', 'accel64']


An absolutely perfect convergent vs divergent split. The labels by the way are added by me, not the code.


Results: N=3


Cluster 1 'convergent' (Unchanged)
['mrinasset160', .... ]

Cluster 2 'fast divergent'
['breakout10', 'breakout20',
'relmomentum10', 'relmomentum20', 'relmomentum40', 'relmomentum80', 
'assettrend2', 'assettrend4', 
'normmom2', 'normmom4', 
'momentum4', 'accel16']

Cluster 3 'medium and slow divergent'
['breakout40', 'breakout80', 'breakout160', 'breakout320',
'assettrend8', 'assettrend16', 'assettrend32', 'assettrend64', 
'normmom8', 'normmom16', 'normmom32', 'normmom64', 
'momentum8', 'momentum16', 'momentum32', 'momentum64', 
'accel32', 'accel64']
This is why we are doing this exercise - we've just discovered something interesting: fast momentum like trading rules have more in common with other fast momentum trading rules, than they do with slow variations of themselves.

Results: N=4

Cluster 1 'convergent mean reversion'
['mrinasset160']
Cluster
2 'convergent skew and carry'
['carry10', 'carry30', 'carry60', 'carry125', 'relcarry', 'skewabs365', 'skewabs180', 'skewrv365', 'skewrv180']
Cluster
3 'fast divergent - unchanged'
['breakout10', 'breakout20', ....]
Cluster
4 'medium and slow divergent - unchanged'
['breakout40', 'breakout80', ....]


Results: N=5

Now it's the turn of the (relatively) slow divergent to be split up:

Cluster 1 'convergent mean reversion (unchanged)'
['mrinasset160', 'mrwrings4']
Cluster
2 'convergent skew and carry (unchanged)'
['carry10', 'carry30', 'carry60', ....]
Cluster
3 'fast divergent - unchanged'
['breakout10', 'breakout20', ....]
Cluster 4 'slow divergent'
['breakout160', 'breakout320',
'assettrend32', 'assettrend64', 
'normmom32', 'normmom64', 
'momentum32', 'momentum64']
Cluster 5 'medium speed divergent'
['breakout40', 'breakout80',
'assettrend8', 'assettrend16', 
'normmom8', 'normmom16', 
'momentum8', 'momentum16', 
'accel32', 'accel64']
Again it's the speed of trading that is the differentiator here, not the trading rule.


Results: N=6

We break relative momentum off from it's counterparts in what was previously cluster 3:

Cluster 1 'convergent mean reversion (unchanged)
['mrinasset160']
Cluster
2 'convergent skew and carry' (unchanged)
['carry10', 'carry30', 'carry60', ...]
Cluster
3 'fast divergent - unchanged'
['breakout10', 'breakout20', ....]
Cluster 3
['relmomentum10', 'relmomentum20', 'relmomentum40', 'relmomentum80']
Cluster 4
['breakout10', 'breakout20',
'assettrend2', 'assettrend4', 
'normmom2', 'normmom4', 
'momentum4', 'accel16']
Cluster 5 'slow divergent' (unchanged - was cluster 4)
['breakout160', 'breakout320',...
]
Cluster 6 'medium speed divergent' (unchanged - was cluster 5)
['breakout40', 'breakout80',...
]

Results: N=7

And now acceleration comes away from the other slow rules:

Cluster 1 'convergent mean reversion (unchanged)
['mrinasset160']
Cluster
2 'convergent skew and carry' (unchanged)
['carry10', 'carry30', 'carry60', ...]
Cluster 3 'relative momentum' (unchanged)
['relmomentum10', 'relmomentum20', 'relmomentum40', 'relmomentum80']
Cluster 4 'fast divergent' (unchanged)
['breakout10', 'breakout20'...
]
Cluster 5 'slow divergent ex. accel'
['breakout160', 'breakout320', 'assettrend32', 'assettrend64', 'normmom32', 'normmom64', 'momentum32', 'momentum64']
Cluster 6 'slow acceleration'
['accel32', 'accel64']
Cluster 7 'medium speed divergent' (unchanged - was cluster 6)
['breakout40', 'breakout80',...
]

Results: N=8

Skew and carry seperate:


Cluster 1 'convergent mean reversion (unchanged)
['mrinasset160']
Cluster 2 ('skew)
['skewabs365', 'skewabs180', 'skewrv365', 'skewrv180']
Cluster 3 ('carry')
['carry10', 'carry30', 'carry60', 'carry125', 'relcarry']
Cluster 4 'relative momentum' (unchanged)
['relmomentum10', 'relmomentum20', 'relmomentum40', 'relmomentum80']
Cluster 5 'fast divergent' (unchanged)
['breakout10', 'breakout20'...
]
Cluster 6 'slow divergent ex. accel'
['breakout160', 'breakout320',...]
Cluster 7 'slow acceleration' (unchanged)
['accel32', 'accel64']
Cluster 8 'medium speed divergent' (unchanged)
['breakout40', 'breakout80',...
]

Results: N=11

Let's skip ahead a bit, and also show all the instruments in each group for this final iteration:

Cluster 1 'slow asset mean reversion'
['mrinasset160']
Cluster 2 'skew'
['skewabs365', 'skewabs180', 'skewrv365', 'skewrv180']
Cluster 3 'carry'
['carry10', 'carry30', 'carry60', 'carry125', 'relcarry']
Cluster 4 'slow relative momentum'
['relmomentum10', 'relmomentum20']
Cluster 5 'fast relative momentum'
['relmomentum40', 'relmomentum80']
Cluster 6 'divergent speed 2'
['breakout20', 'assettrend4', 'normmom4', 'momentum4']
Cluster 7 'divergent speed 1 (fastest)'
['breakout10', 'assettrend2', 'normmom2', 'accel16']
Cluster 8 'divergent speed 5 (slowest)'
['breakout160', 'breakout320', 'assettrend32', 'assettrend64', 'normmom32', 'normmom64', 'momentum32', 'momentum64']
Cluster 9 'slow acceleration'
['accel32', 'accel64']
Cluster 10 'divergent speed 4'
['breakout80', 'assettrend16', 'normmom16', 'momentum16']
Cluster 11 'divergent speed 3'
['breakout40', 'assettrend8', 'normmom8', 'momentum8']

A new heirarchy for handcrafting trading rules

With that all in mind, a better heirarchy would be something a bit like this:

  • Convergent
    • Mean reversion
    • Skew
      • Equal split between 4 skew rules
    • Carry
      • Outright carry
      • Relative carry
  • Divergent
    • Speed 1 (fastest: turnover > 45)
      • acceleration - nothing fast enough
      • relmomentum10
      • other trend
        • breakout10
        • assettrend2
        • normmom2
        • momentum4
    • Speed 2 (22 < turnover <45)
      • acceleration16
      • relmomentum20
      • other trend
        • breakout20
        • assettrend4
        • normmom4
        • momentum8
    • Speed 3 (12 < turnover < 22)
      • acceleration32
      • relmomentum40
      • other trend
        • breakout40
        • assettrend8
        • normmom8
        • momentum16
    • Speed 4 (7 < turnover < 12)
      • acceleration64
      • relmomentum80
      • other trend
        • breakout80
        • assettrend16
        • normmom16
        • momentum32
    • Speed 5 (4 < turnover < 7)
      • other trend
        • breakout160
        • assettrend32
        • normmom32
        • momentum64
    • Speed 6 (turnover > 4)
      • other trend
        • breakout320
        • assettrend64
        • normmom64
As you can see I (roughly) used turnovers to group the divergent rules, although these groupings aren't quite right I thought it better to go for some nice neat sequences. And this also doesn't exactly follow how the clustering above works eithier. But this would certainly be a better way of doing things than the grouping entirely by trading rule, which as we've seen doesn't make sense for divergent rules where speed is more important.

Now of course there are a lot of caveats with this; first of all it's entirely in sample, but given how stable and persistent correlation of trading rules returns are over time, we'd probably get very similar results with a purely backward looking approach. Secondly we're ignoring things like costs, and the possibility that some rules may do better than others, but we can deal with that when we actually use the above structure to set instrument weights.


7 comments:

  1. Hi Rob, I take it the splitting of 'speeds' into several groups is to at least partially mitigate the need for truncating & resampling returns data, as it concerns stability / spurious correlations? Did you research corr stability per some iteration of returns frequency intra- or inter-group, to set the upper & lower bounds of frequency per-group?

    ReplyDelete
    Replies
    1. Handcrafting generally is a technique for ensuring robustess, but the stability of correlations of trading rules is actually much better than of the underlying instruments.

      Delete
  2. Hi Rob, thanks as always for providing such excellent insight into your processes. It's greatly appreciated. Sorry if I'm slow on the uptake here, but does this clustering exercise factor into your forecast weights for pysystemtrade, or are those determined and updated periodically by the generic optimiser with the expanding window methodology?

    ReplyDelete
    Replies
    1. Yes it does - I actually fit by weights with 'manual' handcrafting, and I used this to determine the groupings.

      Delete
  3. Hello Rob, thank you for another fascinating article. Question: How do get from the above groupings to your final weights for each instrument? Do you somehow feed the groups into Pysys and let it do the work, or do you do the remainder manually also? By remainder I mean dropping variations that are two expensive for a particular instrument and tilting the allocations to take account performance. While you might assume equal after cost performance when allocating instrument weights, I am figuring you will want to down weight losing rules when allocating forecasts.

    ReplyDelete
    Replies
    1. "Do you somehow feed the groups into Pysys and let it do the work" - yes that's right. https://github.com/robcarver17/pysystemtrade/issues/1162 But I don't take account of performance.

      Delete
    2. It may be a good idea to start with clusters as above, then apply a cost penalty, then possibly a light performance benefit.

      Delete

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.