Tuesday 18 January 2022

Clustering and correlations

Happy new year!

A very quick post from me this month - I'm trying to get ready for teaching next week and also cracking on with my latest book. On the Systematic Trader podcast I recently discussed using a clustering algorithim to group instruments.  

Using my software, pysystemtrade, I can get a correlation matrix of asset returns and cluster it quite easily. The correlation I'm using is as of now January 2022, and has a ~6 month halflife.

So all I did was run this for different group sizes, and see what instruments were in each group.

 All the code for this post is here and the clustering code is here - it uses scipy.cluster. 
If you want to use this code yourself without the full drama of pysystemtrade, just use the function get_list_of_clusters_for_non_boring_correlation_matrix and pass it a numpy array correlation matrix.


With N=2

clusters = cluster_correlation_matrix(corr_matrix,2)
Cluster 1
Equity: ['AEX', 'BEL20', 'BOVESPA', .... 'SP500_micro', 'TOPIX']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'KRWUSD', 'MXP', 
     'NOK', 'NZD', 'RUR', 'YENEUR']
Metals: ['COPPER', 'PALLAD', 'PLAT']
Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC',... 'US-TECH', 'US-UTILS']

Cluster 2
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', 'CORN', .... 'SOYOIL', 'WHEAT']
Bond: ['BOBL', 'BONO', 'BUND', 'BUXL', 'CH10', ....'USIRS5', 'USIRS5ERIS']
FX: ['CHF', 'CNH', 'CZK', 'EUR', 'EURCHF', 'GBPEUR', 'JPY', 'SEK', 'SGD']
Vol: ['V2X', 'VIX', 'VNKI']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

OK, this is interesting. The first cluster is about 58% of the total and contains equities, italian bonds, a few random metals, and some FX markets which are mostly pretty dodgy EM stuff (yes including GBPUSD!). Let's call this the risk on cluster.

Cluster two contains most Ags markets, most of the rest of the bond markets,  . It also contains vol (which remember is a safe haven - if VIX goes up then it's because the world is going to hell). It seems reasonable that this is mostly a risk off cluster.

There are some weird oddities - crypto risk off, really? And the FX split is a bit weird in places - JPY risk off, yes; CHF yes agreed, but CZK... ?


With N=3


N=3
clusters = cluster_correlation_matrix(corr_matrix,N)
display_clusters(system, clusters)
Cluster 1
Equity: ['AEX', 'BEL20', .... 'SP400', 'SP500_micro', 'TOPIX']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', 'CAD', .... 'RUR', 'YENEUR']
Metals: ['COPPER', 'PALLAD', 'PLAT']
Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC', ... 'US-TECH', 'US-UTILS']

Cluster 2
Vol: ['V2X', 'VIX', 'VNKI']
FX: ['CNH', 'SGD']

Cluster 3
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', ... 'SOYMEAL', 'SOYOIL', 'WHEAT']
Bond: ['BOBL', 'BONO', .... 'USIRS5', 'USIRS5ERIS']
FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'GBPEUR', 'JPY', 'SEK']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

Cluster 1 is now risk on, cluster 3 is risk off. We also have a new cluster, containing the vol markets plus - randomly - a couple of FX markets. These were originally in cluster 2.

With N=4

N=4
clusters = cluster_correlation_matrix(corr_matrix,N)
display_clusters(system, clusters)

Cluster 1
Equity: ['AEX', 'BEL20', ... 'SP500_micro', 'TOPIX']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', ..., 'NZD', 'RUR', 'YENEUR']
Metals: ['COPPER', 'PALLAD', 'PLAT']
Sector: ['EU-AUTO', 'EU-BANKS', '.... 'US-TECH', 'US-UTILS']

Cluster 2
Vol: ['V2X', 'VIX', 'VNKI']
FX: ['CNH', 'SGD']

Cluster 3
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', 'CORN',.... 'SOYMEAL', 'SOYOIL', 'WHEAT']
FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

Cluster 4
Bond: ['BOBL', 'BONO',.... 'USIRS5', 'USIRS5ERIS']
FX: ['GBPEUR', 'JPY']
Clusters 1 (risk on) and 2 (vol + asian FX) are unchanged, but the risk off cluster has now split into cluster 4 (most of the bonds, and therefore very much still risk off) and cluster 3 (which is now looking more like a generic commodities cluster)


With N=5

Cluster 1
Equity: ['AEX', 'BEL20', 'CAC', 'DAX', .... 'SP400', 'SP500_micro', 'TOPIX']
Sector: ['EU-AUTO', 'EU-BANKS', ... 'US-STAPLES', 'US-TECH']

Cluster 2
Equity: ['BOVESPA', 'FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', 'CAD', ... 'RUR', 'YENEUR']
Metals: ['COPPER', 'PALLAD', 'PLAT']
Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']

Cluster 3
Vol: ['V2X', 'VIX', 'VNKI']
FX: ['CNH', 'SGD']

Cluster 4
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', 'CORN',.... 'SOYOIL', 'WHEAT']
FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

Cluster 5
Bond: ['BOBL', 'BONO', ... 'USIRS10', 'USIRS5', 'USIRS5ERIS']
FX: ['GBPEUR', 'JPY']


Unchanged from before are cluster 5 (mostly bonds / risk off), cluster 4 (random commodities), cluster 3 (vol + odd fx). But the giant risk on cluster has split. Cluster 1 is now entirely composed of equities (plus sectors). Cluster 2 has most of the Asian equities, plus the other sort of risk on'y things: italian bonds, most FX, some random (low Beta?), a few metals.


With N=6

Existing:
Cluster 1: Core equities risk on
Equity: ['AEX', 'BEL20',... 'SP500_micro', 'TOPIX']
Sector: ['EU-AUTO', 'EU-BANKS', ... 'US-STAPLES', 'US-TECH']

Cluster 2: Random risk on
Equity: ['BOVESPA', 'FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', 'CAD', ... 'NOK', 'NZD', 'RUR', 'YENEUR']
Metals: ['COPPER', 'PALLAD', 'PLAT']
Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']

Cluster 3: Vol + Asian FX
Vol: ['V2X', 'VIX', 'VNKI']
FX: ['CNH', 'SGD']

Cluster 4: Random commodities
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', ... 'SOYMEAL', 'SOYOIL', 'WHEAT']
FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

New clusters:
Cluster 5: Non US and German bonds, that aren't Italian
Bond: ['BONO', 'CH10', 'JGB', 'KR10', 'KR3', 'OAT']

Cluster 6: US and German Bonds plus odd FX
Bond: ['BOBL', 'BUND', ... 'USIRS5', 'USIRS5ERIS']
FX: ['GBPEUR', 'JPY'
]

With N=7

New clusters:
Cluster 2: Risk on randomness mostly FX
Metals: ['PALLAD', 'PLAT']
FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'MXP', 'NOK', 'NZD', 'RUR']
Equity: ['BOVESPA']
Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']

Cluster 3: Risk on randomness Asian equity, oil, italian bonds...
Equity: ['FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Ags: ['BBCOMM']
Bond: ['BTP', 'BTP3']
FX: ['KRWUSD', 'YENEUR']
Metals: ['COPPER'
]


With N=8


New clusters:-
Cluster 4: US/EU vol
Vol: ['V2X', 'VIX']

Cluster 5: JP Vol, Asian FX
Vol: ['VNKI']
FX: ['CNH', 'SGD']

With N=9

New clusters:

Cluster 8: German and US bonds
Bond: ['BOBL', 'BUND', 'BUXL', 'EDOLLAR', 'SHATZ', 'US10', 'US10U', 'US2', 'US20', 'US3', 'US30', 'US5', 'USIRS10', 'USIRS5', 'USIRS5ERIS']
FX: ['JPY']

Cluster 9: GBPEUR
FX: ['GBPEUR'
]

With N=10 (and a full list of instruments in each cluster)

Cluster 1: Core equity risk on
Equity: ['AEX', 'BEL20', 'CAC', 'DAX', 'DJSTX-SMALL',
'DOW', 'EU-DIV30', 'EURO600', 'EUROSTX', 'EUROSTX-LARGE',
'EUROSTX-SMALL', 'KOSPI', 'MSCISING', 'MUMMY',
'NASDAQ_micro', 'NIFTY', 'NIKKEI', 'NIKKEI400',
'OMX', 'R1000', 'RUSSELL', 'SMI', 'SMI-MID',
'SP400', 'SP500_micro', 'TOPIX']
Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC', 'EU-CHEM',
'EU-CONSTRUCTION', 'EU-DJ-TELECOM', 'EU-FOOD',
'EU-HEALTH', 'EU-HOUSE', 'EU-INSURE', 'EU-MEDIA',
'EU-MID', 'EU-OIL', 'EU-REALESTATE', 'EU-RETAIL',
'EU-TECH', 'EU-TRAVEL', 'EU-UTILS', 'EUROSTX200-LARGE',
'US-DISCRETE', 'US-ENERGY', 'US-FINANCE', 'US-HEALTH',
'US-INDUSTRY', 'US-MATERIAL', 'US-STAPLES', 'US-TECH']

Cluster 2: Risk on mostly FX
Metals: ['PALLAD', 'PLAT']
FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'MXP', 'NOK', 'NZD', 'RUR']
Equity: ['BOVESPA']
Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']

Cluster 3 (new): Italian bonds, Asian equity and FX
Bond: ['BTP', 'BTP3']
FX: ['KRWUSD']
Equity: ['FTSETAIWAN', 'KOSDAQ']

Cluster 4 (new): Asian equities, FX, Some energies
Equity: ['FTSECHINAA', 'FTSECHINAH', 'MSCIASIA']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Ags: ['BBCOMM']
FX: ['YENEUR']
Metals: ['COPPER']

Cluster 5: US/EU vol
Vol: ['V2X', 'VIX']

Cluster 6: Asian vol, Asian FX
Vol: ['VNKI']
FX: ['CNH', 'SGD']

Cluster 7: Random commodities, risk off
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', 'CORN', 'COTTON', 'FEEDCOW',
'LEANHOG', 'LIVECOW', 'LUMBER', 'MILK', 'MILKDRY',
'MILKWET', 'OATIES', 'REDWHEAT', 'RICE', 'SOYBEAN',
'SOYMEAL', 'SOYOIL', 'WHEAT']
FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']

Cluster 8: Other bonds
Bond: ['BONO', 'CH10', 'JGB', 'KR10', 'KR3', 'OAT']

Cluster 9: German and US bonds
Bond: ['BOBL', 'BUND', 'BUXL', 'EDOLLAR', 'SHATZ',
'US10', 'US10U', 'US2', 'US20', 'US3', 'US30',
'US5', 'USIRS10', 'USIRS5', 'USIRS5ERIS']
FX: ['JPY']

Cluster 10: GBPEUR
FX: ['GBPEUR']

As a heirarchy

  • Risk on
    • Core equities (cluster 1)
    • Mostly FX (cluster 2)
    • Random risk on commodities 
      • Italian bonds, Asian (cluster 3)
      • Asian equities, FX, some energies (cluster 4)
  • Risk off
    • Generic risk off commodities (cluster 7)
    • Bonds + flight to safety FX
      • US and German bonds (cluster 9 and 10)
      • Non US/German/Italian bonds (cluster 8)
    • Vol + Asian FX
      • US/EU vol (cluster 5)
      • Asian vol, FX (cluster 6)


Conclusion

Well that was an interesting exercise, which just goes to show that the traditional grouping of asset classes may not make as much sense as you think. 

To reiterate, these groupings will change over time. 

5 comments:

  1. Do you plan to replace the handcrafting method with these clusters?

    ReplyDelete
    Replies
    1. Actually this code is the same code used by the handcrafting method. The difference is the correlations used here are for underlying instrument returns, whereas normally when optimising instrument weights I'm using the correlations of trading sub-strategies for each instrument. Not sure the results would be that different though.

      Delete
  2. For this exercise its effectively just long, so I didn't do anything - negative is fine.

    For trading substrategy returns I don't take the absolute value, but I do floor at zero (here's the critical line https://github.com/robcarver17/pysystemtrade/blob/master/sysdata/config/defaults.yaml#L186)

    I'd expect the correlation of trading substrategy returns to be (a) lower in magnitude than underlying instruments and (b) all positive. And if they aren't positive, it makes sense to set them to zero rather than -(-big number) = +big number.

    ReplyDelete
  3. Hi Robert, one question for you. How is this exercise if we have several trading sessions "Flat/not in the market". Is there a way to reward/punish a strategy for being out of the market vs others? As always, thank you for your incredible work.

    ReplyDelete
    Replies
    1. Do you mean in terms of performance evaluation?

      Delete

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.