## Tuesday 18 January 2022

### Clustering and correlations

Happy new year!

A very quick post from me this month - I'm trying to get ready for teaching next week and also cracking on with my latest book. On the Systematic Trader podcast I recently discussed using a clustering algorithim to group instruments.

Using my software, pysystemtrade, I can get a correlation matrix of asset returns and cluster it quite easily. The correlation I'm using is as of now January 2022, and has a ~6 month halflife.

So all I did was run this for different group sizes, and see what instruments were in each group.

All the code for this post is here and the clustering code is here - it uses scipy.cluster.
If you want to use this code yourself without the full drama of pysystemtrade, just use the function get_list_of_clusters_for_non_boring_correlation_matrix and pass it a numpy array correlation matrix.

## With N=2

`clusters = cluster_correlation_matrix(corr_matrix,2)`
```Cluster 1
Equity: ['AEX', 'BEL20', 'BOVESPA', .... 'SP500_micro', 'TOPIX']
OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']
Bond: ['BTP', 'BTP3']
Ags: ['BBCOMM']
FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'KRWUSD', 'MXP',      'NOK', 'NZD', 'RUR', 'YENEUR']
Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC',... 'US-TECH', 'US-UTILS']
Cluster 2
Equity: ['FTSEINDO']
OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']
Ags: ['BUTTER', 'CHEESE', 'CORN', .... 'SOYOIL', 'WHEAT']
Bond: ['BOBL', 'BONO', 'BUND', 'BUXL', 'CH10', ....'USIRS5', 'USIRS5ERIS']
FX: ['CHF', 'CNH', 'CZK', 'EUR', 'EURCHF', 'GBPEUR', 'JPY', 'SEK', 'SGD']
Vol: ['V2X', 'VIX', 'VNKI']
Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']
```
OK, this is interesting. The first cluster is about 58% of the total and contains equities, italian bonds, a few random metals, and some FX markets which are mostly pretty dodgy EM stuff (yes including GBPUSD!). Let's call this the risk on cluster.

Cluster two contains most Ags markets, most of the rest of the bond markets,  . It also contains vol (which remember is a safe haven - if VIX goes up then it's because the world is going to hell). It seems reasonable that this is mostly a risk off cluster.

There are some weird oddities - crypto risk off, really? And the FX split is a bit weird in places - JPY risk off, yes; CHF yes agreed, but CZK... ?

## With N=3

`N=3clusters = cluster_correlation_matrix(corr_matrix,N)display_clusters(system, clusters)`
`Cluster 1Equity: ['AEX', 'BEL20', .... 'SP400', 'SP500_micro', 'TOPIX']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Bond: ['BTP', 'BTP3']Ags: ['BBCOMM']FX: ['AUD', 'BRE', 'CAD', .... 'RUR', 'YENEUR']Metals: ['COPPER', 'PALLAD', 'PLAT']Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC', ... 'US-TECH', 'US-UTILS']Cluster 2Vol: ['V2X', 'VIX', 'VNKI']FX: ['CNH', 'SGD']Cluster 3Equity: ['FTSEINDO']OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']Ags: ['BUTTER', 'CHEESE', ... 'SOYMEAL', 'SOYOIL', 'WHEAT']Bond: ['BOBL', 'BONO', .... 'USIRS5', 'USIRS5ERIS']FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'GBPEUR', 'JPY', 'SEK']Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']`

Cluster 1 is now risk on, cluster 3 is risk off. We also have a new cluster, containing the vol markets plus - randomly - a couple of FX markets. These were originally in cluster 2.

## With N=4

`N=4clusters = cluster_correlation_matrix(corr_matrix,N)display_clusters(system, clusters)`
`Cluster 1Equity: ['AEX', 'BEL20', ... 'SP500_micro', 'TOPIX']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Bond: ['BTP', 'BTP3']Ags: ['BBCOMM']FX: ['AUD', 'BRE', ..., 'NZD', 'RUR', 'YENEUR']Metals: ['COPPER', 'PALLAD', 'PLAT']Sector: ['EU-AUTO', 'EU-BANKS', '.... 'US-TECH', 'US-UTILS']`
`Cluster 2Vol: ['V2X', 'VIX', 'VNKI']FX: ['CNH', 'SGD']`
`Cluster 3Equity: ['FTSEINDO']OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']Ags: ['BUTTER', 'CHEESE', 'CORN',.... 'SOYMEAL', 'SOYOIL', 'WHEAT']FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']`
`Cluster 4Bond: ['BOBL', 'BONO',.... 'USIRS5', 'USIRS5ERIS']FX: ['GBPEUR', 'JPY']`
Clusters 1 (risk on) and 2 (vol + asian FX) are unchanged, but the risk off cluster has now split into cluster 4 (most of the bonds, and therefore very much still risk off) and cluster 3 (which is now looking more like a generic commodities cluster)

## With N=5

`Cluster 1Equity: ['AEX', 'BEL20', 'CAC', 'DAX', .... 'SP400', 'SP500_micro', 'TOPIX']Sector: ['EU-AUTO', 'EU-BANKS', ... 'US-STAPLES', 'US-TECH']Cluster 2Equity: ['BOVESPA', 'FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Bond: ['BTP', 'BTP3']Ags: ['BBCOMM']FX: ['AUD', 'BRE', 'CAD', ... 'RUR', 'YENEUR']Metals: ['COPPER', 'PALLAD', 'PLAT']Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']Cluster 3Vol: ['V2X', 'VIX', 'VNKI']FX: ['CNH', 'SGD']Cluster 4Equity: ['FTSEINDO']OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']Ags: ['BUTTER', 'CHEESE', 'CORN',.... 'SOYOIL', 'WHEAT']FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']Cluster 5Bond: ['BOBL', 'BONO', ... 'USIRS10', 'USIRS5', 'USIRS5ERIS']FX: ['GBPEUR', 'JPY']`

Unchanged from before are cluster 5 (mostly bonds / risk off), cluster 4 (random commodities), cluster 3 (vol + odd fx). But the giant risk on cluster has split. Cluster 1 is now entirely composed of equities (plus sectors). Cluster 2 has most of the Asian equities, plus the other sort of risk on'y things: italian bonds, most FX, some random (low Beta?), a few metals.

## With N=6

`Existing:`
`Cluster 1: Core equities risk onEquity: ['AEX', 'BEL20',... 'SP500_micro', 'TOPIX']Sector: ['EU-AUTO', 'EU-BANKS', ... 'US-STAPLES', 'US-TECH']`
`Cluster 2: Random risk onEquity: ['BOVESPA', 'FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Bond: ['BTP', 'BTP3']Ags: ['BBCOMM']FX: ['AUD', 'BRE', 'CAD', ... 'NOK', 'NZD', 'RUR', 'YENEUR']Metals: ['COPPER', 'PALLAD', 'PLAT']Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']`
`Cluster 3: Vol + Asian FXVol: ['V2X', 'VIX', 'VNKI']FX: ['CNH', 'SGD']`
`Cluster 4: Random commoditiesEquity: ['FTSEINDO']OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']Ags: ['BUTTER', 'CHEESE', ... 'SOYMEAL', 'SOYOIL', 'WHEAT']FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']`
`New clusters:`
`Cluster 5: Non US and German bonds, that aren't ItalianBond: ['BONO', 'CH10', 'JGB', 'KR10', 'KR3', 'OAT']`
`Cluster 6: US and German Bonds plus odd FXBond: ['BOBL', 'BUND', ... 'USIRS5', 'USIRS5ERIS']FX: ['GBPEUR', 'JPY']`

## With N=7

`New clusters:Cluster 2: Risk on randomness mostly FXMetals: ['PALLAD', 'PLAT']FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'MXP', 'NOK', 'NZD', 'RUR']Equity: ['BOVESPA']Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']Cluster 3: Risk on randomness Asian equity, oil, italian bonds...Equity: ['FTSECHINAA', 'FTSECHINAH', 'FTSETAIWAN', 'KOSDAQ', 'MSCIASIA']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Ags: ['BBCOMM']Bond: ['BTP', 'BTP3']FX: ['KRWUSD', 'YENEUR']Metals: ['COPPER']`

## With N=8

`New clusters:-Cluster 4: US/EU volVol: ['V2X', 'VIX']Cluster 5: JP Vol, Asian FXVol: ['VNKI']FX: ['CNH', 'SGD']`

## With N=9

`New clusters:Cluster 8: German and US bondsBond: ['BOBL', 'BUND', 'BUXL', 'EDOLLAR', 'SHATZ', 'US10', 'US10U', 'US2', 'US20', 'US3', 'US30', 'US5', 'USIRS10', 'USIRS5', 'USIRS5ERIS']FX: ['JPY']Cluster 9: GBPEURFX: ['GBPEUR']`

## With N=10 (and a full list of instruments in each cluster)

`Cluster 1: Core equity risk onEquity: ['AEX', 'BEL20', 'CAC', 'DAX', 'DJSTX-SMALL',         'DOW', 'EU-DIV30', 'EURO600', 'EUROSTX', 'EUROSTX-LARGE',         'EUROSTX-SMALL', 'KOSPI', 'MSCISING', 'MUMMY',         'NASDAQ_micro', 'NIFTY', 'NIKKEI', 'NIKKEI400',         'OMX', 'R1000', 'RUSSELL', 'SMI', 'SMI-MID',         'SP400', 'SP500_micro', 'TOPIX']Sector: ['EU-AUTO', 'EU-BANKS', 'EU-BASIC', 'EU-CHEM',         'EU-CONSTRUCTION', 'EU-DJ-TELECOM', 'EU-FOOD',         'EU-HEALTH', 'EU-HOUSE', 'EU-INSURE', 'EU-MEDIA',         'EU-MID', 'EU-OIL', 'EU-REALESTATE', 'EU-RETAIL',         'EU-TECH', 'EU-TRAVEL', 'EU-UTILS', 'EUROSTX200-LARGE',         'US-DISCRETE', 'US-ENERGY', 'US-FINANCE', 'US-HEALTH',         'US-INDUSTRY', 'US-MATERIAL', 'US-STAPLES', 'US-TECH']Cluster 2: Risk on mostly FXMetals: ['PALLAD', 'PLAT']FX: ['AUD', 'BRE', 'CAD', 'GBP', 'INR', 'IRS', 'MXP', 'NOK', 'NZD', 'RUR']Equity: ['BOVESPA']Sector: ['JP-REALESTATE', 'US-PROPERTY', 'US-REALESTATE', 'US-UTILS']Cluster 3 (new): Italian bonds, Asian equity and FXBond: ['BTP', 'BTP3']FX: ['KRWUSD']Equity: ['FTSETAIWAN', 'KOSDAQ']Cluster 4 (new): Asian equities, FX, Some energiesEquity: ['FTSECHINAA', 'FTSECHINAH', 'MSCIASIA']OilGas: ['BRENT-LAST', 'CRUDE_W_mini', 'GASOILINE', 'HEATOIL']Ags: ['BBCOMM']FX: ['YENEUR']Metals: ['COPPER']Cluster 5: US/EU volVol: ['V2X', 'VIX']Cluster 6: Asian vol, Asian FXVol: ['VNKI']FX: ['CNH', 'SGD']Cluster 7: Random commodities, risk offEquity: ['FTSEINDO']OilGas: ['ETHANOL', 'GAS-LAST', 'GAS_US_mini']Ags: ['BUTTER', 'CHEESE', 'CORN', 'COTTON', 'FEEDCOW',       'LEANHOG', 'LIVECOW', 'LUMBER', 'MILK', 'MILKDRY',       'MILKWET', 'OATIES', 'REDWHEAT', 'RICE', 'SOYBEAN',       'SOYMEAL', 'SOYOIL', 'WHEAT']FX: ['CHF', 'CZK', 'EUR', 'EURCHF', 'SEK']Metals: ['ALUMINIUM', 'BITCOIN', 'ETHEREUM', 'GOLD_micro', 'IRON', 'SILVER']Cluster 8: Other bondsBond: ['BONO', 'CH10', 'JGB', 'KR10', 'KR3', 'OAT']Cluster 9: German and US bondsBond: ['BOBL', 'BUND', 'BUXL', 'EDOLLAR', 'SHATZ',       'US10', 'US10U', 'US2', 'US20', 'US3', 'US30',       'US5', 'USIRS10', 'USIRS5', 'USIRS5ERIS']FX: ['JPY']Cluster 10: GBPEURFX: ['GBPEUR']`

## As a heirarchy

• Risk on
• Core equities (cluster 1)
• Mostly FX (cluster 2)
• Random risk on commodities
• Italian bonds, Asian (cluster 3)
• Asian equities, FX, some energies (cluster 4)
• Risk off
• Generic risk off commodities (cluster 7)
• Bonds + flight to safety FX
• US and German bonds (cluster 9 and 10)
• Non US/German/Italian bonds (cluster 8)
• Vol + Asian FX
• US/EU vol (cluster 5)
• Asian vol, FX (cluster 6)

## Conclusion

Well that was an interesting exercise, which just goes to show that the traditional grouping of asset classes may not make as much sense as you think.

To reiterate, these groupings will change over time.

1. Do you plan to replace the handcrafting method with these clusters?

1. Actually this code is the same code used by the handcrafting method. The difference is the correlations used here are for underlying instrument returns, whereas normally when optimising instrument weights I'm using the correlations of trading sub-strategies for each instrument. Not sure the results would be that different though.

2. For this exercise its effectively just long, so I didn't do anything - negative is fine.

For trading substrategy returns I don't take the absolute value, but I do floor at zero (here's the critical line https://github.com/robcarver17/pysystemtrade/blob/master/sysdata/config/defaults.yaml#L186)

I'd expect the correlation of trading substrategy returns to be (a) lower in magnitude than underlying instruments and (b) all positive. And if they aren't positive, it makes sense to set them to zero rather than -(-big number) = +big number.

3. Hi Robert, one question for you. How is this exercise if we have several trading sessions "Flat/not in the market". Is there a way to reward/punish a strategy for being out of the market vs others? As always, thank you for your incredible work.

1. Do you mean in terms of performance evaluation?

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.