Friday, 14 December 2018

Portfolio construction through handcrafting: implementation

This post is all about handcrafting; a method for doing portfolio construction which human beings can do without computing power, or at least with a spreadsheet. The method aims to achieve the following goals:
  • Humans can trust it: intuitive and transparent method which produces robust weights
  • Can be easily implemented by a human in a spreadsheet
  • Can be back tested
  • Grounded in solid theoretical foundations
  • Takes account of uncertainty in data estimates
  • Decent out of sample performance
  • Addresses the problem of allocating capital to assets on a long only basis, or to trading strategies. It won't be suitable for a long /short portfolio.

This is the third in a series of posts on the handcrafting method.
  1. The first post can be found here, and it motivates the need for a method like this.
  2. In the second post I build up the various components of the method, and discuss why they are needed. 
  3. In this, the third post, I'll explain how you'd actually apply the method step by step, with code. 
  4. Post four will test the method with artificial data
  5. The final post will use real data

This will be a 'twin track' post; in which I'll outline two implementations:
  • a spreadsheet based method suitable for small numbers of assets where you need to do a one-off portfolio for live trading rather than repeated backtest. It's also great for understanding the intution of the method - a big plus point of this technique.
  • a python code based method. This uses (almost) exactly the same method, but can be backtested (the difference is that the grouping of assets is done manually in the spreadsheet based method, but automatically here based on the correlation matrix). The code can be found here; although this will live within the pysystemtrade ecosystem I've deliberately tried to make it as self contained as possible so you could easily drop this out into your own framework.

The demonstration

To demonstrate the implementation I'm going to need some data. This won't be the full blown real data that I'll be using to test the method properly, but we do need *something*. It needs to be an interesting data set; with the following characteristics:
  • different levels of volatility (so not a bunch of trading systems)
  • heirarcy of 3 levels (more would be too complex for the human implementaiton, less wouldn't be a stern enough test)
  • not too many assets such that the human implementation is too complex

I'm going to use long only weekly returns from the following instruments: BOBL, BUND, CORN, CRUDE_W, EURODOLLAR, GAS_US, KR10, KR3, US10, US20; from 2014 to the present (since for some of these instruments I only have data for the last 5 years).

Because this isn't a proper test I won't be doing any fancy rolling out of sample optimisation, just a single portfolio.

The descriptive statistics can be found here. The python code which gets the data (using pysystemtrade), is here.

(I've written the handcrafting functions to be standalone; when I come to testing them with real data I'll show you how to hook these into pysystemtrade]

Overview of the method

Here are the stages involved in the handcrafting method. Note there are a few options involved:
  1. (Optional if using a risk target, and automated): partition the assets into high and low volatility
  2. Group the assets hierarchically (if step 1 is followed, this will form the top level grouping). This will done either by (i) an automated clustering algorithm or (ii) human common sense.
  3. Calculate volatility weights within each group at the lowest level, proceeding upwards. These weights will either be equal, or use the candidate matching technique described in the previous post.
  4. (Optionally) Calculate Sharpe Ratio adjustments. Apply these to the weights from step 3.
  5. Calculate diversification multipliers for each group. Apply these to the weights from step 4.
  6. Calculate cash weights using the volatility of each asset.
  7. (Optionally) if a risk target was used with a manual method, partition the top level groups into high and low volatility.
  8. (Optionally) if a risk target was supplied; use the technique outlined in my previous post to ensure the target is hit.

Spreadsheet: Group the assets hierarchically

A suggested grouping is here. Hopefully it's fairly self explanatory. There could be some debate about whether Eurodollar and bonds should be glued together, but part of doing it this way was to see if the diversification multiplier fixes this potential mistake.

Spreadsheet: Calculate volatility weights

The calculations are shown here.

Notice that for most groups there are only one or two assets, so things are relatively trivial. Then at the top level (level 1) we have three assets, so things are a bit more fun. I use a simple average of correlations to construct a correlation matrix for the top level groups. Then I use a weighted average of two candidate matrices to work out the required weights for the top level groups.

The weights come out as follows:
  • Developed market bonds, which we have a lot of, 3.6% each for a total of 14.4%
  • Emerging market bonds (just Korea), with 7.2% each for a total of 14.4%
  • Energies get 10.7% each, for a total of 21.4%
  • Corn gets 21.4%
  • Eurodollar gets 28.6%

Spreadsheet: Calculate Sharpe Ratio adjustments (optionally)

Adjustments for Sharpe Ratios are shown in this spreadsheet. You should follow the calculations down the page, as they are done in a bottom up fashion. I haven't bothered with interpolating the heuristic adjustments, instead I've just used VLOOKUP to match the closest adjustment row. 

Spreadsheet: Calculate diversification multipliers (DM)

DM calculations are shown in this sheet. DMs are quite low in bonds (where the assets in each country are highly correlated), but much higher in commodities. The final set of changes in particular striking; note the reallocation from the single instrument rates group (initial weight 30.7%, falls to 24.2%) to commodities (initial weight 29%, rises to 36.5%).

Spreadsheet: Calculate cash weights

(Almost) finally we calculate our cash weights, in this spreadsheet. Notice the huge weight to low volatility Eurodollar. 

Spreadsheet: Partition into high and low volatility 

(optional: if risk target used with manual method)

If we're using a risk target we'll need to partition our top level groups (this is done automatically with python, but spreadsheet people are allowed to choose their own groupings). Let's choose an arbitrary risk target: 10%. This should be achievable since the average risk of our assets is 10.6%

This is the average volatility of each group (calculated here):

Bonds: 1.83%
Commodities: 14.6%
Rates: 0.89%

So we have:

High vol: commodities
Low vol: Rates and bonds

(Not a massive surprise!!)

Spreadsheet: Check risk target is hit, adjust weights if required

(optional: with risk target)

The natural risk of the portfolio comes out at 1.09% (calculated here). Let's explore the possible scenarios:
  • Risk target lower than 1.09%, eg 1%: We'd need to add cash to the portfolio. Using the spreadsheet with a 1% risk target you'd need to put 8.45% of your portfolio into cash; with the rest going into the constructed portfolio.
  • Risk target higher than 1.09% with leverage allowed: You'd need to apply a leverage factor; with a risk target of 10% you'd need a leverage factor of 9.16
  • Risk target higher than 1.09% without leverage: You'd need to constrain the proportion of the portfolio that allocated to low risk assets (bonds and rates). The spreadsheet shows that this comes out at 31.4% cash weight, with the rest in commodities. I've also recalculated the weights with this constraint to show how it comes out.
And here are those final weights (to hit 10% risk with no leverage):

BOBL 2.17%
BUND 0.78%
US10 0.44%
US20 0.23%
KR3 7.25%
KR10 1.86%
EDOLLAR 18.67%
CORN 36.67%
CRUDE_W 19.47%
GAS_US 12.45%

Python code

The handcrafting code is here. Although this file will ultimately be dumped into pysystemtrade, it's designed to be entirely self contained so you can use it in your own applications.

The code expects weekly returns, and for all assets to be present. It doesn't do rolling optimisation, or averages over multiple assets. I need to write code to hook it into pysystemtrade, and to achieve these various objectives.

The only input required is a pandas data frame returns with named columns containing weekly returns. The main object you'll be interacting with is called Portfolio

Simplest use case, to go from returns to cash weights without risk targeting:


I won't document the API or methodology fully here, but hopefully you will get the idea.

Python: Partition the assets into high and low volatility

(If using a risk target, and automated)

Let's try with a risk target of 10%:

p=Portfolio(returns, risk_target=.1)

Out[575]: [Portfolio with 7 instruments, Portfolio with 3 instruments]

Out[576]: Portfolio with 7 instruments
Out[577]: ['BOBL', 'BUND', 'EDOLLAR', 'KR10', 'KR3', 'US10', 'US20']

Out[578]: ['CORN', 'CRUDE_W', 'GAS_US']

So all the bonds get put into one group, the other assets into another. Seems plausible.

Using an excessively high risk target is a bad idea:

p=Portfolio(returns, risk_target=.3)
Not many instruments have risk higher than target; portfolio will be concentrated to hit risk target
Out[584]: [Portfolio with 9 instruments, Portfolio with 1 instruments]

This is an even worse idea:

p=Portfolio(returns, risk_target=.4)
Exception: Risk target greater than vol of any instrument: will be impossible to hit risk target

The forced partitioning into two top level groups will not happen if leverage is allowed, or no risk target is supplied:

p=Portfolio(returns) # no risk target
Natural top level grouping used
[Portfolio with 7 instruments,
 Portfolio with 2 instruments,
 Portfolio with 1 instruments]
p=Portfolio(returns, risk_target=.3, allow_leverage=True)
Natural top level grouping used
[Portfolio with 7 instruments,
 Portfolio with 2 instruments,
 Portfolio with 1 instruments]

Python: Group the assets hierarchically

Here's an example when we're allowing the grouping to happen naturally:
Natural top level grouping used Out[48]: [' Contains 3 sub portfolios', ['... Contains 3 sub portfolios', ["...... Contains ['KR10', 'KR3']"], ["...... Contains ['EDOLLAR', 'US10', 'US20']"], ["...... Contains ['BOBL', 'BUND']"]], ["... Contains ['CRUDE_W', 'GAS_US']"], ["... Contains ['CORN']"]]
We have three top level groups: interest rates, energies, and Ags. The interest rate group is further divided into second level groupings by country: Korea, US and Germany. Here's an example when we're doing a partition by risk

p=Portfolio(returns, risk_target=.1)
Applying partition to hit risk target
Partioning into two groups to hit risk target of 0.100000

[' Contains 2 sub portfolios',
 ['... Contains 3 sub portfolios',
  ["...... Contains ['KR10', 'KR3']"],
  ["...... Contains ['EDOLLAR', 'US10', 'US20']"],
  ["...... Contains ['BOBL', 'BUND']"]],
 ["... Contains ['CORN', 'CRUDE_W', 'GAS_US']"]]

There are now two top level groups as we saw above.

If you're a machine learning enthusiast who wishes to play around with the clustering algorithm, then the heavy lifting of the clustering algo is all done in this method of the portfolio object:

def _cluster_breakdown(self):

    X = self.corr_matrix.values
    d = sch.distance.pdist(X)
    L = sch.linkage(d, method='complete')

    # play with this line at your peril!!!
    ind = sch.fcluster(L, MAX_CLUSTER_SIZE, criterion='maxclust')

    return list(ind)

However I've found the results to be very similar regardless of the method used.

Python: Calculate volatility weights

p=Portfolio(returns, use_SR_estimates=False)  # turn off SR estimates for now
Natural top level grouping used
[' Contains 3 sub portfolios',
 ['... Contains 3 sub portfolios',
  ["...... Contains ['KR10', 'KR3']"],
  ["...... Contains ['EDOLLAR', 'US10', 'US20']"],
  ["...... Contains ['BOBL', 'BUND']"]],
 ["... Contains ['CRUDE_W', 'GAS_US']"],
 ["... Contains ['CORN']"]]

Let's look at a few parts of the portfolio. Firstly the very simple single asset Corn portfolio:

# Just Corn, single asset
Out[54]: [1.0]

The Energy portfolio is slightly more interesting with two assets; but this will default to equal volatility weights:

# Just two assets, so goes to equal vol weights
Out[55]: [0.5, 0.5]

Only the US bonds (and STIR) portfolio has 3 assets, and so will use the candidate matching algorithm:

# The US bond group is the only interesting one
          EDOLLAR      US10      US20
EDOLLAR  1.000000  0.974097  0.872359
US10     0.974097  1.000000  0.924023
US20     0.872359  0.924023  1.000000
# Pretty close to equal weighting
Out[57]: [0.28812193544790643, 0.36572016685796049, 0.34615789769413313]

Python: Calculate Sharpe Ratio adjustments (optionally)

p=Portfolio(returns) # by default Sharpe Ratio adjustments are on unless we turn them off

Let's examine a simple two asset portfolio to see how these work:

# Let's look at the energies portfolio
Out[61]: Portfolio with 2 instruments
# first asset is awful, second worse
Out[63]: array([-0.55334564, -0.8375069 ])

# Would be equal weights, now tilted towards first asset
Out[62]: [0.5399245657079913, 0.46007543429200887]

# Can also see this information in one place
                      CRUDE_W    GAS_US
Raw vol (no SR adj)  0.500000  0.500000
Vol (with SR adj)    0.539925  0.460075
Sharpe Ratio        -0.553346 -0.837507 
Portfolio containing ['CRUDE_W', 'GAS_US'] instruments  

Python: Calculate diversification multipliers

Natural top level grouping used

# not much diversification for bonds /rates within each country
Out[67]: 1.0389170782708381  #korea
Out[68]: 1.0261371453175774  #US bonds and STIR
Out[69]: 1.0226377699075955  # german bonds
# Quite decent when you put them together though p.sub_portfolios[0].div_mult
Out[64]: 1.2529917422729928

# Energies group only two assets but quite uncorrelated
Out[65]: 1.2787613327950775

# only one asset in corn group
Out[66]: 1.0
# Not used in the code but good to know
Out[71]: 2.0832290180687183

Python: Aggregate up sub-portfolios

The portfolio in the python code is built up in a bottom up fashion. Let's see how this happens, by focusing on the 10 year US bond.

Natural top level grouping used

First the code calculates the vol weight for US bonds and rates, including a SR adjustment:

                      EDOLLAR      US10      US20
Raw vol (no SR adj)  0.288122  0.365720  0.346158
Vol (with SR adj)    0.292898  0.361774  0.345328
Sharpe Ratio         0.218935  0.164957  0.185952 
 Portfolio containing ['EDOLLAR', 'US10', 'US20'] instruments  

This portfolio then joins the wider bond portfolio (here in column '1' - there are no meaningful names for parts of the wider portfolio - the code doesn't know this is US bonds):

                                  0         1         2
Raw vol (no SR adj or DM)  0.392114  0.261486  0.346399
Vol (with SR adj no DM)    0.423425  0.162705  0.413870
SR                         0.985267  0.192553  1.185336
Div mult                   1.038917  1.026137  1.022638 
 Portfolio containing 3 sub portfolios aggregate 

The Sharpe Ratios, raw vol, and vol weights shown here are for the groups that we're aggregating together here. So the raw vol weight on US bonds is 0.26. To see why look at the correlation matrix:
          0         1         2
0  1.000000  0.493248  0.382147
1  0.493248  1.000000  0.715947
2  0.382147  0.715947  1.000000
You can see that US bonds are more highly correlated with asset 0 and asset 2, than they are with each other. So it gets a lower raw weight. It also has a far worse Sharpe Ratio, so get's further downweighted relative to the other countries.

We can now work out what the weight of US 10 year bonds is amongst bonds as a whole:


                       BOBL      BUND   EDOLLAR      KR10       KR3      US10  \
Vol wt in group    0.519235  0.480765  0.292898  0.477368  0.522632  0.361774   
Vol wt. of group   0.413870  0.413870  0.162705  0.423425  0.423425  0.162705   
Div mult of group  1.022638  1.022638  1.026137  1.038917  1.038917  1.026137   
Vol wt.            0.213339  0.197533  0.047473  0.203860  0.223189  0.058636   
Vol wt in group    0.345328  
Vol wt. of group   0.162705  
Div mult of group  1.026137  
Vol wt.            0.055971   
 Portfolio containing 3 sub portfolios 

The first row is the vol weight of the asset within it's group; we've already seen this calculated. The next row is the vol weight of the group as a whole; again we've already seen the figures for US bonds calculated above. After that is the diversification multiplier for the US bond group. Finally we can see the volatility weight of US 10 year bonds in the bond group as a whole; equal to the vol weight within the group, multiplied by the vol weight of the group, multiplied by the diversification multiplier of the group; and then renormalised to add up to 1.

Finally we're ready to construct the top level group, in which the bonds as a whole is asset '0'. First the correlation matrix:

notUsedYet = p.volatility_weights
          0         1         2
0  1.000000 -0.157908 -0.168607
1 -0.157908  1.000000  0.016346
2 -0.168607  0.016346  1.000000

All these assets, bonds [0], energies [1], and corn [2] are pretty uncorrelated, though bonds might just have the edge:

                                  0         1         2
Raw vol (no SR adj or DM)  0.377518  0.282948  0.339534
Vol (with SR adj no DM)    0.557443  0.201163  0.241394
SR                         1.142585 -0.871979 -0.801852
Div mult                   1.252992  1.278761  1.000000 
 Portfolio containing 3 sub portfolios aggregate 

Now to calculate the final weights:


                       BOBL      BUND      CORN   CRUDE_W   EDOLLAR    GAS_US  \
Vol wt in group    0.213339  0.197533  1.000000  0.539925  0.047473  0.460075   
Vol wt. of group   0.557443  0.557443  0.241394  0.201163  0.557443  0.201163   
Div mult of group  1.252992  1.252992  1.000000  1.278761  1.252992  1.278761   
Vol wt.            0.124476  0.115254  0.201648  0.116022  0.027699  0.098863   
                       KR10       KR3      US10      US20  
Vol wt in group    0.203860  0.223189  0.058636  0.055971  
Vol wt. of group   0.557443  0.557443  0.557443  0.557443  
Div mult of group  1.252992  1.252992  1.252992  1.252992  
Vol wt.            0.118945  0.130224  0.034212  0.032657   
 Portfolio containing 3 sub portfolios 

We've now got the final volatility weights. Here's another way of viewing them:

# First remind ourselves of the volatility weights
dict([(instr,wt) for instr,wt in zip(p.instruments, p.volatility_weights)])
{'BOBL': 0.12447636469041611,
 'BUND': 0.11525384132670763,
 'CORN': 0.20164774158721335,
 'CRUDE_W': 0.11602155610023207,
 'EDOLLAR': 0.027698823230085486,
 'GAS_US': 0.09886319534295436,
 'KR10': 0.11894543449866347,
 'KR3': 0.13022374999090081,
 'US10': 0.034212303586599956,
 'US20': 0.032656989646226771}
The most striking difference to the spreadsheet is that by lumping Eurodollar in with the other US bonds it has a much smaller vol weight. German and Korean bonds have gained as a result; the energies and Corn are pretty similar.

Python: Calculate cash weights

dict([(instr,wt) for instr,wt in zip(p.instruments, p.cash_weights)])
Natural top level grouping used
{'BOBL': 0.21885945926487166,
 'BUND': 0.079116240615862948,
 'CORN': 0.036453365347104472,
 'CRUDE_W': 0.015005426640542012,
 'EDOLLAR': 0.10335586678017628,
 'GAS_US': 0.009421184504702888,
 'KR10': 0.10142345423259323,
 'KR3': 0.39929206844323878,
 'US10': 0.025088747004851766,
 'US20': 0.011984187166055982}

Obviously the less risky assets like 3 year Korean bonds and Eurodollar get a larger cash weight. It's also possible to see how these were calculated from the final volatility weights:
                  BOBL      BUND      CORN   CRUDE_W   EDOLLAR    GAS_US  \
Vol weights   0.124476  0.115254  0.201648  0.116022  0.027699  0.098863   
Std.          0.018965  0.048575  0.184449  0.257816  0.008936  0.349904   
Cash weights  0.218859  0.079116  0.036453  0.015005  0.103356  0.009421   
                  KR10       KR3      US10      US20  
Vol weights   0.118945  0.130224  0.034212  0.032657  
Std.          0.039105  0.010875  0.045470  0.090863  
Cash weights  0.101423  0.399292  0.025089  0.011984   
 Portfolio containing 10 instruments (cash calculations) 

Python: Check risk target is hit, adjust weights if required

(optional: with risk target)

The natural risk of the unconstrained portfolio is quite low: 1.59% (a bit higher than the spreadsheet version, since we haven't allocated as much to Eurodollar)

Natural top level grouping used
Out[82]: 0.015948015324395711

Let's explore the possible scenarios:
  • Risk target lower than 1.59%, eg 1%: We'd need to add cash to the portfolio. 
p=Portfolio(returns, risk_target=.01)

# if cash weights add up to less than 1, must be including cash in the portfolio

Calculating weights to hit a risk target of 0.010000
Natural top level grouping used
Too much risk 0.372963 of the portfolio will be cash
Out[84]: 0.62703727056889502

# check risk target hit
Out[85]: 0.01

With a 1% risk target you'd need to put 37.3% of your portfolio into cash; with the rest going into the constructed portfolio.
  • Risk target higher than 1.59% with leverage allowed, eg 10%
p=Portfolio(returns, risk_target=.1, allow_leverage=True)

# If sum of cash weights>1 we must be using leverage
Calculating weights to hit a risk target of 0.100000
Natural top level grouping used
Not enough risk leverage factor of 6.270373 applied
Out[87]: 6.2703727056889518

# check target hit
Out[88]: 0.10000000000000001

You'd need to apply a leverage factor; with a risk target of 10% you'd need a leverage factor of 6.27
  • Risk target higher than 1.59% without leverage: 
p=Portfolio(returns, risk_target=.1)
Calculating weights to hit a risk target of 0.100000
Not enough risk, no leverage allowed, using partition method
Applying partition to hit risk target
Partitioning into two groups to hit risk target of 0.100000
Need to limit low cash group to 0.005336 (vol) 0.323992 (cash) of portfolio to hit risk target of 0.100000
Applying partition to hit risk target
Partitioning into two groups to hit risk target of 0.100000

# look at cash weights
dict([(instr,wt) for instr,wt in zip(p.instruments, p.cash_weights)])
{'BOBL': 0.07548008030352539,
 'BUND': 0.027285547606928903,
 'CORN': 0.3285778602871447,
 'CRUDE_W': 0.19743348662518673,
 'EDOLLAR': 0.035645291049388697,
 'GAS_US': 0.15010566887898191,
 'KR10': 0.034978842111056153,
 'KR3': 0.13770753839879318,
 'US10': 0.0086525875783564771,
 'US20': 0.0041330971606378854}

# check risk target hit
Out[91]: 0.10001663416516968

In this case the portfolio to constrain the proportion of the portfolio that allocated to low risk assets (bonds and rates). 

What's next

In the next post I'll test the method (in it's back testable python format - otherwise (a) the results could arguably be forward looking, and (b) I have now seen more than enough spreadsheets for 2018 thank you very much) against some alternatives. It could take me a few weeks to post this, as I will be somewhat busy with Christmas, university, and book writing commitments!

Friday, 7 December 2018

Portfolio construction through handcrafting: The method

This post is all about handcrafting; a method for doing portfolio construction which human beings can do without computing power (although realistically you'd probably need a spreadsheet unless you're some kind of weird masochist). The method aims to achieve the following goals:

  • Humans can trust it: intuitive and transparent method which produces robust weights
  • Can be easily implemented by a human in a spreadsheet
  • Can be back tested
  • Grounded in solid theoretical foundations
  • Takes account of uncertainty in data estimates
  • Decent out of sample performance
  • Addresses the problem of allocating capital to assets on a long only basis, or to trading strategies. It won't be suitable for a long /short portfolio.

This is the second in a series of posts on the handcrafting method. The first one can be found here, and it motivates the need for a method like this.

In this post I'll build up the various components of the method, and discuss why they are needed. Then in the next post I'll explain how you'd actually apply the method step by step, with code. The final two posts will test the method using artificial and real data respectively.

A brief note about HRP

This method does share some common ground with hierarchical risk parity (HRP)*. 

* To reiterate I did come up with the method independently of HRP: I first became aware of HRP in early 2017, but I've been using handcrafting in one form or another for several years and my book containing the concept was written in 2014 and 2015, and then published in October 2015. Having said that the core idea of handcrafting is that it reflects the way humans naturally like to do portfolio construction, so I'm not sure anyone can really claim ownership of the concept. 

Indeed in some ways you could think of it as HRP broken down into steps that make intuitive sense to human beings and can be implemented using only a spreadsheet). However as you'll see there are some fundamental differences involved.

Equal weights

Everyone loves equal weights. High falutin academic finance types like them (see here). More importantly human beings like them too.  Benartzi and Thaler  find that when people do diversify, they do so in a naive fashion. They provide evidence that in 401(k) plans, many people seem to use strategies as simple as allocating 1/n of their savings to each of the n available investment options, whatever those options are.

Even people who know a *lot* about portfolio optimisation like them:

"I should have computed the historical covariance of the asset classes and drawn an efficient frontier…I split my contributions 50/50 between bonds and equities." (Harry Markowitz)

Equal weights should be the starting point for any portfolio allocation. If we know literally nothing about the assets in question, why then equal weighting is the logical choice. More formally if:
  • Sharpe Ratios are statistically indistinguishable / completely unpredictable
  • Standard deviations are statistically indistinguishable / completely unpredictable
  • Correlations  are statistically indistinguishable / completely unpredictable
... then equal weights are the optimal portfolio. For highly diversified indices like the S&P 500 equal weights are going to do pretty well; since these things are close enough to the truth not to matter. But in many other cases they won't.

[An aside: I use Sharpes, standard deviations and correlations rather than the usual formulation of means and covariances. I do this for two reasons;

  1. the uncertainty and predictability of the inputs I prefer is different - for example it is usually fair to say at least across asset classes that you should expect equal Sharpe Ratio, but assuming equal mean is far too heroic 
  2. human beings are better at intuitively understanding my preferred inputs - with experience you will learn that a Sharpe of 0.5 is good and 1.0 is better; that an annual standard deviation of 16% means that on average you'll make or lose about 1% of your capital a day; that a correlation of 0.2 is low and 0.95 is really high. Interpreting variances or worse covariances is much harder; comparing means across asset classes is silly.]

Inverse volatility weighting

The most heroic of the assumptions listed above is this:
  • Standard deviations are statistically indistinguishable / completely unpredictable
They're not. Standard deviations in general are the most predictable characteristic of an assets returns. Their sampling uncertainty is lower than for any other estimate. So an equal weighted portfolio for the S&P 500 (standard deviation ~16% a year) and US 2 year bonds (standard deviation ~2% a year) doesn't make sense. It will get most of it's risk from equities.

The solution is to use inverse volatility weighting. We give each asset a weight of 1/s where s is the standard deviation. Except for certain special cases this won't produce nice weights, so we'll have to normalise these to sum up to 1.

For S&P 500 & US 2 year bonds we'd get weights of 1/.16 and 1/.02, which once normalised is 11.1% and 88.9%. 

We now have two different ways of thinking about weightings: cash weightings and volatility weightings. For our simple example the cash weightings are 11.1% & 88.9%, and the volatility weightings are 50% & 50%.

  • To convert from volatility weighting to cash weighting we divided by the standard deviation of each asset, and then normalise weights back to 1. 
  • To convert from cash to volatility weighting we multiply by the standard deviation, and then normalise weights back to 1.
Note: if allocating capital to trading strategies then this step will not be necessary. If built properly trading strategies should target a given long term risk level. Small differences in the actual risk level should not be used to vary the portfolio weights that they are allocating.

Inverse volatility weighting with risk targeting

There is one fatal flaw with the inverse volatility approach. Consider the portfolio of 11.1% in S&P 500 and 88.9% in US 2 year bonds; this portfolio will give us risk of between 2.5% and 3.5% depending on the correlation. For most people that's *way* too little risk. Or to be precise for most people that's *way* too little return: if each asset had a Sharpe Ratio of 0.5 (which is rather punchy for a long only asset) that would give an expected excess return of around 1.8%.

That's fine if you can use leverage; then all you need to do is use a leverage factor of target risk divided by expected risk. For example if you're targeting 10% risk and the correlation is 0.5, giving risk of 3.1%, then you'd apply a leverage factor of 3.2.

[The topic of which is the correct risk target falls outside of the scope of this method - this is covered in slides 80 to 91 of this talk]

What if you can't use leverage? Then you're going to have to constrain your allocation to low risk assets. So we need to calculate the portfolio weights that would give us the right level of risk. Incoming LaTeX...

[the negative solution will be invalid as long as the risk of the first asset \sigma_1 is greater than that of the second \sigma_2 but that's just ordering]

The above might look a bit horrific but it can be implemented in google sheets or open office spreadsheet (other, more expensive, spreadsheets are apparently available). I'd also argue that it is intuitive - or at least it's very easy to get intuition by playing around with the spreadsheet. After 20 minutes even the dumbest MBA will understand the intuition....

* There will be precisely two digs at MBAs in this post. It's nothing personal. Some of my best friends have MBAs. Actually that isn't true, I don't have any friends with MBAs, but I'm sure that people with MBAs are harmless enough.

Let's try it with our simple example. From the sheet with a portfolio risk target of 10% and a correlation of 0.5 we get a weight of 59.8% on the first asset (the S&P 500), 40.1% on the second (US 2 year bonds). Translated into volatility weightings that's 92.3% on the S&P 500 and 7.7% on the US 2 year.

Some obvious points:

  • For very high risk targets the spreadsheet will produce a value for the weight on the first asset above 1. Clearly you can't achieve risk of 17% without leverage if the riskiest asset you have as risk of 16%.
  • For very low risk targets the spreadsheet will produce inefficient weights or errors, eg for a target of 2.5% it will want to put 94% in the US 2 year bond; but this would produce a lower return than allocating 18.8% of your portfolio to cash and the rest to the inverse volatility portfolio. See this second sheet.

So in practice (where 'the constructed portfolio' is the inverse vol portfolio with equal volatility weights):

  • If you can use leverage, then use a leverage factor = (risk target / natural risk of constructed portfolio)
  • If you can't use leverage:
    • If the risk target is lower than the 'natural' risk of the constructed portfolio, then use the constructed portfolio and add cash as required; proportion of cash = (natural risk - risk target) / natural risk. Then multiply the natural weights of the constructed portfolio by (1- proportion of cash).
    • If the risk target is higher than the risky asset in the constructed portfolio... no feasible solution

Hierarchical grouping

We know what to do if we can forecast volatility, but what about correlations? The parameter uncertainty on correlations is a little higher than volatility, and they aren't quite as easy to forecast, but in many portfolio problems it's going to be madness to ignore correlations. 

[In something like the S&P 500 the industry sectors have roughly the same number of constituents, so ignoring correlation structure isn't going to be the end of the world]

Let's move up to a three asset example:

  • S&P 500: standard deviation 16%
  • US 2 year bonds: standard deviation 2%
  • US 5 year bonds: standard deviation 4%

Correlations - between S&P 500 and each bond: 0%, between the bonds: 95%

Doing equal volatility weighting would put 1/3 of our portfolio risk into each of these three assets. So two thirds of our portfolio risk* would be in bonds. This doesn't make a lot of sense.

* not quite. I'll relax this approximation later.

The way to deal with this is by creating groups. Groups contain assets that are similar, with similarity measured by correlation. Grouping can be done purely by humans, or if you want to back test the method it can be done automatically [and I'll explore how in the next post]. The results probably won't be especially different, however it's important we can back test this methodology, even if we end up using human determined groups in live trading (which indeed is what I do myself).

In this case the groups are obvious: a group of two bonds, and a group containing a single stock.

Once we've got our groups together we follow the procedure: allocate to the groups, then allocate within the groups, and then multiply out to find the final weights.

Allocate to the groups:
  • 50% volatility weighting in bonds
  • 50% volatility weighting in equities
Then allocate within groups:
  • Bonds:
    • 50% volatility weighting in 2 years
    • 50% volatility weighting in 5 years
  • Equities
    • 100% in S&P 500
To find the final weights:

  • 2 year bonds: 50% * 50% = 25%
  • 5 year bonds: 50% * 50% = 25%
  • S&P 500 equities: 50% * 100% = 50%
These are volatility weightings - we'd need to divide by risk and normalise to get cash weights.

Hierarchical grouping

This method can also be applied hierarchically. Consider the following  

  • S&P 500
  • US 2 year bonds
  • US 5 year bonds
  • German 5 year bonds

There could be some debate about how these should be grouped, but let's go with the following:

  • Bonds
    • US bonds
      • 2 year bonds
      • 5 year bonds
    • German bonds
      • 5 year bonds
  • Equities
    • US equities
      • S&P 500

This gives the following weights (check you can see where these came from yourself):
  • 2 year US bonds: 50% * 50% * 50% = 12.5%
  • 5 year US bonds: 50% * 50% *50% = 12.5%
  • 5 year German bonds: 50% * 50% * 100% = 25%
  • S&P 500 equities: 50% * 100% * 100% = 50%

Diversification multiplier correction

Consider the following correlation matrix for three arbitrary assets:

           A             B             C
A        1.0             0.5           0.0
B        0.5             1.0           0.0
C        0.0             0.0           1.0

We'd probably split this into two groups: assets A and B (quite similar) , and asset C (quite different). Assuming they all have the same standard deviation (say 10%), the standard deviation of the sub portfolio AB (with 50% in each) will be 8.66%; lower than 10% because the correlation is 0.5. The standard deviation of the sub portfolio C which only has one asset is just 10%.

If we put half our portfolio into AB and half into C, then we'll actually have over allocated to portfolio C since it's riskier. To correct for this we need to work out the diversification multiplier; this will produce the correct result without having to go through the slightly more complex task of working out the estimated risk of each portfolio, resulting in a more intuitive process. 

The diversification multiplier for a group is:

Or if you prefer a spreadsheet, there is one here

The diversification multiplier is just the ratio of the risk of the group to the risk of a single asset. With larger groups it will get larger. With lower correlations it will get larger. Again play with the spreadsheet, and even the most moronic MBA will eventually get the intuition here.

The multiplier for group AB comes out at 1.155, and for group C by construction it's one.

To apply the diversification multiplier, we start with giving equal volatility weightings to each group. We then multiply the raw weight of each group by it's diversification multiplier. Then, because our weights will now add up to more than 1 we normalise them.

Looking at this spreadsheet you can see this produces weights of 53.5% in group AB, and 46.4% in group C. A and B end up getting 26.8% each, and C gets 46.4%. 

Correlations and uncertainty: Candidate matching

(So far the method is going to produce pretty similar results to HRP but at this point we start to deviate)

Let's return to the simple example:

           A             B             C
A        1.0             0.5           0.0
B        0.5             1.0           0.0
C        0.0             0.0           1.0

It's probably fine to assume we can forecast volatility perfectly, as we effectively do by using the inverse volatility weighting method. I'm less confident doing so with correlations. Two sided 95% confidence intervals for a correlation of 0.0 with 100 observations are -0.2 to +0.2. For a correlation of 0.5 they are a bit narrower: 0.34 to 0.63. That's a fair bit of uncertainty; for the group AB the diversification multiplier could be anything between 1.11 and 1.22.

In my first book, Systematic Trading, I suggest a heuristic method to allocate to groups with non equal correlations (for equal correlations it still makes sense to use equal volatility weights, even once uncertainty is considered). The method works for groups of up to three assets; and does by trying to match the correlation matrix to a limited number of possible candidate matrices.

There are 7 candidates, essentially forming the possible combinations (order unimportant) when correlations can take one of three values: 0, 0.5 and 0.9. Once the candidate matrix has been identified you just read off the appropriate weights. These weights have been calculated allowing for the uncertainty of correlations, and . For our simple example which matches a candidate exactly this method would produce weights of 30% in A and B, and 40% in C. 

Notice these are a little different from the weights I calculated above (26.8% in A and B, and 46.4% in C), the difference being that the heuristic weights allow for the uncertainty of correlations, and so are slightly closer to 1/N.

A variation of this method which I don't discuss in my book is to use interpolation. Consider this  correlation matrix:

           A             B             C
A        1.0             0.4           0.0
B        0.4             1.0           0.0
C        0.0             0.0           1.0

This doesn't match any of the candidates... so what should we do?

First we need to calculate a similarity value between the correlation matrix and each candidate. A good way of doing this is to measure the distance using differential version of the Frobenius norm (don't panic this is just the root of the sum of the squared differences between matching correlation values). A similarity value would then be the inverse of the distance. Once we have a bunch of similarities we normalise them so they form weights. We then take a weighted average of the weights proposed by the candidate matrices.

Here is a simple example showing how we'd deal with the matrix above. I've included just a couple of candidates, one with all zero correlations, and one which is identical to the first example (with 0.5 and zeros in it).

In theory this is possible with a spreadsheet, but in practice you probably wouldn't bother with this interpolation method if you were running this by hand.

What makes a good group (or subgroup, or sub-subgroup...)

A good group is one which contains similar correlations. These don't have to be especially high, although it's likely that the most granular groupings will contain highly correlated assets. 

But for the purposes of portfolio construction we don't really care if a portfolio contains assets that are all have zero correlation with each other, or which have 99% correlation with each other; in both cases the optimal portfolio will be equal volatility weights (assuming - still - that we can't predict Sharpe Ratios).

Indeed by the time we get to the highest level groupings (bonds and equities in the simple example above) the correlations will probably be relatively low.

Failing that we'd like to see groups that match closely to the candidates above.

Some grouping suggestions for humans

For long only portfolios of assets the following groups probably make sense  (not all categories will be available in all products):
  • Equities
    • Emerging markets
      • Countries
        • Sectors
          • Firms
    • Developed markets
      • Countries
        • Sectors
          • Firms
  • Bonds
    • Developed markets
      • Country
        • Corporate
          • High yield
          • Crossover
          • Investment grade
        • Government
          • Emerging
          • Developed
        • Inflation linked 
    • Emerging markets
      • ....
  • Commodities
    • Ags
      • Grains
        • Wheat
        • Soy
          • Soybean
          • Soymeal
          • Soy oil
        • ....
      • Softs
      • Meats
    • Metals
      • Precious
      • Base
    • Energies
      • Oil and products
        • Crude
        • Gasoline
        • ....
      • Gas
  • FX
    • Developed
    • Emerging
  • Volatility
    • Country
  • Interest rates
    • Country
      • Point on the curve
  • Crypto (if you really must...!)
  • Alts
For trading strategies it usually makes sense to group in the following hierarchy:

  • Style, eg momentum, carry, ...
    • Specific trading rule, eg moving average crossover, breakout, ...
      • Variation of trading rule, eg 2/8 moving average, 4/16 moving average, ...

We'll consider the joint grouping of trading strategies over multiple instruments in the final post, as this is an empirical question (for myself I currently allocate weights to strategies for a single instrument, then allocate to instruments - but you could do it the other way round, or jointly).

There is more detail on possible groupings in my first and second books. 

Grouping by machine

To back-test we need to be able to create groups automatically. Fortunately there are numerous techniques for doing so. This is a nice notebook that shows one possible method using out of the box scipy functions (not mine!). We can set a cluster size threshold at 3, allowing us to use the 'candidate matching' technique (as this works for 3x3 correlations).

Risk targeting with groups

Let's return to a simple 3 asset portfolio:
  • S&P 500: standard deviation 16%
  • US 2 year bonds: standard deviation 2%
  • US 5 year bonds: standard deviation 4%

Correlations - between S&P 500 and each bond: 0%, between the bonds: 95%

The grouped weights for this lot comes out at: 
  • 2 year bonds: 50% * 50% = 25%; cash weighting 57.1%
  • 5 year bonds: 50% * 50% = 25%; cash weighting 28.6%
  • S&P 500 equities: 50% * 100% = 50%; cash weighting 14.3%

The natural risk of this bad boy is a pathetic 3.21%. So we're back to the problem we have earlier - without access to leverage this is waaaaay too low for most people. However the nice closed form formula that we used earlier to solve this problem only works for 2 assets.

To deal with this we need to partition the portfolio into two parts: a high risk part, and a low risk part. The low risk part will contain all the assets with risk less than our risk target. The high risk component has all the assets with risk higher than our risk target. We then treat these as our two top level groups; and allocate within them as normal. Finally we use the original technique above, where our two 'assets' will be the two groups.

For this simple example (see here) let's suppose we're targeting 10% risk. The low risk group happens to contain all the bonds, whilst the high risk group contains all the equities. We do our allocation in the normal way, and end up with a bond portfolio which has a standard deviation of 2.26%, and an equity portfolio with risk of 16%. Then using the closed form formula we get a solution of 62.2% in equities and 37.8% in bonds (cash weightings).

Warning: if we have too few assets in the high or low risk group then we could end up with a portfolio which is unnecessarily concentrated. 

Other constraints

The last technique was effectively a use of constraints. It's trivial to impose either cash weighted or volatility weighted constraints in any part of a handcrafted portfolio.

Including Sharpe Ratios

We've dealt with volatility and correlations; however we're still assuming that Sharpe Ratios are identical across all assets. Sharpe Ratios are very hard to predict, and their parameter uncertainty is substantial. But we might have some useful information about Sharpe Ratios, some conditioning information. 

For long only asset allocations we might want to use some kind of trading signal; value, momentum or whatever. You might believe in CAPM and want to use the residual part of Beta that isn't covered by inverse volatility weights. You might not believe in CAPM and want to use a 'betting against Beta' thing. You might believe that you are the king of the discretionary stock pickers, and want to make your own forecasts.

For trading strategy allocations I personally prefer to keep potentially poor strategies in my back test, and then allow the back test to weed them out (otherwise my back test results will be inflated by the absence of any ideas I've thrown away). But to do this we need to include historic Sharpe Ratio as an input into the optimisation.

But to what degree should we alter weights according to Sharpe Ratios? Have a gander at this picture:

The slightly odd x-axis of the graph shows the relative Sharpe Ratio of an asset, versus the average Sharpe Ratio (SR) in it's group. The y-axis shows the suggested weight of an asset conditional on that SR, versus the typical weight in it's group

The blue line is a stylised version of what the Markowitz optimiser tends to do; either allocate nothing or everything as the Sharpe changes just a little. 

A more sensible idea might be to adjust weights proportionally with the Kelly criteria as shown with the red line. Assuming the average SR of the group is 0.5, we give something double the normal weight if it has double the average SR, the same weight as the average if it has the same SR as normal, and a zero weight if it has a zero SR. 

The red line assumes we know exactly what the SR is. We don't. If we factor in a reasonable amount of uncertainty we end up with the yellow line. Even if the estimated SR is pretty shocking we'd still end up with some weight on the asset.

This is a nice heuristic, which does a pretty good job in the face of massive uncertainty.

To use it (as outlined in "Systematic Trading" and "Smart Portfolios") you begin with some asset weights (volatility normalised works better and is more intuitive). You then multiply these by the conditioned multiplier depending on the Sharpe Ratio. Some weights will go up, some down. You then normalise the weights. 

What's next

In the next post I'll explain how you actually apply the handcrafting method, both manually, and with some python code.

Wednesday, 5 December 2018

Portfolio construction through handcrafting: motivating

I've talked around a type of portfolio construction called "Handcrafting" for some time now, in both of my first two books, and in the odd blog post. I thought it would be useful to explain how the technique works in a more thorough and complete series of blog posts, and also share some code that implements the method.

I intend to do five posts on this topic. The first, which you won't be surprised to hear, is this one; will set the scene and motivate the use of this particular technique. For the second post I'll explain how each component of the methodology works, and why it is there. In the third I'll explain in some detail how to implement handcrafting, including the relevant code. The fourth post will do some testing with artificial data, whilst the final post will empirically test with real data.

In the rest of this post I'll wax lyrically and philosophically about how I think we should do portfolio construction, and why.

Portfolio optimisers: their use and misuse in the financial industry

The use of portfolio optimisers by human beings working in the financial industry tends to follow a well worn path. In stage one the human is ignorant and/or dismissive of optimisers, preferring to use gut feel – a purely subjective method for choosing portfolio weights. All the human baggage of cognitive biases will be allowed freely into the portfolio.

Stage two sees the optimiser used unthinkingly and it’s outputs taken as gospel, as if printed on immovable stone tablets. The tendency of the optimiser to produce extreme weights is ignored or forgiven until it leads to a significant out of sample under performance. At that point external pressure from investors applied via senior management usually leads to extensive manual overrides of what is supposed to be a purely systematic portfolio weighting process.

After some bad experiences we arrive at stage three, where “ugly hacks” are used to avoid extreme or otherwise unsatisfactory weights. These hacks include applying constraints and modifying the parameter estimates used by the optimiser until they give the “right” result.

The term “ugly hacks” is only used by those who have progressed beyond stage three. Whilst in contemporary usage “ugly hacks” go by alternative monikers: “tweaks”, “adjustments” and “robustification” (a word my spell checker has rightly flagged up as non existent). Regardless of their name the effect is the same. We end up with portfolio weights effectively determined by human gut feel, with the optimisation tortured until it gives the “right” answer.

In stage four we are still trying to force the optimiser to produce satisfactory weights but a more sophisticated armoury is employed, including (but not limited to): Bayesian techniques including Black-Litterman, parametric or non parametric bootstrapping of weights or the efficient frontier, inverse volatility weighting, risk parity, clustering, partial correlations, the use of higher moments, alternative utility functions, neural networks and machine learning. With the right set of data the appropriate technique will provide portfolio weights which can be justified theoretically, are pleasing to investors, and with the added bonus that you blind any doubter with large dollops of science.

Although these stages are well defined it would be misleading to assume there is a straightforward linear progression. Most retail investors will remain at stage one forever. Keeping them company are a few dinosaur era fund managers from the pre-computer age. Stage one may be bypassed entirely by quantitatively trained newcomers arriving directly from non financial industries, or with recently minted advanced degrees that are long on numeric methods but short on common sense application. They may also skip over stage three, hurrying towards the seductive sophistication of stage four.

The gradual journey from stages one to four can also be seen in the evolution of individual firms, as well as for the entire finance industry. A neophyte employee arriving in a firm which is already at stage four will find their own development accelerated. This is not always a good thing. Serving a long apprenticeship in stage two may can give you a healthy scepticism of the portfolio optimisation process and a intuitive understanding of it’s shortcomings. Without this the use of advanced stage four technology may be as dangerous as teaching a novice how to drive in a 250mph supercar.

Stage one: Human defined portfolios "gut feel"

Portfolios created by humans have the advantage of being readily accepted by the humans that built them. Though they may lack the theoretical underpinnings of portfolios created at later stages they also lack the extreme weights that naive mean variance optimisation often delivers. However humans have a poor track record when it comes to making financial decisions using subjective judgement unaided by computing technology or economic theory.

Classical models of financial markets assume that human beings are all knowing, hyper-rational beings who make their decisions within a utility maximising framework. In reality there is considerable evidence that this model of human behaviour is completely and utterly wrong. The field of behavioural economics seeks to explain the numerous instances of apparently irrational behaviour that we see in financial markets.

I'd be loopy to try and summarise that literature here, so from a survey paper here are the main characteristics of investor behaviour which relate directly to the problem of portfolio optimisation:

  • Insufficient diversification
  • Naive diversification*
  • Excessive trading
  • A reluctance to sell assets at a loss
  • The purchase of ‘attention getting’ stocks

* “Naive diversification” is worth explaining, as it also directly relates to the concept of handcrafting. Benartzi and Thaler  find that when people do diversify, they do so in a naive fashion. They provide evidence that in 401(k) plans, many people seem to use strategies as simple as allocating 1/n of their savings to each of the n available investment options, whatever those options are. 

Humans will not automatically choose to hold or rebalance portfolios in a theoretically optimal fashion, and instead will buy portfolios that aren’t sufficiently diversified and are concentrated in ‘story’ stocks, which they will then over-trade. Therefore we need to use formal portfolio optimisers to protect ourselves from our own cognitive flaws.

Stage two: The naive use and misuse of portfolio optimisers

A systematic method for portfolio optimisation is a necessary condition for serious research into portfolio weighting. Unlike subjective methods it can be automated and properly backtested. The method can be run at set intervals over historical data, with each iteration looking backwards so it is not polluted with future information. This isn’t possible with subjective methods. Even if we could persuade a human to repeatedly optimise portfolio weights it would be difficult to erase the knowledge of future events from their minds. The temptation to reduce the weight to stocks in late 1999 and 2007 would be to hard to resist.

As we know the problem of finding optimal portfolio weights was famously “solved” by Harry Markowitz in 1952. Thousands of trees have already died explaining the mean variance optimisation technique that Markowitz expounded, so I do not feel the need to kill any more.

For our purposes the important features of mean variance optimisation are that:
  • We require parameter estimates for the mean, standard deviation, and correlation of the relevant assets
  • Small differences in these estimates can result in highly unstable, extreme, weights.

In practice it is very difficult to forecast the required parameters, or even to know what they should have been in the past. I've discussed this extensively in the past (most recently with this talk). 

We cannot know the optimal portfolio weights with any certainty. More importantly unsatisfactory weights, such as 0% and 100% in a two asset problem, are common. These weights are intuitively unattractive to humans, and they are also more likely to perform badly in out of sample testing.

Stage three: Why ugly hacks are bad

Even under highly unrealistic laboratory conditions mean variance optimisation fails to deliver robust portfolio weights that will be acceptable to humans. The first instinct of many people is to bludgeon the weights until they look intuitively “right”.

The simplest hack is to introduce portfolio constraints. Don’t like to see a zero allocation to bonds? Then set a minimum 10, or 20% weight. Uncomfortable with a 100% alloction to stocks? Then set a maximum weight of perhaps 90% or 80%. We can even introduce a quantitative method to determine constraints: for N assets, set the minimum at x(1/N) and the maximum at y(1/N) where x<1 and y>1.

A more sophisticated technique is to adjust the input parameters; tweaking the mean vector and covariance matrix until you get the result you like. Some skill and intuition is required here. Indeed trying these adjustments is an excellent way to gain a deep understanding

Ugly hacks have some poor attributes. From a technical basis introducing constraints usually makes the optimisation less stable. Most optimisers use penalty functions to apply constraints. This results in highly non-linear gradient functions. Simple grid search optimisers are not subject to this problem, but they are also extremely slow and impractical for use with more than a few assets. Adjusting the inputs does not affect the optimisation function, and so is a more stable approach.

But these technical problems are not the elephant in this particular room. Both forms of hack have one significant failing: they sneak in human subjective judgement, with all it’s potential failings. Even if the optimisation is done on a rolling series of out of sample optimisations the constraints are normally set for the whole period.

Stage four: Why complex optimisation techniques are not the answer

Any self respecting financial quant will feel deep uncertainty when they use stage three techniques. Firstly, to anyone who spends a few moments in introspection it is obvious that we are “cheating”. Secondly, they are too trivial. It would make more sense to protect our future employment prospects by introducing a more complex method.

A small industry has grown up over the last few decades, devoted to fixing the basic mean variance optimisation in various ways. This is not a survey paper on the subject, but here are some of the methods that have been used. This first group still uses all the inputs but tries to modify them in some way:
  • Bayesian (Bayes-Stein): adjust one or more of the inputs to reflect the uncertainty of information, by finding the weighted average of a prior parameter value and the estimated parameter. 
  • Bootstrapping (non-parametric): repeatedly resample the portfolio history and find the optimal set of weights for each sample. Then take an average of the weights across samples.
  • Bootstrapping (parameteric): estimate a distribution from the data history, resample the distribution, find the optimal weights for each sample, and take an average of the weights.
  • Bootstrapping (Michaud parametric): estimate a distribution from the data history, resample the distribution, find the efficient frontier for each sample, take an average of the efficient frontiers across samples, and then find the optimal point on the averaged frontier.
  • Partial correlations. Change the correlations so that they are partial correlations. This tends to shrink correlations away from +1 and -1, and so produces a more robust result.
This second group effectively ignores one or more of the inputs from the optimisation (and I'd highly recommend this paper to help you think about how they fit together - particularly figure 2).
  • Override one or more of the estimated inputs, for example by equalising all portfolio means or Sharpe Ratios. This overlaps with many of the following techniques (and it's also effectively Bayesian with full shrinkage)
  • Equal weighting: Equally weight all instruments in a portfolio. 
  • Risk parity: Equally weight each asset class by it’s volatility contribution to the portfolio. This method is silent on how allocations should be achieved within asset classes. This portfolio may undershoot many investors risk appetite so it's often used with leverage.
  • Inverse volatility weighting: Set weights of all assets equal to the inverse of their relative volatility (notice that this is not the same as risk parity except for certain special cases, and it ignores correlations). This method needs to be combined with other methods if means or correlations are to be used. This portfolio may undershoot many investors risk appetite. 
  • Minimum variance: Mean variance using only the covariance matrix (correlations and volatility), and with an objective function of minimum risk. This portfolio may undershoot many investors risk appetite.
  • Maximum diversification. Does what it says on the tin. This portfolio may undershoot many investors risk appetite.
  • HRP (hierarchical risk parity): As already discussed this is a hierarchical equivalent of risk parity, where we first group assets then assign risk parity weights within and across groups.
Notice a general theme here: once we step out of the cosy world of optimising a fairly homogeneous group of assets like the S&P 500 into something like an asset allocation problem it becomes more likely that portfolios which don't maximise risk versus return will end up seriously undershooting investors target risk). I'll discuss how this is solved in part two of this series.

There is a third group of ideas which involves maximising something else (Sharpe Ratio, geometric return, higher moments of the utility function); but these really lie outside the scope of what we're thinking about here.

Complex optimisation techniques are very appealing to quantitative portfolio analysts and their managers. Humans tend to assume that if something is sufficiently complicated it must be correct; and the more complicated it is, the better. All of the techniques above are more robust than stage two mean variance optimisation and less likely to produce extreme weights.

However not all the techniques shown correct properly for parameter uncertainty. Used correctly the following methods are best at dealing with parameter uncertainty: overriding inputs, Bayesian optimisation and boot-strapping. Risk parity, minimum variance and inverse volatility weighting do not use expected returns which are the largest source of uncertainty, but do not deal with the uncertainty of correlation or standard deviation estimates.

But these methods also aren’t perfect. Non-parameteric boot-strapping does not always deliver robust weights with limited amounts of data, though it has the benefit of requiring no additional parameters. In contrast parametric boot-strapping requires you to select an appropriate distribution and estimate the parameters for it. A joint Gaussian distribution is insufficient for most financial data. Trying to estimate higher moments like (co-) skewness and (co-) kurtosisis tricky; they are easily influenced by one or two outliers, especially if data is limited. Dealing with autocorrelation in return series also makes bootstrapping harder.

Bayesian optimisation requires you to come up with appropriate priors that contain no forward looking information, and a shrinkage parameter that is appropriate to the length and noise level in the data. These additional tuneable parameters transform the problem from one of portfolio optimisation to methodology optimisation.

Often a series of experiments is necessary to determine which combination of technique and tuneable methodology parameter gives the best result. It’s possible in theory to calibrate a shrinkage parameter on a rolling basis using only backward looking data but in practice methodology optimisation is rarely done automatically using a pure out of sample approach.

A number of proper portfolio optimisations are done on a pure out of sample basis; but then the best method is usually selected by the researcher based on their performance over the entire data-set. Implicit in-sample fitting has just crept in to our scientifically rigorous process via the back door.

The other significant problem is that the more complicated the process, the less intuitive and transparent it is to humans. Humans are less likely to accept weights that are not intuitive and transparent, which leads to the problems I outlined earlier. With experience the process of naive stage two mean variance optimisation is reasonably intuitive. Almost all of the techniques above are more complicated and therefore less intuitive than naive mean variance.

But there are degrees of difficultly here. After extensive usage practice Bayesian methods can seem reasonably intuitive (they are to me at least!). Risk parity, inverse volatility and clustering make logical sense although both have to be combined with other techniques to produce reasonable results. For this reason I will be using inverse volatility and clustering as part of the handcrafting method. In contrast it is not obvious to humans how the precise weights out of a bootstrapping process have been derived, even if the process is simple to describe.

I haven’t discussed any of the highly sophisticated techniques that are currently in vogue amongst practitioners: Artificial intelligence, machine learning or neural networks. I do not feel qualified to comment on their efficacy or robustness. But it is clear that they are far worse than any of the methods mentioned above when it comes to opaqueness.

Some desirable attributes of portfolio construction

A portfolio (or any systematic methodology) is no good if a human being isn't happy with it. At the first sign of any loss a human will dump a portfolio that they never liked to begin with, and start meddling. This will completely detract from the point of making trading and investment decisions in a systematic way.

Hence, we can define some idealised characteristics for portfolio optimisation methodologies.

Firstly, a good method for portfolio optimisation is one that will be accepted and understood by human beings. This is important because humans will not be tempted to override the portfolio weights if they find them undesirable, or if the resulting portfolio under performs. Specifically we require that:

  1. The portfolio weights are acceptable to human beings. Human beings like weights that make sense, and they also dislike both extreme weights and weights that move around a lot (the latter also lead to increased trading costs). 
  2. The process by which the weights are derived is transparent to human beings
  3. The process is intuitively obvious
Importantly it is not sufficient that the person performing the optimisation finds the weights acceptable, and the process transparent and intuitive. Any stakeholder with the power to force the weights to be changed must also be satisfied, including investors and senior fund managers.

An ideal portfolio should also satisfy certain theoretical conditions:
  1. The portfolio weights are free of cognitive biases
  2. They should satisfy generally accepted principles of financial theory, for example: diversified portfolios are better
  3. The weights should be robust to problems such as parameter uncertainty and optimisation instability. 
(The last of these points might not be interesting to most humans, but they're necessary for me personally!)

Finally there are advantages to running a portfolio process which is systematic: it can be back-tested against historic data avoiding in-sample over fitting, is repeatable and transferable, can be automated hence reducing costs, can be applied to large portfolios, and it’s likely behaviour can be calibrated and understood. This gives our final conditions:

       7. The portfolio process can be systematically applied
       8.  The process does not use in-sample data for both fitting and testing portfolio weights

Finally it's clearly the case for any methodology that their performance properly measured  (out of sample or in live trading) should produce returns which are not statistically inferior to some competing methodology. But I'll address this in the final two posts of this series.

Which conditions are met at each stage?

Advantages in green, disadvantages in red.

Stage one: gut feel

1: Weights are acceptable to human beings.
2 & 3: Process is intuitive and transparent – if you set the weights
6: Instability is not usually a problem
2: Process is not intuitive or transparent – if they are someone elses weights
4: Cognitive biases are present
5: Financial theory is often ignored
6: Parameter uncertainty is rarely taken into account
7 & 8: Cannot be back-tested – always in sample.

Stage two: Naive optimisation

2: The process is transparent, 3: If you are experienced reasonably intuitive
4 & 5: Weights are free of cognitive biases, and conform to financial theory
7 & 8: Easily backtested 
1: Weights are unacceptable to human beings
3: The process may not be intuitive for non specialists.
6: The process isn’t robust and will perform badly out of sample

Stage three: ugly hacking

1: With enough hacking we can usually find acceptable weights.
2: The process is simple enough to be intuitive to some.
6: Hacking parameter estimates usually makes the process more robust (unknowingly). 
2: The process isn’t very transparent
3: The process may not be intuitive.
4 &5: Cognitive biases and financial heresy can sneak in via the back door
6: Applying constraints usually makes the process less robust
7 & 8: It is difficult in practice to run this as a systematic back test; effectively in sample optimisation is sneaked in via the back door

Stage four: advanced technology

1: With the right technique we can usually find acceptable weights
4 & 5: Weights are free of cognitive biases and standard financial theory is respected
6: Many techniques respect parameter uncertainty
7: The process can be systematically applied
8: When used correctly in sample fitting is avoided.
2: The process is very unlikely to be intuitive
3: To non-experts the process is not transparent. In some cases (eg neural networks) the process is opaque for everyone.
6: Certain techniques do not account for parameter uncertainty
7 & 8: “Methodology shopping”* - searching for a precise technique which produces the best outcome – can lead to implicit in sample fitting.

(* A simple example of “methodology shopping” is running a series of Bayesian optimisations in which the shrinkage parameter is allowed to vary; and then selecting the best shrinkage parameter based on out of sample portfolio performance – which means in practice that the optimal shrinkage has been selected using all the available data effectively fitting in sample.)

Very brief intro to handcrafting

I'll explain handcrafting in inordinate detail in the second post of this series, but for new readers the basic idea is that you split your portfolio into groups in a hierarchical fashion, and then apply equal risk weighting within that group. 

(There are also quite a few twists on the basic idea of equal risk weighting, which I'll explain further in the next post)

This may sound quite a lot like HRP, but in fairness I did come up with the method independently* and there are some important differences. In particular it's possible to implement a "handcrafted" portfolio entirely by hand without any computing power, hence the name, and this is the core idea at the heart of handcrafting.

I first became aware of HRP in early 2017, but I've been using handcrafting in one form or another for several years and my book containing the concept was written in 2014 and 2015, and then published in October 2015. Having said that the core idea of handcrafting is that it reflects the way humans naturally like to do portfolio construction, so I'm not sure anyone can really claim ownership of the concept. I certainly won't be patenting it!

Handcrafting is essentially a heuristic method rather than an optimisation, although it does use the same inputs as any optimisation would: the expected distribution of asset returns, plus any relevant constraints.

So... handcrafting

To finish this post let me explain why I think handcrafting, unlike the other methods we've discussed, meets all the criteria I've outlined above. Before that let me just explain that there are effectively two types of handcrafting:

  • A 'humane' method that human beings can use to produce 'one off' intuitive portfolios without requiring any advanced technology
  • An automated method that can be used to back test the methodology over repeated time periods
The main difference between these methods is in the choice of groupings of assets. With the humane method this is done subjectively; with the automated method it's done objectively. A human could apply the automated method of course; but it would be extremely painful. Conversely you probably couldn't automate the discretionary opinions of human beings.  Anyway on to the characteristics that we require:

  1. The portfolio weights are acceptable to human beings: Weights won't be extreme, and except when assets move in or out of the portfolio are mostly pretty stable.
  2. The process by which the weights are derived is transparent to human beings: a human being can follow the handcrafting process because they can actually to it themselves (in it's humane form) on the back of a beer mat or napkin. This is the key strength of handcrafting compared to stage 2,3 and 4 portfolio optimisation.
  3. The process is intuitively obvious: although there are a lot of steps to the full handcrafting process (as we'll discover in the next post), each step can clearly be understood and explained.
  1. The portfolio weights are free of cognitive biases: True of the automated method but might not be true for the humane method
  2. They should satisfy generally accepted principles of financial theory, for example: diversified portfolios are better.  Yes the core of handcrafting is to create the portfolio with the most diversification. The heuristics that I'll use in handcrafting are all supported by solid theoretical and empirical foundations.
  3. The weights should be robust to problems such as parameter uncertainty and optimisation instability: As will become clear in the next post this is definitely true of hand crafting
       7. The portfolio process can be systematically applied: Entirely true for the automated method
       8. The process does not use in-sample data for both fitting and testing portfolio weights: Entirely true for the automated method, may not be completely true for the humane method.

What's next

In the second post I'll explain the handcrafted method in more detail, and share some code with you.