Friday, 7 December 2018

Portfolio construction through handcrafting: The method

This post is all about handcrafting; a method for doing portfolio construction which human beings can do without computing power (although realistically you'd probably need a spreadsheet unless you're some kind of weird masochist). The method aims to achieve the following goals:

  • Humans can trust it: intuitive and transparent method which produces robust weights
  • Can be easily implemented by a human in a spreadsheet
  • Can be back tested
  • Grounded in solid theoretical foundations
  • Takes account of uncertainty in data estimates
  • Decent out of sample performance
  • Addresses the problem of allocating capital to assets on a long only basis, or to trading strategies. It won't be suitable for a long /short portfolio.

This is the second in a series of posts on the handcrafting method.
  1. The first post can be found here, and it motivates the need for a method like this.
  2. In this post I build up the various components of the method, and discuss why they are needed. 
  3. In next post, I'll explain how you'd actually apply the method step by step, with code. 
  4. Post four will test the method with real data

A brief note about HRP


This method does share some common ground with hierarchical risk parity (HRP)*. 



* To reiterate I did come up with the method independently of HRP: I first became aware of HRP in early 2017, but I've been using handcrafting in one form or another for several years and my book containing the concept was written in 2014 and 2015, and then published in October 2015. Having said that the core idea of handcrafting is that it reflects the way humans naturally like to do portfolio construction, so I'm not sure anyone can really claim ownership of the concept. 

Indeed in some ways you could think of it as HRP broken down into steps that make intuitive sense to human beings and can be implemented using only a spreadsheet). However as you'll see there are some fundamental differences involved.


Equal weights


Everyone loves equal weights. High falutin academic finance types like them (see here). More importantly human beings like them too.  Benartzi and Thaler  find that when people do diversify, they do so in a naive fashion. They provide evidence that in 401(k) plans, many people seem to use strategies as simple as allocating 1/n of their savings to each of the n available investment options, whatever those options are.

Even people who know a *lot* about portfolio optimisation like them:

"I should have computed the historical covariance of the asset classes and drawn an efficient frontier…I split my contributions 50/50 between bonds and equities." (Harry Markowitz)

Equal weights should be the starting point for any portfolio allocation. If we know literally nothing about the assets in question, why then equal weighting is the logical choice. More formally if:
  • Sharpe Ratios are statistically indistinguishable / completely unpredictable
  • Standard deviations are statistically indistinguishable / completely unpredictable
  • Correlations  are statistically indistinguishable / completely unpredictable
... then equal weights are the optimal portfolio. For highly diversified indices like the S&P 500 equal weights are going to do pretty well; since these things are close enough to the truth not to matter. But in many other cases they won't.

[An aside: I use Sharpes, standard deviations and correlations rather than the usual formulation of means and covariances. I do this for two reasons;

  1. the uncertainty and predictability of the inputs I prefer is different - for example it is usually fair to say at least across asset classes that you should expect equal Sharpe Ratio, but assuming equal mean is far too heroic 
  2. human beings are better at intuitively understanding my preferred inputs - with experience you will learn that a Sharpe of 0.5 is good and 1.0 is better; that an annual standard deviation of 16% means that on average you'll make or lose about 1% of your capital a day; that a correlation of 0.2 is low and 0.95 is really high. Interpreting variances or worse covariances is much harder; comparing means across asset classes is silly.]


Inverse volatility weighting


The most heroic of the assumptions listed above is this:
  • Standard deviations are statistically indistinguishable / completely unpredictable
They're not. Standard deviations in general are the most predictable characteristic of an assets returns. Their sampling uncertainty is lower than for any other estimate. So an equal weighted portfolio for the S&P 500 (standard deviation ~16% a year) and US 2 year bonds (standard deviation ~2% a year) doesn't make sense. It will get most of it's risk from equities.

The solution is to use inverse volatility weighting. We give each asset a weight of 1/s where s is the standard deviation. Except for certain special cases this won't produce nice weights, so we'll have to normalise these to sum up to 1.

For S&P 500 & US 2 year bonds we'd get weights of 1/.16 and 1/.02, which once normalised is 11.1% and 88.9%. 

We now have two different ways of thinking about weightings: cash weightings and volatility weightings. For our simple example the cash weightings are 11.1% & 88.9%, and the volatility weightings are 50% & 50%.


  • To convert from volatility weighting to cash weighting we divided by the standard deviation of each asset, and then normalise weights back to 1. 
  • To convert from cash to volatility weighting we multiply by the standard deviation, and then normalise weights back to 1.
Note: if allocating capital to trading strategies then this step will not be necessary. If built properly trading strategies should target a given long term risk level. Small differences in the actual risk level should not be used to vary the portfolio weights that they are allocating.


Inverse volatility weighting with risk targeting


There is one fatal flaw with the inverse volatility approach. Consider the portfolio of 11.1% in S&P 500 and 88.9% in US 2 year bonds; this portfolio will give us risk of between 2.5% and 3.5% depending on the correlation. For most people that's *way* too little risk. Or to be precise for most people that's *way* too little return: if each asset had a Sharpe Ratio of 0.5 (which is rather punchy for a long only asset) that would give an expected excess return of around 1.8%.

That's fine if you can use leverage; then all you need to do is use a leverage factor of target risk divided by expected risk. For example if you're targeting 10% risk and the correlation is 0.5, giving risk of 3.1%, then you'd apply a leverage factor of 3.2.

[The topic of which is the correct risk target falls outside of the scope of this method - this is covered in slides 80 to 91 of this talk]

What if you can't use leverage? Then you're going to have to constrain your allocation to low risk assets. So we need to calculate the portfolio weights that would give us the right level of risk. Incoming LaTeX...


[the negative solution will be invalid as long as the risk of the first asset \sigma_1 is greater than that of the second \sigma_2 but that's just ordering]

The above might look a bit horrific but it can be implemented in google sheets or open office spreadsheet (other, more expensive, spreadsheets are apparently available). I'd also argue that it is intuitive - or at least it's very easy to get intuition by playing around with the spreadsheet. After 20 minutes even the dumbest MBA will understand the intuition....

* There will be precisely two digs at MBAs in this post. It's nothing personal. Some of my best friends have MBAs. Actually that isn't true, I don't have any friends with MBAs, but I'm sure that people with MBAs are harmless enough.

Let's try it with our simple example. From the sheet with a portfolio risk target of 10% and a correlation of 0.5 we get a weight of 59.8% on the first asset (the S&P 500), 40.1% on the second (US 2 year bonds). Translated into volatility weightings that's 92.3% on the S&P 500 and 7.7% on the US 2 year.

Some obvious points:

  • For very high risk targets the spreadsheet will produce a value for the weight on the first asset above 1. Clearly you can't achieve risk of 17% without leverage if the riskiest asset you have as risk of 16%.
  • For very low risk targets the spreadsheet will produce inefficient weights or errors, eg for a target of 2.5% it will want to put 94% in the US 2 year bond; but this would produce a lower return than allocating 18.8% of your portfolio to cash and the rest to the inverse volatility portfolio. See this second sheet.

So in practice (where 'the constructed portfolio' is the inverse vol portfolio with equal volatility weights):

  • If you can use leverage, then use a leverage factor = (risk target / natural risk of constructed portfolio)
  • If you can't use leverage:
    • If the risk target is lower than the 'natural' risk of the constructed portfolio, then use the constructed portfolio and add cash as required; proportion of cash = (natural risk - risk target) / natural risk. Then multiply the natural weights of the constructed portfolio by (1- proportion of cash).
    • If the risk target is higher than the risky asset in the constructed portfolio... no feasible solution



Hierarchical grouping


We know what to do if we can forecast volatility, but what about correlations? The parameter uncertainty on correlations is a little higher than volatility, and they aren't quite as easy to forecast, but in many portfolio problems it's going to be madness to ignore correlations. 

[In something like the S&P 500 the industry sectors have roughly the same number of constituents, so ignoring correlation structure isn't going to be the end of the world]

Let's move up to a three asset example:

  • S&P 500: standard deviation 16%
  • US 2 year bonds: standard deviation 2%
  • US 5 year bonds: standard deviation 4%

Correlations - between S&P 500 and each bond: 0%, between the bonds: 95%

Doing equal volatility weighting would put 1/3 of our portfolio risk into each of these three assets. So two thirds of our portfolio risk* would be in bonds. This doesn't make a lot of sense.

* not quite. I'll relax this approximation later.

The way to deal with this is by creating groups. Groups contain assets that are similar, with similarity measured by correlation. Grouping can be done purely by humans, or if you want to back test the method it can be done automatically [and I'll explore how in the next post]. The results probably won't be especially different, however it's important we can back test this methodology, even if we end up using human determined groups in live trading (which indeed is what I do myself).

In this case the groups are obvious: a group of two bonds, and a group containing a single stock.

Once we've got our groups together we follow the procedure: allocate to the groups, then allocate within the groups, and then multiply out to find the final weights.

Allocate to the groups:
  • 50% volatility weighting in bonds
  • 50% volatility weighting in equities
Then allocate within groups:
  • Bonds:
    • 50% volatility weighting in 2 years
    • 50% volatility weighting in 5 years
  • Equities
    • 100% in S&P 500
To find the final weights:

  • 2 year bonds: 50% * 50% = 25%
  • 5 year bonds: 50% * 50% = 25%
  • S&P 500 equities: 50% * 100% = 50%
These are volatility weightings - we'd need to divide by risk and normalise to get cash weights.


Hierarchical grouping


This method can also be applied hierarchically. Consider the following  

  • S&P 500
  • US 2 year bonds
  • US 5 year bonds
  • German 5 year bonds

There could be some debate about how these should be grouped, but let's go with the following:

  • Bonds
    • US bonds
      • 2 year bonds
      • 5 year bonds
    • German bonds
      • 5 year bonds
  • Equities
    • US equities
      • S&P 500

This gives the following weights (check you can see where these came from yourself):
  • 2 year US bonds: 50% * 50% * 50% = 12.5%
  • 5 year US bonds: 50% * 50% *50% = 12.5%
  • 5 year German bonds: 50% * 50% * 100% = 25%
  • S&P 500 equities: 50% * 100% * 100% = 50%


Diversification multiplier correction


Consider the following correlation matrix for three arbitrary assets:

           A             B             C
A        1.0             0.5           0.0
B        0.5             1.0           0.0
C        0.0             0.0           1.0


We'd probably split this into two groups: assets A and B (quite similar) , and asset C (quite different). Assuming they all have the same standard deviation (say 10%), the standard deviation of the sub portfolio AB (with 50% in each) will be 8.66%; lower than 10% because the correlation is 0.5. The standard deviation of the sub portfolio C which only has one asset is just 10%.

If we put half our portfolio into AB and half into C, then we'll actually have over allocated to portfolio C since it's riskier. To correct for this we need to work out the diversification multiplier; this will produce the correct result without having to go through the slightly more complex task of working out the estimated risk of each portfolio, resulting in a more intuitive process. 

The diversification multiplier for a group is:

Or if you prefer a spreadsheet, there is one here

The diversification multiplier is just the ratio of the risk of the group to the risk of a single asset. With larger groups it will get larger. With lower correlations it will get larger. Again play with the spreadsheet, and even the most moronic MBA will eventually get the intuition here.

The multiplier for group AB comes out at 1.155, and for group C by construction it's one.

To apply the diversification multiplier, we start with giving equal volatility weightings to each group. We then multiply the raw weight of each group by it's diversification multiplier. Then, because our weights will now add up to more than 1 we normalise them.

Looking at this spreadsheet you can see this produces weights of 53.5% in group AB, and 46.4% in group C. A and B end up getting 26.8% each, and C gets 46.4%. 


Correlations and uncertainty: Candidate matching


(So far the method is going to produce pretty similar results to HRP but at this point we start to deviate)

Let's return to the simple example:


           A             B             C
A        1.0             0.5           0.0
B        0.5             1.0           0.0
C        0.0             0.0           1.0

It's probably fine to assume we can forecast volatility perfectly, as we effectively do by using the inverse volatility weighting method. I'm less confident doing so with correlations. Two sided 95% confidence intervals for a correlation of 0.0 with 100 observations are -0.2 to +0.2. For a correlation of 0.5 they are a bit narrower: 0.34 to 0.63. That's a fair bit of uncertainty; for the group AB the diversification multiplier could be anything between 1.11 and 1.22.

In my first book, Systematic Trading, I suggest a heuristic method to allocate to groups with non equal correlations (for equal correlations it still makes sense to use equal volatility weights, even once uncertainty is considered). The method works for groups of up to three assets; and does by trying to match the correlation matrix to a limited number of possible candidate matrices.

There are 7 candidates, essentially forming the possible combinations (order unimportant) when correlations can take one of three values: 0, 0.5 and 0.9. Once the candidate matrix has been identified you just read off the appropriate weights. These weights have been calculated allowing for the uncertainty of correlations, and . For our simple example which matches a candidate exactly this method would produce weights of 30% in A and B, and 40% in C. 

Notice these are a little different from the weights I calculated above (26.8% in A and B, and 46.4% in C), the difference being that the heuristic weights allow for the uncertainty of correlations, and so are slightly closer to 1/N.

A variation of this method which I don't discuss in my book is to use interpolation. Consider this  correlation matrix:


           A             B             C
A        1.0             0.4           0.0
B        0.4             1.0           0.0
C        0.0             0.0           1.0

This doesn't match any of the candidates... so what should we do?

First we need to calculate a similarity value between the correlation matrix and each candidate. A good way of doing this is to measure the distance using differential version of the Frobenius norm (don't panic this is just the root of the sum of the squared differences between matching correlation values). A similarity value would then be the inverse of the distance. Once we have a bunch of similarities we normalise them so they form weights. We then take a weighted average of the weights proposed by the candidate matrices.

Here is a simple example showing how we'd deal with the matrix above. I've included just a couple of candidates, one with all zero correlations, and one which is identical to the first example (with 0.5 and zeros in it).

In theory this is possible with a spreadsheet, but in practice you probably wouldn't bother with this interpolation method if you were running this by hand.


What makes a good group (or subgroup, or sub-subgroup...)


A good group is one which contains similar correlations. These don't have to be especially high, although it's likely that the most granular groupings will contain highly correlated assets. 

But for the purposes of portfolio construction we don't really care if a portfolio contains assets that are all have zero correlation with each other, or which have 99% correlation with each other; in both cases the optimal portfolio will be equal volatility weights (assuming - still - that we can't predict Sharpe Ratios).

Indeed by the time we get to the highest level groupings (bonds and equities in the simple example above) the correlations will probably be relatively low.

Failing that we'd like to see groups that match closely to the candidates above.


Some grouping suggestions for humans


For long only portfolios of assets the following groups probably make sense  (not all categories will be available in all products):
  • Equities
    • Emerging markets
      • Countries
        • Sectors
          • Firms
    • Developed markets
      • Countries
        • Sectors
          • Firms
  • Bonds
    • Developed markets
      • Country
        • Corporate
          • High yield
          • Crossover
          • Investment grade
        • Government
          • Emerging
          • Developed
        • Inflation linked 
    • Emerging markets
      • ....
  • Commodities
    • Ags
      • Grains
        • Wheat
        • Soy
          • Soybean
          • Soymeal
          • Soy oil
        • ....
      • Softs
      • Meats
    • Metals
      • Precious
      • Base
    • Energies
      • Oil and products
        • Crude
        • Gasoline
        • ....
      • Gas
  • FX
    • Developed
    • Emerging
  • Volatility
    • Country
  • Interest rates
    • Country
      • Point on the curve
  • Crypto (if you really must...!)
  • Alts
For trading strategies it usually makes sense to group in the following hierarchy:

  • Style, eg momentum, carry, ...
    • Specific trading rule, eg moving average crossover, breakout, ...
      • Variation of trading rule, eg 2/8 moving average, 4/16 moving average, ...

We'll consider the joint grouping of trading strategies over multiple instruments in the final post, as this is an empirical question (for myself I currently allocate weights to strategies for a single instrument, then allocate to instruments - but you could do it the other way round, or jointly).

There is more detail on possible groupings in my first and second books. 


Grouping by machine


To back-test we need to be able to create groups automatically. Fortunately there are numerous techniques for doing so. This is a nice notebook that shows one possible method using out of the box scipy functions (not mine!). We can set a cluster size threshold at 3, allowing us to use the 'candidate matching' technique (as this works for 3x3 correlations).


Risk targeting with groups


Let's return to a simple 3 asset portfolio:
  • S&P 500: standard deviation 16%
  • US 2 year bonds: standard deviation 2%
  • US 5 year bonds: standard deviation 4%

Correlations - between S&P 500 and each bond: 0%, between the bonds: 95%

The grouped weights for this lot comes out at: 
  • 2 year bonds: 50% * 50% = 25%; cash weighting 57.1%
  • 5 year bonds: 50% * 50% = 25%; cash weighting 28.6%
  • S&P 500 equities: 50% * 100% = 50%; cash weighting 14.3%

The natural risk of this bad boy is a pathetic 3.21%. So we're back to the problem we have earlier - without access to leverage this is waaaaay too low for most people. However the nice closed form formula that we used earlier to solve this problem only works for 2 assets.

To deal with this we need to partition the portfolio into two parts: a high risk part, and a low risk part. The low risk part will contain all the assets with risk less than our risk target. The high risk component has all the assets with risk higher than our risk target. We then treat these as our two top level groups; and allocate within them as normal. Finally we use the original technique above, where our two 'assets' will be the two groups.

For this simple example (see here) let's suppose we're targeting 10% risk. The low risk group happens to contain all the bonds, whilst the high risk group contains all the equities. We do our allocation in the normal way, and end up with a bond portfolio which has a standard deviation of 2.26%, and an equity portfolio with risk of 16%. Then using the closed form formula we get a solution of 62.2% in equities and 37.8% in bonds (cash weightings).

Warning: if we have too few assets in the high or low risk group then we could end up with a portfolio which is unnecessarily concentrated. 


Other constraints


The last technique was effectively a use of constraints. It's trivial to impose either cash weighted or volatility weighted constraints in any part of a handcrafted portfolio.


Including Sharpe Ratios


We've dealt with volatility and correlations; however we're still assuming that Sharpe Ratios are identical across all assets. Sharpe Ratios are very hard to predict, and their parameter uncertainty is substantial. But we might have some useful information about Sharpe Ratios, some conditioning information. 

For long only asset allocations we might want to use some kind of trading signal; value, momentum or whatever. You might believe in CAPM and want to use the residual part of Beta that isn't covered by inverse volatility weights. You might not believe in CAPM and want to use a 'betting against Beta' thing. You might believe that you are the king of the discretionary stock pickers, and want to make your own forecasts.

For trading strategy allocations I personally prefer to keep potentially poor strategies in my back test, and then allow the back test to weed them out (otherwise my back test results will be inflated by the absence of any ideas I've thrown away). But to do this we need to include historic Sharpe Ratio as an input into the optimisation.

But to what degree should we alter weights according to Sharpe Ratios? Have a gander at this picture:


The slightly odd x-axis of the graph shows the relative Sharpe Ratio of an asset, versus the average Sharpe Ratio (SR) in it's group. The y-axis shows the suggested weight of an asset conditional on that SR, versus the typical weight in it's group

The blue line is a stylised version of what the Markowitz optimiser tends to do; either allocate nothing or everything as the Sharpe changes just a little. 

A more sensible idea might be to adjust weights proportionally with the Kelly criteria as shown with the red line. Assuming the average SR of the group is 0.5, we give something double the normal weight if it has double the average SR, the same weight as the average if it has the same SR as normal, and a zero weight if it has a zero SR. 

The red line assumes we know exactly what the SR is. We don't. If we factor in a reasonable amount of uncertainty we end up with the yellow line. Even if the estimated SR is pretty shocking we'd still end up with some weight on the asset.

This is a nice heuristic, which does a pretty good job in the face of massive uncertainty.

To use it (as outlined in "Systematic Trading" and "Smart Portfolios") you begin with some asset weights (volatility normalised works better and is more intuitive). You then multiply these by the conditioned multiplier depending on the Sharpe Ratio. Some weights will go up, some down. You then normalise the weights. 


What's next


In the next post I'll explain how you actually apply the handcrafting method, both manually, and with some python code.

5 comments:

  1. Hey Rob, long time reader of your posts, love almost every one of them! (especially as a quant trader that is still in the learning stage) I was a bit confused about the section regarding the candidate matching to correct for correlation uncertainty, specifically in how exactly you constructed the candidate matrices. Could you elaborate on constructing and matching the observed correlation matrix to the candidate matrix?

    Thanks in advance!

    ReplyDelete
  2. when you say "If we put half our portfolio into AB and half into C, then we'll actually have over allocated to portfolio AB since it's riskier." I assume you mean "... over allocated to portfolio C since it's risker". If not, I am confused.

    ReplyDelete
    Replies
    1. Yes my mistake, now fixed. Good spot, thanks.

      Delete