I've talked around a type of portfolio construction called "Handcrafting" for some time now, in both of my first two books, and in the odd blog post. I thought it would be useful to explain how the technique works in a more thorough and complete series of blog posts, and also share some code that implements the method.

This is the first of six posts on the handcrafting method.

The first post motivates the need for a method like this.
In the second post I build up the various components of the method, and discuss why they are needed.
In the third post, I'll explain how you'd actually apply the method step by step, with code.
Post four will test the method with real data
Bonus: post five explains how to account for uncertainty in asset Sharpe Ratios (effectively an appendix to post two)
Bonus: post six explains how I come up with subportfolio weights that account for correlation uncertainty (effectively an appendix to post two)

In the rest of this post I'll wax lyrically and philosophically about how I think we should do portfolio construction, and why.

Portfolio optimisers: their use and misuse in the financial industry

The use of portfolio optimisers by human beings working in the financial industry tends to follow a well worn path. In stage one the human is ignorant and/or dismissive of optimisers, preferring to use gut feel – a purely subjective method for choosing portfolio weights. All the human baggage of cognitive biases will be allowed freely into the portfolio.

Stage two sees the optimiser used unthinkingly and it’s outputs taken as gospel, as if printed on immovable stone tablets. The tendency of the optimiser to produce extreme weights is ignored or forgiven until it leads to a significant out of sample under performance. At that point external pressure from investors applied via senior management usually leads to extensive manual overrides of what is supposed to be a purely systematic portfolio weighting process.

After some bad experiences we arrive at stage three, where “ugly hacks” are used to avoid extreme or otherwise unsatisfactory weights. These hacks include applying constraints and modifying the parameter estimates used by the optimiser until they give the “right” result.

The term “ugly hacks” is only used by those who have progressed beyond stage three. Whilst in contemporary usage “ugly hacks” go by alternative monikers: “tweaks”, “adjustments” and “robustification” (a word my spell checker has rightly flagged up as non existent). Regardless of their name the effect is the same. We end up with portfolio weights effectively determined by human gut feel, with the optimisation tortured until it gives the “right” answer.

In stage four we are still trying to force the optimiser to produce satisfactory weights but a more sophisticated armoury is employed, including (but not limited to): Bayesian techniques including Black-Litterman, parametric or non parametric bootstrapping of weights or the efficient frontier, inverse volatility weighting, risk parity, clustering, partial correlations, the use of higher moments, alternative utility functions, neural networks and machine learning. With the right set of data the appropriate technique will provide portfolio weights which can be justified theoretically, are pleasing to investors, and with the added bonus that you blind any doubter with large dollops of science.

Although these stages are well defined it would be misleading to assume there is a straightforward linear progression. Most retail investors will remain at stage one forever. Keeping them company are a few dinosaur era fund managers from the pre-computer age. Stage one may be bypassed entirely by quantitatively trained newcomers arriving directly from non financial industries, or with recently minted advanced degrees that are long on numeric methods but short on common sense application. They may also skip over stage three, hurrying towards the seductive sophistication of stage four.

The gradual journey from stages one to four can also be seen in the evolution of individual firms, as well as for the entire finance industry. A neophyte employee arriving in a firm which is already at stage four will find their own development accelerated. This is not always a good thing. Serving a long apprenticeship in stage two may can give you a healthy scepticism of the portfolio optimisation process and a intuitive understanding of it’s shortcomings. Without this the use of advanced stage four technology may be as dangerous as teaching a novice how to drive in a 250mph supercar.

Stage one: Human defined portfolios "gut feel"

Portfolios created by humans have the advantage of being readily accepted by the humans that built them. Though they may lack the theoretical underpinnings of portfolios created at later stages they also lack the extreme weights that naive mean variance optimisation often delivers. However humans have a poor track record when it comes to making financial decisions using subjective judgement unaided by computing technology or economic theory.

Classical models of financial markets assume that human beings are all knowing, hyper-rational beings who make their decisions within a utility maximising framework. In reality there is considerable evidence that this model of human behaviour is completely and utterly wrong. The field of behavioural economics seeks to explain the numerous instances of apparently irrational behaviour that we see in financial markets.

I'd be loopy to try and summarise that literature here, so from a survey paper here are the main characteristics of investor behaviour which relate directly to the problem of portfolio optimisation:

Insufficient diversification
Naive diversification*
Excessive trading
A reluctance to sell assets at a loss
The purchase of ‘attention getting’ stocks

* “Naive diversification” is worth explaining, as it also directly relates to the concept of handcrafting. Benartzi and Thaler find that when people do diversify, they do so in a naive fashion. They provide evidence that in 401(k) plans, many people seem to use strategies as simple as allocating 1/n of their savings to each of the n available investment options, whatever those options are.

Humans will not automatically choose to hold or rebalance portfolios in a theoretically optimal fashion, and instead will buy portfolios that aren’t sufficiently diversified and are concentrated in ‘story’ stocks, which they will then over-trade. Therefore we need to use formal portfolio optimisers to protect ourselves from our own cognitive flaws.

Stage two: The naive use and misuse of portfolio optimisers

A systematic method for portfolio optimisation is a necessary condition for serious research into portfolio weighting. Unlike subjective methods it can be automated and properly backtested. The method can be run at set intervals over historical data, with each iteration looking backwards so it is not polluted with future information. This isn’t possible with subjective methods. Even if we could persuade a human to repeatedly optimise portfolio weights it would be difficult to erase the knowledge of future events from their minds. The temptation to reduce the weight to stocks in late 1999 and 2007 would be to hard to resist.

As we know the problem of finding optimal portfolio weights was famously “solved” by Harry Markowitz in 1952. Thousands of trees have already died explaining the mean variance optimisation technique that Markowitz expounded, so I do not feel the need to kill any more.

For our purposes the important features of mean variance optimisation are that:

We require parameter estimates for the mean, standard deviation, and correlation of the relevant assets
Small differences in these estimates can result in highly unstable, extreme, weights.

In practice it is very difficult to forecast the required parameters, or even to know what they should have been in the past. I've discussed this extensively in the past (most recently with this talk).

We cannot know the optimal portfolio weights with any certainty. More importantly unsatisfactory weights, such as 0% and 100% in a two asset problem, are common. These weights are intuitively unattractive to humans, and they are also more likely to perform badly in out of sample testing.

Stage three: Why ugly hacks are bad

Even under highly unrealistic laboratory conditions mean variance optimisation fails to deliver robust portfolio weights that will be acceptable to humans. The first instinct of many people is to bludgeon the weights until they look intuitively “right”.

The simplest hack is to introduce portfolio constraints. Don’t like to see a zero allocation to bonds? Then set a minimum 10, or 20% weight. Uncomfortable with a 100% alloction to stocks? Then set a maximum weight of perhaps 90% or 80%. We can even introduce a quantitative method to determine constraints: for N assets, set the minimum at x(1/N) and the maximum at y(1/N) where x<1 and y>1.

A more sophisticated technique is to adjust the input parameters; tweaking the mean vector and covariance matrix until you get the result you like. Some skill and intuition is required here. Indeed trying these adjustments is an excellent way to gain a deep understanding

Ugly hacks have some poor attributes. From a technical basis introducing constraints usually makes the optimisation less stable. Most optimisers use penalty functions to apply constraints. This results in highly non-linear gradient functions. Simple grid search optimisers are not subject to this problem, but they are also extremely slow and impractical for use with more than a few assets. Adjusting the inputs does not affect the optimisation function, and so is a more stable approach.

But these technical problems are not the elephant in this particular room. Both forms of hack have one significant failing: they sneak in human subjective judgement, with all it’s potential failings. Even if the optimisation is done on a rolling series of out of sample optimisations the constraints are normally set for the whole period.

Stage four: Why complex optimisation techniques are not the answer

Any self respecting financial quant will feel deep uncertainty when they use stage three techniques. Firstly, to anyone who spends a few moments in introspection it is obvious that we are “cheating”. Secondly, they are too trivial. It would make more sense to protect our future employment prospects by introducing a more complex method.

A small industry has grown up over the last few decades, devoted to fixing the basic mean variance optimisation in various ways. This is not a survey paper on the subject, but here are some of the methods that have been used. This first group still uses all the inputs but tries to modify them in some way:

Bayesian (Bayes-Stein): adjust one or more of the inputs to reflect the uncertainty of information, by finding the weighted average of a prior parameter value and the estimated parameter.
Bootstrapping (non-parametric): repeatedly resample the portfolio history and find the optimal set of weights for each sample. Then take an average of the weights across samples.
Bootstrapping (parameteric): estimate a distribution from the data history, resample the distribution, find the optimal weights for each sample, and take an average of the weights.
Bootstrapping (Michaud parametric): estimate a distribution from the data history, resample the distribution, find the efficient frontier for each sample, take an average of the efficient frontiers across samples, and then find the optimal point on the averaged frontier.
Partial correlations. Change the correlations so that they are partial correlations. This tends to shrink correlations away from +1 and -1, and so produces a more robust result.

This second group effectively ignores one or more of the inputs from the optimisation (and I'd highly recommend this paper to help you think about how they fit together - particularly figure 2).

Override one or more of the estimated inputs, for example by equalising all portfolio means or Sharpe Ratios. This overlaps with many of the following techniques (and it's also effectively Bayesian with full shrinkage)
Equal weighting: Equally weight all instruments in a portfolio.
Risk parity: Equally weight each asset class by it’s volatility contribution to the portfolio. This method is silent on how allocations should be achieved within asset classes. This portfolio may undershoot many investors risk appetite so it's often used with leverage.
Inverse volatility weighting: Set weights of all assets equal to the inverse of their relative volatility (notice that this is not the same as risk parity except for certain special cases, and it ignores correlations). This method needs to be combined with other methods if means or correlations are to be used. This portfolio may undershoot many investors risk appetite.
Minimum variance: Mean variance using only the covariance matrix (correlations and volatility), and with an objective function of minimum risk. This portfolio may undershoot many investors risk appetite.
Maximum diversification. Does what it says on the tin. This portfolio may undershoot many investors risk appetite.
HRP (hierarchical risk parity): As already discussed this is a hierarchical equivalent of risk parity, where we first group assets then assign risk parity weights within and across groups.

Notice a general theme here: once we step out of the cosy world of optimising a fairly homogeneous group of assets like the S&P 500 into something like an asset allocation problem it becomes more likely that portfolios which don't maximise risk versus return will end up seriously undershooting investors target risk). I'll discuss how this is solved in part two of this series.

There is a third group of ideas which involves maximising something else (Sharpe Ratio, geometric return, higher moments of the utility function); but these really lie outside the scope of what we're thinking about here.

Complex optimisation techniques are very appealing to quantitative portfolio analysts and their managers. Humans tend to assume that if something is sufficiently complicated it must be correct; and the more complicated it is, the better. All of the techniques above are more robust than stage two mean variance optimisation and less likely to produce extreme weights.

However not all the techniques shown correct properly for parameter uncertainty. Used correctly the following methods are best at dealing with parameter uncertainty: overriding inputs, Bayesian optimisation and boot-strapping. Risk parity, minimum variance and inverse volatility weighting do not use expected returns which are the largest source of uncertainty, but do not deal with the uncertainty of correlation or standard deviation estimates.

But these methods also aren’t perfect. Non-parameteric boot-strapping does not always deliver robust weights with limited amounts of data, though it has the benefit of requiring no additional parameters. In contrast parametric boot-strapping requires you to select an appropriate distribution and estimate the parameters for it. A joint Gaussian distribution is insufficient for most financial data. Trying to estimate higher moments like (co-) skewness and (co-) kurtosisis tricky; they are easily influenced by one or two outliers, especially if data is limited. Dealing with autocorrelation in return series also makes bootstrapping harder.

Bayesian optimisation requires you to come up with appropriate priors that contain no forward looking information, and a shrinkage parameter that is appropriate to the length and noise level in the data. These additional tuneable parameters transform the problem from one of portfolio optimisation to methodology optimisation.

Often a series of experiments is necessary to determine which combination of technique and tuneable methodology parameter gives the best result. It’s possible in theory to calibrate a shrinkage parameter on a rolling basis using only backward looking data but in practice methodology optimisation is rarely done automatically using a pure out of sample approach.

A number of proper portfolio optimisations are done on a pure out of sample basis; but then the best method is usually selected by the researcher based on their performance over the entire data-set. Implicit in-sample fitting has just crept in to our scientifically rigorous process via the back door.

The other significant problem is that the more complicated the process, the less intuitive and transparent it is to humans. Humans are less likely to accept weights that are not intuitive and transparent, which leads to the problems I outlined earlier. With experience the process of naive stage two mean variance optimisation is reasonably intuitive. Almost all of the techniques above are more complicated and therefore less intuitive than naive mean variance.

But there are degrees of difficultly here. After extensive usage practice Bayesian methods can seem reasonably intuitive (they are to me at least!). Risk parity, inverse volatility and clustering make logical sense although both have to be combined with other techniques to produce reasonable results. For this reason I will be using inverse volatility and clustering as part of the handcrafting method. In contrast it is not obvious to humans how the precise weights out of a bootstrapping process have been derived, even if the process is simple to describe.

I haven’t discussed any of the highly sophisticated techniques that are currently in vogue amongst practitioners: Artificial intelligence, machine learning or neural networks. I do not feel qualified to comment on their efficacy or robustness. But it is clear that they are far worse than any of the methods mentioned above when it comes to opaqueness.

Some desirable attributes of portfolio construction

A portfolio (or any systematic methodology) is no good if a human being isn't happy with it. At the first sign of any loss a human will dump a portfolio that they never liked to begin with, and start meddling. This will completely detract from the point of making trading and investment decisions in a systematic way.

Hence, we can define some idealised characteristics for portfolio optimisation methodologies.

Firstly, a good method for portfolio optimisation is one that will be accepted and understood by human beings. This is important because humans will not be tempted to override the portfolio weights if they find them undesirable, or if the resulting portfolio under performs. Specifically we require that:

The portfolio weights are acceptable to human beings. Human beings like weights that make sense, and they also dislike both extreme weights and weights that move around a lot (the latter also lead to increased trading costs).
The process by which the weights are derived is transparent to human beings
The process is intuitively obvious

Importantly it is not sufficient that the person performing the optimisation finds the weights acceptable, and the process transparent and intuitive. Any stakeholder with the power to force the weights to be changed must also be satisfied, including investors and senior fund managers.

An ideal portfolio should also satisfy certain theoretical conditions:

The portfolio weights are free of cognitive biases
They should satisfy generally accepted principles of financial theory, for example: diversified portfolios are better
The weights should be robust to problems such as parameter uncertainty and optimisation instability.

(The last of these points might not be interesting to most humans, but they're necessary for me personally!)

Finally there are advantages to running a portfolio process which is systematic: it can be back-tested against historic data avoiding in-sample over fitting, is repeatable and transferable, can be automated hence reducing costs, can be applied to large portfolios, and it’s likely behaviour can be calibrated and understood. This gives our final conditions:

7. The portfolio process can be systematically applied

8. The process does not use in-sample data for both fitting and testing portfolio weights

Finally it's clearly the case for any methodology that their performance properly measured (out of sample or in live trading) should produce returns which are not statistically inferior to some competing methodology. But I'll address this in the final two posts of this series.

Which conditions are met at each stage?

Advantages in green, disadvantages in red.

Stage one: gut feel

1: Weights are acceptable to human beings.

2 & 3: Process is intuitive and transparent – if you set the weights

6: Instability is not usually a problem

2: Process is not intuitive or transparent – if they are someone elses weights

4: Cognitive biases are present

5: Financial theory is often ignored

6: Parameter uncertainty is rarely taken into account

7 & 8: Cannot be back-tested – always in sample.

Stage two: Naive optimisation

2: The process is transparent, 3: If you are experienced reasonably intuitive

4 & 5: Weights are free of cognitive biases, and conform to financial theory

7 & 8: Easily backtested

1: Weights are unacceptable to human beings

3: The process may not be intuitive for non specialists.

6: The process isn’t robust and will perform badly out of sample

Stage three: ugly hacking

1: With enough hacking we can usually find acceptable weights.

2: The process is simple enough to be intuitive to some.

6: Hacking parameter estimates usually makes the process more robust (unknowingly).

2: The process isn’t very transparent

3: The process may not be intuitive.

4 &5: Cognitive biases and financial heresy can sneak in via the back door

6: Applying constraints usually makes the process less robust

7 & 8: It is difficult in practice to run this as a systematic back test; effectively in sample optimisation is sneaked in via the back door

Stage four: advanced technology

1: With the right technique we can usually find acceptable weights

4 & 5: Weights are free of cognitive biases and standard financial theory is respected

6: Many techniques respect parameter uncertainty

7: The process can be systematically applied

8: When used correctly in sample fitting is avoided.

2: The process is very unlikely to be intuitive

3: To non-experts the process is not transparent. In some cases (eg neural networks) the process is opaque for everyone.

6: Certain techniques do not account for parameter uncertainty

7 & 8: “Methodology shopping”* - searching for a precise technique which produces the best outcome – can lead to implicit in sample fitting.

(* A simple example of “methodology shopping” is running a series of Bayesian optimisations in which the shrinkage parameter is allowed to vary; and then selecting the best shrinkage parameter based on out of sample portfolio performance – which means in practice that the optimal shrinkage has been selected using all the available data effectively fitting in sample.)

Very brief intro to handcrafting

I'll explain handcrafting in inordinate detail in the second post of this series, but for new readers the basic idea is that you split your portfolio into groups in a hierarchical fashion, and then apply equal risk weighting within that group.

(There are also quite a few twists on the basic idea of equal risk weighting, which I'll explain further in the next post)

This may sound quite a lot like HRP, but in fairness I did come up with the method independently* and there are some important differences. In particular it's possible to implement a "handcrafted" portfolio entirely by hand without any computing power, hence the name, and this is the core idea at the heart of handcrafting.

* I first became aware of HRP in early 2017, but I've been using handcrafting in one form or another for several years and my book containing the concept was written in 2014 and 2015, and then published in October 2015. Having said that the core idea of handcrafting is that it reflects the way humans naturally like to do portfolio construction, so I'm not sure anyone can really claim ownership of the concept. I certainly won't be patenting it!

Handcrafting is essentially a heuristic method rather than an optimisation, although it does use the same inputs as any optimisation would: the expected distribution of asset returns, plus any relevant constraints.

So... handcrafting

To finish this post let me explain why I think handcrafting, unlike the other methods we've discussed, meets all the criteria I've outlined above. Before that let me just explain that there are effectively two types of handcrafting:

A 'humane' method that human beings can use to produce 'one off' intuitive portfolios without requiring any advanced technology
An automated method that can be used to back test the methodology over repeated time periods

The main difference between these methods is in the choice of groupings of assets. With the humane method this is done subjectively; with the automated method it's done objectively. A human could apply the automated method of course; but it would be extremely painful. Conversely you probably couldn't automate the discretionary opinions of human beings. Anyway on to the characteristics that we require:

The portfolio weights are acceptable to human beings: Weights won't be extreme, and except when assets move in or out of the portfolio are mostly pretty stable.
The process by which the weights are derived is transparent to human beings: a human being can follow the handcrafting process because they can actually to it themselves (in it's humane form) on the back of a beer mat or napkin. This is the key strength of handcrafting compared to stage 2,3 and 4 portfolio optimisation.
The process is intuitively obvious: although there are a lot of steps to the full handcrafting process (as we'll discover in the next post), each step can clearly be understood and explained.

The portfolio weights are free of cognitive biases: True of the automated method but might not be true for the humane method
They should satisfy generally accepted principles of financial theory, for example: diversified portfolios are better. Yes the core of handcrafting is to create the portfolio with the most diversification. The heuristics that I'll use in handcrafting are all supported by solid theoretical and empirical foundations.
The weights should be robust to problems such as parameter uncertainty and optimisation instability: As will become clear in the next post this is definitely true of hand crafting

7. The portfolio process can be systematically applied: Entirely true for the automated method

8. The process does not use in-sample data for both fitting and testing portfolio weights: Entirely true for the automated method, may not be completely true for the humane method.

What's next

In the second post I'll explain the handcrafted method in more detail, and share some code with you.

This Blog is Systematic

Wednesday, 5 December 2018

Portfolio construction through handcrafting: motivating