This Blog is Systematic: How much should we get paid for skew risk? Not as much as you think!

A bit of a theme in my posts a few years ago was my 'battle' with the 'classic' trend followers, which can perhaps be summarised as:

Me: Better Sharpe!

Them: Yeah, but Skew!!

My final post on the subject (when I realised it as a futile battle, as we were playing on different fields - me on the field of empirical evidence, them on .... a different field) was this one, in which the key takeaway was this:

The backtest evidence shows that you can achieve a higher maximum CAGR with vol targeting, because it has a large Sharpe Ratio advantage that is only partly offset by it's small skew disadvantage. For lower levels of relative leverage, at more sensible risk targets, vol targeting still has a substantially higher CAGR. The slightly worse skew of vol targeting does not become problematic enough to overcome the SR advantage, except at extremely high levels of risk; well beyond what any sensible person would run.

And another more recent post was on Bitcoin, and why your allocation to it would depend on your appetite for skew.

With those in mind I recently came to the insight that I could use my framework of 'maximising expected geometric mean / final wealth at different quantile points of the expectation distribution given you can use leverage or not'* to give an intuitive answer an intruiging question - probably one of the core questions in finance:

"What should the price of risk be?"

* or MEGMFWADQPOTED for short - looking actively for a better acronym - which I used in the Bitcoin post linked to above, but explain better in the first half of this post and also this one from a year ago

The whole academic risk factor literature assumes the price of risk often without much reasoning. We can work out the size of the exposure, and the risk of the factor, but that doesn't really justify it's price. After all, academics spent a long time justifying the equity risk premium.

I think it would be fun to think about the price of different kinds of risk. Given the background above, I thought only about skew (3rd moment) risk but I will also briefly discuss standard deviation (2nd moment) risk. Generally speaking the idea is to answer the question "What additional Sharpe Ratio should an investor require for each unit of additional risk in the form of X?" Whilst this has certainly been covered by academics at some length, I think the approach of wrapping up into expressing risk preference as optimising for different distributional points is novel and means pretty graphs.

I'm going to assume you're familiar with the idea of maximising geometric return / CAGR / log(final wealth) at some distributional point (50% median or more conservative points like 10, 25%), to find some optimal level of leverage. If not enjoy reading the prior work.

The "price" of standard deviation risk - with and without leverage

To an investor who can use leverage, for Gaussian normal returns, this is trivial. We want the higest Sharpe Ratio asset, irrespective of what it's standard deviation is. Therefore the 'price' of standard deviation is zero. We don't mind getting additional standard deviation risk as long as it doesn't affect our Sharpe Ratio - we don't need a higher SR to compensate. Indeed in practice, we might prefer higher standard deviations since it will require less potential leverage that could be problematic if we are wrong about our SR estimates or assumptions about return distributions.

In classical Markowitz finance to an investor who cannot use leverage, the price of standard deviation is negative. We will happily pay for higher risk in the form of a lower Sharpe Ratio. We want higher returns at all costs; that may come at the cost of higher standard deviation so we aren't fully compensated for the additional risk, but we don't care. This is the 'betting against beta' explanation from the classic Pedersen paper. Consider for example an investment with a mean of 5% and a standard deviation of 10% for a Sharpe Ratio of 0.5 (I set the risk free rate to zero without loss of generality) . If the standard deviation doubles to 20%, but the mean only rises to 6%, well we'd happily take that higher mean. We'd even take it if the mean only increased by 0.00001%. That means the 'price' of higher standard deviation is not only negative, but a very big negative number.

But we are not maximising arithmetic mean. Instead we're maximising geometric mean, which is penalised by higher standard deviation. That means there will be some point at which the higher standard deviation penalty for greater mean is just too high. For the median point on the quantile distribution, which is a full Kelly investor, that will be once the standard deviation has gone above the Kelly optimal level. Until that point the price of risk will be negative; above it will turn positive.

Consider again an arbitrary investment with a mean of 5% and a standard deviation of 10%; SR =0.5. If returns are Gaussian then the geometric mean will be 4.5%. The Kelly optimal risk is much higher 50%, which means it's likely the local price of risk is still negative. So for example, if the standard deviation goes up to 20%, with the mean rising to say 6.5%, for a new (lower) SR of 0.325; we'd still end up with the same geometric mean of 4.5%. In this simple case the price of 10% units of risk is a SR penalty of 0.175; we are willing to pay 0.0175 units of SR for each 1% unit of standard deviation.

If however the standard deviation goes up another 10%, then the maximum SR penalty for equal geometric mean we would accept is 0.025 units (getting us to a SR of 0.3 or returns of 6.5% a year on 30% standard deviation equating again to a geometric mean of 4.5%); and for any further increase in standard deviation we will have to be payed SR units. This is because the standard deviation is now 30% and so is the SR; we are at the Kelly optimal point. We wouldn't want to take on any additional standard deviation risk unless it is at a higher SR, which will then push the Kelly optimal point upwards.

So we'd need to get paid SR units to push the standard deviation up to say 40%. With 40% standard deviation we'd only be interested in taking the additional risk if we could get a SR of 0.3125 to maintain the geometric mean at 4.5%. Something weird happens here however, since 40% is higher than the new Kelly optimal we can actually get a higher geometric mean if we used less risk (basically by splitting our investment between cash and the new asset). To actually want to use that 40% of risk the SR would trivially have to be 40%. For someone who is remaining fully invested the price of standard deviation risk once you hit the Kelly optimal is going to be 1:1 (1% of standard deviation risk requiring 0.01 of SR benefit).

That is all for a Kelly optimal investor, but how would using my probabilistic methodology with a lower quantile point than the median change this? Well clearly, that would penalise higher standard deviations more, reducing the point at which standard deviation risk was negative.

Because the interaction of leverage and Kelly optimal is complex and will depend on exactly how close the initial asset is to the cutoff point, I'm not going to do more detailed analysis on this as it would be timeconsuming to write, and to read, and not add more intuition thatn the above. Suffice to say there is a reason why I usually assume we can get as much leverage as required!

The "price" of skew - with leverage

Now let's turn to skew (and let's also drop the annoying lack of leverage which makes our life so complicated). The question we now want to answer is "What is the price of skew: how many additional points of SR do we need to compensate us for a unit change in skew, assuming we can freely use leverage? And how does this change at different distributional points?". Returning to the debate that heads this post; is an extra 0.50 units of skew worth a 0.30 drop in SR when we go from continous to 'classical' trend following? We know that would only be the case if we were allowed to use a lot of leverage; which implies we were unlikely to be anything but a full Kelly optimising median distributional point investor. But at what distributional point does that sort of tradeoff become worth it?

To answer this, I'm going to recycle some code from this post and adapt it. That code uses a brute force technique to by mixing Gaussian returns to produce returns with different levels of skewness and fat tailed-ness, but with the same given Sharpe Ratio. We then bootstrap those returns at different leverage levels. That gives us a distribution of returns for each leverage level. We can then choose the optimal leverage that produces the maximum geometric return at a given distributional point (eg median for full Kelly, 10% to be conservative and so on). I then have an expected CAGR level at a given SR, for a given level of skew and fat tailness. By modifying the SR, skew and fat tailness I can see how the geometric return varies, and construct planes where the CAGR is constant. From that I can derive the price of skew (and fat tailness, but I will look at that in a momen) in SR units at different distributional points. Phew!

(Be prepared to set aside many hours of compute time for this exercise if you want to replicate...)

The "price" of skew: Kelly investor

Let's begin by looking at the results for the Kelly maximiser who focuses on the median point of the distribution when calculating their optimal leverage.

The plots show 'indifference curves' at which the geometric mean is approximately equal. Each coloured line is for a different level of geometric mean. The plots are 'cross plots' that show statistical significance and the median of a cloud of points, as due to the brute force approach there is a cloud of points underneath.

Even then, there is still some non monotonic behaviour. But hopefully the broad message is clear; for this sort of person skew is not worth paying much for! At most we might be willing to give up 4 SR basis points to go from a skew of -3 to +3, which is a pretty massive range.

The "price" of skew: very conservative investor

Now let's consider someone who is working at the 10% quantile point.

If anything these curves are slightly flatter; at most the price of skew might be a couple of basis points. The intuition for this is that these people are working at much lower levels of leverage. They are much less likely to see a penalty from high negative skew, or much of a benefit from a high positive skew.

The "price" of lower tail risk: Kelly investor

Now let's consider the lower tail risk. Remember, a ratio of 1 means we have a Gaussian distribution, and a value above 1 means the left tail is fatter.

This may seem surprising; with a more extreme left tail it looks like you can have a higher SR. But the improvement is modest again, perhaps 5bp of SR at most.

The "price" of lower tail risk: 10% percentile investor

Once again, investors at a lower point on the quantile spectrum are less affected by changes in tail risk, requiring perhaps 3bp of SR in compensation.

How does the optimal leverage / skew relationship change at different percentiles?

As we have the data we can update the plots done earlier and consider how optimal leverage changes with skew. First for the Kelly investor:

Here each coloured line is for a different SR. We can see that for the lowest SR the optimal leverage goes from around 2.7 to 3.7 between the largest negative and positive skews; and for the higest from around 4.2 to 5.6. This is the same result as the last post: leverage can be higher if skew is positive, but not that much higher (from skew of -2 to +2 we can leverage up by around a third).

Here is the 10% investor:

The optimal leverage is lower as you would expect, since we are scaredy cats. It looks like the leverage range is higher though; for the highest SR strategies we go from around 1.7 to 2.8; a two thirds increase. And for the lower SR the rise in optimal leverage is even more dramatic.

One final cut of the data cake

Finally another way to slice the cake is to draw different coloured lines for each level of skew and then see how the geometric mean varies as we change Sharpe Ratio. First the Kelly guy:

This is really reinforcing the point that skew is second order compared to Sharpe Ratio. Each of the bunches of coloured lines is very close to each other. At the very lowest SR at around 0.52 we only get a modest improvement in CAGR going from skew of -2.4 (purple) to +2.4 (red). We get a bigger improvement in CAGR when we add around 3bp of SR and move along the x-axis. Hence 5 units of skew are worth less than 3bp in SR. It's only at relatively high levels of SR that skew becomes more valuable; perhaps 5bp of SR for each 5 units of skew.

Here is the 10% person:

As we noted before there is almost no benefit from skew for the conservative investor (coloured lines close together at each SR point), except until SR ramps up. At the end 5 units of skew are worth the same as around 6bp of SR.

Conclusion: Skew isn't as valuable as you might think

I started this post harking back to this question: is an extra 0.50 units of skew from 'traditional' trend following worth a 0.30 drop in SR? And the answer is, almost certainly not. The best price we get for skew is around 6bp for 5 units of skew. At that price, 0.5 units of skew should cost us less than 1bp in SR penalty. We're being charged about 50 times the correct price!!!

And this is for Kelly investors. For those with a lower risk tolerance, much of the time there is basically no significant benefit from skew.

That doesn't mean that you shouldn't know what your skew is, as it will affect your optimal leverage, particularly as we saw above if you are a conservative utility person (being such a person will also protect you if you think your skew or Sharpe ratio is better than it actually is, and that's no bad thing). And negatively skewed strategies at la LTCM with very low natural vol that have to be run at insane leverage will always be dangerous, particularly if you don't realise they are negatively skewed.

But part of the problem with the original debate is a false argument by taking a true statement 'highly negatively skewed strategies are very dangerous with leverage' and extending it to 'you should be happy to suffer significantly lower Sharpe Ratio to get a marginally more positive skew' (which I have demonstrated is false).

Anyway outside of that argument I think I have shown that to an extent the obsession with getting positive skew is a bit of an unhealthy one. Sure, get it if it's free, but don't pay much for it otherwise.

This Blog is Systematic

Thursday, 6 February 2025

How much should we get paid for skew risk? Not as much as you think!