This Blog is Systematic: How to write a tweet that gets over 300k views; and why diversification is probably good

Well that blew up:

https://x.com/investingidiocy/status/2032438612961165409

At eight words this is almost certainly* my most viewed and liked tweet ever (although I have nearly 25k followers, so thats 18,000 or so that didn't like it) . Short, pithy, funny; I should retire from my Xmaxxing game right now (just kidding; there are still plenty of gamblers, crypto nuts and MAGA idiots waiting patiently to be educated; and I have a million more dad jokes to inflict on the unsuspecting public).

* X doesn't provide statistics to non premium users, but I am pretty sure it's up there

Obviously the point I was making was about diversification. Buying 50 stocks is better than buying 2 or 3 because of said diversification, unless certain - quite challenging - conditions are met (I'll be spending much of this post quantifying exactly what those conditions are).

Now it was very interesting to read the negative replies (the positive ones were just lovely, thanks). They generally fell into variations of the following:

"Something something asymmetry" (which a few kind people explained to me; obviously as a trend follower, ex-professional options trader and occasionally lottery ticket buyer I didn't know there could be such a thing as an asymmetric bet)
"You just have to choose the right three stocks". A couple of people were even kind enough to share the list of stocks that will definitely go up in the future.
"If you are so smart, why aren't you as rich as Druck?"
What the tweet means, but didn't actually say, is: "Yes, you should buy 50 stocks and ensure 2 or 3 of them are home runs"
Again what the tweet meant, but didn't say, is: "you should use a Taleb barbell type strategy and build a portfolio mostly consisting of boring stuff; plus a few carefully selected lottery tickets"
Also what Druck apparently does, but again wasn't in the tweet, is use stops which means it's fine and dandy to have concentrated positions.
"Risk and reward you idiot". Thank you for kindly explaining the basic premise of modern portfolio theory (MPT) in such a succint way, I weirdly hadn't come across this idea before in over a quarter of a century of studying and researching portfolio optimisation.
"Diversification might make sense once you are rich, but not if you are trying to build wealth."

Whilst some of the above points might seem facile to the cognoscenti who are reading this post, several of them actually raise some very interesting points about a core argument in finance: the trade off between skill (if you have enough, then diversification is bad), and luck (if you don't have enough of this, then diversification is always better); as well as the role of risk and access or not to leverage.

Hence this post - the ultimate diversification-good-or-bad post! I'll whizz through quite quickly from 1st principals some stuff that will be familiar to most of you (though definitely new to many of the people who replied above if they are reading this) ; and end up in more novel territory I have been considering a lot lately - the best way of quantifying and comparing uncertain outcomes; which feeds into optimal back testing practice.

Some irrelevant stuff

Let's first dismiss some of the problems above as mostly irrelevant:

"Something something asymmetry": All bets in finance are to an extent asymmetric, eithier good (positive skew) or bad (negattive). Arguably long only bets in stocks which have limited downside but unlimited upside fall into this category (though a distribution of stock returns will have fat negative tails for return periods of less than a year; beyond that you get saved by autocorrelation). But an asymmetric diversified portfolio of bets will outperform a non diversified asymmetric portfolio unless certain conditions are bet: asymmetry doesn't affect the basic argument here. Apart from lottery tickets and maybe options (which I don't think Druck uses, and the persistent buying of which will lose you money in the form of paying the variance premium) the degree of asymettry isn't known in advance, so we are left with the problem that unless we can forecast the future to some degree of accuracy diversification is always better.
Also what Druck apparently does, but wasn't in the tweet, is use stops which means it's fine to have concentrated positions: Using stops will just add positive skew (asymettry) to strategy returns; plus if assets have trends at the appropriate horizon this will be at no cost or even improve the average return. A diversified portfolio of bets with stops will outperform a non diversified portfolio of bets with stops; so again, this doesn't change the basic question we're trying to answer. As someone whose diversified trend following system with implicit stops trades over 200 assets, I am clearly putting my money where my big mouth is on this one.
"If you are so smart, why aren't you as rich as Druck?": Putting aside that Druck isn't actually putting his entire fund into three stocks, as discussed in the many of the replies to the post, my immediate response to this is if Druck is so smart, why isn't he as rich as numerous other people I could mention who do use massive amounts of diversification? One data point doesn't really prove anything. It could just be that he is highly skilled, or very lucky; two explanations I will explore at length in the post proper.

With that in mind we are left with three interesting arguments to debate:

"All" you need to do is pick better stocks; eithier as part of a small concentrated portfolio, a larger one or a barbell setup.
More risk means more return
Diversification makes more sense once you are rich

Risk 'n' Leverage

One key point we need to address is whether we are working in an environment with leverage or without it; and if our assets volatility is sufficiently high that will not matter. With leverage diversification will improve returns by more than without.

To explain you first need to know that my favourite formula in finance is that Kelly optimal risk is equal to your Sharpe Ratio; if your SR is 0.5 then your optimal risk is 50%. My next favourite formula, which you can use to derive my first favourite formula, is the Gaussian approximation* for geometric return AKA CAGR:

m - 0.5s^2

where m is the arithmetic mean and s is the standard deviation. My third favourite formula is that risk will be reduced by sqrt(N) if you increase the number of assets in your portfolio by N, and all your assets are independent; higher correlations will lead to smaller reductions. As a rule there are diminishing benefits to diversification, as it gets harder to find assets with low correlations, especially if your universe is just US stocks. But there will always be benefits from diversifying as long as correlations never get to 1.

* By the way with positive skew the CAGR will be higher than it is for Gaussian; with the reverse true for negative skew, but to reiterate doesn't change the overall findings we will get.

Let's make some assumptions for now just to illustrate the effect of having more leverage. We have a choice between N=3 assets, and N=3*(4^2) = 48 assets (close enough to fifty). The correlation of those assets is 0.5 (seems reasonable enough for US stocks drawn from multiple sectors, but we'll explore other values later). We also assume we give those assets equal weights.

From favourite formula number three the maximum reduction in standard deviation will be 4 if they are independent, but they are not so it's more like 1.14 (the ratio of 1.40 and 1.22 which are the improvements from N=1 going to N=48 and N=3 respectively). For example if the standard deviation of each stock is 30%, then it will be 30%/1.22 = 24.5% for a portfolio of 3 stocks and 30%/1.4 = 21.4% for 48 stocks.

Let's also suppose for now we can't predict future Sharpe Ratios, and the expected SR is 0.25 on each asset, and the risk free rate is the current 3.75%. Then each stock will have an excess return of 0.25*30% = 7.5% which equates to a total return of 11.25%. That seems optimistic, but we're just playing with numbers for now.

An unleveraged portfolio of 3 stocks will have an average return of 11.25% and a standard deviation of 24.5%; 48 stocks will come in at 11.25% and 21.4% respectively. The Sharpe Ratios on those should be 1.22*0.25 = 0.306 and 1.4*0.25 = 0.35. Check you get those numbers if you calculate the SR manually.

There is a messy spreadsheet you can play with here: https://docs.google.com/spreadsheets/d/12Y06FAxSW6Ii-dDg4TM1H9g-ZBHTUt6mi1a8Htskeuo/edit?usp=sharing Make a copy, do not ask for edit access. This won't allow you to calculate the reduction in risk from diversification, I used python for that. Here is the messy python code: https://gist.github.com/robcarver17/85238b6d538d6e3acedccfc1cc5a76c8

What about a leveraged portfolio? The optimal leverage to run these at will be risk targets of 30.6% and 35.0% for 3 and 48 stocks. Assuming for the moment we are mad enough to do that, we will end up with leverage ratios of 30.6/24.5 = 1.25 and 35/21.4 = 1.64; and after subtracting borrowing costs we get returns of 13.1% and 16.0%, excess returns of 9.37% and 12.2%; and Sharpe Ratios of 0.306 and 0.35 since SR is unaffected by leverage.

Now using my 2nd favourite formula, we can work out the expected geometric returns. They come in at:

- Unleveraged 3 stocks: 8.25%

- Unleveraged 48 stocks: 8.95%

- Leveraged 3 stocks: 8.44%

- Leveraged 48 stocks: 9.87%

In simple terms, we can't 'eat' risk reductions in the form of improved annual returns without leverage. But lower risk does improve CAGR by 70bp. With leverage we can convert all of our lower risk into higher returns, and thus boost CAGR even more: by over 140bp.

Note the maths above for unleveraged traders would be better if:

The individual assets were riskier; at over 50% the optimal leverage would be below 1 even for 48 stocks and we would improve CAGR 'as if' we had leverage.
The per asset Sharpe Ratios were lower; if below 0.15 then the risk target even on the larger portfolio would be sufficiently low that again the optimal leverage would be below 1 even for 48 stocks

The fact that diversification improves CAGR even for unleveraged traders is probably one of the most important yet not widely known results in finance.

Of course the key number in all of the above is the average correlation between stocks; if it's 0.7 then the CAGR improvement falls from 70bp to 42 bp unleveraged; and from 143bp to just 52bp leveraged. A more realistic scenario is if you have a choice between picking three relatively undiversified stocks* and 48, then the larger portfolio will inevitably include several assets that have a higher correlation, bringing your average correlation up, and the diversification benefits down. However it's impossible for the benefit of diversification to go to zero if we assume identical standard deviations and Sharpe Ratios.

Risk 'n' Reward you idiot

You can hopefully already see that the 'you take higher risk therefore you get more reward' without diversification just isn't relevant here. It is true - and a trusim of the efficient frontier in CAPM which assumes constant SR - that you can get an improvement to CAGR by adding riskier assets if you are an unleveraged investor and the SR is sufficiently high that you can put more risk on without going past the optimal Kelly point (if you are leveraged you don't care since you can always achieve your required risk target even if assets have a very low standard deviation).

But a larger portfolio of riskier assets will still have a higher geometric return than a smaller portfolio of assets with the same risk!

Basically these people are mixing up the two key kinds of risk in MPT: specific and systematic risk. The choice of the universe of possible assets sets a given risk 'background' systematic level. In certain circumstances you can sometimes but not always improve CAGR by choosing a universe of assets with higher risk. But you can always improve it further by subsequently diversifying your portfolio so you hold more of the universe, and thus reducing your specific risk.

Taking a higher risk through not diversifying means you are not being rewarded for taking more risk; you only get that reward if you trade riskier assets.

Diversification is only for rich people, take one

You can hopefully also see that the 'diversification is only for rich people' argument also doesn't apply. One of the trusims of financial economics is that rich people should have safer portfolios. In practice that might mean running their portfolio risk at half Kelly rather than full Kelly (where risk target = SR). But changing your risk target, whilst it will for example reduce the benefit from using leverage, doesn't affect the improvements you would get from diversifying!

Eg If I rerun the numbers above with a 50% instrument risk (so leverage doesn't help, simplifying the argument) I get these numbers:

- Unleveraged 3 stocks: 8.44% CAGR with 30.6% risk

- Unleveraged 48 stocks: 9.87% CAGR with 35% risk

It still makes more sense to diversify, even with a higher risk target.

If I now set the maximum risk target as a rich person trying to protec their wealth at 20%, then the advantage of diversification is diminished but it is still there:

- Unleveraged 3 stocks: 7.87% CAGR

- Unleveraged 48 stocks: 8.75% CAGR

Again the logic is clear: just because a poorer person can take higher risk doesn't mean they shouldn't diversify. Not diversifying means you won't benefit from the CAGR boost that comes from the lower risk that brings. A poor person who can't use leverage should probably invest in a universe of riskier stocks (assuming they don't end up above kelly optimal risk in doing so), but a diversified basket of such stocks will be better than just three.

There are more subtleties with this poor vs rich that we will come on to later in the post.

A first stab at calculating the skill premium required

With the above back of the envelope figures we can work out the skill we would need as stock pickers to overcome. To close the 70bp CAGR gap for an unleveraged investor between N=3 and N=48, with the initial assumption of a correlation of 0.5 and SR=0.25; would require to to lift your SR from 0.25 with 48 assets to 0.2735 with only three assets. That is a SR difference of 2.35bp.

Actually the premium is slightly larger; if you had 45 assets with a SR of 0.2485; and three assets with a SR of 0.2735, then your SR on your three asset portfolio would be 0.2735 and on your 48 asset portfolio it would come out at 0.24. You need to be able to find three assets with a SR that is about 2.5bp higher than the average of your universe. Or to put it in terms more people would understand, you need to find three assets whose expected average annual return is 0.75% higher than the rest of your universe. That doesn't sound that hard. If correlations were 0.7 rather than 0.5; then you just need an advantage of 0.45%.

The role of modelled uncertainty

45bp to 75bp of extra annual returns doesn't sound that hard. But in this exercise so far we've completely ignoring the role of uncertainty. We don't actually know any of the numbers I have thrown about in these formula, except for N of course (assuming we can count to three, and keep counting to 48). In practice standard deviations and correlation estimates are broadly predictable enough we can treat them as forecastable (and it's only really if we are using leverage all the way up to the Kelly optimal that we should care that much about the exact figures). But forecasting Sharpe Ratios is much harder. And that is to ignore the fact that financial returns aren't really drawn from Gaussian distributions with linear correlations. We need to think about the effect all this will have on our results, and we will do so for the rest of the post.

For now though, let us indeed assume that financial returns are drawn from linearly related Gaussian distributions with stable distributions, whose parameters we know. However we can't assume that this will gurantee outperformance from a given portfolio with superior theoretical risk adjusted returns. This is because luck will still play a part in the form of modelled variance. Expected performance will only be achieved in the very long run AND/OR across the average of many possible future outcomes.

This means that for relatively short periods of time (not the long run, as in the long run we're all dead), and/or for conservative points of the expectation of the future, we'd have more of a preference for portfolios that are safer and have less 'sparse' portfolio weights. An example of a sparse portfolio weight is if you have say 48 assets available but decide to only allocate to say three of them because they have a higher expected SR/return. So sparse is just another word for concentrated.

Note: This obviously ties into proper robust portfolio optimisation for backtesting purposes

Sparseness is penalised in the short run because there is more chance of eithier very good or very bad outcomes than with non sparse weights, and thus larger variance between outcomes. For example, suppose you pick only three stocks. There is a risk that you happen to pick three crap stocks, and also a possibility that you pick three amazing ones. That is a much larger dispersion of outcomes than if you have a 48 stock portfolio.

Let's look at a picture:

Each of the grey lines shows the cumulative mean returns over a year for 1000 different bootstraps. The setup here is an average correlation of 0.5, 3 assets, risk free rate 3.75%, per asset SR 0.2735 and per asset standard deviation 30% with no leverage. This would equate to an average arithmetic return of 12.0% (the expected CAGR remember comes out to 8.95%). The black line shows the average across bootstraps, and that indeed comes out roughly where we would expect it (equivalent to a CAGR of 8.6%). But there is huge dispersion. If we convert each of the bootstrap outcomes into a CAGR figure we can get a histogram:

The MEAN of this distribution is 8.65%, which is the same as if we had took the daily mean return across all the distributions and worked out it's CAGR. The median however is much lower, and comes in around 5.35%. The 25% quantile is a loss: -10.8%.

What if we did the same thing but with 48 assets, whose per instrument SR is just 0.25. We already know that the lower per asset SR exactly balances out the extra diversification, with an expected CAGR of 8.95% the same as before. Here is the histogram:

The upside is capped, but the downside is also lower. As a result the 25% quantile is better: -9%.

If I produce much longer time series, then things don't deviate as much. For 20 years of data rather than one, the 25% quantile with three assets is a CAGR of 1.56%. For 48 assets, with the lower per asset SR, it's still higher: 3%.

Note: you have probably seen this sort of exercise before in bonds vs equities comparisons. If you care about say the 25% quantile then you'd probably prefer a portfolio with more bonds or cash for shorter time periods to avoid the potential bear market in stocks that would happen in any given year; whereas for longer time periods we expect the returns of stocks to be sufficiently high that even at the 25% quantile you'd prefer more stocks.

The TLDR on this section is that a more concentrated portfolio will have more dispersion of results, especially over short periods. So even if your stockpicking talent is enough to overcome the loss of diversification on average; you will need an even bigger advantage to be confident that you'll beat a more diversified portfolio say 3/4 of the time.

Diversification is for the rich, take two

Notice that in the histograms above we are capping the upside more with a diversified portfolio. If the two different sized portfolios above had equal per asset SR then the 99% percentile of the two over one year comes in at CAGR of 86% for three assets, and 73.9% for 48. That's versus medians of 5.0% and 5.8% respectively (with equal per asset SR diversification gets an edge). For me that tiny chance of a higher upside isn't worth the loss I get on average.

Of course over longer time periods, and considering even tinier top percentiles, you would get starker differences. But if your utility function is "I need to make more than X in time T, and if I don't make that number I place zero value on lower wealth" then sure, you would ultimately want to take buy riskier assets, use more leverage if possible, and have a more concentrated portfolio. That is one weird utility function though! It's perhaps a very gen Z influencer view of the world; you need to be rich or risk losing almost all of your money trying, with no other outcome making any sense. It doesn't make sense to this Gen X'er for sure. If your utility function is "I want as much money as possible in expectation (the median), or in the worst 1%, 10%, 25%... percentile outcome" then you should absolutely be diversifying.

The role of forecast error

Now let's assume we don't know the SR. This is more realistic! The way I have dealt with this before is in the context of a backtest where we need to discover the hidden parameters of a distribution that we assume is stable. That doesn't make as much sense here. Instead let's suppose we have a universe of 48 assets, three of which have a higher SR than the others. We don't know for sure which three are special! How special is special? Let's say a SR of 0.5 rather than the ~0.25 we have been assuming. That might not sound much of an improvement but over 20 years that means compounding to a 15x gain rather than a 3.3x gain.

We begin with the assumption of zero skill. That means we unlikely to pick all 3 good stocks with a portfolio of three, or even just one. Of course with a portfolio of 48 we will always get the 3 good stocks, but their greatness will be diluted by them keeping company with 45 other shitty stocks. Just how shitty? If you set the 45 stocks to have a SR of 0.23, and the three good stocks at 0.5, then the 48 stock portfolio will have an average per stock SR of 0.247.

We can now generate bazillions of backtests just like before. In each one we pick 3 or 48 stocks from a pool of 48. The results for 48 stocks will be very similar to those from before, since we always end up with the same per stock average or around 0.247. The results for 3 stocks will vary though. The average per asset SR across multiple bootstraps will be 0.247, but it will vary a lot on an individual basis. We have a (3/48)*(2/47)*(1/46) = 0.006% chance of picking three amazing stocks and having a per asset SR of 0.5. There is a (45/48)*(44/47)*(43/46)=82% chance of picking no good stocks and having a per asset SR of 0.23. That isn't the worst of it of course, since we will also have less diversification which will also punish our expected CAGR.

Here are some statistics of the CAGR distribution, running a one year backtest:

3 assets 48 assets

Median 4.5% 5.2%

25% percentile -11.6% -8.6%

99% percentile 85.6% 72.1%

As you would expect the 48 asset portfolio is superior, except at the very extreme right tail.

Now what happens if we have some skill? We calculate the numbers again using a skill ratio: R. A skill ratio of one means no skill, and just the normal chance of getting a high SR asset on any draw. A skill ratio of two means on our first draw the chance of getting a high SR asset is 2*(45/48) not (45/48). Notice the skill ratio doesn't affect the 48 asset portfolio, since that will always include 45 duffers and 3 special ones.

What skill ratio R is required for the median of the 3 assets to be the same as that for 48 assets? There is no closed form for this, we just have to play around until we get the right number. A skill ratio of 3 works pretty well (these are bootstrapped numbers so the outcome will vary slightly):

3 assets with skill R=3 48 assets

Median 5.2% 5.2%

25% percentile -10.0% -8.6%

99% percentile 86.8% 72.1%

With a skill ratio of 9 we can match the 25% percentile:

3 assets with skill R=9 48 assets

Median 7.6% 5.2%

25% percentile -8.6% -8.6%

99% percentile 90.3% 72.1%

How likely is a skill ratio of 9 in practice? That means our probability of picking one good stock is 56%, when the odds of doing it just by luck are just over 6%. I don't know, that sounds like an astonishing amount of skill to me. Yes some people will pick all three good stocks just by luck - 0.006% of the time, or one in 16,667 people. One of them could even be called Bill.

The above analysis is with a 0.50 correlation; what would it look like with 0.7?

3 assets 48 assets

Median 3.9% 4.3%

25% percentile -13.5% -12.0%

99% percentile 94.9% 85.4%

Then with skill, R>1:

3 assets R=1.7 3 assets R=4

Median 4.3% 5.5%

25% percentile -13.3% -12.0%

99% percentile 95.0% 95.1%

TLDR: For a three asset portfolio to be better 3/4 of the time than a more diversified portfolio, when we don't know for sure which assets have the best SR, we need to be very skilled in picking stocks: between four and nine times better than pure luck. If we are concerned only with the average median outcome, then we 'only' need a skill level around two to three times better than pure luck.

The role of other uncertainty

At this point I have to stop, since there is no easy way to model the other sources of uncertainty: parameter changes and distributional uncertainty. Suffice to say that in the real world it will be even harder than it is for the toy models above to have sufficient stockpicking skill to overcome the benefits of diversification.

Summary

My conclusions are unchanged - diversification always pays in expectation and at the lower tail of distributions unless you are very, very skilled. But hopefully reading this post has made the reasons why clearer.

This Blog is Systematic

Tuesday, 17 March 2026