This Blog is Systematic: A little demonstration of portfolio optimisation

Tuesday, 6 October 2015

A little demonstration of portfolio optimisation

I've had a request for the code used to do the optimisations in chapter 4 of my book "Systematic Trading" (the 'one-period' and 'bootstrapping' methods; there isn't much point in including code to the 'handcrafted' method as it's supposed to avoid programming).

Although this post will make more sense if you've read the book, it can also be read independently as I'll be dropping brief explanations in as we go. Hopefully it will whet your appetite!

You can get the code you need from here:

https://github.com/robcarver17/systematictradingexamples/blob/master/optimisation.py

The code also includes a function for generating "expanding window", "rolling window" and "in sample" back test time periods which could be useful for general fitting.

The problem

The problem we are trying to solve here is "What portfolio weights should we have held in the past (between 2000 and mid 2015) given the returns of 3 assets: S&P 500 equity index, NASDAQ equity index and a US 20 year bond*?"

* You can think of this as a synthetic constant maturity bond, or what you'd get if you held the 20 year US bond future and also earned interest on the cash you saved from getting exposure via a derivative.

Some of the issues I will explore in this post are:

This is a backtest that we're running here - a historical simulation. So how do we deal with the fact that 10 years ago we wouldn't have had data from 2005 to 2015? How much data should we use to fit?
These assets have quite different volatility. How can we express our portfolio weights in a way which accounts for this?
Standard portfolio optimisation techniques produce very unstable and extreme weights. Should we use them, or another method like bootstrapping which takes account of the noise in the data?
Most of the instability in weights comes from having slightly different estimates of the mean return. Should we just assume all assets have the same mean return?

In sample

Let's begin by doing some simple in sample testing. Here we cheat, and assume we have all the data at the start.

I'm going to do the most 'vanilla' optimisation possible:

opt_and_plot(data, "in_sample", "one_shot", equalisemeans=False, equalisevols=False)

This is a very boring plot, but it shows that we would have put 78% of our portfolio into US 20 year bonds and 22% into S&P500, with nothing in NASDAQ. Because we're cheating we have the same information throughout the backtest so the weights don't change. We haven't accounted for the uncertainty in our data; nor done anything with our estimated means - this is just vanilla 'one period' optimisation - so the weights are pretty extreme.

Let's deal with the first problem - different volatility. In my book I use the technique of volatility normalisation to make sure that the assets we are optimising weights for have the same expected risk. That isn't the case here. Bonds are much less volatile than stocks. To compensate for this they have a much bigger weight.

We can change the optimisation function so it does a type of normalisation; measure the standard deviation of returns in the dataset and change all the returns so they have some arbitrary annualised risk (20% by default). This has the effect of turning the covariance matrix into a correlation matrix.

opt_and_plot(data, "in_sample", "one_shot", equalisemeans=False, equalisevols=True)

Now things are looking slightly more reasonable. The weights we are seeing here are 'risk allocations'; they are conditional on the assets having the same volatility. Even if we aren't lucky enough to have assets like that it's more intuitive to look at weights in this vol adjusted space.

However it's still a pretty extreme portfolio. Poor NASDAQ doesn't get a look in. A very simple way of dealing with this is to throw away the information we have about expected mean returns, and assume all assets have the same mean return (notice that as we have equalised volatility this is the same as assume the same Sharpe Ratio for all assets; and indeed this is actually what the code does).

opt_and_plot(data, "in_sample", "one_period", equalisemeans=True, equalisevols=True)

Now we have something I could actually live with. The only information we're using here is correlations; clearly bonds are uncorrelated with equities and get almost half the weight (which is what they'd get with handcrafting - the simple, no computer required, method I discuss in my own). S&P 500 is, for some reason, slightly less diversifying than NASDAQ in this dataset, and gets a slightly higher weight.

However what if our assets do have different expected returns, and in a statistically significant way? A better way of doing the optimisation is not to throw away the means, but to use bootstrapping. With bootstrapping we pull returns out of our data at random (500 times in this example); do an optimisation on each sample of returns, and then take an average of the weights from each sample.

opt_and_plot(data, "in_sample", "bootstrap", equalisemeans=False, equalisevols=True, monte_carlo=500)

Notice the weights are 'wiggling' around slightly. This is because although the code is using the same data (as we're optimising in sample), it's doing a new set of 500 optimisations each year, and each will be slightly different due to the randomness of each sample. If I'd used a smaller value for monte_carlo then there would be even more noise. I quite like this 'wiggliness' - it exposes the underlying uncertainty in the data.

Looking at the actual weights they are similar to the previous example with no means, although NASDAQ (which did really badly in this sample) is slightly downweighted. In this case using the distribution of average returns (and correlations, for what it's worth) hasn't changed our minds very much. There isn't a statistically significant difference in the returns of these three assets over this period.

Rolling window

Let's stop cheating and run our optimisation in such a way that we only know the returns of the past. A common method to do this is 'walk forward testing', or what I call 'a rolling window'. In each year that we're testing for we use the returns of the last N years to do our optimisation.

To begin with let's use 'one period' optimisation with a lookback of a single year.

opt_and_plot(data, "rolling", "one_period", rollyears=1, equalisemeans=False, equalisevols=True)

As I explain at length in my book one year is wholly inadequate to give you significant information about returns. Notice how unstable and extreme these weights are. What about 5 years?

opt_and_plot(data, "rolling", "one_period", rollyears=5, equalisemeans=False, equalisevols=True)

These are a little more stable, but still very extreme. In practice you usually need a lot more than 5 years of data to do any kind of optimisation, and chapter 3 of my book expands on this point.

I won't show the results for bootstrapping with a rolling window; this is left as an exercise for the reader.

Expanding window

It's my preference to use an expanding window (sometimes called anchored fitting). Here we use all the data that we have available as we step through each year. So our window gets bigger over time.

opt_and_plot(data, "expanding", "one_period", equalisemeans=False, equalisevols=True)

These weights are more stable as we get more data; by the end of the period we're only adding 7% more information so it doesn't affect the results that much. However the weights are still extreme. Adding more data to a one-shot optimisation is only helpful up to a point.

Let's go back to the boostrapped method. This is my own personal favourite optimisation method:

opt_and_plot(data, "expanding", "bootstrap", equalisemeans=False, equalisevols=True)

Things are a bit hairy in the first year* but the weights quickly settle down to non extreme values, gradually adjusting as we get more data as the window expands.

* I'm using 250 days - about a year - of data in each bootstrap sample (you can change this with the monte_length parameter). With the underlying sample also only a year long this is pushing things to their limit - I normally suggest you use a window size around 10% of the total data. If you must optimise with only a year of data then you should probably use samples of around 25 business days. However my simple code doesn't support a varying window size; though it would be easy to use the 10% guideline eg by adding monte_length=int(0.1*len(returns_to_bs.index)) to the start of the function bootstrap_portfolio.

Just to reinforce the point that these are 'risk weightings' here is the same optimisation done with the actual 'cash' weights and no normalisation of volatility:

opt_and_plot(data, "expanding", "bootstrap", equalisemeans=False, equalisevols=False)

Conclusion

I hope this has been useful both to those who have bought my book, and those who haven't yet bought it (I'm feeling optimistic!). If there is any python code that I've used to write the book you would like to see, let me know.

71 comments:

Thomas Schmelzer6 October 2015 at 17:06
Bootstrapping of returns isn't always a good idea. Such "take the mean over various optimisation problems" are often not useful for sparse portfolios. Assume you have a universe of 500 stocks but can only invest in like 50 at the same time. Obviously taken your average over sparse portfolios would destroy this. On top of this it's computationally demanding and it takes some effort to reproduce exactly the same results (that's important in a back test). Your noise around your average will cause further artificial trading costs. The method you are suggesting reminds of a book by Michaud. You may want to cite his work. Having the same mean return is just a more extreme form of shrinkage. I guess you could look into a less brutal shrinkage.
ReplyDelete
Replies
Rob Carver6 October 2015 at 17:23
Hi Thomas

Great comments.

You're right bootstrapping isn't great for sparse portfolios. However the kinds of portfolios I deal with in my book aren't sparse; generally you'd have an investment in all of them. Still the point is well made.

I think the issues of computational demand are less problematic than they were in the past; 100 or 200 monte carlo runs is enough to get pretty good results.

One could use some kind of buffering or smoothing to reduce the jumps in weights. However it's probably better to overstate trading costs in a backtest. In reality one would probably use the final weights from the backtest; and then recheck annually that they were still pretty close to the latest bootstrapped values. It's also worth bearing in mind that these jumps are much smaller than you'd get from most other methods, except one with a massive amount of shrinkage.

I actually like getting slightly different results when I run a backtest. I think it reminds us that any backtest is just a single random sample from an unknown universe. But then I'm weird like that :-)

I think this is a bit different from Michaud who does something a bit more sophisticated than me, resampling the efficient frontier rather than the weights of a single optimal point. In my book I credit Jobson and Korkie who I think came up with this non parametric method in the 1980's. I'm happy to recredit them here.

I've also used shrinkage in the past. It does require a bit more skill / work as you need to (a) come up with a prior and (b) decide given the amount of noise in the data. Its my experience that it's easier to get things wrong with shrinkage methods than with bootstrapping.

ReplyDelete
Replies
Robert7 October 2015 at 15:53
Hi Rob
This question may (may) be somewhat related to the previous question.

Let say I'm systematically trading a portfolio of stocks and I have $ available to open N more positions.
From the universe of many/many stocks what metric should I use to select the stocks in order to maintain a balanced (low correlation) portfolio?
What metric am I trying to minimize/maximize?
I feel like I've seen this question posed and answered other places and that maybe it's a standard portfolio composition/optimization question ... but I'm not sure.

Thanks
ReplyDelete
Replies
Robert8 October 2015 at 15:51
Hi Rob - thank for your feedback.

Hmmm ... interesting. I think you answered my question but am not 100% sure ...
The actual scenario I'm trying to understand how to automate is:
1) I have an empty portfolio and I want to add positions to it until it's full (however I measure that)
2) I close a position(s) and want to add position(s) to the portfolio until it's full

I assume I wouldn't want to fill the portfolio with highly correlated positions - e.g., all semiconductor stocks or if trading futures all equity futures or grains.
I'd like to write some code to automate both 1) and 2) above but am not quite getting the picture of what metric I should be optimizing .. is it covariance?
Maybe I'm over complicating things and a simple heuristic like never have > N% of the portfolio in a single sector (semiconductor, equity futures, grains, etc) would suffice.
The engineer in me wants to optimize some number to make myself feel good that the portfolio is mathematically "well balanced" however you define that.
ReplyDelete
Replies
Robert8 October 2015 at 20:23
Hi Rob,
I think situation 2 is really the same as 1. E.g., I was stopped out of all positions at once and I now need to refill the whole portfolio.

I'm going to go through the steps you listed and see if I understand:
0) Add your first asset. This should be the asset with the lowest average correlation to other assets. You now have 1- 1/S of your portfolio left to allocate.
0.1) I assume the portfolio can contain S positions and for this exercise each are equally weighted as 1/S ... OK?
0.2) I assume I'm calculating the correlation between the price series (closes) of the potential new asset and those already in the portfolio ... correct?

1) Find the asset which gives you the highest sharpe ratio, given a particular covariance / correlation matrix given a weight on the new asset of 1/S, and existing weights unchanged.
1.1) I've already selected an asset (the min avg correlation asset) in step 0 and added it to the portfolio ... correct?
1.2) Re "Find the asset which gives you the highest sharpe ratio".
Can you elaborate a bit on what's going in this step as I'm not sure what I'm "finding" here and what I do with it once found. I.e. what's the output of this step as you've already added a new asset in step 0?
1.3) If I'm calculating the Sharpe ratio of asset a-sub-n I assume I'm using the returns of this asset as would have been produced by the system... correct?

Re "The 'handcrafted' method is a simple heuristic; and I actually use that, not bootstrapping, in my 'live' portfolio weights."
- What is the 'handcrated' method?

Thanks
ReplyDelete
Replies
Robert8 October 2015 at 21:32
First, no I haven't yet purchased your book but have it on my list after reading the review on the Reading The Markets blog - my go-to source for market book reviews.

OK, I almost 100% completely understand ... almost! I'm going to ask a few questions step by step.
Will start with Step 1 now as I have to think a bit about Step 2 before posing questions. Those will probably come tomorrow.

Step-1:
What two time series are you using to calculate the correlation? Assume we're using daily soybeans, corn for this example.
a) Daily closing values of beans, corn
b) Daily returns of beans, corn
c) Log(daily returns or beans), Log(daily returns of corn)
d) something else
ReplyDelete
Replies
Robert9 October 2015 at 14:56
Hi Rob,
Thanks for the correlation info.

Re Step 1:
It seems odd, at least to me, that if we're trying to build a maximally diversified portfolio, as mentioned in Step 0, that the correlation matrix (built in step 0) is used once only in Step 1 and then never again.
It seems that as it's used to pick the initial market and nothing else that one could (seemingly) just pick the initial market at random at get similar results ... yes?

Re Step 2:
Is the method to calculate the portfolio Sharpe detailed in your book?
I'm not exactly sure how to calculate this and it may be a bit involved for this Q&A.

Thanks -- Robert
ReplyDelete
Replies
Robert9 October 2015 at 19:17
Thanks Rob. No hurry as this is one of those things on my "I wonder how to do this" list.

Regards, -- Robert
ReplyDelete
Replies
Unknown19 October 2015 at 19:55
Hi!

I'm a complete beginner in developing automated trading strategies. The most "sophisticated" tool that I've access to use when testing my strategies is walk forward tests where I can test my strategies on unseen data. Do you think this tool is "enough" for testing the robustness of my strategies?

Thanks for taking your time.
ReplyDelete
Replies
Unknown19 October 2015 at 19:55
Hi!

I'm a complete beginner in developing automated trading strategies. The most "sophisticated" tool that I've access to use when testing my strategies is walk forward tests where I can test my strategies on unseen data. Do you think this tool is "enough" for testing the robustness of my strategies?

Thanks for taking your time.
ReplyDelete
Replies
Unknown23 October 2015 at 09:57
Hi!

When testing strategies - what do you think is the most powerfull way to test if the strategies have chances on working in real-time-trading?
ReplyDelete
Replies
Andy C29 October 2015 at 17:25
Hi Rob,

Thanks for this great resource. I have read through your book but I need to read it again to fully grasp all the concepts. I was going to post a question on how exactly to derive the Forecast Scalars in Table 49, p. 285, but I have now found the example spreadsheet on your support site which does exactly that. So thanks!

Keep up the good work.

Andy
ReplyDelete
Replies
Anonymous19 November 2015 at 00:18
Hi Rob,

Enjoying your book and blog. In your book you refer to volatility normalizing your time series before optimizing portfolio weights. How exactly are you doing your vol normalizing?

Thanks!
ReplyDelete
Replies
P_Ser4 January 2016 at 14:58
Dear Robert,

thanks for the wonderful work within your book and here. I'm struggling to grasp the idea of bootstrapping, so plese clarify if I'm thinking correctly:

1. Case of in-sample bootstrapping.

opt_and_plot(data, "in_sample", "bootstrap", equalisemeans=False, equalisevols=True, monte_carlo=500)

So you have roughly 2500 (10 years) of data points (returns). What you do is draw 250 returns at random with replacement (for year one) and calculate their statistics (mean, std, correlation matrix) then you do that 500 times and take average of those statistics. Then you do the same thing for years 2 to 10, is this correct?

2. For the case of expanding window

opt_and_plot(data, "expanding", "bootstrap", equalisemeans=False, equalisevols=True).

Pretty much everything is the same - draw 250 samples at random with replacement first out of 250 data points (1 yr), do that 500 times, find statistic averages, then out of 500 data points (2 yrs), do that 500 times, find averages e.t.c. Is this correct?

3. From your personal experience is epanding window better than the rolling window?

Thank You!

ReplyDelete
Replies
Kris18 January 2016 at 18:46
Hi Robert

I really enjoy your book and going to implement this methods on my trading.

In your book on page 167 you describe that we going to share our capital across a portfolio of subsystems. However, fFrom my backtesting results I see that my subsystems are not always traded. For each of the subsystems there are periods (sometimes pretty long) that there are no trades. If I'm going to use bootstrapping on this "gapped" subsystems, is my result then realistic ?

Thanks
Kris
ReplyDelete
Replies
AvantGarde4 March 2016 at 07:37
Hi Rob,
In determining your asset weights for your own system, in your optimisation do you constrain all weights such that the weights are bound by zero and 1, meaning "no short sales" based on the weights? I believe this is what you do, as your trading rules determine long/short positions. Am I right?
ReplyDelete
Replies
AvantGarde4 March 2016 at 07:51
One question I am struggling with is the following: if I am performing a mean-variance Markowitz optimisation (via bootstrapping), I believe this will put a downward bias on the weights to winning shorts, so that they are a smaller part of the portfolio even though they are profitable. Why? Because if we constrain the weights to be positive when setting the instrument weights, the assets with negative returns (which could have been profitable shorts) will have very low/zero weights in the optimisations. What is your take on this?
ReplyDelete
Replies
CJ5 May 2016 at 15:01
Hi Rob
Just to say I think your work has been an eye-opener to how the hedge fund world operates. So greatly appreciated.
I have no financial or programming background but have been trying to implement your methods in C sharp which I know a little. So I have done the variance covariance matrix on random portfolios with different characteristics etc etc. I understand the bootstrapping principle to select the rules but come an abrupt stop when it comes to optimisation.
The code you use in your example uses method='SLSQP' as part of the mark-solvo method/def. Is this something you have coded yourself in python or is it something freely available to users of python?
The other question I have is that the code you exhibit is really the overall structure of the code. I presume you have coded the detail behind the framework or is a lot of this already precoded "library" code available to users of python?
Many thanks

Chris
ReplyDelete
Replies
CJ5 May 2016 at 16:23
Thanks Rob
Looks like I would save a lot of time by learning python and having a look at your open source code! Can't be that difficult! Will download python tonight!
Chris
ReplyDelete
Replies
Chad B21 January 2017 at 19:49
Hello Rob,
Do you consider/discuss research of an 'optimal window length' as a function of Sharpe ratio or other metric for expanding window fitment? If so, which blog article (or your book) is it located? If not, what's your critique of the prospect of doing so?

Also, would it be technically correct to call expanding or window testing 'cross-validation?'
ReplyDelete
Replies
Unknown5 May 2017 at 11:16
Hi Rob, thanks for the interesting book. Can you comment on the use of fx adjusted returns and local returns in the asset allocation example. When bootstrapping weights on a similar portfolio to the book for a GBP investor I see FTSE and UK Bond weights increase for fx adjusted returns. I think this probably makes sense since the bulk of the assets are not GBP so this diversifies the currency exposure. Appreciate that one could borrow the foreign currencies or use fx derivatives but that is not available to all accounts. If you go down the route of foreign borrowing do people account for the cost of repatriating profits and losses (Quanto in derivatives terminology) in their Sharpe ratios?
ReplyDelete
Replies
AndyW10 May 2018 at 13:05
I'm currently re-estimating the weights every 21 days using bootstrapping.

I'm taking gross returns that are all normalised/scaled to a min/max of +/-20. Then for each day I take all the historic returns data available up to that date across all (17) assets and from them randomly sample and combine 20 x 256 day blocks to create one long block. I then make 200 MC runs across that long block and average the samples (256 days each) to get the weights.

When doing this using 200 MC runs, I noticed some fairly big variations between re-estimations. A weight might be 0.15 on one occasion but then 0.5 on the next. I realise this isn't necessarily 'wrong' (returns change) but as an experiment I tried upping the MC runs to 1000 and found that the largest change for individual weights between re-estimations dropped to max of about 0.1 (e.g. 0.15 might become 0.25). (I tried various other numbers of MC runs and found that 1000 was about as 'good' as it got - negligible further change reduction above 1000 and pretty much a linear increase below that down to 200 MC iterations.)

This prompted me to wonder if there was a conceptually 'better' way to do this:
1. Simply increase the MC runs per estimation to 1000
2. Calculate an average of the average weights from each estimation (200 runs) either cumulatively or on a rolling window basis and use that average of averages for the weights
3. Both of the above

Would much value your view on this (even if you think it is just a futile exercise in false precision). Many thanks.

Note: I'm using C not Python (so extra MC cycles aren't very computationally expernsive) and the weights are based on the same 6 sets of EMA lengths in your book, but I'm applying those lengths to low pass filters instead of EMAs (because I noticed that the correlation between LP filters of the same periods as the EMAs was considerably lower)
ReplyDelete
Replies
Unknown26 July 2018 at 16:09
Rob- thanks for the useful posts, books, etc. long-time lurker here. Quick(ish) question: when bootstrapping returns whats the best way to keep the serial correlation (and other such properties I guess)? block re-sampling or something else?
ReplyDelete
Replies
AndyW29 July 2018 at 10:10
Hi Rob, I just wanted to double check I was correctly calculating the returns from individual EMA parameter sets. I'm simply taking the volatility standardised asset returns for each day for each asset and multiplying them by the normalised (i.e. scaled +/-20) predictions generated by each EMA parameter set (i.e. (2,8), (4,16)) for each day. Is that correct? (I'm coding this in C rather than using your Python version)
Apologies for the v basic question, but I'm getting v odd portfolio weights from Markowitz for the slowest two parameter sets when only those two alone are being bootstrapped. They are both always zero, even when the mean returns (as calculated above) for both are positive.
Many thanks.
ReplyDelete
Replies
trader15 November 2018 at 16:15
Hi Rob, I wonder what are your thoughts incorporating autoregressive nature of the returns in your bootstrapping? How would you go about this? IIRC there are methods which use the residuals of ARIMA process in the bootstrapping.
ReplyDelete
Replies
Andrew Dolder3 April 2020 at 12:56
Hi Rob,

I managed to write my own bootstrapper. But currently it has the constraint that all of the Instruments must have history going back to a certain point in time (2009).

I suppose that's fine as long as I'm using well-established instruments, but I was hoping to bootstrap some stocks and ETF's that have been launched later, say as late as 2015 or so.

So obviously, I cannot correlate between these instruments in the period of 2009 thru 2014. And any bootstrap iteration that has any sliver of that date range cannot use the new instruments.

Do you know of a way to bootstrap with a combination of the established and newer instruments yet still have relatively stable weights?
ReplyDelete
Replies
Tony5 June 2020 at 04:25
Hi Rob,

Thanks for your excellent post.

For the volatility normalization, how to apply the "risk weightings" in real portfolio investment? Is there any adjustment necessary as the return was adjusted by the volatility before?
ReplyDelete
Replies
Mathias20 February 2024 at 16:15
Hi Rob,

I hope you're doing well.

I've been exploring the portfolio optimization techniques outlined in your work, specifically the bootstrapping with replacement approach combined with an expanding window. This has led me to a couple of questions regarding the methodology and its application:

Monte Carlo Length Adjustment: In the context of using an expanding window for bootstrapping, would it make sense to increase the length of each Monte Carlo simulation (monte_length) proportionally to the size of the expanding window? This adjustment could potentially capture more of the evolving data characteristics over time. I'm curious about your perspective on this approach.

Optimization Frequency: From the analysis presented, it seems that the optimizer's output weights are kept static on a yearly basis. Could you share more about the rationale behind this decision? Was the frequency of running the optimizer determined based on empirical analysis, theoretical considerations, or a discretionary choice?

Application to Sparse Data: My current project involves data with a weekly frequency, but I only have around 600 weeks' worth of data. Given the relatively limited dataset:

How would you suggest adjusting the frequency of recalculating forecast weights?
Are there any recommendations for setting the window size for the expanding window and the number of bootstrap runs in this context?

Thank you for your time and for sharing your expertise!
ReplyDelete
Replies