Friday 19 November 2021

Mr Greedy and the Tale of the Minimum Tracking Error Variance - Part two

My last blog post was about a new method for a daily dynamic optimisation of portfolios with limited capital, to allow them to trade large numbers of instruments. 

(Although I normally write my blog posts to be self contained, you'll definitely have to read the previous one for this to make any sense!)

Subsequent to writing that post I implemented the method, and quickly ran into some problems. Basically the dammn thing was trading too much. Fortunately there is a single parameter which controls trading speed in the model - the shadow cost. I turned up the shadow cost to a very high value and spent a few days investigating what had went wrong.

Broadly speaking, I found that:

  • My measure of turnover wasn't suitable for 'sparse' portfolios, i.e. portfolios where most instruments have a zero position at any given time
  • My measure of risk adjusted costs also wasn't suitable
  • The true turnover of the dynamic optimised portfolio was much higher than I'd realised
  • The true trading costs were thus also much higher
  • Thus the results in my previous post were misleading and wrong

After discussing with Doug (the original source of this idea) I realised I'd been missing a step: postion buffering. This is something I did in my original static system to slow down my trading behaviour, but which was missing here. 

In this post I explain:

  • Some other bad things I did
  • Why my approximations for trading costs and turnover were wrong, and how I fixed them
  • Why increasing the shadow cost doesn't help
  • Why position buffering is a good thing and how it works in the old 'static' system
  • How to create a position buffering method for dynamic optimisation
  • How I calibrated the position buffering and shadow cost for the dynamic model
  • How I used shrinkage to make the results more stable
  • A more appropriate benchmark
  • Some revised results using the new method, and more accurate statistics

So there's a lot there, but this has been quite an extensive (and at times painful!) piece of research.


Some other bad things I did


As well as the stuff I've already mentioned I did some bad things when implementing the model to production. I was so excited about the concept, basically I rushed it. Here's a list of things I did:

  • Not thinking more carefully about the interpretation of the shadow cost
  • Not using common code between sim and production (this didn't serious cause problems, but did make it harder to work out what was going ong)
  • Not paper trading for long enough
I ought to know better, so this just goes to show that even supposedly experienced and smart people make mistakes...



My (bad) approximations for trading costs and turnover


I'm a big fan of using risk adjusted measures for trading costs and other things. Here's how I used to work out trading costs:

  • Work out the current cost of trading
  • Divide that by current vol to calculate the cost in SR units
  • Calculate the turnover in contracts
  • Calculate the normalised turnover; dividing the historic turnover in contracts by the current average position, and the taking an average
  • The average position assumes we have a constant forecast of +10
  • Multiply the normalised turnover
  • I do a similar calculation for rolling costs, assuming I roll an average position worth of contracts N times a year (where N is the number of rolls each year)
The big flaw in this, with respect to sparse positions, is the average position. Clearly our average position is going to be much smaller than we think, which means our turnover will look unnaturally low, and our costs unrealistically low.

  • Work out the current cost of trading in actual $$$ terms
  • For each historic trade, multiply that by the number of contracts traded
  • For rolling costs, assume we roll a number of contracts equal to the number of contracts we held in the period before the roll (gives a more reliable figure)
  • Deflate these figures according to the difference in price volatility between then and now
  • So for example, if the vol of S&P 500 is 20% with a price of 5000, that's a price vol of 1000
  • If that figure was 200 at some point in the past, then we'd divide the historic cost by 5
This still risk adjusts costs, but in such a way that it will reflect sparse positions.

(Incidentally, I still use the old SR based method to calculate the cost of a trading rule, for which we don't really know the exact number of contracts)

We don't use turnover anymore, but it would still be nice to know what it is. It's hard to calculate a turnover per instrument with sparse positions, but we can meaningfully calculate the turnover for an entire system. We do this by calculating the turnover in contracts at the portfolio level (after dynamic optimisation, or after buffering for the static system), divide each of these by the expected average position (assuming a forecast of +10, and accounting for instrument weights and IDM), and then add these up. 


Why shadow cost doesn't help - much


So when I first calculated this new measure of average turnover for the dynamic system, it was quite a bit higher than I expected.

"No problem" I thought "I'll increase the shadow cost".

Remember the shadow cost is a multiplier on the cost penalty applied to the dynamic optimisation. Increasing it ought to slow the system down.

And it does... but not by as much as I'd hoped, and with a decreasing effect for higher shadow costs. Digging under the hood of the greedy algo, the reason for this is that the algo always starts at zero (in the absence of corner case constraints) and then starts changing position with the sign of the optimal unrounded position

So it will always end up with the same sign of position as the underlying optimal position. To put it another way, every time the underlying forecast changes sign, we'll definitely trade regardless of the size of the shadow cost. All the shadow cost will do is reduce trading in instruments where we haven't seen a sign change.

No biggie, except for the uber-portfolio I was running in production with over 110 instruments. Let's say their average holding period is a conservative two months; so they change sign roughly every 42 business days. That means on any given day we'd see around three instruments changing sign, and trading. This imposes a lower limit on turnover, no matter how much you crank up the shadow cost. 

There isn't a lot you can do about this, except perhaps switch to a different optimisation method. I did look at this, but I really like the intuitive behaviour of the greedy algo.

Another - smaller - effect is that the more instruments you have, the higher your IDM, the higher your turnover will be (as a rule of thumb, if you double your leverage on an individual instrument, you'll also double your turnover).

So it's important to do any testing with a sufficiently large portfolio of instruments relative to capital.


Position buffering in a static system


Anyway the solution, after discussion with Doug, revolves around something in my old 'static' system that's missing from the new dynamic system: position buffering. 

Doug isn't doing exactly what I've decided to do, but the principal of applying a buffer was his idea so is due credit

Position buffering works something like this; suppose we calculate an optimal (unrounded) position of long +2.3 contracts (2 contracts rounded). We then measure a buffer around this; which for the sake of argument we'll say is 2 contracts wide. So the position plus buffer will be (lower level) 1.3 contracts to (upper level) 3.3 contracts. Rounded that's +1 to +3 contracts. 

We then compare this to our current position. If that's between 1 and 3, we're good. We do nothing. However let's suppose our current position is +4 contracts. Then we'd sell... not 2 contracts bringing us back to the (rounded) optimal of +2, but instead a single contract to bring us to +3... the edge of the buffer.

Buffering reduces our turnover without affecting our performance, as long as the buffer isn't too wide. Calibrating the exact width of the buffer is quite involved (it's related to the interplay of forecast speed and costs for a given instrument), so I've always used a simple 10% of average position (the position I'd get with an average forecast of +10).


Position buffering in a dynamic system


How do we impose a buffer on the dynamic system? We could try and set up the optimisation so it treats all positions within the buffer for each instrument as having the same utility... but that would be darned messy. And as we're not doing a grid search which tests every possible integer contract holding we wouldn't neccesarily use the buffer in a sensible way.

Instead we need to do the buffering at a portfolio level. And we'll do the buffering on our utility function: the tracking error.

So the basic idea is this: if the tracking error of the current portfolio is less than some buffer level, we'll stick with the current portfolio (='inside the buffer'). If it isn't, we'll trade in such a way to bring our tracking error down to the buffered level (='trading to the edge of the buffer').

We actually do this as a three step process. In step one we calculate the tracking error of the current portfolio, versus the portfolio with unrounded optimal positions (let's call this tracking error /unrounded). If this is less than the buffer, we stick with the current positions. No optimisation is needed. This step doesn't affect what happens next, except to speed things up by reducing the number of times we need to do a full blown optimisation.

Step two: we do the optimisation, which gives a portfolio consisting of integer positions. Step three: we now calculate a second tracking error: the tracking error of the currently held portfolio versus the integer positions (tracking error/rounded). By construction, this tracking error will be lower than tracking error/unrounded. 

We now calculate an adjustment factor, which is a function of tracking error/rounded (which I'm now going to rename x) and the buffer (b):

Adjustment factor = max((x - b)/x, 0)

We then multiply all our required trades (from current to integer optimised) by the adjustment factor. And then round them, obviously.

So if the tracking error is less than the buffer, we don't trade. Otherwise we do a proportion of the required trade. The further away we are from the buffer, the more of the trade we'll do. For a very large x, we'd do almost all of our trade.

Note that (ignoring the fact we need to round trades) doing this will bring our tracking error back to the edge of the buffer.



Calibrating buffer size and shadow cost


Note the buffer doesn't replace the shadow cost; it's in addition to it. The shadow cost is handy since it penalises the costs of trading according to the different costs of each instrument. Nevertheless, both methods will slow down our trading so we have to consider their joint effect.

We could do this with an old school, in sample, grid search of parameters. Instead let's use some common sense.

Firstly, with an average target risk of 25%, a 10% buffer around that means our dynamic buffer size should be around 1.25% (not 2.5% as you might expect as this buffer is expressed differently).

Now consider the shadow cost. In my prior post I said that a shadow cost of 10 'looked about right', but a few moments thought reveals it is not. Remember the utility function is tracking error - costs. Tracking error is measured in units of annualised standard deviation. How much does a tracking error of 1% lose use in terms of expected return? Difficult to say, but let's make some assumptions:

  • Sharpe Ratio 1.0
  • Target annual standard deviation 25%
  • Target annual return = 1.0 * 25% = 25%
  • Suppose we're running at half the risk we want, so our tracking error will be 12.5%
  • In this case we'll also be missing out on ~12.5% of annual return
  • If SR = 1, then tracking error is effectively in annualised return units
Now bear in mind the utility function subtracts the cost from a given optimisation; which is probably a daily cost, we need to annualise this. So a shadow cost of 250 would annualise costs and put them in the same units as the tracking error.

To recap then:

- Shadow cost 250
- Buffer size 0.0125


Shrinkage and the mystery of the non semi-definite matrix


In the discussion 'below the line' of the previous post it was suggested that the turnover in the correlation matrix might be responsible for the additional costs. Basically, if the correlation estimate was to change enough it would cause the optimal portfolio to change a lot as well.

It was suggested to try shrinking the correlation matrix, which would result in fewer correlation changes. But the correlation estimate is also quite slow so I didn't shrinkage would be worthwhile. However I then discovered another problem, which led me down quite the rabbit hole. 

Essentially, once I started examining the results of my optimisation more carefully, I realised that for large (100+ instrument) portfolios there were instances when my optimisation just broke as it couldn't evaluate the utility function. One cause was inputting costs of nan, and was easily fixed by making my cost deflation function more accurate. But I also got errors trying to find the standard deviation of the tracking error portfolio. 

So it turns out that pandas doesn't actually guarantee to produce positive semi-definite correlation matrices, which means that sometimes the tracking error of the portfolio can have a negative variance. I experimented with trying to find the nearest PSD matrix - it's very slow, too slow for backtesting though possibly worth a last line of defence. I tried tweaking the parameters of the exponential correlation; even tried going back to vanilla non exponential correlation estimates but still ended up with non PSD matrices. 

What eventually came to the rescue, just as I was about to give up, was shrinking the correlation matrix. For reasons that are too boring to go into here (but try here), shrinkage is a good way of fixing PSD issues. And shrinking the correlation matrix in this particular application isn't really a bad thing. It makes it less likely we'll put on weird positions, just because correlations are especially high or low.

(If anyone has come across this problem with pandas, and has a solution, I'd like to hear it...)


How should we assess this change?


Yet another flaw in the previous post was an excessive reliance on looking at Sharpe Ratios to see if performance improved, comparing a plain 'rounded' with an optimised portfolio. 

However, there is an element of luck involved here. For example, if there was an instrument with a low contract size which still had positions even for a small unoptimised portfolio, and which had a higher SR than anything else, then the simple rounded portfolio would outperform the optimised portfolio. 

A better set of metrics would be:
  • The correlation in returns between an unrounded and the rounded optimised portfolio
  • The level of costs paid in SR units
  • The total portfolio level turnover (see notes above)
We'd want to check the SR wasn't completely massacred by the dynamic optimisation, but a slight fall in SR isn't a go/no go decision.



The static benchmark

It could be argued that a better benchmark would not be the simple rounded static portfolio with say 100 instruments; but instead the 30 or so instrument static portfolio you'd actually be able to trade with say $500K in your account. Again, there is an element of luck here, depending on which set of 30 or so instruments you choose. But it would give a good indication of what sort of turnover we are aiming for, and whether we had the costs about right.

So I reran the process described here, and came up with the following list of static instruments for a $500K portfolio:

instrument_list = ['DOW', 'MUMMY','FTSECHINAA', 'OAT',  'NIFTY', 'KOSDAQ','SP500_micro', 'NIKKEI',
'BOBL', 'KR10','KR3','EDOLLAR', 'US10U','US3',
'SOYOIL',
'WHEAT', 'LIVECOW', 'LEANHOG',
'CNH', 'YENEUR','RUR','BRE', 'JPY', 'MXP', 'NZD','CHF',
'BITCOIN', 'IRON','SILVER', 'GOLD_micro' ,
'CRUDE_W_mini', 'GASOILINE','GAS_US_mini',
'VIX'
]
Notice I've grouped these by asset class:
  • Equities
  • Bonds/STIR
  • Ags
  • Currencies
  • Metals
  • Energies
  • Vol
The results are different from last time, as I've added more instruments to my data set, and also excluded some instruments which are too expensive / illiquid; having spent some time recently on putting together a systematic process for identifying those.

For benchmarking purposes I allowed the instrument weights to be fitted in both dynamic and static cases, however both sets of portfolios have forecast weights that are fitted in sample for speed.


Results versus benchmark


First let's get a feel for the set of instruments we are playing with. We have 114 instruments in total, reflecting my constantly expanding universe, with data history that looks like this:

For the 34 instruments in the benchmark portfolio, it's worth bearing in mind that prior to 2000 we have less than 15 instruments, and so the early part of the backtest might be worth a pinch or two of salt:




Now in reality we can't hold 114 instruments at once, so in the dynamic optimised portfolio we end up holding around a quarter to a third of what's available on any given day:


Whilst in the benchmark portfolio we mostly hold positions in around 90-100% of what we can trade:

Let's dig into the optimiser some more and make sure it's doing it's job. Here is the position series for the S&P 500 micro. The blue line shows the position we'd have on without rounding. Naturally the green buffered position follows this quite closely. We actually end up with the orange line after optimisation; it follows along but at times clearly prefers to get it's risk from elsewhere. This is a cheap future so the turnover is relatively high:

Another plot I used in the previous post was to see how closely the expected risk tracked between the unrouded and optimised portfolio. It's still a very close match:
Remember one of our key metrics was the return correlation between the unrounded and optimised portfolios. This comes in at a very unshabby 0.986.

Let's look at some summary statistics. The Sharpe Ratio of the optimised portfolio is 1.336 gross, 1.287 net. So that's a SR cost loss of ~5bp a year. In other words, we'd expect to lose 1/20 of our vol target (1/20 of 25% = 1.25%) in costs annually; the actual average cost is 1.7% a year. The portfolio level turnover is 51.1 times a year. 

In comparison the benchmark portfolio hits a Sharpe Ratio of gross 1.322, net 1.275, SR in costs again ~5bp. Actual average cost is slightly lower at 1.45% a year. Turnover is 37.5. I'll discuss those figures in more detail later.

Incidentally, the (unachievable!) Sharpe Ratio of the unrounded portfolio with 114 instruments is 1.359. So we're only losing a few bp in SR when holding about a quarter to a third of the available instruments. You can see how close the account curves are here:
Orange: Unrounded, Blue: Optimised

Another way of visualising the costs: If I plot the costs with the sign flipped, and multiplied by 10, and then add on the net account curve; you can see how we are losing less than half  (so less than 5%) of our gross performance. Also the compounded costs are steady and linear, indicating a nice consistency over time.

Let's compare that to the benchmark:


Although the proportion of costs is slightly lower, you can see here that costs have increased over time. And in fact the annual costs over the last 15 years have been around 2%: higher than the 1.7% achieved by the dynamic system. 

So if we're comparing dynamic with static benchmark:

  • In recent years it has lower costs
  • But probably higher turnover
  • The historic net Sharpe Ratio is effectively identical

The turnover isn't a massive deal, as long as we have accurately estimated our trading costs: we'll do a few more trades, but on average they will be in cheaper markets.

However, to reiterate, the results of the static system are very much more a matter of luck in market selection particularly for the early backtest when it only has a few markets. Overall I'd still rather have the dynamic system - with the opportunity to catch market movements in over 100 markets and counting - than the static system where if I am missing a market that happens to move I will kick myself.


Summary

"I've prodded and poked this methodology in backtesting, and I'm fairly confident it's working well and does what I expect it to." ....

.... is something I wrote in my last post. And I was wrong! However, after a lot more prodding and poking, which has had the side effect of tightening up a lot of my backtesting framework and making me think a lot, I'm now much more confident. And indeed this system is now trading 'live'.

That was the seventh post on the best way to trade futures with a small account size, and - barring another unexpected problem - that's all folks!