Friday 4 December 2020

Dynamic trend following

As most of you know I have a regular(ish) gig talking on the Top Traders Unplugged systematic investor podcast, every month or so with Niels Kaastrup-Larsen and Moritz Seibert. 

Anyway on the most recent episode we got chatting about whether open or closed equity should matter when trading a position. More broadly, should your history of trading a position affect how you trade it now, or is it only what's happened in the market that matters?

Moritz and I had a bit of a debate about this; I'm a big fan of running my system on a 'stateless' basis where the only thing that matters is the market price. My logic is that the market does not know what my position is or has been, or how much profit I've made. That means if I'm using a stop loss, the size of the stop loss will remain the same regardless of what's happened to my p&l since I opened the trade.

Moritz on the other hand, seemed to imply that you should change your trading tactics depending on how the position has played out. The basic idea is that initially you should have pretty tight stops, and once you've made a decent profit you should increase the stop so that the position can 'breath'. Then you have a better chance of hitting a home run if the trade lasts a long time, without being stopped out early when you've just made a profit. These are dynamic stop losses, that adjust throughout the life of a trade.

I followed this up with a twitter thread where I clarified my thinking and got some interesting feedback. I also promised to do some more research. This blogpost is that research. But it's not just about that.

That's because this idea is closely related to another perpetual bone of contention  polite discussion between myself and Moritz, which is whether positions and stop losses should be adjusted as volatility changes. I like dynamic vol control: adjusting position sizes as vol changes. His preference is for no adjustment, for the same reason that if the market is getting riskier you've got a better chance of having a big up trade. 

What these two things have in common is that, intuitively at least, the 'purer' trend following tactic (no dynamic vol, but dynamic stop losses) should lead to more positive skew.

I would like to check that intution, and also see if this is an example of 'the no free lunch effect' (whereby you can only get better skew by giving up Sharpe Ratio). In plain english, what effect do these two changes have on Sharpe Ratio and skew? Then at least we can make an informed decision based on our preferences.


Discrete and continuous trading systems

Before we start, I need to make one thing clear. There is one significant characteristic of all trading systems: they are eithier discrete or continuous. A discrete trading system works like this:

  1. Something happens ('entry rule')
  2. We open a trade
  3. (Optionally) we make adjustments to the trade
  4. Something else happens ('exit rule'). A common exit rule is a stop loss.
  5. We close the trade
That's probably 99.9% of retail trading systems right there. Most of you will know systems like that. Some of you will recognise this as the 'Starter System' from my third book, 'Leveraged Trading'

What I actually use is a continuous system:

  1. We calculate an optimal position that we want to take
  2. We compare it to the position we currently have
  3. We adjust to get to our optimal position by trading 
There are no 'trades' here, just positions that vary in size. This is the system described in part four of 'Leveraged Trading', and it's also the 'Staunch Systems Trader' from my first book, 'Systematic Trading'.

Now the whole discussion about dynamic vol control and dynamic stop losses doesn't make any sense in the context of continuous systems: they automatically dynamically control vol, and they don't really have stop losses (but since optimal positions are based only on price movements they are definitely state-less). 

So for a like for like comparision, we're going to have to use a discrete benchmark system, and it won't be an astonishing surprise to hear I will be using the starter system from 'Leveraged Trading'. Briefly, the starter system uses a 16,64 moving average to open positions, and a 0.5x annual standard deviation stop loss to close them. So it doesn't do dynamic vol control (not because that's optimal, but because it's simpler and the starter system has to be as simple as possible), but it also has fixed stops: no dynamic stop loss eithier.

If you are too cheap to buy my books, then I describe the important elements of that system in the series of posts that begins here.

We can consider four variations of the Starter System:

- No dynamic vol control, no dynamic stop loss: Simple starter system as presented in the book Leveraged Trading
- Dynamic vol control, no dynamic stop loss: My preference
- No dynamic vol control, dynamic stop loss ('position breathing'): 'purest' form of trend following
- Dynamic vol control, dynamic stop loss: 'double dynamic'

Code for the starter system

To implement the starter system (all code can be found in this gist) using pysystemtrade we do the following:

  • use a single MAV rule with a binary forecast
  • replace the positionSize stage with something that:
    • calculates a 'preliminary position' which is just the binary position scaled for vol
    • adjusts this preliminary position using the function stoploss to create discrete trades
Now, I'm not going to bore you with thousands of lines of python in this code, but it's worth looking at the function stoploss in some detail. 

def stoploss(price, vol, raw_position, dynamic_vol=False, dynamic_SL = False):

assert all(vol.index == price.index)
assert all(price.index == raw_position.index)

# assume all lined up
simple_system_position = simpleSysystemPosition(
new_position_list = []

for iday in range(len(price)):
current_price = price[iday]
current_vol = vol[iday]

if simple_system_position.no_current_position:
# no position, check for signal
original_position_now = raw_position[iday]
new_position = simple_system_position.no_position_check_for_trade(original_position_now,
current_price, current_vol)
new_position = simple_system_position.position_on_check_for_close(
current_price, current_vol)


new_position_df = pd.DataFrame(new_position_list, raw_position.index)

return new_position_df
'raw_position' is the position we'd have if we were trading  continuously. It's the position that will be taken when a new trade is opened.

Let's have a look at the workhorse class simpleSysystemPosition.

If we haven't got a trade, we run this logic. Remember 'original position_now' is the unfiltered position; basically the sign of the forecast multiplied by a precalculated position size:

def no_position_check_for_trade(self, original_position_now, current_price, current_vol):
assert self.no_current_position
if np.isnan(original_position_now):
# no signal
return 0.0

if original_position_now ==0.0:
return 0.0

# potentially going long / short
# check last position to avoid whipsaw
if self.previous_position != 0.0:
# same way round avoid whipsaw
if sign(
original_position_now) == sign(self.previous_position):
return 0.0

self.initialise_trade(original_position_now, current_price, current_vol)
return original_position_now

The only point of interest is that we don't put a trade on if our previous trade was in the same direction. This is discussed in Leveraged Trading.

What if we have a position on already?

def position_on_check_for_close(self, current_price, current_vol):
assert not self.no_current_position

new_position = self.vol_adjusted_position(current_vol)

time_to_close_trade =self.check_if_hit_stoploss(current_vol)

if time_to_close_trade:

return new_position
def check_if_hit_stoploss(self, current_vol):
stoploss_gap = self.stoploss_gap(current_vol)

sign_position = sign(self.current_position)
if sign_position == 1:
# long
time_to_close_trade = self.check_if_long_stop_hit(stoploss_gap)
# short
time_to_close_trade = self.check_if_short_stop_hit(stoploss_gap)

return time_to_close_trade
def check_if_long_stop_hit(self, stoploss_gap):
threshold = self.hwm - stoploss_gap
time_to_close_trade = self.current_price < threshold

return time_to_close_trade

def check_if_short_stop_hit(self, stoploss_gap):
threshold = self.hwm + stoploss_gap
time_to_close_trade = self.current_price > threshold

return time_to_close_trade

I will delve a bit more into the calculations for stop loss gap and vol adjustment below, since these depend on what flavour of system we are running.

Dynamic vol control

So what is 'dynamic vol control'? Basically it's adjusting open positions as vol changes. 

(I'm assuming that we always set our initial position according to the vol when the trade is opened. I explore the consequences of not doing that here.)

World has got riskier? Then your position should be smaller. Things chilled out? Bigger position is called for.

Note, and this is really important, if you're going to adjust your vol you must also adjust your stop loss. Since I set stop loss initially at 0.5xannual standard deviation, if vol doubles then the stop loss gap will double (get wider), if it halves then the gap will also half (get tighter). 

Why is this so important? Well, suppose you halve your position, but don't widen your stop loss gap. Your risk on the trade is going to be too large; and you'll end up getting stopped out prematurely. Basically the stop loss and the postion size need to stay in synch, as I discussed in the series of posts that begins here.

Here's the relevant code from our uber-class, simpleSystemPosition:

def vol_adjusted_position(self, current_vol):
initial_position = self.initial_position
if self.dynamic_vol:
vol_adjusted_position = (self.initial_vol / current_vol) * initial_position
return vol_adjusted_position
return initial_position
def stoploss_gap(self, current_vol):
xfactor = self.Xfactor

if self.dynamic_vol:
vol = current_vol
vol = self.initial_vol

stoploss_gap = vol * xfactor

return stoploss_gap

Xfactor will be 8 here, because we use a 0.5x annual standard deviation stop loss to close positions and vol here is daily, so the multiple becomes 16 (square root of 256~approx # of trading days per year) multiplied by 0.5 = 8

(Of course we'll allow Xfactor to vary when we use dynamic stop losses)

Dynamically adjusting stop loss for p&l

Now what about the dynamic stop loss? Remember we want to have a smaller stop when we enter a trade, but if we make profits we need to increase the stop. I thought about this for five minutes and came up with the following:

The y-axis is the X-factor to use. The x-axis is a measure of vol normalised profitability: it's the profit or loss in price points divided by the initial vol when the trade was put on. As you can see from the excellent drawing, we use an X-factor of 2 when the trade is put on, or if we're losing money. If we make profits, the X-factor gradually increases to 8 (which is the default fixed setting) and goes no higher.

I haven't fitted this, all I did was plot the statistic for vol-normalised trade profit for one instrument (Eurodollar) to get a feel for what sort of values it has. And since I already had an 8 parameter, I thought I might as well have another one.

Here's simpleSystemPosition again:
def Xfactor(self):
if self.dynamic_SL:
return self.dynamic_xfactor()
return fixed_xfactor()

def dynamic_xfactor(self):
pandl_vol_units = self.vol_adjusted_profit_since_trade_points()
return dynamic_xfactor(pandl_vol_units)

# outside the class
def fixed_xfactor():
return 8.0

def dynamic_xfactor(pandl_vol_units):
if pandl_vol_units<=PANDL_LOWER_CUTOFF:
elif pandl_vol_units>PANDL_UPPER_CUTOFF:
Whilst this may not resemble what any real person actually does in their trading system, it does at least do what dynamic stop losses are supposed to do: let positions 'breath' once they're in profit.

PS: A stateless way of letting positions breath

Something that occured to me, but I didn't test, is that you can implement a dynamic stop without having to calculate previous p&l. For example you could measure the length or strength of a trend, thus creating something that was consistent with my conviction that 'the market doesn't know what my profit is, or when I put a trade on'. 

If you put a gun to my head and said I had to do a dynamic stop loss, then this is how I'd do it.

Evaluating the results

OK, so we're going to measure the Sharpe Ratio, and the skew of our p&l series. We're trying to see if going dynamic makes more money, or changes the skew, or both. Then depending on our preferences we can decide what suits us.

There are two ways of evaluating your p&l: by trade, or by time period. Retail investors usually use 'per trade', professionals use time periods 'per day', 'per month' or 'per year'. 

NB I nearly always use time period P&L, because I'm a bleedin' pro mate.

Does it matter? 

YES. In fact the skew of trades and returns is substantially different for trend following systems, as I shall now illustrate with Eurodollar and the base system (the exact system and instrument is irrelevant for the moment as I'm not comparing results; you'd see similar effects):

instrument_code = "EDOLLAR"
pandl_returns_capital = pandl_capital(instrument_code, system, method_used="returns")
pandl_trades_capital = pandl_capital(instrument_code, system, method_used="trades")

What I'm showing you here are the cumulated returns from returns (orange line) and trades (blue line), zoomed in. Obviously they line up at the point each trade is closed, but the blue line is more jagged since it only cumulates at the point when a trade is closed.

Now let's look at the distributions; first daily returns:
(The plot has some extremes removed for both tails). 
As I discussed at some length in this post from last year, the skew of daily returns p&l will often be slightly negative except perhaps for very fast systems. Here the skew is -0.24. Now what about trades?

Wow! That is some seriously positive skew: 2.84 (I've cutoff the plot at the right tail). That is what we'd expect, because we're trend followers. 

Clearly we need to consider both kinds of p&l here when considering skew, as they will probably give substantially different answers. And for return p&l skew we need to think about different time periods, as again they will give different results. Sharpe ratio we'll just measure on daily return p&l; Sharpe Ratio for trades doesn't make a lot of sense.

So our measures will be:
  • Sharpe Ratio (based on daily returns and annualised)
  • Skew (based on daily returns)
  • Skew (weekly returns)
  • Skew (monthly returns)
  • Skew (using trades)
For all these figures I'm going to use gross returns, with the statistics averaged across instruments, giving all equal weights. So don't be surprised to see quite low Sharpe Ratios, these are instrument SR not portfolio SR.

(The reason for not taking costs into account is that I'm not using the main pysystemtrade accounting functions here; they can't do trade by trade p&l, and I want to refactor them before I add that functionality. This will make dynamic vol control - which leads to more position adjustment trades - look a little better than it is, although in practice buffering will reduce the actual amount of extra trading by a considerable amount)


Neithier 0.19
Dynamic vol 0.25
Dynamic SL 0.05
Both         0.07

So beginning with Sharpe Ratio we can see that dynamic vol adds value to the basic starter system: this confirms the result in my book 'Leveraged Trading'. However the dynamic stop loss is a big loser. Adding dynamic vol control in as well improves it slightly, but clearly a lot of positions are being closed before the dynamic vol does anything interesting or useful.

        Skew trades
Neithier 1.95
Dynamic vol 1.87
Dynamic SL 3.07
Both         4.00

Now looking at the skew for trade p&l, we can see that dynamic vol does indeed reduce the positive skew. Clearly it's cutting the position on too many big winners just as things are getting interesting. But the reduction isn't massive. In contrast dynamic stop loss massively increases the skew, as you might expect; from all those positions that start to lose being cut quickly. Interestingly the highest skew of all comes from doing both; there must be some weird interaction going on here.

        Daily return skew
Neithier 0
Dynamic vol    -0.11
Dynamic SL 0.4
Both        -0.02

Now let's consider the skew of return p&l, first for daily returns. There's basically no skew for the simplest system. Adding dynamic vol reduces the skew a little, as for trades. For dynamic stop loss the skew is boosted, as we'd probably expect. However the increase in skew is nowhere near as dramatic as it is for trades. Finally doing both results in the two dynamic effects pretty much offsetting each other.

        Weekly return skew
Neithier 0.34
Dynamic vol -0.02
Dynamic SL 0.35
Both         -0.01

        Monthly return skew
Neithier 0.73
Dynamic vol 0.08
Dynamic SL 0.77
Both         0.05

Considering weekly and monthly returns, we can see that a similar pattern emerges. The basic system has positive skew which increases as we go slower (as discussed in this post). Adding dynamic vol control reduces the skew, by a fair bit. Adding a dynamic stop loss increases it, although only by a very small amount. Doing both results in something pretty similar to dynamic vol control.


This post has confirmed my intuition:

  • Adding dynamic vol control reduces positive skew, but adds Sharpe. The effect is starkest for trade by trade p&l, and for monthly returns.
  • Adding dynamic stop losses increases positive skew, but drastically reduces SR. The skew bonus is very high for trade p&l, but relatively modest for weekly and monthly returns.

Now to be fair, I don't know exactly what other trend followers do in their trading system, since unlike me many of them don't have the freedom to open source it*. So I don't know if these tests accurately reflect what 'purer' trend followers are doing, especially when it comes to dynamic stop loss (dynamic vol control is less contentious since there is only really one way of doing it: you could change the measurement of vol, frequency of changes and the use of buffering; none of which will affect the results very much). 

* 'Yes I know I'm charging you 2 and 20 for my black box, and I know I put the entire thing on github, but information want's to be free dammn it!'

It's most likely that they are running a milder version of what I've used, something that doesn't affect the SR anywhere near as much as the dynamic stop loss I outlined here. However, this will also lead to a smaller skew bonus: the no free lunch hypothesis is confirmed again, you can't buy more skew without selling SR. You can probably test this by changing the parameters of the dynamic skew function.

But the point of this post isn't to say 'this is the perfect way'. It's to show you the possible trade offs. What you will do will depend on your preferences for SR and skew. You could whip out Occams' razor and say you should run the simplest system, which has very good skew properties at weekly and monthly returns. Or you could take the view (like moi) that the extra SR of dynamic vol control is worth the complexity and reduction in skew. Or you could go hell bent for skew, and do dynamic stop losses (but not dynamic vol). 

The only thing that doesn't make sense is doing both! That's ideologically inconsistent, over complicated, and also pretty crap.

Frankly, it's up to you. To me the most interesting thing for me about this exercise has been the contrast between the skew of trade p&l, and return p&l. You can be doing something that massively bumps up your positive skew on trade p&l (like very aggressive dynamic stop loss adjustment), and think you are being a 'purer' trend follower, and then when you look at your monthly return you see there is no meaningful skew effect :-(

Tuesday 3 November 2020

Improving the use of correlations in portfolio optimisation when handcrafting

Remember the handcrafting method, which I described in this series of posts? 

Motivating portfolio construction
Adjusting portfolio weights for Sharpe Ratios

All very nice, all very theoretically grounded, except for one thing: the 'candidate matrices'. Remember, what we do is group our portfolio into subportfolios of 2 or 3 assets, and find some initial risk weights (following which there is some funny business involved IDMs and SR adjustments). 

For two assets the risk weights are trivial (50% each), and then for three assets we have the problem of coming up with some portfolio weights that account for different correlations, and yet are robust to the uncertainty in correlation estimates.

Note: I account for uncertainty in Sharpe Ratios. I ignore uncertainty in volatility estimates, as that is relatively small and has minimal effect on portfolio weights.

I do this by coming up with 'candidate' matrices. Basically a series of correlation matricies that cover the possible combinations of three assets. Each of these then has a matching set of portfolio weights. We find the closest correlation matrix and voila, we have the right weights! (possibly with some extrapolation if no matrix is perfect).

Now this has always been a slightly unsatisfactory part of the method, which reflects it's roots as a heuristic method used by humans requiring minimal computation. As humans, it is nice to glance at correlation matrices and say 'ah this looks like a classic #3 matrix. I have some nice robust weights for this in a draw somewhere'. But as I've made the transition towards fully systematizing and automating this method, we can do better.

To be precise, we can do what we need with the SR adjustments. In the original post, with the method taken from my first book, this was a somewhat vague method that took no precise account for how much uncertainty there is in SR estimates. In my subsequent post on the subject, I did exactly that: computed the expected distribution of parameter estimates, and took the average portfolio weight over various points

So in this post, I'm going to take a similar approach. To whit, I will replace the 'candidate matching' stage in handcrating with the following:

  • Get central estimates for correlations in a given asset portfolio
  • Get the amount of time for which those correlations are estimated over
  • Calculate the parameter distribution for the correlations
  • Optimise the portfolio weights over different points on the correlation distribution
  • Take an average of those portfolio weights

I'd recommend reading this post even if you're not a fan of the handcrafting method. It will give you some interesting insight into the uncertainty of correlation estimates, their effect on portfolio weights, and (bonus feature!) the optimal amount of time to estimate correlations over.


Let's start by setting ourselves up with some boiler plate optimisation code that will give us some weights for a three asset portfolio, based purely on the correlations (I'll assume SR=0.5 and some arbitrary standard deviation for all three assets).

from scipy.optimize import minimize
import numpy as np

def optimise_for_corr_matrix(corr_matrix):
mean_list = [.05]*3
std = [.1]*3

return optimise_inner(mean_list, corr_matrix, std)

def optimise_inner(mean_list, corr_matrix, std):
stdev_list = [std]*len(mean_list)
sigma = sigma_from_corr_and_std(stdev_list, corr_matrix)

return optimise_with_sigma(sigma, mean_list)

def optimise_with_sigma(sigma, mean_list):

mus = np.array(mean_list, ndmin=2).transpose()
number_assets = sigma.shape[1]
start_weights = [1.0 / number_assets] * number_assets

# Constraints - positive weights, adding to 1.0
bounds = [(0.0, 1.0)] * number_assets
cdict = [{'type': 'eq', 'fun': addem}]
ans = minimize(
start_weights, (sigma, mus),
weights = ans['x']

return weights

def neg_SR(weights, sigma, mus):
# Returns minus the Sharpe Ratio (as we're minimising)
weights = np.matrix(weights)
estreturn = (weights * mus)[0, 0]
std_dev = (variance(weights, sigma)**.5)

return -estreturn / std_dev

def variance(weights, sigma):
# returns the variance (NOT standard deviation) given weights and sigma
return (weights * sigma * weights.transpose())[0, 0]

def sigma_from_corr_and_std(stdev_list, corrmatrix):
stdev = np.array(stdev_list, ndmin=2).transpose()
sigma = stdev * corrmatrix * stdev

return sigma

def addem(weights):
# Used for constraints
return 1.0 - sum(weights)

labelledCorrelations = namedtuple("labelledCorrelationList", 'ab ac bc')

def three_asset_corr_matrix(labelled_correlations):
:return: np.array 2 dimensions, size
ab = labelled_correlations.ab
ac =
bc = labelled_correlations.bc

m = [[1.0, ab, ac], [ab, 1.0, bc], [ac, bc, 1.0]]
m = np.array(m)
return m

Let's check a few stylised examples:

>>> optimise_for_corr_matrix(three_asset_corr_matrix(labelledCorrelations(ab=0,ac=0,bc=0)))
array([0.33333333, 0.33333333, 0.33333333])
>>> optimise_for_corr_matrix(three_asset_corr_matrix(
array([0.25627057, 0.25627056, 0.48745887])
>>> optimise_for_corr_matrix(three_asset_corr_matrix(
array([0.        , 0.49999999, 0.50000001])
>>> optimise_for_corr_matrix(three_asset_corr_matrix(
array([0.33333333, 0.33333333, 0.33333333])

That all looks pretty sensible.

Correlation uncertainty

Now, what is the correlation uncertainty given some sample size and the correlation value? I could paste in some LaTex, but instead let me point you towards the wiki page, and show you some code that implements it:

def get_correlation_distribution_point(corr_value, data_points, conf_interval):
fisher_corr = fisher_transform(corr_value)
point_in_fisher_units = \
get_fisher_confidence_point(fisher_corr, data_points, conf_interval)

point_in_natural_units = inverse_fisher(point_in_fisher_units)

return point_in_natural_units

def fisher_transform(corr_value):
return 0.5*np.log((1+corr_value) / (1-corr_value)) # also arctanh

def get_fisher_confidence_point(fisher_corr, data_points, conf_interval):
if conf_interval<0.5:
confidence_in_fisher_units = fisher_confidence(data_points, conf_interval)
point_in_fisher_units = fisher_corr - confidence_in_fisher_units
elif conf_interval>0.5:
confidence_in_fisher_units = fisher_confidence(data_points, 1-conf_interval)
point_in_fisher_units = fisher_corr + confidence_in_fisher_units
point_in_fisher_units = fisher_corr

return point_in_fisher_units

def fisher_confidence(data_points, conf_interval):
data_point_root =fisher_stdev(data_points)
conf_point = get_confidence_point(conf_interval)

return data_point_root * conf_point

def fisher_stdev(data_points):
data_point_root = 1/((data_points-3)**.5)
return data_point_root

def get_confidence_point(conf_interval):
conf_point = norm.ppf(1-(conf_interval/2))

return conf_point

def inverse_fisher(fisher_corr_value):
return (np.exp(2*fisher_corr_value) - 1) / (np.exp(2*fisher_corr_value) + 1)

Let's have a play with this. First of all, the 0.5 point should give us the central estimate for correlation:

>>> get_correlation_distribution_point(0.9, 100, 0.5)
>>> get_correlation_distribution_point(0.0, 100, 0.5)

We can back out lower and upper estimates:

>>> get_correlation_distribution_point(0.0, 100, 0.05)
>>> get_correlation_distribution_point(0.0, 100, 0.95)

These get wider if we use less data:

>>> get_correlation_distribution_point(0.0, 10, 0.05)
>>> get_correlation_distribution_point(0.0, 10, 0.95)

The distribution gets narrower and slightly assymetric at extremes:

>>> get_correlation_distribution_point(0.99, 100, 0.05)
>>> get_correlation_distribution_point(0.99, 100, 0.95)

And so on.

Three asset portfolio weights under conditions of portfolio uncertainty

For a three asset portfolio we can now back out the optimised portfolio weights given the relevant points on the correlation distributions:

def optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, data_points):
labelled_correlations = extract_asset_pairwise_correlations_from_matrix(corr_matrix)
labelled_correlation_points = calculate_correlation_points_from_tuples(labelled_correlations, conf_intervals, data_points)
corr_matrix = three_asset_corr_matrix(labelled_correlation_points)
weights = optimise_for_corr_matrix(corr_matrix)

return weights

def extract_asset_pairwise_correlations_from_matrix(corr_matrix):
ab = corr_matrix[0][1]
ac = corr_matrix[0][2]
bc = corr_matrix[1][2]

return labelledCorrelations(ab=ab, ac=ac, bc=bc)

def calculate_correlation_points_from_tuples(labelled_correlations, conf_intervals, data_points):
correlation_point_list = [get_correlation_distribution_point(corr_value, data_points, confidence_interval)
for corr_value, confidence_interval in
zip(labelled_correlations, conf_intervals)]
labelled_correlation_points = labelledCorrelations(*correlation_point_list)

return labelled_correlation_points

Let's try this out:

>>> corr_matrix = three_asset_corr_matrix(labelledCorrelations(0,0,0))
>>> conf_intervals = labelledCorrelations(.5,.5,.5)
>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)
array([0.33333333, 0.33333333, 0.33333333])
# Makes sense, taking the median point off the distribution just recovers equal weights
>>> conf_intervals = labelledCorrelations(.5,.5,.95)
>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)
array([0.37431944, 0.31284028, 0.31284028])
# with a higher confidence point for correlation BC we put more in asset A
>>> conf_intervals = labelledCorrelations(.5,.5,.05)
>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)
array([0.286266, 0.356867, 0.356867])
# with a lower correlation for BC we put less in asset A
>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 10)
array([0.15588428, 0.42205786, 0.42205786])
# With less data the confidence intervals get wider

Seems to be working. Now I need to use the same technique as in the previous post on SR adjustments, which is get the portfolio weights over a number of points on the distribution, and then take an average. 

def optimised_weights_given_correlation_uncertainty(corr_matrix, data_points, p_step=0.1):
dist_points = np.arange(p_step, stop=(1-p_step)+0.000001, step=p_step)
list_of_weights = []
for conf1 in dist_points:
for conf2 in dist_points:
for conf3 in dist_points:
conf_intervals = labelledCorrelations(conf1, conf2, conf3)
weights = optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, data_points)

array_of_weights = np.array(list_of_weights)
average_weights = np.nanmean(array_of_weights, axis=0)

return average_weights

Slightly niggly thing here is that there are three inputs, so I need to iterate over 3 sets of distributions. Hence I'm using a 'p-step' of 0.1 (meaning we have to do 9^3 optimisations); using smaller values doesn't change the results (whilst being exponentially slower!), larger values are a bit too coarse.

Recalculating the candidate matrices

Now I'm going to recalculate the candidate matrices used in the original handcrafting. This involves doing this kind of thing:

>>> corr_matrix = three_asset_corr_matrix(labelledCorrelations(0.,0.,0.))
# get raw optimised weights, no uncertainty
>>> optimise_for_corr_matrix(corr_matrix)
array([0.33333333, 0.33333333, 0.33333333])
# get optimised weights with uncertainty, using data_periods = 5
>>> optimised_weights_given_correlation_uncertainty(corr_matrix, 5)
array([0.33333332, 0.33333346, 0.33333323])

Here are the results. First the raw weights with no uncertainty (first 3 columns are pairwise correlations, next 3 are weights):

Corr AB Corr AC Corr BC Raw A Raw B Raw C

0 0.5 0 0.286 0.428 0.286

0 0.9 0 0.256 0.487 0.256

0.5 0 0.5 0.5 0 0.5

0 0.5 0.9 0.5 0.5 0

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.263 0.473 0.263

0.9 0.5 0.9 0.5 0 0.5

0 0 0 0.33 0.33 0.33

Now for the original handcrafted weights:

Corr AB Corr AC Corr BC H/C A H/C B H/C C

0 0.5 0 0.3 0.4 0.3

0 0.9 0 0.27 0.46 0.27

0.5 0 0.5 0.37 0.26 0.27

0 0.5 0.9 0.45 0.45 0.1

0.9 0 0.9 0.39 0.22 0.39

0.5 0.9 0.5 0.29 0.42 0.29

0.9 0.5 0.9 0.42 0.16 0.42

0 0 0 0.33 0.33 0.33

Now let's do the new stuff, with 5 data points (anything less than 4 will break, for obvious reasons):

Corr AB Corr AC Corr BC 5dp A 5dp B 5dp C

0 0.5 0 0.298 0.404 0.298

0 0.9 0 0.264 0.472 0.264

0.5 0 0.5 0.379 0.241 0.379

0 0.5 0.9 0.462 0.349 0.189

0.9 0 0.9 0.44 0.12 0.44

0.5 0.9 0.5 0.279 0.441 0.279

0.9 0.5 0.9 0.411 0.178 0.411

0 0 0 0.33 0.33 0.33

Some observations:

  • For 'nice' matrices (like equal correlations or the first row: 0,.5,0) the results of both the original handcrafting and the new method are very close to the raw optimisation
  • In most cases the original handcrafting and the new method give very similar results. The main exceptions are the fourth row and the third row. 

Now let's see how things change with more data, say data_points=20:

Corr AB Corr AC Corr BC 20dp A 20dp B 20dp C

0 0.5 0 0.278 0.444 0.278

0 0.9 0 0.252 0.496 0.252

0.5 0 0.5 0.437 0.125 0.437

0 0.5 0.9 0.497 0.443 0.06

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.256 0.489 0.256

0.9 0.5 0.9 0.4955 0.009 0.4955

0 0 0 0.33 0.33 0.33

For the 'nicer' correlation matrices, not very much happens as we have more data. However for the more extreme versions, like rows 3 and 4, we get closer to the raw optimised weights (since we can be more confident that the central correlation estimates are correct). In fact for row 5 we end up with the raw weights, and the penultimate row isn't far off. And finally data_points=100:

Corr AB Corr AC Corr BC 100dp A 100dp B 100dp C

0 0.5 0 0.283 0.434 0.283

0 0.9 0 0.252 0.496 0.252

0.5 0 0.5 0.471 0.059 0.471

0 0.5 0.9 0.5 0.5 0

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.256 0.489 0.256

0.9 0.5 0.9 0.5 0 0.5

0 0 0 0.33 0.33 0.33

We're pretty much at the same point as the raw weights here, although row 3 hasn't quite got there.

How many data points should we use?

Now, I don't know about you, but I'm quite happy with the weights produced by the datapoints=5, and less so with the others. This presents us with a problem, since I use weekly data to estimate correlations, which means we'd usually have anything from 52 to potentially 2000+ data points. It doesn't seem logical that having a years data would make you certain enough about correlations to completely drop an asset from your portfolio.

OK, let's take a detour for a moment. What are we trying to do here? We're not really interested in the uncertainty of past correlations. We're interested in the uncertainty of future correlations, given the data we have about the past. The sampling distribution of the past is just a proxy for this. In particular, it assumes that the latent correlation structure is static (as if!), and that joint Gaussian normal is a good model for financial data (if only!). 

Now for Sharpe Ratios, where the sampling distribution is massive, and we're appalling at predicting the future, this is a pretty good approximation. Not so for correlations. Consider the following:
This is an estimate of the correlation between S&P 500 and US 10 year bond futures, estimated using all historic data on an expanding window basis. The red and the blue lines show the upper and lower 95% confidence intervals for the correlation at the start of a given year. The green line shows the actual correlation.

Now if the confidence intervals were any good, we'd expect the green line to be outside of the red and blues about 1 in 20 years. The actual figure is 15 out of 18 years. That's not great. What's not helping is there is a secular downtrend in the correlation, but even without that things are poor. Even without that, the true 95% range for correlations is probably around 0.5 rather than the 0.14 distance between the red and the green lines at the end.

Here's a similar picture for the highly correlated US 5 year and 10 year bond futures:

Although the confidence intervals are tighter (since the correlation is closer to 1) the record is just as bad: 14 misses in 18 years. No secular trend here, but the typical pattern of highly correlated assets: correlations break down in a crisis (1999, 2008). Again the true 95% range is probably 0.04 rather than the 0.01 show here.

What all this means is that using the parameter distribution error as a proxy for future uncertainy of correlations will probably underestimate the likely uncertainty by roughly a factor of 4.

There are a number of solutions to this. Most are hacky, for example, imposing some kind of minimum weight. Or dividing the time_periods by some arbitrary number. We could do something more sophisticated, like actually trying to estimate (on a rolling basis!) the true estimation error (outturn versus forecast distribution).

I'm going to do something that's fairly simple, and only a little bit hacky. I'm literally going to multiply the uncertainty by 4:

def fisher_confidence(data_points, conf_interval):
data_point_root =fisher_stdev(data_points)*4
conf_point = get_confidence_point(conf_interval)

return data_point_root * conf_point
Note the precise value will depend on the level of correlation, since we should strictly use the fisher equivalent of 4, but this seems to work well enough.

Let's see if this works. Here's the original 95% confidence intervals for correlations of -0.25 and 0.96 (roughly what we have in the plots above), with 1000 data points (about 20 years of weekly data):

>>> get_correlation_distribution_point(-0.25, 1000, 0.025)
>>> get_correlation_distribution_point(-0.25, 1000, 0.975)
>>> get_correlation_distribution_point(.96, 1000, 0.025)
>>> get_correlation_distribution_point(.96, 1000, 0.975)

Now with the new fudge factor applied:

>>> get_correlation_distribution_point(-0.25, 1000, 0.025)
>>> get_correlation_distribution_point(-0.25, 1000, 0.975)
>>> 0.028523195562247212
>>> get_correlation_distribution_point(.96, 1000, 0.025)
>>> get_correlation_distribution_point(.96, 1000, 0.975)
These are much closer to the likely forecasting errors we'd see in reality. Let's look at the candidate correlations using the revised method, with 100 data points (so for comparision with the final table above), or about 2 years of weekly data:

Corr AB Corr AC Corr BC 100dp A 100dp B 100dp C
0 0.5 0 0.285 0.43 0.285
0 0.9 0 0.253 0.494 0.253
0.5 0 0.5 0.404 0.192 0.404
0 0.5 0.9 0.495 0.386 0.118
0.9 0 0.9 0.498 0.003 0.498
0.5 0.9 0.5 0.261 0.4777 0.261
0.9 0.5 0.9 0.451 0.097 0.451
0 0 0 0.333 0.333 0.333

In most cases we've got weights that are pretty robust looking, and much closer to the original handcrafted versions than the weights we had before. Now with 1000 data points (about 20 years):

Corr AB Corr AC Corr BC 1000dp A 1000dp B 1000dp C
0 0.5 0 0.282 0.437 0.282
0 0.9 0 0.253 0.494 0.253
0.5 0 0.5 0.464 0.071 0.464
0 0.5 0.9 0.5 0.5 0
0.9 0 0.9 0.5 0 0.5
0.5 0.9 0.5 0.256 0.489 0.256
0.9 0.5 0.9 0.5 0 0.5
0 0 0 0.333 0.333 0.333

OK, so there are more zeros here than before. It looks like with 20 years there is enough evidence to produce relatively extreme weights for certain kinds of portfolio. 
Much as it pains me, I think a further hack is required here. I'm just uncomfortable with allocating 0 to anything (although arguably if the entire portfolio is large enough, the odd zero won't matter too much). Even if the weight is small, there is the opportunity for the IDM or SR scaling to increase it.

Let's apply a minimum weight of 9%, which in a three asset portfolio is pretty underweight (so for example in the 40 or so futures contracts I trade, if all assets are in groups of 3 - which they're not - the minimum instrument weight would be 0.7%).

def apply_min_weight(average_weights):
weights_with_min = [min_weight(weight) for weight in average_weights]
adj_weights = weights_sum_to_total(weights_with_min)

return adj_weights

def min_weight(weight):
if weight<0.1:
return 0.1
return weight

def weights_sum_to_total(weights_with_min):
sum_weights = np.sum(weights_with_min)
adj_weights = weights_with_min / sum_weights
return adj_weights

Ignoring the horrible use of constants rather than variables, why is the 0.1 there not 0.09? Well if you try using this with weights of [0.5,0.5,0] you will get: 

>>> apply_min_weight(average_weights)
array([0.45454545, 0.09090909, 0.45454545])
... because of the effect of making the weights add up to 1.

It isn't pretty is it? These fixed numbers, 4 and 0.1, smack of being slightly arbitrary, and the 0.1 is just a personal preference with no scientific grounding whatsoever. Still you have the code. Feel free to do this differently. I won't judge you. Much.

As a final note this new technique has a huge advantage over the old candidate matching method. We can use it for any correlation matrix. That means we don't have to do the candidate matching and extrapolation. We just feed in the correlation matrix directly and out will pop the appropriate weights.

Note: in theory you could use this technique for any size of correlation matrix, but the time involved would increase exponentially.

In particular, it will generate much more sensible weights than raw optimisation when we have negative correlations (something that I completely ignored when generating the candiate matrices).

Should we use all of our data to estimate correlations?

There's a common tradeoff in a lot of the work I do: should we use more data and get a robust estimate, or less data and be more responsive? For example, should we use shorter or longer moving average crossovers? (Answer: probably both). 

In the space of estimating the parameters of a Gaussian normal distribution: mean, standard deviation and correlation, the answers are different. For means (or Sharpe Ratios), I am a fan of using as much data as possible (although there is some evidence that faster trend following no longer works as well as it used to, especially for equity indices). For volatility, as a rule of thumb something like a 30 day estimate of volatility works much better than using more data.

Sharpe Ratios are hard to predict so we use a lot of data. Vol is relatively easy to predict so we can use less data and update our estimates frequently. Logically correlations, which fall somewhere between those two stools in terms of predictability, could be candidates for more updating. But, in my current handcrafting code, I use all the available data to estimate correlation matrices.

Under the new method the length of data used will have a big effect. A shorter window will result in less extreme weights. Hey, we might even be able to avoid using the horrible hack of minimum weight constraints. However we shouldn't arbitrarily reduce the correlation lookback purely to get a more robust outcome, unless the evidence justifies it.

So let's get some understanding of the optimal length of data history to use for predicting correlations. I'll consider three portfolios. Each contains two highly correlated assets, and another that probably isn't so correlated:

  • Underlying instrument returns for different instruments
  • Trading subsystem returns for different instruments
  • Returns of trading rules
from systems.provided.futures_chapter15.basesystem import futures_system

system = futures_system()

## underlying returns
instrument_list = system.get_instrument_list()
perc_returns = [system.rawdata.get_percentage_returns(instrument_code) for instrument_code in instrument_list]
perc_returns_df = pd.concat(perc_returns, axis=1)
perc_returns_df.columns = instrument_list

## subsystem returns
subsys_returns = [system.accounts.pandl_for_subsystem(instrument_code) for instrument_code in instrument_list]
subsys_returns_df = pd.concat(subsys_returns, axis=1)
subsys_returns_df.columns = instrument_list

## trading rule returns
def get_rule_returns(instrument_code):
rules = list(system.rules.trading_rules().keys())
rule_returns = [system.accounts.pandl_for_instrument_forecast(instrument_code, rule) for rule in rules]
rule_returns_df = pd.concat(rule_returns, axis=1)
rule_returns_df.columns = rules

return rule_returns_df
use_returns = perc_returns_df # change as required
use_returns = use_returns[pd.datetime(1998,1,1):] # common timestamp for fair comparison
use_returns = use_returns.resample("5B").sum() # good enough approx for weekly returns

Before we dive into the code, we need to answer a question: Over what period are we trying to predict correlations? It makes sense to forecast volatility over the next 30 days, since that is roughly our average holding period and we adjust our positions according to vol on a day to day basis. In a long only portfolio quarterly rebalancing is common, so for the underlying instrument perc_returns we'll try and predict the next 3 months (13 weeks) correlations.

We could do monthly rebalancing, but predicting 5 weeks of weekly correlation estimates is probably a mugs game, and most long only portfolios don't have turnover that is as high as all that. 

But correlations are used to define instrument and trading rule weights which we keep for a while - in backtesting they are refitted annually. So let's use try and predict next years correlations for trading rule and subsystem returns.

Here's the functions we need:

def get_forecast_and_future_corr(Nweeks_back, Nweeks_forward):
forecast = get_historic_correlations(Nweeks_back)
future = get_future_correlations(Nweeks_forward)

pd_result = merge_forecast_and_future(forecast, future, Nweeks_forward)

return pd_result

def merge_forecast_and_future(forecast, future, Nweeks_forward):
assets = forecast.columns # should be the same won't check
pd_result = []
for asset in assets:
result_for_asset = pd.concat([forecast[asset], future[asset]], axis=1)
# remove tail with nothing
result_for_asset = result_for_asset[:-Nweeks_forward]

# remove overlapping periods which bias R^2
selector = range(0, len(result_for_asset.index), Nweeks_forward)
result_for_asset = result_for_asset.iloc[selector]

result_for_asset.columns = ['forecast', 'turnout']

pd_result = pd.concat(pd_result, axis=0)

return pd_result

def get_future_correlations(Nweeks_forward):
corr = get_rolling_correlations(use_returns, Nweeks_forward)
corr = corr.ffill()
future_corr = corr.shift(-Nweeks_forward)

return future_corr

def get_historic_correlations(Nweeks_back):
corr = get_rolling_correlations(use_returns, Nweeks_back)
corr = corr.ffill()

return corr

def get_rolling_correlations(use_returns, Nperiods):
roll_df = use_returns.rolling(Nperiods, min_periods=4).corr()
perm_names = get_asset_perm_names(use_returns)
roll_list = [get_rolling_corr_for_perm_pair(perm_pair, roll_df) for perm_pair in perm_names]
roll_list_df = pd.concat(roll_list, axis=1)
roll_list_df.columns = ["%s/%s" % (asset1, asset2) for (asset1, asset2) in perm_names]

return roll_list_df

def get_asset_perm_names(use_returns):
asset_names = use_returns.columns
permlist = []
for asset1 in asset_names:
for asset2 in asset_names:
if asset1==asset2:
pairing = [asset1, asset2]
if pairing in permlist:
if pairing in permlist:


return permlist

def get_rolling_corr_for_perm_pair(perm_pair, roll_df):
return roll_df[perm_pair[0]][:,perm_pair[1]]

Just for fun, here are some one year rolling correlations for the three types of portfolios. I've used the returns for US10, US5 and SP500. For trading rules I've used the returns of EWMAC16, EWMAC32 and carry (for S&P 500 as it happens, but the results will be pretty similar on any instrument).

get_rolling_correlations(use_returns, 52).plot()

First underlying returns:

Now trading subsystems:

Finally trading rules:

I leave the interpretation of these plots as an exercise for the reader. But the main point is that correlations do move around and there are some clustering states; whether they are predictable by using more frequent estimates is a good question.

Now let's return to the question of how much data we should use when predicting correlations:

Nweeks_forward = 52 # use 13 weeks for underlying returns, 52 for others

import statsmodels.api as sm
import matplotlib.pyplot as pyplot
pyplot.rcParams.update({'font.size': 16})

Nweeks_list = [4, 7, 13, 26,52, 104, 208, 520]
r_squared = []
for Nweeks_back in Nweeks_list:
pd_result = get_forecast_and_future_corr(Nweeks_back, Nweeks_forward)

pd_result = pd_result.dropna()

x = pd_result.forecast
x = sm.add_constant(x)
y = pd_result.turnout
est = sm.OLS(y, x).fit()
r2 = est.rsquared_adj

ax = pyplot.gca()
ax.scatter(Nweeks_list, r_squared)

Here are the results for underlying assets (perc_returns_df, Nweeks_forward = 13):

To explain: the x-axis is the number of weeks we're looking back (log scale), whilst doing a regression on how correlation ends up being in the next 13 weeks, on what correlation was in the lookback period. The y-axis is the resulting R squared of the regression. Higher R squared is good! And the results are interesting: it looks like around 2 years (100 weeks) is the optimum length of time to predict 13 weeks correlations, although anything from 1 year upwards is fine.

And here's subsystem returns (subsys_returns_df, Nweeks_forward = 52):

Again, it looks like we should use at least a couple of years of data, although there is a benefit from using even more (which makes sense since we're trying to forecast a year ahead rather than just 3 months). And finally here are the trading rules (rule_returns_df, Nweeks_forward = 52), for which we need to tweak our code to use the average R squared across instruments:

Nweeks_forward = 52 # use 13 weeks for underyling returns, 52 for others

r_squared = []
for Nweeks_back in Nweeks_list:
all_r2 = []
for instrument_code in instrument_list:
print("%s/%s" % (instrument_code, Nweeks_back))
use_returns = get_rule_returns(instrument_code)
use_returns = use_returns[pd.datetime(1998, 1, 1):] # common timestamp
use_returns = use_returns.resample("5B").sum() # good enough approx for weekly returns

pd_result = get_forecast_and_future_corr(Nweeks_back, Nweeks_forward)

pd_result = pd_result.replace([np.inf, -np.inf], np.nan)
pd_result = pd_result.dropna()

x = pd_result.forecast
x = sm.add_constant(x)
y = pd_result.turnout
est = sm.OLS(y, x).fit()
r2_this_instr = est.rsquared_adj

r2 = np.mean(all_r2)

Again we need to use at least 5 years, and perhaps 10 or more years of data.

Let's recap our results. We're primarily interested here in setting instrument and rule weights, so we'll put the underlying instrument results to one side (although they're not that different - the sweet spot is 2 years):

  • A lookback of less than 2 years is likely to do a relatively poor job of forecasting future correlations over the next year
  • Lookbacks of more than 10 years may do a little better at forecasting, but from above we know they will produce weights that aren't as robust even after applying a fudge factor
  • Shorter lookbacks may also result in assets moving between groups, which will produce changes in weights regardless of how robust our method is.
In conclusion then I'd recommend using 10 years of data, or say 500 data points, in the handcrafting correlation estimates.

Update 9th November 2020:
In the original post I suggested using a 2 year window. The only real reason to use a shorter window was because I was uncomfortable with the zero weights produced by the longer windows. But I already 'solved' that with the minimum weight. In fact, the data is telling me that with 10 years of data you can be darn sure that the correct portfolio does indeed have a zero weight in some cases. It strikes me as better to use the explicit hack, rather than achieving it through the back door and pretending it's the correct thing to do.

Note that where we to consider the uncertainty of Sharpe ratios and correlations jointly,that would justify having at least some weight even when an asset is highly correlated as there would be some outcomes when having a zero weight would be suboptimal. This could be shown by doing a double pass; stepping through SR values and correlation values together, and optimising across them all. Whilst this is a neat idea it would slow things down a lot and we'd lose the intuition of the two step process (here are the weights for this correlation matrix. now lets' adjust them for SR). Something for a future blogpost.

Plugging into pysystemtrade

Let's now plug the new code into pysystemtrade, replacing the original methodology. Incidentally, the python code I've used so far can be found here.

Basically in the handcrafting code we replace get_weights_using_candidate_method(cmatrix) with code we've already developed (although there are a couple of minor tweaks because it reuses existing optimisation functions). I've also increased the p-step to 0.25 since it doesn't affect the results, and results in a 5^3 improvement in speed (we only do 9 optimisations per correlation matrix; at confidence intervals of 0.25, 0.5 and 0.75 per each of the three assets).

The optimisation code is a bit slow, but I need to refactor it first before I try and speed it up.

Does it work? Let's try with a quick and dirty 3 asset portfolio. 

To make the results more interesting I am using a 2 year lookback here, rather than the 10 years suggested above.

First some forecast weights:
from systems.provided.futures_chapter15.estimatedsystem import futures_system

system = futures_system()
system.config.instruments = ['US10', 'US5', 'SP500']
system = futures_system(config = system.config)


These seem kind of sensible. The weight to the middle EWMAC is lower, because it is in a 3 asset group, and it is highly correlated with the other two EWMAC. It looks like the minimum weight is kicking here at times. There is some movement from year to year, possibly due to the shorter lookback on correlations, but nothing too extreme; however it will be more stable once we move to the recommended 10 year lookback.

Now lookee at some instrument weights:

Prety much what you'd expect (bearing in mind the different data lengths).


That's it. I think the handcrafting method is probably about as good as I can get it now. Hope it's been an interesting journey.

Note: I will be updating the earlier posts on handcrafting to reflect this new methodology.