Remember the handcrafting method, which I described in this series of posts?

Motivating portfolio construction

Methodology

Implementing

Testing

Adjusting portfolio weights for Sharpe Ratios

All very nice, all very theoretically grounded, except for one thing: the 'candidate matrices'. Remember, what we do is group our portfolio into subportfolios of 2 or 3 assets, and find some initial risk weights (following which there is some funny business involved IDMs and SR adjustments).

For two assets the risk weights are trivial (50% each), and then for three assets we have the problem of coming up with some portfolio weights that account for different correlations, and yet are robust to the uncertainty in correlation estimates.

*Note: I account for uncertainty in Sharpe Ratios. I ignore uncertainty in volatility estimates, as that is relatively small and has minimal effect on portfolio weights.*

I do this by coming up with 'candidate' matrices. Basically a series of correlation matricies that cover the possible combinations of three assets. Each of these then has a matching set of portfolio weights. We find the closest correlation matrix and voila, we have the right weights! (possibly with some extrapolation if no matrix is perfect).

Now this has always been a slightly unsatisfactory part of the method, which reflects it's roots as a heuristic method used by humans requiring minimal computation. As humans, it is nice to glance at correlation matrices and say 'ah this looks like a classic #3 matrix. I have some nice robust weights for this in a draw somewhere'. But as I've made the transition towards fully systematizing and automating this method, we can do better.

To be precise, we can do what we need with the SR adjustments. In the original post, with the method taken from my first book, this was a somewhat vague method that took no precise account for how much uncertainty there is in SR estimates. In my subsequent post on the subject, I did exactly that: computed the expected distribution of parameter estimates, and took the average portfolio weight over various points

So in this post, I'm going to take a similar approach. To whit, I will replace the 'candidate matching' stage in handcrating with the following:

- Get central estimates for correlations in a given asset portfolio
- Get the amount of time for which those correlations are estimated over
- Calculate the parameter distribution for the correlations
- Optimise the portfolio weights over different points on the correlation distribution
- Take an average of those portfolio weights

I'd recommend reading this post even if you're not a fan of the handcrafting method. It will give you some interesting insight into the uncertainty of correlation estimates, their effect on portfolio weights, and (bonus feature!) the optimal amount of time to estimate correlations over.

Setup

Let's start by setting ourselves up with some boiler plate optimisation code that will give us some weights for a three asset portfolio, based purely on the correlations (I'll assume SR=0.5 and some arbitrary standard deviation for all three assets).

`from scipy.optimize import minimize`

import numpy as np

def optimise_for_corr_matrix(corr_matrix):

mean_list = [.05]*3

std = [.1]*3

return optimise_inner(mean_list, corr_matrix, std)

def optimise_inner(mean_list, corr_matrix, std):

stdev_list = [std]*len(mean_list)

sigma = sigma_from_corr_and_std(stdev_list, corr_matrix)

return optimise_with_sigma(sigma, mean_list)

def optimise_with_sigma(sigma, mean_list):

mus = np.array(mean_list, ndmin=2).transpose()

number_assets = sigma.shape[1]

start_weights = [1.0 / number_assets] * number_assets

# Constraints - positive weights, adding to 1.0

bounds = [(0.0, 1.0)] * number_assets

cdict = [{'type': 'eq', 'fun': addem}]

ans = minimize(

neg_SR,

start_weights, (sigma, mus),

method='SLSQP',

bounds=bounds,

constraints=cdict,

tol=0.00001)

weights = ans['x']

return weights

def neg_SR(weights, sigma, mus):

# Returns minus the Sharpe Ratio (as we're minimising)

weights = np.matrix(weights)

estreturn = (weights * mus)[0, 0]

std_dev = (variance(weights, sigma)**.5)

return -estreturn / std_dev

def variance(weights, sigma):

# returns the variance (NOT standard deviation) given weights and sigma

return (weights * sigma * weights.transpose())[0, 0]

def sigma_from_corr_and_std(stdev_list, corrmatrix):

stdev = np.array(stdev_list, ndmin=2).transpose()

sigma = stdev * corrmatrix * stdev

return sigma

def addem(weights):

# Used for constraints

return 1.0 - sum(weights)

labelledCorrelations = namedtuple("labelledCorrelationList", 'ab ac bc')

def three_asset_corr_matrix(labelled_correlations):

"""

:return: np.array 2 dimensions, size

"""

ab = labelled_correlations.ab

ac = labelled_correlations.ac

bc = labelled_correlations.bc

m = [[1.0, ab, ac], [ab, 1.0, bc], [ac, bc, 1.0]]

m = np.array(m)

return m

Let's check a few stylised examples:

`>>> optimise_for_corr_matrix(three_asset_corr_matrix(labelledCorrelations(ab=0,ac=0,bc=0)))`

`array([0.33333333, 0.33333333, 0.33333333])`

>>> optimise_for_corr_matrix(three_asset_corr_matrix(labelledCorrelations(ab=0.9,ac=0,bc=0)))

`array([0.25627057, 0.25627056, 0.48745887])`

>>> optimise_for_corr_matrix(three_asset_corr_matrix(labelledCorrelations(ab=0.9,ac=0.9,bc=0)))

`array([0. , 0.49999999, 0.50000001])`

>>> optimise_for_corr_matrix(three_asset_corr_matrix(labelledCorrelations(ab=0.9,ac=0.9,bc=0.9)))

`array([0.33333333, 0.33333333, 0.33333333])`

That all looks pretty sensible.

## Correlation uncertainty

Now, what is the correlation uncertainty given some sample size and the correlation value? I could paste in some LaTex, but instead let me point you towards the wiki page, and show you some code that implements it:

def get_correlation_distribution_point(corr_value, data_points, conf_interval):

fisher_corr = fisher_transform(corr_value)

point_in_fisher_units = \

get_fisher_confidence_point(fisher_corr, data_points, conf_interval)

point_in_natural_units = inverse_fisher(point_in_fisher_units)

return point_in_natural_units

def fisher_transform(corr_value):

return 0.5*np.log((1+corr_value) / (1-corr_value)) # also arctanh

def get_fisher_confidence_point(fisher_corr, data_points, conf_interval):

if conf_interval<0.5:

confidence_in_fisher_units = fisher_confidence(data_points, conf_interval)

point_in_fisher_units = fisher_corr - confidence_in_fisher_units

elif conf_interval>0.5:

confidence_in_fisher_units = fisher_confidence(data_points, 1-conf_interval)

point_in_fisher_units = fisher_corr + confidence_in_fisher_units

else:

point_in_fisher_units = fisher_corr

return point_in_fisher_units

def fisher_confidence(data_points, conf_interval):

data_point_root =fisher_stdev(data_points)

conf_point = get_confidence_point(conf_interval)

return data_point_root * conf_point

def fisher_stdev(data_points):

data_point_root = 1/((data_points-3)**.5)

return data_point_root

def get_confidence_point(conf_interval):

conf_point = norm.ppf(1-(conf_interval/2))

return conf_point

def inverse_fisher(fisher_corr_value):

return (np.exp(2*fisher_corr_value) - 1) / (np.exp(2*fisher_corr_value) + 1)

Let's have a play with this. First of all, the 0.5 point should give us the central estimate for correlation:

`>>> get_correlation_distribution_point(0.9, 100, 0.5)`

0.9

>>> get_correlation_distribution_point(0.0, 100, 0.5)

0.0

We can back out lower and upper estimates:

`>>> get_correlation_distribution_point(0.0, 100, 0.05)`

-0.1964181176820594

>>> get_correlation_distribution_point(0.0, 100, 0.95)

0.19641811768205936

These get wider if we use less data:

`>>> get_correlation_distribution_point(0.0, 10, 0.05)`

-0.6296263003883293

>>> get_correlation_distribution_point(0.0, 10, 0.95)

0.6296263003883295

The distribution gets narrower and slightly assymetric at extremes:

`>>> get_correlation_distribution_point(0.99, 100, 0.05)`

0.9851477380139932

>>> get_correlation_distribution_point(0.99, 100, 0.95)

0.9932723911926727

And so on.

## Three asset portfolio weights under conditions of portfolio uncertainty

For a three asset portfolio we can now back out the optimised portfolio weights given the relevant points on the correlation distributions:

def optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, data_points):

labelled_correlations = extract_asset_pairwise_correlations_from_matrix(corr_matrix)

labelled_correlation_points = calculate_correlation_points_from_tuples(labelled_correlations, conf_intervals, data_points)

corr_matrix = three_asset_corr_matrix(labelled_correlation_points)

weights = optimise_for_corr_matrix(corr_matrix)

return weights

def extract_asset_pairwise_correlations_from_matrix(corr_matrix):

ab = corr_matrix[0][1]

ac = corr_matrix[0][2]

bc = corr_matrix[1][2]

return labelledCorrelations(ab=ab, ac=ac, bc=bc)

def calculate_correlation_points_from_tuples(labelled_correlations, conf_intervals, data_points):

correlation_point_list = [get_correlation_distribution_point(corr_value, data_points, confidence_interval)

for corr_value, confidence_interval in

zip(labelled_correlations, conf_intervals)]

labelled_correlation_points = labelledCorrelations(*correlation_point_list)

return labelled_correlation_points

Let's try this out:

`>>> corr_matrix = three_asset_corr_matrix(labelledCorrelations(0,0,0))`

>>> conf_intervals = labelledCorrelations(.5,.5,.5)

>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)

array([0.33333333, 0.33333333, 0.33333333])

# Makes sense, taking the median point off the distribution just recovers equal weights

`>>> conf_intervals = labelledCorrelations(.5,.5,.95)`

>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)

array([0.37431944, 0.31284028, 0.31284028])

# with a higher confidence point for correlation BC we put more in asset A

`>>> conf_intervals = labelledCorrelations(.5,.5,.05)`

>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 100)

array([0.286266, 0.356867, 0.356867])

# with a lower correlation for BC we put less in asset A

`>>> optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, 10)`

array([0.15588428, 0.42205786, 0.42205786])

`# With less data the confidence intervals get wider`

Seems to be working. Now I need to use the same technique as in the previous post on SR adjustments, which is get the portfolio weights over a number of points on the distribution, and then take an average.

def optimised_weights_given_correlation_uncertainty(corr_matrix, data_points, p_step=0.1):

dist_points = np.arange(p_step, stop=(1-p_step)+0.000001, step=p_step)

list_of_weights = []

for conf1 in dist_points:

for conf2 in dist_points:

for conf3 in dist_points:

conf_intervals = labelledCorrelations(conf1, conf2, conf3)

weights = optimise_for_corr_matrix_with_uncertainty(corr_matrix, conf_intervals, data_points)

list_of_weights.append(weights)

array_of_weights = np.array(list_of_weights)

average_weights = np.nanmean(array_of_weights, axis=0)

return average_weights

Slightly niggly thing here is that there are three inputs, so I need to iterate over 3 sets of distributions. Hence I'm using a 'p-step' of 0.1 (meaning we have to do 9^3 optimisations); using smaller values doesn't change the results (whilst being exponentially slower!), larger values are a bit too coarse.

## Recalculating the candidate matrices

Now I'm going to recalculate the candidate matrices used in the original handcrafting. This involves doing this kind of thing:

`>>> corr_matrix = three_asset_corr_matrix(labelledCorrelations(0.,0.,0.))`

# get raw optimised weights, no uncertainty

>>> optimise_for_corr_matrix(corr_matrix)

array([0.33333333, 0.33333333, 0.33333333])

# get optimised weights with uncertainty, using data_periods = 5

>>> optimised_weights_given_correlation_uncertainty(corr_matrix, 5)

array([0.33333332, 0.33333346, 0.33333323])

Here are the results. First the raw weights with no uncertainty (first 3 columns are pairwise correlations, next 3 are weights):

Corr AB Corr AC Corr BC Raw A Raw B Raw C

0 0.5 0 0.286 0.428 0.286

0 0.9 0 0.256 0.487 0.256

0.5 0 0.5 0.5 0 0.5

0 0.5 0.9 0.5 0.5 0

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.263 0.473 0.263

0.9 0.5 0.9 0.5 0 0.5

0 0 0 0.33 0.33 0.33

Now for the original handcrafted weights:

Corr AB Corr AC Corr BC H/C A H/C B H/C C

0 0.5 0 0.3 0.4 0.3

0 0.9 0 0.27 0.46 0.27

0.5 0 0.5 0.37 0.26 0.27

0 0.5 0.9 0.45 0.45 0.1

0.9 0 0.9 0.39 0.22 0.39

0.5 0.9 0.5 0.29 0.42 0.29

0.9 0.5 0.9 0.42 0.16 0.42

0 0 0 0.33 0.33 0.33

Now let's do the new stuff, with 5 data points (anything less than 4 will break, for obvious reasons):

Corr AB Corr AC Corr BC 5dp A 5dp B 5dp C

0 0.5 0 0.298 0.404 0.298

0 0.9 0 0.264 0.472 0.264

0.5 0 0.5 0.379 0.241 0.379

0 0.5 0.9 0.462 0.349 0.189

0.9 0 0.9 0.44 0.12 0.44

0.5 0.9 0.5 0.279 0.441 0.279

0.9 0.5 0.9 0.411 0.178 0.411

0 0 0 0.33 0.33 0.33

Some observations:

- For 'nice' matrices (like equal correlations or the first row: 0,.5,0) the results of both the original handcrafting and the new method are very close to the raw optimisation
- In most cases the original handcrafting and the new method give very similar results. The main exceptions are the fourth row and the third row.

Now let's see how things change with more data, say data_points=20:

Corr AB Corr AC Corr BC 20dp A 20dp B 20dp C

0 0.5 0 0.278 0.444 0.278

0 0.9 0 0.252 0.496 0.252

0.5 0 0.5 0.437 0.125 0.437

0 0.5 0.9 0.497 0.443 0.06

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.256 0.489 0.256

0.9 0.5 0.9 0.4955 0.009 0.4955

0 0 0 0.33 0.33 0.33

For the 'nicer' correlation matrices, not very much happens as we have more data. However for the more extreme versions, like rows 3 and 4, we get closer to the raw optimised weights (since we can be more confident that the central correlation estimates are correct). In fact for row 5 we end up with the raw weights, and the penultimate row isn't far off. And finally data_points=100:

Corr AB Corr AC Corr BC 100dp A 100dp B 100dp C

0 0.5 0 0.283 0.434 0.283

0 0.9 0 0.252 0.496 0.252

0.5 0 0.5 0.471 0.059 0.471

0 0.5 0.9 0.5 0.5 0

0.9 0 0.9 0.5 0 0.5

0.5 0.9 0.5 0.256 0.489 0.256

0.9 0.5 0.9 0.5 0 0.5

0 0 0 0.33 0.33 0.33

## How many data points should we use?

*past*correlations. We're interested in the uncertainty of

*future*correlations, given the data we have about the past. The sampling distribution of the past is just a proxy for this. In particular, it assumes that the latent correlation structure is static (as if!), and that joint Gaussian normal is a good model for financial data (if only!).

Although the confidence intervals are tighter (since the correlation is closer to 1) the record is just as bad: 14 misses in 18 years. No secular trend here, but the typical pattern of highly correlated assets: correlations break down in a crisis (1999, 2008). Again the true 95% range is probably 0.04 rather than the 0.01 show here.

`def fisher_confidence(data_points, conf_interval):`

data_point_root =fisher_stdev(data_points)*4

conf_point = get_confidence_point(conf_interval)

return data_point_root * conf_point

*Note the precise value will depend on the level of correlation, since we should strictly use the fisher equivalent of 4, but this seems to work well enough.*

`>>> get_correlation_distribution_point(-0.25, 1000, 0.025)`

-0.31528119007914723

>>> get_correlation_distribution_point(-0.25, 1000, 0.975)

-0.18236395028476593

>>> get_correlation_distribution_point(.96, 1000, 0.025)

0.9540384541438645

>>> get_correlation_distribution_point(.96, 1000, 0.975)

0.9652020604250287

>>> get_correlation_distribution_point(-0.25, 1000, 0.025)

-0.4925007507489921

>>> get_correlation_distribution_point(-0.25, 1000, 0.975)

>>> 0.028523195562247212

>>> get_correlation_distribution_point(.96, 1000, 0.025)

0.9304815655421306

>>> get_correlation_distribution_point(.96, 1000, 0.975)

0.9771329891421301

def apply_min_weight(average_weights):

weights_with_min = [min_weight(weight) for weight in average_weights]

adj_weights = weights_sum_to_total(weights_with_min)

return adj_weights

def min_weight(weight):

if weight<0.1:

return 0.1

else:

return weight

def weights_sum_to_total(weights_with_min):

sum_weights = np.sum(weights_with_min)

adj_weights = weights_with_min / sum_weights

return adj_weights

`>>> apply_min_weight(average_weights)`

array([0.45454545, 0.09090909, 0.45454545])

*any*correlation matrix. That means we don't have to do the candidate matching and extrapolation. We just feed in the correlation matrix directly and out will pop the appropriate weights.

*Note: in theory you could use this technique for any size of correlation matrix, but the time involved would increase exponentially.*

## Should we use all of our data to estimate correlations?

- Underlying instrument returns for different instruments
- Trading subsystem returns for different instruments
- Returns of trading rules

from systems.provided.futures_chapter15.basesystem import futures_system

system = futures_system()

## underlying returns`del(system.config.instrument_weights)`

perc_returns = [system.rawdata.get_percentage_returns(instrument_code) for instrument_code in instrument_list]

instrument_list = system.get_instrument_list()

perc_returns_df = pd.concat(perc_returns, axis=1)

perc_returns_df.columns = instrument_list

## subsystem returns

subsys_returns = [system.accounts.pandl_for_subsystem(instrument_code) for instrument_code in instrument_list]

subsys_returns_df = pd.concat(subsys_returns, axis=1)

subsys_returns_df.columns = instrument_list

## trading rule returns`def get_rule_returns(instrument_code):`

rules = list(system.rules.trading_rules().keys())

rule_returns = [system.accounts.pandl_for_instrument_forecast(instrument_code, rule) for rule in rules]

rule_returns_df = pd.concat(rule_returns, axis=1)

rule_returns_df.columns = rules

return rule_returns_df

`use_returns = perc_returns_df # change as required`

`use_returns = use_returns[pd.datetime(1998,1,1):] # common timestamp for fair comparison`

use_returns = use_returns.resample("5B").sum() # good enough approx for weekly returns

*We could do monthly rebalancing, but predicting 5 weeks of weekly correlation estimates is probably a mugs game, and most long only portfolios don't have turnover that is as high as all that.*

`def get_forecast_and_future_corr(Nweeks_back, Nweeks_forward):`

forecast = get_historic_correlations(Nweeks_back)

future = get_future_correlations(Nweeks_forward)

pd_result = merge_forecast_and_future(forecast, future, Nweeks_forward)

return pd_result

def merge_forecast_and_future(forecast, future, Nweeks_forward):

assets = forecast.columns # should be the same won't check

pd_result = []

for asset in assets:

result_for_asset = pd.concat([forecast[asset], future[asset]], axis=1)

# remove tail with nothing

result_for_asset = result_for_asset[:-Nweeks_forward]

# remove overlapping periods which bias R^2

selector = range(0, len(result_for_asset.index), Nweeks_forward)

result_for_asset = result_for_asset.iloc[selector]

result_for_asset.columns = ['forecast', 'turnout']

pd_result.append(result_for_asset)

pd_result = pd.concat(pd_result, axis=0)

return pd_result

def get_future_correlations(Nweeks_forward):

corr = get_rolling_correlations(use_returns, Nweeks_forward)

corr = corr.ffill()

future_corr = corr.shift(-Nweeks_forward)

return future_corr

def get_historic_correlations(Nweeks_back):

corr = get_rolling_correlations(use_returns, Nweeks_back)

corr = corr.ffill()

return corr

def get_rolling_correlations(use_returns, Nperiods):

roll_df = use_returns.rolling(Nperiods, min_periods=4).corr()

perm_names = get_asset_perm_names(use_returns)

roll_list = [get_rolling_corr_for_perm_pair(perm_pair, roll_df) for perm_pair in perm_names]

roll_list_df = pd.concat(roll_list, axis=1)

roll_list_df.columns = ["%s/%s" % (asset1, asset2) for (asset1, asset2) in perm_names]

return roll_list_df

def get_asset_perm_names(use_returns):

asset_names = use_returns.columns

permlist = []

for asset1 in asset_names:

for asset2 in asset_names:

if asset1==asset2:

continue

pairing = [asset1, asset2]

if pairing in permlist:

continue

pairing.reverse()

if pairing in permlist:

continue

permlist.append(pairing)

return permlist

def get_rolling_corr_for_perm_pair(perm_pair, roll_df):

return roll_df[perm_pair[0]][:,perm_pair[1]]

`Nweeks_forward = 52 # use 13 weeks for underlying returns, 52 for others`

import statsmodels.api as sm

import matplotlib.pyplot as pyplot

pyplot.rcParams.update({'font.size': 16})

Nweeks_list = [4, 7, 13, 26,52, 104, 208, 520]

r_squared = []

for Nweeks_back in Nweeks_list:

print(Nweeks_back)

pd_result = get_forecast_and_future_corr(Nweeks_back, Nweeks_forward)

pd_result = pd_result.dropna()

x = pd_result.forecast

x = sm.add_constant(x)

y = pd_result.turnout

est = sm.OLS(y, x).fit()

r2 = est.rsquared_adj

r_squared.append(r2)

ax = pyplot.gca()

ax.scatter(Nweeks_list, r_squared)

ax.set_xscale('log')

`Nweeks_forward = 52 # use 13 weeks for underyling returns, 52 for others`

r_squared = []

for Nweeks_back in Nweeks_list:

print(Nweeks_back)

all_r2 = []

for instrument_code in instrument_list:

print("%s/%s" % (instrument_code, Nweeks_back))

use_returns = get_rule_returns(instrument_code)

use_returns = use_returns[pd.datetime(1998, 1, 1):] # common timestamp

use_returns = use_returns.resample("5B").sum() # good enough approx for weekly returns

pd_result = get_forecast_and_future_corr(Nweeks_back, Nweeks_forward)

pd_result = pd_result.replace([np.inf, -np.inf], np.nan)

pd_result = pd_result.dropna()

x = pd_result.forecast

x = sm.add_constant(x)

y = pd_result.turnout

est = sm.OLS(y, x).fit()

r2_this_instr = est.rsquared_adj

all_r2.append(r2_this_instr)

r2 = np.mean(all_r2)

r_squared.append(r2)

- A lookback of less than 2 years is likely to do a relatively poor job of forecasting future correlations over the next year
- Lookbacks of more than 10 years may do a little better at forecasting, but from above we know they will produce weights that aren't as robust even after applying a fudge factor
- Shorter lookbacks may also result in assets moving between groups, which will produce changes in weights regardless of how robust our method is.

*Update 9th November 2020:*

## Plugging into pysystemtrade

*The optimisation code is a bit slow, but I need to refactor it first before I try and speed it up.*

*To make the results more interesting I am using a 2 year lookback here, rather than the 10 years suggested above.*

`from systems.provided.futures_chapter15.estimatedsystem import futures_system`

system = futures_system()

system.config.instruments = ['US10', 'US5', 'SP500']

system = futures_system(config = system.config)

x=system.combForecast.get_forecast_weights("US10")

x.plot()

### Summary

*Note: I will be updating the earlier posts on handcrafting to reflect this new methodology.*