Friday 2 July 2021

Talking to the dead / simple heuristic position selection / small account problems - part four / EPIC FAIL #2

 Over the last few posts I've been grappling with the difficulties of trading futures with a retail sized account. I've tried a couple of things so far - a complex dynamic optimisation (here and here) where I try and optimise the portfolio every day in the knowledge that I can only take integer positions, and then a simpler static approach where I try to pick the best fixed set of instruments to trade given my account size - and then trade them.

In this post I return to a dynamic approach (choosing the best positions to hold each day from a very large set of instruments), but this time I'm going to use much simpler heuristic methods. I use the term heuristic to mean something you could explain to an eldery relative: let's call them Auntie Barbara.

I used to have an Auntie Barbara, but she died a long time ago. If there is an afterlife, and if they have internet there, and if she subscribes to this blog: Hi!




I've written this post to be fairly self contained (I can't really expect Auntie Barbara to read all the previous posts, she will be too busy playing tennis with Marilyn Monroe or something) and also a bit simpler than the previous three to follow.



The setup


Here's the setup again. I have a universe currently of 48 futures markets I'd like to trade (for now - in practice I'm adding new instruments every few days, and in an ideal world there are around 150  I'd like to trade if I could). If I backtest their performance it looks great (this is just with the set of trading rules in chapter 15 of my book, 3 EWMAC + carry; but I do allow the instrument weights to be optimised):



That's a Sharpe Ratio of 1.18, pretty good for two trading rules (ewmac and carry). Oh the power of diversification...

Not only does it make money, it also (on average) has good risk targeting. Here's the rolling annualised standard deviation (which come in at 22.2% on average, slightly under the target)




Auntie Barbara (AB): "Great! You always were a little smart alec. Can I get back to my jacuzzi now? I've got James Dean and Heath Ledger waiting for me."

* Auntie is communicating with me from the spirit world via telnet, hence the Courier typeface

Sorry Auntie, I cheated slightly there. That's the performance if I can take fractional futures positions, o equivalently what I could do with many millions of dollars. 

This is what it looks like if I trade it with $100K (about £80K: this particular FX rate is roughly unchanged since my Auntie died)

I normally use $500K for these tests - but I'm trying to make the results starker.

AB "Why does it start going wrong, weirdly, not long after I've died? Are you saying this is my fault?"


Not at all! No, to begin with there are only a few instruments in the data. Then as more are added, we struggle to take positions in every instrument due to rounding. We end up with many instruments that have no position at all; the positions we end up making (or losing money) from just happen to be those with relatively small contract sizes. 

So the portfolio becomes more concentrated, and in expectation (and also in reality here), has worse performance. It also undershoots it's risk due to all that 'wasted' capacity of the instruments which can't take a position. There are many instruments here that we are just collecting data for, but can't hope to ever take a position in.

Now look at the rolling realised standard deviation again:


We're systematically undershooting, especially in more recent years when I have a lot more instruments in the dataset. The risk is also 'lumpier', reflecting the close to binary nature of the system.


AB "Hang on, I've just read your last couple of posts again. Or tried to. What happens if you do some kind of fancy dynamic optimisation on your positions each day?"

That doesn't work and is way too complicated.

AB "And what if you just select a group of markets and trade with those?"

Well if I use the 16 instruments I identified in my last post  as suitable for a $100K account I get these results:


Fewer markets is handicapped by having later starting data, but if I account for that:


AB "When does that data start now?"

The 11th May 1974

AB "Ah - that's your birthday. Coincidence?"

Well actually the data starts on 22nd April 1974, but that's close enough.

That feels slightly like cheating since they're identified using some forward looking information, but if I selected any 16 instruments on a rolling basis using any vaguely sensible methodology I'd expect on average to get similar results.

Basically we make up some of the ground on the full 40+ instrument portfolio compared to the rounded situation, but we never quite manage it (although the green curve looks as good, it's actually got a lower SR and underperforms in more recent years as we get more and more instruments in the full portfolio). In expectation 16 instruments, no matter how carefully chosen, will underperform 50; never mind 150.



The simplest possible approach?


AB "Well it's obvious what you should do"

Is it?

AB "Do you remember when you were a boy, and you'd invite all your friends to your birthday parties?"

I'm 47 Auntie Barbara. I'm not 100% sure what I did last thursday.

AB "Well just bear with me then. Suppose you had 50 friends, and you could only invite 16 to your party. What would you do?"

I'd.... well I'd pick my favourite 16 friends (this is hypothetical! What kind of person has fifty 'friends'?).

AB "Now suppose you had a birthday party every single day. What would happen?"

Well... I suppose I'd pick whoever was my favourite 16 friends on that day. But, with respect, what on earth, (sorry insensitive), what the hell (worse!),  what in heaven does this have this to do with the problem at hand.

AB "Hasn't the penny dropped yet? I thought you were a smart alec."

OK it has finally dropped. What I need to do is just hold positions in the 16 instruments that have the strongest absolute forecast on that day.

AB "Someone give the boy a medal"



I choose to ignore that. Let's see some code:


class newPositionSizing(PositionSizing):

@output()
def get_subsystem_position(self, instrument_code: str) -> pd.Series:
all_positions = self.get_pd_df_of_subsystem_positions()
return all_positions[instrument_code]

@diagnostic()
def get_pd_df_of_subsystem_positions(self) -> pd.DataFrame:
all_forecasts =self.get_all_forecasts()
list_of_dates =all_forecasts.index

list_of_positions = []
previous_days_positions = portfolioWeights()
p=progressBar(len(list_of_dates))
for passed_date in list_of_dates:

positions = self.get_subsystem_positions_for_day(passed_date, previous_days_positions)
list_of_positions.append(positions)
previous_days_positions = copy(positions)
p.iterate()

p.finished()

df_of_positions = pd.DataFrame(list_of_positions)
df_of_positions.index = list_of_dates

return df_of_positions

def get_subsystem_positions_for_day(self,
passed_date: datetime.datetime,
previous_days_positions: portfolioWeights = arg_not_supplied) -> portfolioWeights:

if previous_days_positions is arg_not_supplied:
previous_days_positions = portfolioWeights()
forecasts = self.get_forecasts_for_day(passed_date)

initial_positions_all_capital = self.get_initial_positions_for_day_using_all_capital(passed_date)

positions = calculate_positions_for_day(previous_days_positions = previous_days_positions,
forecasts = forecasts,
initial_positions_all_capital = initial_positions_all_capital)
list_of_instruments = self.parent.get_instrument_list()
positions = positions.with_zero_weights_for_missing_keys(list_of_instruments)

return positions


def get_initial_positions_for_day_using_all_capital(self,passed_date: datetime.datetime) -> portfolioWeights:
all_positions = self.get_all_initial_positions_using_all_capital()
all_positions_on_day = all_positions.loc[passed_date]

return portfolioWeights(all_positions_on_day.to_dict())

def get_forecasts_for_day(self, passed_date: datetime.datetime)->portfolioWeights:
all_forecasts = self.get_all_forecasts()

todays_forecasts = all_forecasts.loc[passed_date]

return portfolioWeights(todays_forecasts.to_dict())

@diagnostic()
def get_all_forecasts(self) -> pd.DataFrame:
instrument_list = self.parent.get_instrument_list()
forecasts = [self.get_combined_forecast(instrument_code)
for instrument_code in instrument_list]

forecasts_as_pd = pd.concat(forecasts, axis=1)
forecasts_as_pd.columns = instrument_list
forecasts_as_pd = forecasts_as_pd.ffill()

return forecasts_as_pd

@diagnostic()
def get_all_initial_positions_using_all_capital(self) -> pd.DataFrame:
instrument_list = self.parent.get_instrument_list()
positions = [self.get_initial_position_using_all_capital(instrument_code)
for instrument_code in instrument_list]

positions_as_pd = pd.concat(positions, axis=1)
positions_as_pd.columns = instrument_list
positions_as_pd = positions_as_pd.ffill()

return positions_as_pd


@diagnostic()
def get_initial_position_using_all_capital(self, instrument_code: str) -> pd.Series:

self.log.msg(
"Calculating subsystem position for %s" % instrument_code,
instrument_code=instrument_code,
)

inital_position = self.get_volatility_scalar(instrument_code)

return inital_position


This code actually contains some future proofing, in that it is written for path dependence in positions - which we're not actually going to use yet. 



def calculate_positions_for_day(previous_days_positions: portfolioWeights,
forecasts: portfolioWeights,
initial_positions_all_capital: portfolioWeights):

## Get risk budget per market
##
risk_budget_per_market = proportionate_risk_budget(forecasts)
maximum_positions = int(1.0 / risk_budget_per_market)
idm = min(maximum_positions**.35, 2.5)
idm_with_risk = risk_budget_per_market * idm

initial_positions = signed_initial_position_given_risk_budget(initial_positions_all_capital,
forecasts = forecasts,
risk_budget=idm_with_risk)

list_of_tradeable_instruments = tradeable_instruments(initial_positions=initial_positions,
forecasts=forecasts)

current_instruments_with_positions = []

## Sort markets by abs strength of forecast
## Iteratively from strongest to weakest:
list_of_instruments_strongest_forecast_first = \
sort_list_of_instruments_by_forecast_strength(forecasts=forecasts,
instrument_list=list_of_tradeable_instruments)

for instrument_to_add in list_of_instruments_strongest_forecast_first:
## If already have position, keep it on - wouldn't be in this list
if len(current_instruments_with_positions)<maximum_positions:
## If haven't got a position on, and risk budget remaining, add a position
current_instruments_with_positions.append(instrument_to_add)
continue
else:
## If no markets remain with current positions in could be removed group, halt
break

new_positions = fill_positions_from_initial(current_instruments_with_positions=current_instruments_with_positions,
initial_positions=initial_positions)

return new_positions


Most of that should be self explanatory, the 'initial' position (perhaps badly named) is the position the system would want to take if we put all of our trading capital into that single instrument. We then scale that by a risk budget, which is equivalent to an 'instrument weight' that here is just 1/N (N is the number of assets we're currently trading), with a lower limit of 6.25% (to avoid having no more than 16 positions; this value can be tweaked depending on your capital), and an IDM calculated as N^0.35 (note if all subsystems had zero correlation this would be N^0.5, so this is a reasonable approximation), with my normal limit of IDM=2.5


def proportionate_risk_budget(forecasts: portfolioWeights):
market_count = market_count_in_forecasts(forecasts)
proportion = 1.0/market_count

use_proportion = max(proportion, 1.0/16)

return use_proportion

Now a 'tradeable' instrument is one with a non na forecast, but also a position that is equal to a single contract or more. No point wasting risk capital on a position that isn't at least one contract.

AB "No point inviting a kid to the party who can't come. That's a waste of an invitation."


Indeed.

def tradeable_instruments(initial_positions: portfolioWeights,
forecasts: portfolioWeights):
## Non tradeable instruments:
## We don't open up new positions in non tradeable instruments, but we may
## maintain positions in existing ones

valid_forecasts = instruments_with_valid_forecasts(forecasts)
possible_positions = instruments_with_possible_positions(initial_positions)

valid_instruments = list(set(possible_positions).intersection(set(valid_forecasts)))

return valid_instruments

def instruments_with_valid_forecasts(forecasts: portfolioWeights) -> list:
valid_keys = [instrument_code
for instrument_code, forecast_value in forecasts.items()
if _valid_forecast(forecast_value)]
return valid_keys

def _valid_forecast(forecast_value: float):
if np.isnan(forecast_value):
return False
if forecast_value==0.0:
return False
return True

def instruments_with_possible_positions(initial_positions: portfolioWeights) -> list:
valid_keys = [instrument_code
for instrument_code, position in initial_positions.items()
if _possible_position(position)]
return valid_keys

def _possible_position(position: float):
if np.isnan(position):
return False
if abs(position)<1.0:
return False

Let's have a gander at what this thing is doing:




I've zoomed in to the end of this plot, which shows positions for Eurodollar at various stages. The blue line shows what position we'd have on without position rounding, and with a fixed capital weight of 6.25% (equal weight across 16 instruments) multiplied by the IDM (2.5 here). The orange line - which is mostly on the blue line - shows the position we'd have on without rounding, once we've applied the 'You need to have one of the 16 strongest forecasts to come to the party' rule (I need a catchier name).

So for example between March and mid April this goes to zero, as the forecast weakens.

 Finally the green line shows the rounded position, once I've applied my usual buffering rule. You can see that's mostly a rounded version of the orange line.




OK. It's not great, although the last 10 years is pretty good. Also the vol targeting is somewhat poor:




... coming in at an average of 12% a year. 




Horrible path dependence



Let's turn our attention first to the poor performance. Some of that is due to costs; which go up from around 10bp of SR in the large and reduced benchmarks, to 26bp of SR. As I've said many times before, pre-cost performance is (to an extent) random but costs are predictable. Not a surprise when a forecast going from being ranked 16th best to 15th best will result in a trade; and then possibly the next day the same position being closed.


AB "It seems unfair to kick someone out of the party, just because they've gone from being your 15th to 16th favourite friend. Maybe you should let kids stay until they are really not your friends anymore."



OK, let's try it. I propose the following rule (bear in mind that my forecasts are scaled such that a forecast of +10 is an average long):
  • If we have a position, and the absolute forecast is more than 5, then hang on to it.
  • If we don't have a position, and the absolute forecast is more than 5 then try to open a new position. Starting with the instuments with the highest forecasts:
  • If we already have the maximum number of positions open, then:
    • For instruments that have open positions, starting with the lowest forecast close the position and replace it with the new instrument.
    • Do not close a position if the absolute forecast is more than 5. 
    • Once all possible positions (absolute forecast<5) have been closed, do not open any new positions

So:
  • Absolute forecasts greater than 10:
    • Existing position: won't be closed
    • New position: probably will be opened
  • Absolute forecasts between 5 and 10:
    • Existing positions: won't be closed
    • New positions: may be opened
  • Forecasts less than 5:
    • Existing positions: may be closed


def calculate_positions_for_day(previous_days_positions: portfolioWeights,
forecasts: portfolioWeights,
initial_positions_all_capital: portfolioWeights):

risk_budget_per_market = proportionate_risk_budget(forecasts)
maximum_positions = int(1.0 / risk_budget_per_market)
idm = min(maximum_positions**.35, 2.5)
idm_with_risk = risk_budget_per_market * idm

initial_positions = signed_initial_position_given_risk_budget(initial_positions_all_capital,
forecasts = forecasts,
risk_budget=idm_with_risk)

list_of_tradeable_instruments = tradeable_instruments(initial_positions=initial_positions,
forecasts=forecasts)

current_instruments_with_positions = from_portfolio_weights_to_instrument_list(previous_days_positions)

## forecast less than +5 or non tradable (could be removed)
list_of_removable_instruments = removable_instruments_with_positions_weakest_forecasts_last(current_instruments_with_positions,
forecasts=forecasts)
## ordered by weakness of forecast

## Sort markets by abs strength of forecast
## Iteratively from strongest to weakest:
list_of_instruments_with_no_position_strongest_forecast_first = \
instruments_with_no_position_strongest_forecast_first(
list_of_tradeable_instruments=list_of_tradeable_instruments,
forecasts=forecasts,
current_instruments_with_positions=current_instruments_with_positions)

for instrument_to_add in list_of_instruments_with_no_position_strongest_forecast_first:
## If already have position, keep it on - wouldn't be in this list
if len(current_instruments_with_positions)<maximum_positions:
## If haven't got a position on, and risk budget remaining, add a position
current_instruments_with_positions.append(instrument_to_add)
continue

elif len(list_of_removable_instruments)>0:
## If haven't got a position on, and no risk budget remaining,
## Remove position from market with current position and weakest forecast in 'could be removed' group
instrument_to_remove = list_of_removable_instruments.pop()
current_instruments_with_positions.remove(instrument_to_remove)
current_instruments_with_positions.append(instrument_to_add)
continue
else:
## If no markets remain with current positions in could be removed group, halt
break

new_positions = fill_positions_from_initial(current_instruments_with_positions=current_instruments_with_positions,

initial_positions=initial_positions)

return new_positions

def from_portfolio_weights_to_instrument_list(positions: portfolioWeights):
instrument_list = [instrument_code for instrument_code, position in positions.items()
if _valid_position(position)]
return instrument_list

def _valid_position(position: float):
if np.isnan(position):
return False
if position==0.0:
return False

return True

def removable_instruments_with_positions_weakest_forecasts_last(current_instruments_with_positions: list,
forecasts: portfolioWeights):
instrument_with_weak_forecasts = instruments_with_weak_or_non_existent_forecasts(forecasts)
instruments_with_positions_and_weak_forecasts = list(set(current_instruments_with_positions).intersection(instrument_with_weak_forecasts))

instruments_with_positions_and_weak_forecasts_weakest_forecast_last = \
sort_list_of_instruments_by_forecast_strength(forecasts,
instruments_with_positions_and_weak_forecasts)

return instruments_with_positions_and_weak_forecasts_weakest_forecast_last


def instruments_with_weak_or_non_existent_forecasts(forecasts: portfolioWeights) -> list:
weak_forecasts = [instrument_code
for instrument_code, forecast_value in forecasts.items()
if _weak_forecast(forecast_value)]
return weak_forecasts

def _weak_forecast(forecast_value: float):
if np.isnan(forecast_value):
return True
#FIXME SHOULD COME FROM SYSTEM HARD CODING IS THE DEVILS WORK
if abs(forecast_value)<5.0:
return True
return False

def sort_list_of_instruments_by_forecast_strength(forecasts: portfolioWeights,
instrument_list) -> list:

tuples_to_sort = [(instrument_code,
_get_forecast_sort_key_given_value(forecasts[instrument_code]))
for instrument_code in instrument_list]
sorted_tuples = sorted(tuples_to_sort, key=lambda tup: tup[1], reverse=True)
list_of_instruments = [x[0] for x in sorted_tuples]

return list_of_instruments

def _get_forecast_sort_key_given_value(forecast_value:float):
if np.isnan(forecast_value):
return 0.0
return abs(forecast_value)

def instruments_with_no_position_strongest_forecast_first(forecasts: portfolioWeights,
current_instruments_with_positions: list,
list_of_tradeable_instruments: list):

tradeable_instruments_setted = set(list_of_tradeable_instruments)
tradeable_instruments_setted.difference_update(current_instruments_with_positions)
instruments_with_no_position = list(tradeable_instruments_setted)
list_of_instruments_with_strong_forecasts = instruments_with_strong_forecasts(forecasts)

list_of_instruments_with_strong_forecasts_and_no_position = \
list(set(instruments_with_no_position).intersection(set(list_of_instruments_with_strong_forecasts)))

sorted_instruments = sort_list_of_instruments_by_forecast_strength(forecasts=forecasts,
instrument_list=list_of_instruments_with_strong_forecasts_and_no_position)

return sorted_instruments

def instruments_with_strong_forecasts(forecasts: portfolioWeights) -> list:
strong_forecasts = [instrument_code
for instrument_code, forecast_value in forecasts.items()
if _strong_forecast(forecast_value)]
return strong_forecasts

def _strong_forecast(forecast_value: float):
if np.isnan(forecast_value):
return False
#FIXME SHOULD COME FROM SYSTEM
if abs(forecast_value)<5.0:
return False
return True

That improves things a little; the cost comes down to 20SR units. But that's still a lot - about double what it is in the benchmark cases.

Let's restrict our universe of instruments we can consider adding to forecasts over 10, rather than over 5. Then we have:

  • Absolute forecasts greater than 10:
    • Existing position: won't be closed
    • New position: probably will be opened
  • Absolute forecasts between 5 and 10:
    • Existing positions: won't be closed
    • New positions: won't be opened
  • Forecasts less than 5:
    • Existing positions: may be closed
This creates a 'no trade zone' for forecasts between 5 and 10.

.... and makes almost no difference; lowering the costs by 1 SR unit.

Clearly I could play with these boundaries until I got a nicer result, but this reeks of implicit fitting and I feel the gap is just too large.


Some other things we could try


There are more complicated things we could do here, for example considering diversification when adding potential instrument positions, allocating the risk bucket by asset class or instrument cluster, perhaps a more sophisticated approach to costs.... but I think we'll just end up in the bad old world of complex dynamic optimisation that I narrowly escaped from in the second post


Conclusion


I feel this particular dead horse has been flogged enough. There is no easy way to get around the problem of having insufficient capital to trade loads and loads of futures markets. Any kind of dynamic optimisation, eithier by simple ranking (this post), or complex formula (posts 1 and 2) just isn't very effective, and involves making the nice simple straightforward trading system very ugly indeed.

By far the simplest approach is to sensibly choose some subset of those markets, and use those as your static set of instruments as I did in post #3 of this series. This also happens to be the best performing option in a backtest. For the $500K of capital that I have the effect on performance is fairly minimal in any case.

Yes there will FOMO if an instrument I don't own shows a seriously good trend, but I will just have to live with that.

Things are clearly tougher if you only have $100K or less, but then as my third book points out maybe you should be trading other leveraged instruments.

My personal 'to do' list now consists of tactically reweighting my portfolio towards the 28 instruments I found to be optimal for my account size here, and putting into place the technology to allow regular (annual?) reviews of my set of instruments.

Thanks for your help Auntie B.

AB "You're welcome. And I hope for your sake that the Jacuzzi is still warm."


4 comments:

  1. I'm probably lost between the blog posts and code (I'm not a python expert) but have you tried something like:
    Selecting assets with high forecast, low correlations, low size and low trading costs?
    This could be done in a complicated way (estimating portfolio expected Sharpe ratio given a few assumptions on how each variable impacts exp risk and return) or in some heuristic way.
    I think correlations may be a key variable: an instrument may have a rather large size and a rather weak signal, but if it is negatively correlated to other assets in the current portfolio including it may make sense.

    ReplyDelete
    Replies
    1. That's effectively what I do in the first two posts of the series.

      Delete
  2. It seems like all these approaches are somewhat violating your efforts to "avoid optimization". - I've followed a VERY similar approach to you for the unconstrained "fractional contract" solution, and have found that building the "full contract" solution based on solely optimizing to reduce transaction costs, and minimize tracking error to the "fractional" solution works great. Happy to share if you are interested...

    ReplyDelete
    Replies
    1. Email me on:
      rob AT systematicmoney DOT org

      Delete

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.