Wednesday, 18 May 2016

Optimising weights with costs (pysystemtrade)

In a previous post I showed you how to use my open source python backtesting package, pysystemtrade, to estimate forecast weights and instrument weights. At the time I hadn't included code to calculate costs. Now that has been remedied I thought I should also write some code to demonstrate the different ways you can optimise in the presence of costs.

You can see what else has been included in pysystemtrade since you last checked, here. If you're not a fan of pysystemtrade, don't worry, you'll probably find the discussion interesting although you'll then have to work out how to implement it yourself.

Naturally your understanding and enjoyment of this post will be enhanced if you've read chapter 12 of my book, though non readers who have followed my blog posts on the subject of optimisation should also be fine.

This post has been modified to reflect the changes made in version 0.10.0

Costs, generally


In this first part of the post I'll discuss in general terms the different ways of dealing with costs. Then I'll show you how to do this in python, with some pretty pictures.

How to treat costs


Ignore costs - use gross returns

This is by far the simplest option. It's even okay to do it if the assets in your portfolio optimisation, whether they be trading rules or instruments, have the same costs. Of course if this isn't the case you'll end up overweighting anything that looks amazing pre-costs, but has very high costs to go with it.

Subtract costs from gross returns: use net returns

This is what most people would do without thinking; it's perhaps the "obvious" solution. However this skips over a fairly fundamental problem, which is this: We know costs with more certainty than we know expected gross returns.

I'm assuming here you've calculated your costs properly and not been overly optimistic. 

Take for example my own trading system. It backtests at something like a 20% average return before costs. As it happens I came in just over that last year. But I expect just over two thirds of my annual returns to be somewhere between -5% and +45% (this is a simple confidence interval given a standard deviation of 25%).

Costs however are a different matter. My costs for last year came in at 1.39% versus a backtested 1.5%. I am pretty confident they will be between 1% and 2% most years I trade. Costs of 1.5% don't affect optimisation much, but if your predicted costs were 15% versus a 20% gross return, then it's important you bear in mind your costs have perhaps a 70% chance of being between 14% and 16%, and your gross returns between -5% and 45%. That's a system I personally wouldn't trade.

The next few alternatives involve various ways of dealing with this issue.


Ignore gross returns, just use costs


This is probably the most extreme solution - throw away any information about gross returns and optimise purely on cost data. Expensive assets will get a lower weight than cheap ones.

(Note this should be done in a way that preserves the covariance structure of gross returns; since we still want more diversifying assets to be upweighted. It's also good for optimisation to avoid negative returns. I apply a drag to the returns such that the asset's Sharpe Ratios are equal to the cost difference plus a reasonable average Sharpe)

I quite like this idea. I think it's particularly appealing in the case of allocating instrument weights in a portfolio of trading systems, each trading one instrument. There is little reason to think that one instrument will do better than another.

Nevertheless I can understand why some people might find this a little extreme, and I probably wouldn't use it for allocating forecast weights to trading rule variations. This is particularly the case if you've been following by general advice not to fit trading models in the "design" phase, which means an optimiser needs to be able to underweight a poor model.



Subtract costs from gross returns, and multiply costs by a factor


This is a compromise where we use gross returns, but multiply our costs by a factor (>1) before calculating net costs. A common factor is 2, derived from the common saying "A bird in the hand is worth two in the bush", or in the language of Bayesian financial economics: "A bird which we have in our possession with 100% uncertainty has the same certainty equivalent value as two birds whose ownership and/or existence has a probability of X% (solve for X according to your own personal utility function)". 


In other words 1% of extra costs is worth as much as 2% of extra gross returns. This has the advantage of being simple, although it's not obvious what the correct factor is. The correct factor will depend on the sampling distribution of the gross returns, and that's even without getting into the sticky world of uncertainty adjustment for personal utility functions. 


Calculate weights using gross returns, and adjust subsequently for cost levels


This is a slightly more sophisticated method and one that Thomas Bayes FRS would be more comfortable with.

I'm 97% sure this is Thomas Bayes. Ex-ante I was 100%, but the image is from Wikipedia after all. 

In table 12 (chapter 4, p.86 print edition) of my book I explain how you can adjust portfolio weights if you know with certainty what the Sharpe ratio difference is between the distribution of returns of the assets in our portfolio (notice this is not as strong as saying we can predict the actual returns). Since we have a pretty good idea what costs are likely to be (so I'm happy to say we can predict their distribution with something close to certainty)  it's valid to use the following approach:

  • Make a first stab at the portfolio weights using just gross returns. The optimiser (assuming you're using bootstrapping or shrinkage) will automatically incorporate the uncertainty of gross returns into it's work.
  • Using the Sharpe Ratio differences of the costs of each asset adjust the weights.
This trick means we can pull in two types of data about which we have differing levels of statistical confidence. It's also simpler than the strictly correct alternative of bootstrapping some weights with only cost data, and then combining the weights derived from using gross returns with those from costs.


Apply a maximum cost threshold


In chapter 12 of my book I recommended that you do not use any trading system which sucked up more than 0.13 SR of costs a year. Although all the methods above will help to some extent, it's probably a good idea to entirely avoid allocating to any system which has excessive costs.


Different methods of pooling


I'm a big fan of pooling information across different instruments.

This section applies only to calculating forecast weights; you can't pool data across instruments to work out instrument weights.

It's rare to have enough data for any one instrument to be able to say with any statistical significance that this trading rule is better than the other one. We often need decades of data to make that decision. Decades of data just aren't available except for a few select markets. But if you have 30 instruments with 10 years of data each, then that's three centuries worth.

Once we start thinking about pooling with costs however there are a few different ways of doing it.


Full pooling: Pool both gross returns and costs


This is the simplest solution; but if our instruments have radically different costs it may prove fatal. In a field of mostly cheap instruments we'd plump for a high return, high turnover, trading rule. When applied to a costly instrument that would be a guaranteed money loser.


Don't pool: Use each instruments own gross returns and costs


This is the "throw our toys out of the pram" solution to the point I raised above.  Of course we lose all the benefits of pooling.


Half pooling: Use pooled gross returns, and an instrument's own costs


This is a nice compromise. The idea being once again that gross returns are relatively unpredictable (so let's get as much data as possible about them), whilst costs are easy to forecast on an instrument by instrument basis (so let's use them).

Notice that the calculation for cost comes in two parts - the cost "per turnover" (buy and sell) and the number of turnovers per year. So we can use some pooled information about costs (the average turnover of the trading rule), whilst using the relevant cost per turnover of an individual instrument.

Note


I hopefully don't need to point out to the intelligent readers of my blog that using an instrument's own gross returns, but with pooled costs, is pretty silly.




Costs in pysystemtrade


You can follow along here. Notice that I'm using the standard futures example from chapter 15 of my book, but we're using all the variations of the ewmac rule from 2_8 upwards to make things more exciting.



Key


This is an extract from a pysystemtrade YAML configuration file:

forecast_weight_estimate:
   date_method: expanding ## other options: in_sample, rolling
   rollyears: 20

   frequency: "W" ## other options: D, M, Y


Forecast weights

Let's begin with setting forecast weights. I'll begin with the simplest possible behaviour which is:

  • applying no cost weighting adjustments
  • applying no ceiling to costs before weights are zeroed
  • A cost multiplier of 0.0 i.e. no costs used at all
  • Pooling gross returns across instruments and also costs; same as pooling net returns
  • Using gross returns without equalising them


I'll focus on the weights for Eurostoxx (as it's the cheapest market in the chapter 15 set) and V2X (the most expensive one); although of course in this first example they'll be the same as I'm pooling net returns. Although I'm doing all my optimisation on a rolling out of sample basis I'll only be showing the final set of weights.

forecast_cost_estimates:
   use_pooled_costs: True
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 0.0
   pool_gross_returns: True
   equalise_gross: False

   # optimisation parameters remain the same 
   method: bootstrap
   equalise_vols: True
   monte_runs: 100
   bootstrap_length: 50
   equalise_SR: False
   frequency: "W"
   date_method: "expanding"
   rollyears: 20
   cleaning: True
   


These are the Eurostoxx weights; but they're the same for every market. As you can see the diversifying carry rule gets the most weight; for the variations of EWMAC it's close to equal weights.


Subtract costs from gross returns: use net returns

forecast_cost_estimates:
   use_pooled_costs: True
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 1.0 ## note change
   pool_gross_returns: True
   equalise_gross: False

Again just the one set of weights here. There's a bit of a reallocation from expensive and fast trading rules towards slower ones, but not much.

Don't pool: Use each instruments own gross returns and costs

forecast_cost_estimates:
   use_pooled_costs: False

forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 1.0
   pool_gross_returns: False
   equalise_gross: False

EUROSTX (Cheap market)
For a cheap market like Eurostoxx applying it's own costs means that the fast EWMAC2_8 isn't downweighted by nearly as much as with pooled costs. There's also the fact that we haven't much data, which means the weights could be a bit unusual.
V2X (Expensive market)
V2X is expensive, so there's a bit less weight on ewmac2. However again there's actually more than for the pooled data above; suggesting again that the small amount of data available is just giving us rather random weights.

Half pooling: Use pooled gross returns, and an instrument's own costs

forecast_cost_estimates:

   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 1.0
   pool_gross_returns: True
   equalise_gross: False


EUROSTOXX (Cheap)
V2X (Expensive)

These two graphs are far more sensible. There is about 30% in carry in both; with pretty much equal weighting for the cheaper Eurostoxx market, and a tilt towards the cheaper variations for pricey V2X.

For the rest of this post I'll be using this combination.

Ignore gross returns, just use costs

forecast_cost_estimates:

   use_pooled_costs: False
   use_pooled_turnover: True ## even when not pooling costs I recommend doing this
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 1.0
   pool_gross_returns: True
   equalise_gross: True
Eurostoxx (cheap)


V2X (expensive)



For cheap Eurostoxx costs don't seem to matter much and there's a spread to the more diversifying 'wings' driven by correlations. For V2X there is clearly a tilting away from the more expensive options, but it isn't dramatic.

Subtract costs from gross returns after multiply costs by a factor>1

forecast_cost_estimates:

   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 999.0

   cost_multiplier: 3.0
   pool_gross_returns: True
   equalise_gross: False

Eurostoxx (Expensive)
V2X (Cheap)
I've used a factor of 3 here rather than 2 to make the picture more dramatic. The results show about half the weight for V2X in the most expensive variation compared to using a cost factor of 1. Still I have my reservations about what the right number to use is here.

Calculate weights using gross returns, and adjust subsequently for cost levels

forecast_cost_estimates:
   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this
forecast_weight_estimate:
   apply_cost_weight: True
   ceiling_cost_SR: 999.0

   cost_multiplier: 0.0
   pool_gross_returns: True
   equalise_gross: False


EUROSTOXX (Cheap)
V2X (Expensive)
This gives similar results to using a cost multiplier of 3.0 above.

Apply a maximum cost threshold

forecast_cost_estimates:
   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this
forecast_weight_estimate:
   apply_cost_weight: False
   ceiling_cost_SR: 0.13

   cost_multiplier: 1.0
   pool_instruments: True
   equalise_gross: False


EUROSTOXX (Cheap)
V2X (Expensive)

Really giving any allocation to ewmac2_8 has been kind of crazy because it is a rather expensive beast to trade. The V2X weighting is even more extreme removing all but the 3 slowest EWMAC variations. It's for this reason that chapter 15 of my book uses only these three rules plus carry.



My favourite setup


forecast_cost_estimates:

   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this

forecast_weight_estimate:
   apply_cost_weight: True
   ceiling_cost_SR: 0.13
   cost_multiplier: 0.0
   pool_gross_returns: True
   equalise_gross: False


As you'll know from reading my book I like the idea of culling expensive trading rules. This leaves the optimiser with less to do. I really like the idea of applying cost weighting (versus say multiplying costs my some arbitrary figure); which means optimising on gross returns. Equalising the returns before costs seems a bit extreme to me; I'd still like the idea of really poor trading rules being downweighted, even if they are the cheap ones. Pooling gross returns but using instrument specific costs strikes me as the most logical route to take.

Eurostoxx (cheap)
V2X (expensive)

A note about other optimisation methods


The other three methods in the package are shrinkage and one shot optimisation (just a standard mean variance). I don't recommend using the latter for reasons which I've mentioned many, many times before. However if you're using shrinkage be careful; the shrinkage of net return Sharpe Ratios will cancel out the effect of costs. So again I'd suggest using a cost ceiling, setting the cost multiplier to zero, then applying a cost weighting adjustment; the same setup as for bootstrapping. 

forecast_cost_estimates:
   use_pooled_costs: False
   use_pooled_turnover: True ## even when pooling I recommend doing this
forecast_weight_estimate:

   method: shrinkage
   equalise_SR: False
   ann_target_SR: 0.5
   equalise_vols: True
   shrinkage_SR: 0.90
   shrinkage_corr: 0.50



   apply_cost_weight: True
   ceiling_cost_SR: 0.13

   cost_multiplier: 0.0
   pool_instruments: True
   pool_costs: False 
   equalise_gross: False

The Eurostoxx weights are very similar to those with bootstrapping. For V2X shrinkage ends up putting about 60% in carry with the rest split between the 3 slowest ewmac variations. This is more a property of the shrinkage method, rather than the costs. Perhaps the shrinkage isn't compensating enough for the uncertainty in the correlation matrix, which the bootstrapping does.

If you don't want to pool gross returns you might want to consider a higher shrinkage factor as there is less data (one of the reasons I don't like shrinkage is the fact the optimum shrinkage varies depending on the use case).

There's a final method equal_weight which will give equal weights (duh). However it can be used in combination with ceiling_cost_SR and apply_cost_weights.


Instrument weights


In my book I actually suggested that ignoring costs for instrument trading subsystems is a perfectly valid thing to do. In fact I'd be pretty happy to equalise Sharpe ratios in the final optimisation and just let correlations do the talking.

Note that with instrument weights we also don't have the benefit of pooling. Also by using a ceiling on costs for trading rule variations it's unlikely that any instrument subsystem will end up breaking that ceiling.

Instrument weights
Just for fun I ran an instrument weight optimisation with the same cost parameters as I suggested for forecasts. There is a smaller allocation to V2X and Eurostoxx; but that's probably because they are fairly highly correlated. One is the cheapest and the other the most expensive market, so the different weights are not driven by costs!


Summary


Key points from this post:


  • Pooling  gross returns data across instruments - good thing
  • Using instrument specific costs - important
  • Filtering out trading rules that are too expensive - very very important
  • Using a post optimisation cost weighting - nice, but not a dealbreaker
  • Using costs when estimating instrument weights - less important

Happy optimising.

24 comments:

  1. This one requires some thought.
    If we were to just run the optimisation with our own config with NO mention of how costs are treated (e.g., no reference to 'apply_cost_weight', 'pool_costs', 'ceiling_cost', etc), what are the defaults that pysystemtrade reverts to?

    ReplyDelete
    Replies
    1. The file systems.provided.defaults.yaml tells you this. eg

      forecast_weight_estimate:
      func: syscore.optimisation.GenericOptimiser
      pool_instruments: True
      pool_costs: False
      equalise_gross: False
      cost_multiplier: 0.0
      apply_cost_weight: True
      ceiling_cost_SR: 0.13

      Delete
  2. It's probably not a picture of Bayes but of an English churchman who lived about 100 years after Bayes. Rev. Bayes would have worn a wig in his portrait.

    ReplyDelete
    Replies
    1. Glad I didn't say I was 100% sure. I'm bowled over by the erudite knowledge of the readers of this blog...

      Delete
  3. I am wondering for the optimisation.py in pyalgotrade. There are some adjustment factors (adj_factors) for the Sharpe Ratio difference. How are these calculated?

    ReplyDelete
    Replies
    1. They are from chapter 4 of my book.

      Delete
  4. Hi Rob...just finished reading your book - my wife couldn't believe I was reading "a text book" after dinner as she rudely called your opus rather than watch Netflix!! Great stuff - finished it in two days and will go for a slower read through soon.

    One first thought that has bubbled up already, in terms of costs: would I be correct in thinking that selecting from high liquidity/tight spread ETFs with a US$2 round trip cost at IB (things like: SPY,GDX,EEM,XLF,EWJ,XIV,IWM,EFA,QQQ,FXI,USO,VWO,GDXJ,HYG,SLV,XLV,TLT) with the advantages of highly granular instrument blocks be acceptable in terms of meeting the trade cost requirements for something like the EWMAC 8,32 (and slower)?

    I realise that for accounts above US$100K futures are always the way to go (refer https://www.peterlbrandt.com/trading-futures-forex-related-etfs-foolish-way-manage-trading-capital/ ) but for smaller accounts is my thinking on the right track?

    Cheers, Donald

    ReplyDelete
    Replies
    1. You are probably on the right track. Of those ETF I have only looked at SPY using the cost model in chapter 12 but it was certainly cheap enough to trade like a future.

      Delete
    2. My second thought bubbling up, is would I be way off the "get to 10" concept if I did something like this?: use the read-outs from a Percent Price Oscillator, multiply each by 2 (this is for four EWMAVs), and cap at total of 20 (refer here for a qqq example http://schrts.co/Pgqbfg ) and here for more on the PPO ( http://bit.ly/2aFOCrx )

      So, to illustrate the "nitty-gritty", for QQQ at the moment, the PPOs read 1.758, 2.657, 2.904, 2.526 which sum to 9.845. Multiply this by 2, brings up to a current QQQ signal strength of 19.69

      Regards, Donald

      Delete
    3. I am not familiar with the PPO. It doesnt seem to have a unitless scale. It will change scale in different vol enviroments. So I wouldn't use it.

      Delete
    4. Hi Rob

      Agreed.... ;-) I'm just on my second read through, and came across page 112 and page 160. These two make it very clear and easy to understand. The PPO fails in terms of not being proportional to expected risk adjusted returns...that's the whole enchilada right there, as the Americans might say!

      By the way, thanks for putting together the excellent MS Excel files on the resource page....you've gone beyond the extra mile.

      Cheers, Donald

      Delete
  5. ...actually...on second thoughts, maybe not even bother with the multiplier of two. Just sum up the PPOs (and maybe ditch the fastest EWMAC as it's a bit quick for most things?).

    ReplyDelete
  6. Hi Rob,

    How often do you rebalance cross-sectional strategies?

    ReplyDelete
  7. Hi Rob,

    In your code there is a function named 'apply_cost_weighting'. If I debug the code with and using your data I receive following weight adjustment factors for the different rule variations :
    - carry : 0.959611 (rule with lowest SR cost)
    - ewmac16_64 : 0.990172
    - ewmac32_128 : 0.976361
    - ewmac4_16 : 1,141907 (rule with highest SR cost)
    - ewmac64_256 : 0.969992
    - ewmac8_32 : 1.036545

    This result is the opposite from what I should expecting. I should expect that the cheapest rule gets the highest weight adj factor and the most expensive gets the lowest factor.

    Is it possible that this rule :
    relative_SR_costs = [cost - avg_cost for cost in ann_SR_costs]
    should be
    relative_SR_costs = [-1*cost - avg_cost for cost in ann_SR_costs] ?

    Kris

    ReplyDelete
    Replies
    1. no it should be working on returns not costs. bug fixed in latest release.

      Delete
    2. With this new release the result looks fine, thnx

      Delete
  8. Rob,

    I've a question on bootstrapping in combination with pooling gross returns.
    In pysystemtrade I see that you use bootstrapping based on random individual daily returns (also described in your book p 289). But as you mention on p 290, it's could be better to use 'block bootstrapping'.
    For the pooling of the returns you stack up the returns by give them an extra second in the date-time field. After that you select random returns. But this isn't possible if we want to use block bootstrapping. I was thinking to work like this :
    - get P&L from instrument 1
    - get P&L from instrument 2
    - get P&L from instrument 3
    - stitch P&L of all this instruments :
    --> P&L instrument 1 - P&L instrument 2 - P&L instrument 3
    - create a virtual account curve from the stitched P&L's
    - do block bootstrapping

    Is this reasonable to do or do I overlook something ?

    Kris

    ReplyDelete
    Replies
    1. Yes, this is exactly right. You glue together the p&l so for example with 10 instruments of 10 years each you'd have a single 100 year long account curve. But this will make life complicated and may also run into problems with linux date storage. Instead keep the instruments in seperate buckets, say with 1 year blocks and 10 instruments each with 10 years of data you'd have 100 blocks (so populate a list of the blocks, rather than stitching). Then when you bootstrap pick N blocks at random with replacement from those 100 blocks; and stitch those together to create a set of data on which you run your single optimisation. Hope that makes sense.

      Delete
    2. Yes this make sense and it's an interesting idea. I will try this out.

      Delete
  9. Rob,

    Just trying to get my head around the difference between
    i) Calculate weights using gross returns, and adjust subsequently for cost levels
    apply_cost_weight: True
    cost_multiplier: 0

    ii) using half-pooling
    apply_cost_weight: False
    cost_multiplier: 1.0

    I only vaguely understand the difference and would appreciate more clarity!

    Many thanks

    Chris

    ReplyDelete
    Replies
    1. i: optimisation is done without costs (pure pre-cost returns), and then the weights that come out are adjusted with more expensive assets seeing their weights reduced.

      ii: optimisation is done with post cost returns, but then weights aren't adjusted afterwards

      Delete
  10. Rob, I bought your book and in the process of replicating the logics to confirm my understanding. I have some questions regarding to cost estimation. I can grasp the standardised cost, i.e., SR for a round trip, yet I find it is confusing on the estimation of the turnover (for forecast level, instrument level and subsystem level)

    In your book, you used position level as example, and provided a formula as: average position change / 2 * average holding per year. Firstly, I do not understand why normalisation is required for this. If a round trip for a block value for an instrument costs 0.01 SR unit, and the number of block values changes says 10 round trip, doesn't this imply 0.01 * 10 = 0.10 SR unit, which should be subtracted from the expected return in various optimisation stage?

    Furthermore, if I look into what is implemented in your python library, it seems that 2 * factor is missing (at least on the forecast level). Could you clarify this?

    Thirdly, it is not clear how to include turnover estimation at forecast level as opposed to position level? Forecast generated is only proportional, not equal to estimated SR for a rule variation, wouldn't the cost estimation in the way you described in the book require to be scaled by an unknown factor?

    As a separate note, on the carry strategy, for various future strategies, the front and back (2nd front) future are sufficient to use as proxy of carry, yet this is not true for commodity where seasonality could play a big role, I wonder how do you pick the right pair of future systematically? Moreover, is there a point to look at the how term structure?

    ReplyDelete