This Blog is Systematic: Random data: Evaluating "Trading the equity curve"

Tuesday, 10 November 2015

Random data: Evaluating "Trading the equity curve"

Everyone hates drawdowns (those periods when you're losing money whilst trading). If only there was a way to reduce their severity and length....

Quite a few people seem to think that "trading the equity curve" is the answer. The basic idea is that when you are doing badly, you reduce your exposure (or remove it completely) whilst still tracking your 'virtual' p&l (what you would have made without any interference). Once your virtual p&l has recovered you pile back into your system. The idea is that you'll make fewer losses whilst your system is turned off. It sounds too good to be true... so is it? The aim of this post is to try and answer that question.

This is something that I have looked at in the past, as have others, with mixed results. However all the analysis I've seen or done myself has involved looking at backtests of systems based on actual financial data. I believe that to properly evaluate this technique we need to use large amounts of random data, which won't be influenced by the fluke of how a few back tests come out. This will also allow us to find which conditions will help equity curve trading work, or not.

This is the second post in a series on using random data. The first post is here.

How do we trade the equity curve?

I'm going to assume you're reasonably familiar with the basic idea of equity curve trading. If not, then its probably worth perusing this excellent article from futures magazine.

An equity curve trading overlay will comprise of the following components:

A way of identifying that the system is 'doing badly', and quantifying by how much.
A rule for degearing the trading system given how badly it is doing
A second rule for regearing the system once the 'virtual' account curve is 'doing better'

I've seen two main ways for identifying that the system is doing badly. The first is to use a straightforward drawdown figure. So, for example, if your drawdown exceeds 10% then you might take action.

(Sometimes rather than the 'absolute' drawdown, the drawdown since the high in some recent period is considered)

The second variation is to use a moving average (or some other similar filter) of the account curve. If your account curve falls below the moving average, then you take action.

(There are other variations out there, in particular I noticed that the excellent Jon Kinlay blog had a more complex variation)

As for degearing your system, broadly speaking you can eithier degear it all in one go, or gradually. Usually if we dip below the moving average of an equity curve then it is suggested that you cut your position entirely.

Whilst if you are using the current drawdown as your indicator, then you might degear gradually: drawdown; then by another 20% for a total of 40% when you hit a 20% drawdown and so on.

Note that this degearing will be in addition to the normal derisking you should always do when you lose money; if you lose 10% then you should derisk your system by 10% regardless of whether you are using an equity curve trading overlay.

Finally the 'regearing' rule is normally the reverse of the degearing rule and process.

Prior research

The idea of trading the equity curve is something that seems to have bypassed academic researchers (unless they are calling it something else - please write in if you know about any good research) so rather than any formal literature review I had a quick look at the first page of google.

Positive

adaptrade
crazy ivan (whether he's a reliable source I don't know...)
Jon Kinlay (though not the normal system)

Negative (or at least no clear benefit)

Futures magazine (and also here)
Anthony Garnett
futures.io
r-bloggers

Why random data?

I personally find the above research interesting, but not definitive one way or another. My main issue is that it was all done on different financial instruments and different kinds of trading systems, which unsuprisingly gave different results. This might be because there is something 'special' about those instruments where equity curve trading worked, but it's more likely to be just dumb luck. Note: It is slightly more plausible that different kinds of trading rules will give different results; and we'll explore this below.

I personally think that we can't properly evaluate this kind of overlay without using random data. By generating returns for different arbitrary trading strategies we can then judge whether on average equity curve trading will be better.

Another advantage of using random data to evaluate an equity curve overlay system is that we avoid potentially overfitting. If we run one version of the overlay on our system, and it doesn't work, then it is very tempting to try a different variation until it does work. Of course we could 'fit' the overlay 'parameters' on an out of sample basis. But this quite a bit of work; and we don't really know if we'd have the right parameters for that strategy going forward or if they just happened to be the best for the backtest we ran.

Finally using random data means we can discover what the key characteristic of a trading system is that will allow trading the equity curve to work, or not.

Designing the test

Which overlay method?

There are probably an infinite variety of methods of doing an equity curve trading overlay (of which I touched on just a few above). However to avoid making this already lengthy post the size of an encylopedia I am going to limit myself to testing just one method. In any case I don't believe that the results for other methods will be substantially different.

I'm going to focus on the most popular, moving average, method:

"When the equity curve falls below it's N day moving average, turn off the system. Keep calculating the 'virtual' curve, and it's moving average during this period. When the 'virtual' curve goes back above it's moving average then turn the system back on"

That just leaves us with the question of N. Futures magazine uses 10, 25 and 40 days and to make life simple I'll do the same. However to me at least these seem incredibly short periods of time. For these faster N trading costs may well overwhelm any advantage we get (and I'll explore this later).

Also wouldn't it be nice to avoid the 3 year drawdown in trend following that happened between 2011 and 2013? Because we're using random data (which we can make as long as we like) we can use longer moving averages which wouldn't give us meaningful results if we tested just a few account curves which were 'only' 20 years long.

So I'll use N=10, 25, 40, 64, 128, 256, 512

(In business days; 2 weeks, 5 weeks, 8 weeks, 3 months, 6 months, 1 year, 2 years)

def isbelow(cum_x, mav_x, idx):

    ## returns 1 if cum_x>mav_x at idx, 0 otherwise

    if cum_x[idx]>=mav_x[idx]:
        return 1.0
    return 0.0

def apply_overlay(x, N_length):
    """
    apply an equity curve filter overlay

    x is a pd time series of returns

    N_length is the mav to apply

    Returns a new x with 'flat spots'

    """
    if N_length==NO_OVERLAY:
        return x

    cum_x=x.cumsum()
    mav_x=pd.rolling_mean(cum_x, N_length)
    filter_x=pd.TimeSeries([isbelow(cum_x, mav_x, idx) for idx in range(len(x))], x.index)

    ## can only apply with a lag (!)
    filtered_x=x*filter_x.shift(1)

    return filtered_x

Which criteria?

A nice way of thinking about equity curve trading is that it is a bit like buying insurance, or in financial terms a put option on your system's performance. If your system does 'badly' then the insurance policy prevents you from losing too much.

One of my favourite acronyms is TINSTAAFL. If we're buying insurance, or an option, then there ought to be a cost to it. Since we aren't paying any kind of explicit premium, the cost must come in the form of losing something in an implicit way. This could be a lower average return, or something else that is more subtle. This doesn't mean that equity curve trading is automatically a bad thing - it depends on whether you value the lower maximum drawdown* more than the implicit premium you are giving up.

* This assumes we're getting a lower maximum drawdown - as we'll see later this isn't always the case.

I can think of a number of ways of evaluating performance which try and balance risk and reward. Sadly the most common, Sharpe Ratio, isn't appropriate here. The volatility of the equity curve with the overlay on will be, to use a technical term, weird - especially for large N. Long periods without returns will be combined with periods when return standard deviation is normal. So the volatility of the curve with an overlay will always be lower; but it won't be a well defined statistic. Higher statistical moments will also suffer.

Instead I'm going to use the metric return / drawdown. To be precise I'm going to see what effect adding the overlay has on the following account curve statistics:

Average annual return
Average drawdown
Maximum drawdown
Average annual return / average drawdown
Average annual return / maximum drawdown

(Note that return / drawdown is a good measure of performance as it is 'scale invarient'; if you double your leverage then this measure will be unchanged)

My plan is to generate a random account curve, and then measure all the above. Then I'll pass it through the equity overlay, and remeasure the statistics.

Finally just to note that for most of this post I won't be considering costs. For small N, given we are closing our entire strategy down and then restarting it potentially every week, these could be enormous. Towards the end I will give you an idea of how sensitive the findings are to the costs of different trading instruments.

Which equity curve characteristics?

Broadly speaking the process for using random data is:

Identify the important characteristics of the real data you need to model
Calibrate against some real data
Create a process which produces random data with the neccessary characteristics
Produce the random data, and then do whatever it is you need to do

Notice that an obvious danger of this process is making random data that is 'too good'. In an extreme case with enough degrees of freedom you could end up producing 'random' data which looks exactly like the data you calibrated it against! There is a balance between having random data that is realistic enough for the tests you are running, and 'over calibrated'.

Identification

What characteristics of a trading system returns will affect how well an equity curve overlay will work? As in my previous post I'll be producing returns that have a particular volatility target - that won't affect the results.

They will also have a given expected Sharpe Ratio. With a negative Sharpe Ratio equity curve overlay should be fantastic - it will turn off the bad system. With a high positive Sharpe Ratio they will probably have no effect (at least for large enough N). It's in the middle that things will get more interesting. I'll test Sharpe Ratios from -2 to +2.

My intuition is that skew is important here. Negative skew strategies could see their short, sharp, losses reduced. Positive skew such as trend following strategies which tend to see slow 'bleeds' in capital might be improved by an overlay (and this seems to be a common opinion amongst those who like these kind of systems). I'll test skew from -2 (average for a short volatility or arbitrage system) to +1 (typical of a fast trend following system).

Finally I think that autocorrelation of returns could be key. If we tend to get losses one after the other then equity curve trading could help turn off your system before the losses get too bad.

Calibration

First then for the calibration stage we need some actual returns of real trading systems.

The systems I am interested in are the trading rules described in my book, and in this post: a set of trend following rule variations (exponentially weighted moving average crossover, or EWMAC for short) and a carry rule.

First skew. The stylised fact is that trend following, especially fast trend following, is positive skew. However we wouldn't expect this effect to occur at frequencies much faster than the typical holding period. Unsurprisingly daily returns show no significant skew even for the very fastest rules. At a weekly frequency the very fastest variations (2,8 and 4,16) of EWMAC have a skew of around 1.0. At a monthly frequency the variations (8,32 and 16,64) join the positive skew party; with the two slowest variations having perhaps half that.

Carry doesn't see any of the negative skew you might expect from say just fx carry; although it certainly isn't a positive skew strategy.

There is plenty of research showing that trend following rules produce returns that are typically negatively autocorrelated eg http://www.valuewalk.com/2015/09/autocorrelation-of-trend-following-returns-illusion-and-reality/ and http://www.trendfollowing.com/whitepaper/newedge.pdf. The latter paper suggests that equities have a monthly autocorrelation of around +0.2, whilst trend following autocorrelations come in around -0.3. Carry doesn't seem to have a significant autocorrelation.

My conclusion is that for realistic equity curves it isn't enough just to generate daily returns with some standard deviation and skew. We need to generate something that has certain properties at an appropriate time scale; and we also need to generate autocorrelated returns.

I'll show the effect of varying skew between -2 and +1, autocorrelation between -0.3 and +0.3, and Sharpe Ratio between -1 and +2.

How do we model?

In the previous post I showed how to generate skewed random data. Now we have to do something a bit fancier. This section is slightly technical and you might want to skip if you don't care where the random data comes from as long as it's got the right properties.

The classic way of modelling an autocorrelated process is to create an autoregressive AR1 model* (note I'm ignoring higher autoregression to avoid over calibrating the model).

* This assumes that the second, third, ... order autocorrelations follow the same pattern as they would in an AR1 model.

So our model is:

r_t = Rho * r(t-1) + e_t

Where Rho is the desired autocorrelation and e_t is our error process: here it's skewed gaussian noise*.

* Introducing autocorrelation biases the other moments of the distribution. I've included corrections for this which works for reasonable levels of abs(rho)<0.8. You're unlikely to see anything like this level in a real life trading system.

This python code shows how the random data is produced, and checks that it has the right properties.

Now, how do we deal with different behaviour at different frequencies? There are very complicated ways of dealing with this (like using a Brownian bridge), but the simplest is to generate returns at a time scale appropriate to the speed of the indicator. This implies weekly returns for carry and fast EWMAC rules(2,4 and 4,8); and monthly for slower EWMAC rules. If you're trading a very fast trading rule then you should generate daily data, if you see a clear return pattern at that frequency.

I'll use daily returns for the rest of this post, but I've checked that they still hold true at weekly and monthly frequencies (using equity curve filter lookbacks at least 3 times longer than the frequency of returns we're generating).

Results

To recap I am going to be TESTING different lookbacks for the equity curve filter; and I am going to be GENERATING returns with skew between -2 and +1, autocorrelation between -0.4 and +0.4, and Sharpe Ratio between -1 and +2.

I'll keep standard deviation of returns constant, since that will just change the overall scale of each process and not affect the results. I'll look at the results with daily returns. The results won't be significantly different with other periods.

All the code you need is here.

Sharpe Ratio

In this section I'll be varying the Sharpe Ratio whilst keeping the skew and autocorrelation fixed (at zero, and zero, respectively).

scenario_type="VaryingSharpeRatio"
period_length=1 ### daily returns

Average annual return

All of the plots that follow have the same format. Each line shows a different level of return characteristic (Sharpe Ratio in this case). The x axis shows the equity curve filter N day count that we're using for the moving average. Note that N=1000, which is always on the right hand side, means we aren't using a filter at all. The y axis shows the average value of the statistic of interest (in this case average annual return) across all the random equity curves that we generate and filter.

The good news is if you know you're trading system is rubbish, then applying an equity curve system, preferably with large N, improves the performance. If you knew your system was rubbish then of course rather than use a complicated filter to turn it off you wouldn't bother turning it on at all! However for all profitable equity curves equity curve trading reduces, rather than increases, your returns.

Average drawdown

Again for systems which break even or lose money the average drawdown is lower with an equity curve trading system, as you might expect; again especially for large N. However for profitable systems there is no benefit, and average drawdowns may even be slightly worse.

Maximum drawdown

For profitable systems there might perhaps be a modest reduction in maximum drawdown for small values of N. For loss making systems the biggest reduction in drawdown is for large values of N; although all filters are better than none.

Average annual return / average drawdown

Let's now try and put together the return and drawdown into a simple statistic. For unprofitable systems the overlay makes no difference. For profitable systems it reduces the drawdown adjusted return (the 'hump' at the right hand side of the SR=2.0 line is an artifact caused by the fact we can't calculate this statistic when the average drawdown is zero).

Average annual return / maximum drawdown

This statistic tells the same story. For a profitable system applying an equity curve overlay reduces the average return / max drawdown ratio; with faster overlays (small N) probably worse (and they would be much, much worse with trading costs applied). If you have a system that definitely loses money then applying an overlay, of any lookback, will mitigate your losses.

Skew

In this section I'll be varying the Skew whilst keeping the Sharpe Ratio and autocorrelation fixed (at one, and zero, respectively).

scenario_type="VaryingSkew"
period_length=1 ## daily returns

Average annual return / average drawdown

To save time let's jump ahead to the calculated statistics. Bluntly my intuition was wrong; skew makes no difference. The overlay harms returns for all skew values shown here.

Average annual return / maximum drawdown

There is a similar story for return / max drawdown.

Autocorrelation

In this section I'll be varying the Autocorrelation whilst keeping the skew and Sharpe Ratio fixed (at zero, and one, respectively).

scenario_type="VaryingAuto"
period_length=1