Monday, 5 May 2025

Can I build a scalping bot? A blogpost with numerous double digit SR

 I did post recently, from which I shall now quote:

Two minute to 30 minute horizon: Mean reversion works, and is most effective at the 4-8 minute horizon from a predictive perspective; although from a Sharpe Ratio angle it's likely the benefits of speeding up to a two minute trade window would overcome the slight loss in predictability. There is no possibility that you would be able to overcome trading costs unless you were passively filled, with all that implies... Automating trading strategies at this latency - as you would inevitably want to do - requires some careful handling (although I guess consistently profitable manual scalpers do exist that aren't just roleplaying instagram content creators, I've never met one). Fast mean reversion is also of course a negatively skewed strategy so you will need deep pockets to cope with sharp drawdowns. Trade mean reversion but proceed with great care.

Am I the only person who read that and thought.... well if anyone can build an automated scalping bot... surely I can? 


To be precise something with a horizon of somewhere around the 4-8 minute mark, but we'd happily go a bit slower (though not too much, lest we hit the dreaded region of mean reversion), or even a bit quicker.

In fact I already have the technology to build one (at least in futures). There's a piece of code in my open source trading system called the (order) stack handler. This code exists to manage the execution of my trades, and implements my simple trading algo, which means it can do stuff like this: 

- get streaming prices from interactive brokers

- place, modify and cancel orders, and check their status with IB

- get fills and write them to a database so I can keep track of performance

- work out my current position and check it is synched with IB

I'll need to do a little extra coding and configuration. For example, I will probably inherit from this class to create my little scalping bot. I will also want to partition my trading universe into the scalped instruments, and the rest of my portfolio (it would be too risky to do both on a given market; and my dynamic optimisation means removing a few instruments from my main system won't cause too much of a headache).

All that leaves then is the "easy" job of creating the actual algo that this bot will run... and doing so profitably.

Some messy python (no psystemtrade required), here, and an sketchy implementation in psystemtrade here.


Some initial thoughts

I'm going to begin with a bit of a brain dump:

So the basic idea is that we start the day flat, and from the current price we set symmetric up and down bracket limit orders. To simplify things, we'll assume those orders are for a fixed number of contracts (to be determined); most likely one but could be more. Also to be determined are the width of those limit orders.

Unlike the (now discredited) slower mean reversion strategy in my fourth book, we won't use any kind of slow trend overlay here. We're going to be trading so darn quick that the underlying trend will be completely irrelevant. That means we're going to need some kind of stop loss. 

If I describe my scalper as a state machine then, it will have a number of states. Let's first assume we make money on a buy.

A: Start: no positions or orders

- Place initial bracket order

B: Ready: buy limit and sell limit

- Buy limit order taken out

C: Long and unprotected: (existing) sell limit order at a profit

- Place sell stop loss below current level

D: Long and protected: (existing) sell limit order at a profit (higher) and sell stop loss order (lower)

- Hit sell limit profit order

E: Flat and unbalanced: sell stop loss order

- Cancel sell stop loss order

A: No positions or orders

Or if you prefer a nice picture:


Lines are orders, black is price, then green because we are long, then black again when we are flat. Red lines are sell order, dotted line is stop loss, green line is buy order.

What happens next: is we reset, and place bracket orders around the current price which is assumed to be the current equilibrium price. Note: if we didn't assume it was the current equilibrium we would have the opportunity to leave our stop loss orders in place. But that also assumes we're going to use explicit stops, which is a design decision to leave for now.

The other profitable path we can take is:

B: Ready: buy limit and sell limit

- Sell limit order taken out

F: Short and unprotected: (existing) buy limit order at a profit

- Place buy stop loss above current level

G: Short and protected: (existing) buy limit order at a profit (lower) and buy stop loss order (higher)

- Hit buy limit profit order

H: Flat and unbalanced: buy stop loss order

- Cancel buy stop loss order

A: No positions or orders


Now what if things go wrong?

D: Long and protected: (existing) sell limit order at a profit (higher) and sell stop loss order (lower)

- Hit sell limit stop loss

J: Flat and unbalanced: sell limit order

- Cancel sell limit order

A: No positions or orders


Alternatively:

G: Short and protected: (existing) buy limit order at a profit (lower) and buy stop loss order (higher)

- Hit buy stop loss limit

K: Flat and unbalanced: buy limit order

- Cancel buy limit order

A: No positions or orders


That's a neat ten states to consider. Of course with faster trading there is the opportunity for async events which will effectively result in some other states occuring, but we'll return to that later. Finally, we're probably going to want to have a 'soft close' time before the end of the day beyond which we wouldn't reopen new positions, and then a 'hard close' time when we would close our existing position irrespective of p&l.

It's clear that the profitability, risk, and the average holding period of this bad boy are going to depend on what proportion of our trades are profitable, and hence on some parameters we need to set:

- position size

- initial bracket width

- distance to stop loss


... some numbers we need to know:

- contract multiplier and FX

- commissions and exchange fees

- the cost of cancelling orders 


... and also on on some values we need to estimate:

- volatility (we'd probably need a short warm up period to establish what that is)

- likely autocorrelation in prices

Note: We're also going to have to consider tick size, which we need to round our limit orders, but also in the limit will result in a minimum bracket width of two ticks (assuming that would still be a profitable thing to do).

Further possible refinements to the system could be to avoid trading if volatility is very high, or is in a period of adjustment to a higher level (which depending on time scale can basically mean the same thing). We could also throw in a daily stop loss on the scalper, if it loses more than X we stop for the day as we're just 'not feeling it'.

Another implementation detail that springs to mind is how we handle futures rolls; since we hope to end the day flat this should be quite easy; we just need to make sure any rolls are done overnight and not trade if the system is in any kind of roll state (see 'tactics' chapter in AFTS and doc here).


Living in a world of OHLC

Now as you know I'm not a big fan of using OHLC bars in my trading strategies -  I normally just use close prices. I associate using OHLC prices and creating candle stick charts with the sort of people who think Fibbonaci is a useful trading strategy, rather than the in house restaurant at a london accountants office

However OHLC do have something useful to gift us, which is the concept of the likely trading range over a given time period (the top and the bottom of a candle). 

Let's call that range R, and it's simply the difference between the highest and lowest prices over some time horizon H (to be determined). We can look at the most recent time bucket, or take an average over multiple periods (a stable R could be a good indication of stable volatility which is a good thing for this strategy). So for example if we're looking at '5 minute bars' (I'll be honest it takes an effort to use this kind of language and not throw up), then we could look at the height of the last bar, or take a (weighted?) average over the last N bars.

Now to set our initial bracket limit orders. We're going to set them at (R/2)*F above and below the current price. Note that if F=1, we'll set a bracket range which is equal to R. Given we're assuming that R is constant and represents the expected trading range, we're probably not going to want to set F>=1.  Smaller values of F mean we are capturing less of the spread, but we're exposed for less time. Note that setting a smaller F is equivalent to trading the same F on a smaller R, which would be a shorter horizon. So reducing F just does the same thing as reducing R. For simplicity then, we can set F to be some fixed value. I'm going to use F=0.75.

Finally, what about our stop losses? It probably makes sense to also scale them against R. Note that if we are putting on a stop loss we must already have a position on, which means the price has already moved by (R/2)*F. The remaining price "left" in the range is going to be (R/2)*(1-F); to put it another way versus the original 'starting' price we could set our stop loss at (R/2) on the appropriate side if we wanted to stop out at the extremes of the range. But maybe we want to allow ourselves some wriggle room, or set a stop closer to our original entry point. Let's set our stop at (R/2)*K from the original price, where K>F. 

Note this means our (theoretical!!!) max loss on any trade is going to be (R/2)*(K-F). For example, if F=0.75 and K=1.25 (so we place our stops at the extreme of the range), then the most we can lose if we hit our stop precisely is R/4.


Back of the envelople p&l

We're now in a position to work out precisely what our p&l will be on winning trades, and roughly (because stop loss) on losing trades. Let's ignore currency effects and set the multiplier of the future to be M (the $ or otherwise value of a 1 point price move). Let's set commission to C and also assume that it costs C to cancel a trade (the guidance on trade cancelling cost in interactive brokers docs is vague, so best to be conservative). In fact it makes more sense to express C as a proportion of M, c.
Finally we assume we're going to use an explicit stop loss order rather than implicit which means paying a cancellation charge.

win, W = R*F*M - 3C = R*F*M - 3Mc = M(R*F - 3c)
loss, L = -(K-F)*M*R/2 - 3C = -M[(K-F)*R/2 + 3c)

I always prefer concrete examples. For the SPY micro, M=$5, C=$0.25, c=0.05 and let's assume for now that R=10. I'll also assume we stick to F=0.75 and K = 1.25. So our expected win is:

W = $5 * (10*0.75 - 3*0.05) = $36.75
L = -5*[(1.25 - 0.75)*10/2 +3*0.05] = -$13.25

Note that in the absence of transaction costs the ratio of W to L would be equal to 2F / (F-K) which in this case is equal to 3. Note also that the breakeven probability of a win for this to be a profitable system is L / (L-W) which would be 0.25 without costs, but actually comes out at around 0.265 with costs.

We can see that setting K will change the character of our trading system. The system above is likely to be quite positive skew (breakeven probability). If we were to make K smaller, then we'd have more wins but they are likely to be smaller, and we'd get closer to negative skew.


Introducing the speed limit

Readers of my books will know that I am keen on something called the 'speed limit'; the idea that we shouldn't pay too much of our expected return on costs. I recommend an absolute maximum of 1/3 of returns paid out on costs; in practice eg for my normal trading strategy I pay about 1% a year and hope to earn at least ten times on that.

On the same logic, I'm not sure I'd be especially happy with a strategy where I have to pay up half my profits in costs even on a winning trade. Let's suppose the fraction of gross profit I am happy to pay is L, then:

3Mc / R*F*M = L

R = 3c / LF

For example, if L was 0.1 (we only want to pay 10% of our costs), then the minimum R for F=0.75 on the S&P 500 with c=0.05 would be 3*.05 / (0.1 * 0.75) = 2

In practice this will put a floor on the horizon chosen to estimate R; if H is too short then R won't be big enough. The more volatile the market is, the shorter the horizon will be when R is sufficiently large. However we would want to be wary of a longer horizon, since that would push us towards a holding period where autocorrelation is likely to turn against us (getting up towards 30 minutes).

But as we will see later, this crude method significantly understimates the impact of costs as it ignores the cost of losing trades. In practice we can't really use the speed limit idea here.


Random thought: Setting H or setting R?

We can imagine two ways of running this system:

- We set R to be some fixed value. The trading horizon would then be implicit depending on the volatility of the market. Higher vol, shorter horizon. We might want to set some limits on the horizon (this implies we'll need some sensitivity analysis on the relationship between horizon and the ratio of R and vol). For example, if we go below a two minute horizon we're probably pushing the latency possibilities of the python/ib_insyc/IB execution stack, as well as going beyond the horizon analysed in the arvix paper. If we go above a fifteen minute horizon, we're probably going to lose mean reversion again as in the arvix paper. 
- We set the horizon H at some fixed value, and estimate R. We might want to set some limits on R - for example the speed limit mentioned above would put a lower limit on R. 

In practice we can mix these approaches; for example estimating R at a given horizon at the start of each day (perhaps using yesterdays data, or a warm up period), and then keeping it fixed throughout the day; or maybe re-estimating once or twice intraday. We can also at this point perhaps do some optimisation on the best possible value of H; the period when we see the best autocorrelation.

Simulation

Right, shall we simulate this bad boy? Notice I don't say backtest. I don't have the data to backtest this. I'm going to assume, initially, very benign normal distribution returns that - less benign - has no autocorrelation.

Initially I'm going to use iid returns, which is conservative (because no autocorrelation) but also aggresive (no jumps or gaps in prices, or changes in volatility during the day). I'm also going to use zero costs and cancellation fees, and optimistically assume stop losses are filled at their given level. We're just going to try and get a feel for how changing the horizon, and K affect return distribution. I'm going to re-estimate R throughout the day with a fixed horizon, but given the underlying distribution is random this won't make a huge amount of difference.

The distribution is set up to look a bit like the S&P 500 micro future; I assume 20% annualised daily vol and then scale that down to an 8 hour day (so no overnight vol). That equates to 71.25 units of price of daily vol, assuming the current index level of ~5700. I will simulate one day of ten second price ticks, do this a bunch of times, and then see what happens. The other parameters are those for the S&P micro: contract multiplier $5, tick size $0.25, and I also use a single contract when trading.

The simulations can be used to produce pretty plots like this which show what is going on quite nicely:

The price is in grey, or red when we are short, or green if long. The horizontal dark red and green lines are the initial take profit bracket orders. You can see we hit one early on and go short. The light green line is a buy stop loss (there are no sell stop losses in this plot as we're only ever short). You can see we hit that, so our initial trade fails. We then get two short lived losing trades, followed by a profitable trade near the end of the period when the price retraces from the red bracket entry down to the green take profit line.

Here's a distribution of 5000 days of simulation with a 5 minute horizon, and K=1.25 (with F=0.75):



Well it loses money, which isn't great. The loss is 0.4 Units of the daily risk on one contract (daily vol 71.25 multiplied by multiplier $5). And that's with zero costs and optimistic fills remember. Although there is a small point to make, which is that this isn't a true order book simulator. I only simulate a single price to avoid the complexity of modelling order arrival. In reality we'd gain a little bit from having limit orders in place in a situation where the mid price gets near but doesn't touch the limit order. Basically we can 'earn' the bid-ask bounce. 

The distribution of daily returns is almost perfectly symettrical, obviously that wouldn't be true of the trade by trade distribution.

What happens if we set a tighter stop, say K=0.85?

Now we're seeing profits (0.36 daily risk units) AND some nice positive skew on the daily returns (which have a much lower standard deviation of about 0.40 daily risk units). Costs are likely to be higher though, and we'll check that shortly. Note we have a daily SR approaching 1.... which is obviously bonkers- it annualises to nearly 15!!! Incredible bearing in mind there is no autocorrelation in the price series at all here. But of course, no costs.

What happens if we increase the estimate period for R to 600 seconds (10 minutes), again with K=0.85:

Skew improves but returns don't. Of course, costs are likely to fall with a longer horizon.


An analysis of R and horizon

Bearing in mind we know what the vol of this market is (71.25 price units per day), how does the R measured at different horizons compare to that daily vol? Note this isn't affected by the stoploss or costs.

At 60 seconds, average R is 4.6, in daily vol units: 0.065
At 120 seconds, .... R in vol units: 0.10
At 240 seconds, ...  R in vol units: 0.15
At 360 seconds, ...  R in vol units: 0.19
At 480 seconds, ...  R in vol units: 0.22
At 600 seconds, ... R in vol units: 0.25
At 900 seconds, ... R in vol units: 0.31

This doesn't quite grow at sqrt(T) eg T^0.5, but at something like T^0.59.

Footnote: even the smallest value of R here is greater than the minimum calculated earlier based on our speed limit. Of course that wouldn't be true for every market. 

Now with costs

Now let's include costs in our analysis. I'm going to assume we pay $0.25 per contract, and the same to cancel, which is c=0.05 as a proportion of the $5 multiplier. In the following tables, the rows are values of H, and the columns are values of K. For each I run 5000 daily simulations and take the statistics of the distribution of daily returns.

First the average daily returns, normalised as a multiple of the daily $ risk on one contract (71.25 * M = $356.25 ). Note I haven't populated the entire table, as there is no point focusing on points which won't make any money (with the exception of those we've already looked at).

H / K    0.8     0.85    0.92       1.0    1.25   
120      0.62    0.26   -0.16     -0.53   -1.08
240      0.37    0.09   -0.21     -0.44   -0.70  
300      0.30    0.04   -0.22             -0.60
600      0.14   -0.04
900      0.07   -0.05                     -0.25                     

Note we've already seen some of these values before pre-cost; H=300, K=1.25 was -0.4 (now -0.60), H=300, K=0.85 was 0.36 (now 0.044), and H=600, K=0.85 was 0.13 (now -0.04). So we lose about 0.2, 0.32 and 0.17 in daily risk units in costs. We lose more in costs if H is smaller (as we trade more often) and with a tighter stoploss (as we have more losing trades, and thus trade more often). To put it another way, we lose 1 - (0.044/0.36) = 78% of our gross returns to costs with H=300, K=0.85 (which is a darn sight more than the speed limit would like!).

Note: even with a 5000 point bootstrap there is some variability in these numbers; for example I ran the H=120/K=0.8 twice and on the other occasion got a mean of over 0.8.

Standard deviations, normalised as a multiple of the daily $ risk on one contract:

H / K    0.8     0.85    0.92    1.0     1.25   
120      0.34    0.37    0.41    0.46    0.58
240      0.35    0.39    0.45    0.50    0.63
300      0.35    0.40    0.47            0.65
600      0.35    0.41
900      0.35    0.41                    0.67

These are mostly driven by the stop; with longer lookbacks increasing it slightly but not by much.

Annualised daily SR:

H / K    0.8     0.85    0.92     1.0   1.25   
120      28.8   11.4    -6.0    -17.7  -29.7
240      17.0    3.8    -7.7    -14.1  -17.7
300      13.9    1.77   -7.5           -14.6
600      6.3    -1.53
900      3.24   -1.98                   -5.9

These numbers are... large. Both positive and negative.

Skew:

H / K    0.8     0.85    0.92   1.0   1.25 
120      0.16    0.12    0.09   0.07  0.03
240      0.18    0.22    0.15   0.13  0.03
300      0.30    0.16    0.19         0.10
600      0.42    0.27
900      0.54    0.35                 0.08

As we'd expect, tighter stop is better skew. But we also get positive skew from longer holding periods.

With more conservative fills

So it's simple, right, we should run a short horizon (say H=120) with a tight stop (say 0.8, which is only 0.05R away from the initial entry price).  It does seem that a tight stop is the key to success; the SR is very sensitive to the stop increasing in size, 

But it's perhaps worth examining what happens to the SR of such a system as we change our assumptions about costs and fills. With very tight stops, and short horizons, we are going to get lots more trades (so costs have a big impact), and hit many more stops; so the level at which we assume they are filled is critical. Remember above I've assumed that stops are hit at the exact level they are set at.

Z: Zero costs, stop loss filled at limit, take profit filled at limit 
CL: Costs, stop loss filled at limit, take profit filled at limit
CP: Costs, stop loss filled at next price, take profit filled at limit
CLS: Costs, stop loss filled at limit + slippage, take profit filled at limit
CPS: Costs, stop loss filled at next price + slippage, take profit filled at limit
CLS2: Costs, stop loss filled at limit + slippage, take profit filled at limit - slippage
CPS2: Costs, stop loss filled at next price + slippage, take profit filled at limit - slippage 

Two sided slippage (CSL2, CPS2)- means we assume we earn 1/2 a tick on take profit limit orders (assumes a 1 tick spread, which is configurable), and have to pay 1/2 a tick on stop losses (of which there will be many more with such tight stops). Essentially it's the approximation I use normally when backtesting, and tries to get us closer to a real order book. There's also a more pessimistic one sided version of these (CLS, CPS) where we only apply the negative slippage. 

Here are the Sharpe Ratios, different strategies in columns (I've just gone for a selection of apparently profitable), different cost assumptions in rows (numbers are comparable with each other as used same random seed, but not comparable with those above):

H/K: 120/0.8      300/0.8      900/0.8      120/0.85      300/0.85
ZL:    48.6          23.9          7.5           29.6        10.4
CL:    28.9          14.1          3.2           11.9         2.5
CLS2:  32.2          15.4          3.7           16.5         4.1
CLS:    7.0           2.9          -1.7          -7.2         -6.2
CP:    -120           -69          -35           -114         -59
CPS2:  -115           -66          -34           -109         -56
CPS:   -129           -75          -38           -123         -64

Well we know that zero costs (ZL) and costs with stop losses being hit exactly (CL) are probably a pipe dream, so then it's a question of which of the other five options gets closest to the messy reality of a real life order book as far as executing the stop loss orders goes. Clearly using the next price(CP*) is an absolute killer to performance with the biggest damage done as you would expect on the tightest stops and shortest horizons where we lose up to 140 units of annualised SR. We're not going to be able to do much unless we assume we can get near the limit price on stop loss orders (*L*).

If we assume, reasonably conservatively, that we receive our limits on stop profit orders without benefitting from bid-ask bounce, but have to pay slippage on our stop losses - which is CLS - there are still two systems that survive with high positive SR. 

Interactive brokers provides synthetic stop losses on some markets; whether it's better to use these or simulate our own stops by placing aggressive market orders once we are past the stop level is an open question.

Note that even the most optimistic of these (CSL2) sees us losing half or more of the pre-cost returns, again hardly what I usually like to see but I guess I could live with an annualised SR of over 7 :-)

Can't we do better

All of the above assumes prices are a pure random walk.... but the original motivation was the fact that prices in the sort of region I'm looking at appear to be mean reverting. Which would radically improve the results above. It's easy to generate autocorrelated returns at the same frequency as your data, but I'm generating returns every 10 seconds whilst the autocorrelation is happening at a slower frequency. We'd have to use something like a brownian bridge to fill in the other returns, or also assume autocorrelation happens at the 10 second frequency, which we don't know for sure and would flatter us too much. The final problem is that I don't actually have figures to calibrate this with; the arvix paper doesn't measure autocorrelation.

Since I've already spent quite a lot of time digging into this, I decided not to bother. Ultimately a system like this can only really be tested by trading it; as that's also the only way to find out which of the assumptions about fill prices is closest to the truth. Since the SR ought to be quite high, we should be able to work out quite quickly if this is working or not. 

System choice in tick terms

This then brings me back to which is the best option of H and K to trade.

Remember from above that the average R expressed in daily vol units varies between 0.10 (for H=120 seconds) up to 0.31 (H=900 seconds). If we use K=0.8 (with L=0.75) then we'll have a stop of 0.05R, which will be 0.05*0.10 = 0.005 times daily vol. To put that in context, if S&P 500 vol is around 1%, or call it 60 points a day, then R=6 price units, and our stop will be just 0.3 price points away.. which is just over one tick (0.25)! You can now see why the tight K, short horizon, systems were so sensitive to fill assumptions. For the slowest system in the second set of analysis above, H=300/K=.85, as a stop of 0.1R with R~0.17 we get up to four ticks between our take profit and stop loss limits.

We're then faced with the dilemna of hoping we can actually get filled within a tick of that level (and if it doesn't will absolutely bleed money), or slowing down to something that will definitely lose money unless we get mean reversion (but won't lose as much). 

If we were to say we wanted to see at least 10 ticks between take profit and stop loss, then we're looking at something like H=900, K=0.87; or H=120, K=1; or somewhere in between.  The former has a less bad SR, better skew and lower standard deviation - basically a tighter stop on a longer horizon is better than a looser stop on a shorter one.


Next steps

My plan then is to actually trade this, on the S&P with H=900 and K=0.87. I'll come back with a follow up post when I've done this! 

Monday, 14 April 2025

Annual performance update returneth - year 11

Mad out there isn't it? Tarrifs on/off/on/partially off/on... USD/SP500/Gold/US10/Bitcoin all yoyoing like crazy. Seems a good moment to be slightly reflective.

I skipped my annual performance update last year, a little sad given it was my tenth anniversary. Mainly this is because it had become a lot of work, covering my entire portfolio. The long only stuff is especially hard, as all my analysis is driven by spreadsheets rather than nicely outputted from python. If I'm going to restart the annual performance, I'm going to do it just on my systematic futures portfolio. That's probably what most people are interested in anyway. 

This then is my 2025 performance update. As in previous years I track the UK tax year, which finishes on 5th April. That precise dating is especially important this year!! 

You can track my performance in (almost) real time, along with other information here,

TLDR: On a relative basis my performance was good, 3% long only vs -0.1% vs benchmark; -16.3% in futures vs ~-18% benchmarks. On an absolute basis, it's a mediocre year long only and my worst ever in futures.


Headline

My total portfolio performance was -1.9% last year; It would have been +3% without futures. That's a sign that this hasn't been a great year in the systematic world of futures trading. Spoiler alert - it's my worst ever. Whilst I'm not doing the full gamut of benchmarking this year, my benchmark of Vanguard 80:20 was down 0.1% (thanks to Trump), so I would have been well ahead of that if it wasn't for futures.

Everything else in this post just relates to futures.

Headline numbers, as a % of the starting capital:

MTM:        -17.8%
Interest:     2.4%
Fees:        -0.05%
Commissions: -0.29%
Slippage:    -0.56%

Net :       -16.3%

'Interest' includes dividends on 'cash like' short bond ETFs I hold to make a slightly more efficient use of my cash. Actually I ended up paying interest as I was a bit sloppy and didn't liquidate the ETFs when I lost money so ended up with negative cash balances. Given the interest rate was probably higher on the borrowing than the yield on the money market cash like funds; and given the tax disadvantage (I pay tax on dividends but can't offset losses from interest payments), this was dumb. I've now liquidated a chunk of my ETFs.

MTM - mark to market - also includes gains or losses on FX positions held to meet margin, and on the cash like ETFs. Broken down:

Pure futures:   -14.5%

Cash like ETFs:  -0.64%

FX:              -2.7%

So perhaps a better way of doing this would be to lump together all the interest, margin and ETF related flows, and call those 'cost of margin':

Futures MTM: -14.5%
Commissions:  -0.29%
Slippage:     -0.56%
Fees:         -0.05%

SUBTOTAL - PURE FUTURES: -15.3%

Cost of margin: -1.0%

Net :         -16.3%

Time series


You can see that after the peak in late April (which was also my all time HWM), we have two legs down; the first is almost recovered from; but the drawdown from mid July to the end of September is quite vicous at ~17% is probably the worst I have seen. 

Just about visible at the end of the chart is 'tariff liberation day' which has liberated me from a chunk of my money. Ouch, let's make ourselves feel better with a longer term plot

Benchmarks

My two benchmarks are the SG CTA index, and an AHL fund which like me is denominated in GBP.

Here is the raw performance:



It's not super fair as I run at a much higher vol target than the other two products, but here are some numbers:

           AHL       SG           ROB

Mean       4.5%      4.0%         12.9%

Stdev     10.7%      8.9%         16.8%

SR (rf=0)  0.42      0.45          0.76        

Those Sharpes would be lower, especially in the last couple of years, if a risk free rate was deducted.

My correlation is 0.66 with SG CTA, and 0.56 with AHL. Their joint correlation is 0.78. Perhaps because I'm not just trend following and carry. Both have been bad this year. More on that story later. To make things fairer then, let's do an insample vol adjustment so everything has the same vol:



I'm still winning, but it's a bit closer. Here are the annual figures:

          AHL SG Me

30/03/15 66.0% 51.4% 59.5%

30/03/16 -6.2% -3.7% 28.1%

30/03/17 -3.7% -12.6% 2.4%

30/03/18 8.7% -1.7% 2.0%

30/03/19 5.4% -2.7% 4.5%

30/03/20 22.0% 6.5% 33.8%

30/03/21 0.9% 11.9% -1.7%

30/03/22 -12.0% 33.1% 25.8%

30/03/23 8.3% 0.6% -7.6%

30/03/24 14.5% 24.3% 20.6%

30/04/25 -18.2% -17.9% -14.7%

Note: The reason I'm showing -14.7% here and 16.3% earlier, is that these figures are to the end of the relevant month (i.e. March 31st to March 31st), rather than the UK tax year; as I can only get monthly figures for the AHL fund.

I've highlighted in green the best performer in each year, red is the worst. You can see that, at least with my vol adjustment, it was a shocking year for the industry generally, and although this was my worst year on record, I did actually outperform (the unadjusted numbers are -12% for AHL and -10% for the CTA index so I was worse than those, but obviously would have looked even better earlier in the period).

The one stat I haven't included here is my favourite, geometric return or CAGR; mine was 12.0% vs 5.9% AHL and 6.4% SG (based on the leveraged up, vol adjusted numbers in the second graph; the figures would be worse for AHL and SG without this adjustment).

Is this fair? Well no, there are fees embedded in the AHL and SG numbers. My fees aren't in these figures, and are much higher - I charge no management fee, but my performance fee is 100% :-)


Market by market

Here are the numbers by asset class:

Equity  -6.0%
OilGas  -4.2%
FX      -2.7%
Metals  -1.5%
Bond    -1.3%
Vol     -0.5%
Ags     +3.2%

When you only make money in one sector, that's a bad year! My worst markets were: Gas-pen, MIB, China A construction, Gasoline, MSCI Taiwan, Platinum, MXP; with losses between 1.7% and 1%. My best markets were Coffee, MSCI singapore, Cotton#2 and S&P 500; with gains between 1.7% and 0.7%. So no concentration issues at least.

Because of my dynamic optimisation, the p&l explanations can be ... interesting. Coffee for example, I made 1.7%, but I only held Coffee for less than two weeks; from 31st January to 11th February 2025 (a period in which it went up rather sharply as it happens, but I missed out on the long rally that preceeded it). 

S&P 500 on the other hand, I actually traded:

That's price, position, and p&l. Nice trading. Now look at MIB, italian equities:
For some reason we only played the long side, and boy did we get chopped up.


Trading rules

Presented initially without comment, a bunch of plots showing p&l for each trading rule group (these are backtested numbers, and they are before dynamic optimisation).

















What a rogues gallery! It's quicker to list the rules that did make money or at least broke even:

- relative value skew
- relative carry
- fast relative momentum

This also explains, to an extent, why my performance isn't quite as bad as the industry; I would imagine that I have a little more allocation to this weird stuff than the typical CTA. The pure univariate momentum rules (breakout, momentum and normalised momentum) were especially bad. Even fast momentum has been crucified especially badly in the recent carnage, for obvious reasons.


Costs and slippage

I've already noted that my total slippage (diff between mid market price when I come to place an order, and where I get filled) was 0.56% of my starting capital; and commissions were 0.29%. That's an all in cost of 0.85% which is a little lower than what I usually pay; but as a % of my final capital it's 0.98%, the real number is somewhere around the 89bp - 93bp mark. Two years ago, the last time I checked I was paying 90bp of which 69bp was slippage and 21bp comissions. So slippage is down, commissions up slightly but net-net roughly similar.

Without my execution algo, if I had just traded at the market, I would have paid 1.53% in slippage; my simple algo earned 97bp and cut my slippage bill by two thirds. So that is a good thing.


Coming up

I'm not planning to do much with my futures system this year; I'm going to be busy writing a new book so let's see how it does going forward. Certainly feels like a bad year for trend following; unless or until the US economy gets tipped into recession and we get clear risk off markets. I think it unlikey we're going to get risk on until November 2028 or January 2029 :-) At least for the time being, it's not going great for me (-4% YTD), and it's going even worse for the industry generally.

Tuesday, 4 March 2025

Very.... slow... mean reversion .... and some thoughts on trading at different speeds

 Bit of a mixed bag post today. The golden thread connecting them is the idea that markets trend and mean revert at different frequencies.

- A review of the discussion around timeframes for momentum and mean reversion in 'Advanced Futures Trading Strategies', in light of this excellent recent paper (which I also discussed on the TTU podcast, here from 1:02:12 onwards).

- A mea culpa on the mean reversion strategies in 'Advanced Futures Trading Strategies'. TLDR - there is an error in the backtest and they don't work at least in the specified form.

- A new slow.... absolute mean reversion strategy inspired by a question from Paul Calluzzo on the aforementioned podcast episode.

Note: in this article I use the terms momentum and trend following interchangeably to both mean absolute momentum - not relative.


When do markets trend and mean revert?

When do markets trend? When do they not trend... perhaps even mean reverting? This is a very important question! 

You might think it would depend on the market, but actually there seem to be some fairly common patterns across many different instruments. Here is how I summarised by thoughts in my most recent book, Advanced Futures Trading Strategies (AFTS):

Multi-year horizons: Mean reversion sort of works (although the results are not statistically significant, as the value strategy in part three attests). Note: this value strategy is a relative value strategy that looks for mean reversion within asset classes. Such strategies are common in academic equity research.

Several months to one year horizon: Trend following works, but is not at it’s best (consider the slightly poorer results we get for EWMAC64 versus faster trend variations).

Several weeks to several month horizon: Trend works extremely well (consider the excellent performance of EWMAC8, EWMAC16 and EWMAC32).

Several days to one week: Trend is starting to work less well (EWMAC4 and especially EWMAC2 perform somewhat worse than slower variations, even before costs are deducted).

A few days: We might expect mean reversion to work?

Less than a day: We might expect trend to work?

Less than a second: Mean reversion works well (high frequency trading - HFT - is very profitable).      

Note that 'momentum works for months or years' and 'mean reversion / value works for years' is a very well known stylised fact which has been established in the literature for many decades; see for example this seminal paper. And given the existence of profitable CTAs with holding periods in the weekly to monthly range, it's hardly surprising that momentum works for shorter holding periods. Nor is the fact that HFT firms make a ton of money a secret.

However, in the region between high frequency trading and a horizon of a week or so I wasn't sure exactly what to expect, but I speculated that there would be a region where mean reversion would start to work (more on that later!), and I also thought trend following with holding periods in the 'few hours' range (mainly because there has been some sell side research on that). Note that since my own data is hourly at best, I couldn't really test anything with a horizon of less than about one day.

Fortunately someone came along to fill in this gap in our understanding, with this excellent paper:

"Trends and Reversion in Financial Markets on Time Scales from Minutes to Decades" by Sara A. Safari and Christof Schmidhuber

I won't summarise the paper in much detail (for example it has some interesting results around the relationship between trend strength and reversal), but they have the following pattern of results (from figure 10 in the paper):


Horizon over two years: Mean reversion works, becoming more effective at longer horizons. They used literally centuries of data to check this result. 

One week to two year horizon: Trend following works, but it's effectiveness peaks at around one year

One hour to one week horizon: Trend following works, getting gradually less effective as the time horizon shortens.

Two minute to 30 minute horizon: Mean reversion works, and is most effective at the 4-8 minute horizon


The key differences between my results and theirs only occur in my 'zone of speculation', where I was only guessing and they had actual evidence so let's go with them :-) In particular they have two 'crossing points' from when mean reversion stops working and momentum starts working (at just over two years, and somewhere between 30 minutes and one hour), giving the following broad ranges:

Horizon over two years: Mean reversion works.

One hour to two year horizon: Trend following works.

Two minute to 30 minute horizon: Mean reversion works

Whilst I had speculated that there was something more complicated going on. Even without evidence, Occams razor would suggest you should prefer their results to mine.

Another difference is that when I looked 'optimal points' for eg momentum I was concerned with Sharpe Ratio, but they are instead fitting a response function and seeing when it has the best statistical fitness. Because of the Law Of Active Management, Sharpe Ratio (loosely) scales inversely with the square root of time for a given level of prediction accuracy. So you if you are equally good at predicting one year trends, and 3 month trends, the latter will have twice the Sharpe Ratio of the former. Hence there are good reasons why my optimal SR point is different from their optimal response point; all other things being equal the optimal SR is going to be at a shorter horizon.

Combining the two pieces of research together, and thinking about what sorts of strategies we could be trading, we get this:

Horizon over two years: Mean reversion works. The optimal SR is probably quite flat for anything between three and ten years. Equity value, relative value within asset classes, and absolute mean reversion (of which more in a moment) are all nice strategies. But given their holding period you shouldn't expect high Sharpe Ratios unless you are Warren Buffett (hi Warren!). 

One to two years: Momentum will work but will be getting steadily worse as the timescale gets longer, both from a predictability perspective and a Sharpe Ratio viewpoint. Avoid.

Three months to one year horizon: Trend following works with high predictability, but is not at it’s highest Sharpe ratio due to the slow turnover. However, the advantage here is that this is a playing field that even retail punters with expensive trading costs can play in. Slower momentum strategies are all good.

Three weeks to three months horizon: Trend following probably has it's optimal Sharpe Ratio somewhere in this region, depending on the asset class. Any medium speed momentum strategies are good, and nearly all futures traders can play in this area if they avoid a few very expensive instruments.

Several days to three weeks: Trend is starting to work less well (because the improvement from trading faster is being overwhelmed by the deficit in response) and trading costs will start to bite except for the very cheapest futures (see calculation below), traded with exemplary execution. On the upside, trend following models at this speed will have the highest positive skew. Trade selectively.

A few hours to several days: Trend still just about works but but there are probably only a small number of futures where  you can overcome the bid/offer costs (although I hear costs are very low in Crypto, and there might be US traders who get zero commission able to trade highly liquid ETFs like SPY); I'd doubt though it would be worth doing. As the authors note, strong trends also tend to reverse strongly in this region (see AFTS for my own confirmation of this effect). Against that there have been the sell side papers on this subject, but they seem to rely on gamma hedging effects which may not persist. Avoid.

1 hour to a few hours: The authors in the paper note that the very weak trend effect here can't overcome the tick size effect. Avoid.

Two minute to 30 minute horizon: Mean reversion works, and is most effective at the 4-8 minute horizon from a predictive perspective; although from a Sharpe Ratio angle it's likely the benefits of speeding up to a two minute trade window would overcome the slight loss in predictability. There is no possibility that you would be able to overcome trading costs unless you were passively filled, with all that implies (see below). Automating trading strategies at this latency - as you would inevitably want to do - requires some careful handling (although I guess consistently profitable manual scalpers do exist that aren't just roleplaying instagram content creators, I've never met one). Fast mean reversion is also of course a negatively skewed strategy so you will need deep pockets to cope with sharp drawdowns. Trade mean reversion but proceed with great care.

Less than a second to two minutes: Not covered in the paper, but I would speculate that mean reversion continues to work, and the pre-cost Sharpe Ratio would also continue to improve as the horizon falls. Proceed with even more care.

Less than a second: High frequency trading works, and clearly has a very high Sharpe Ratio, but this is not for the amateur.


Notes on costs: 

The very cheapest equity index future I trade has a cost of around 0.2bp assuming we execute market orders; and vol of around 20% a year, for a SR cost of  about 1bp. Median single instrument SR on the optimal trend strategy (holding period around 3 weeks) is around 0.30. Predictability, as a regression coefficient, from the linked paper is around 6.5% at 3 weeks; and around 1.8% at 2 days (a reduction of 3.6x). Time scaling would improve the SR by 2.7x so the net effect is a 25% fall in SR to around 0.22 for a two day forecast horizon. 

If we take a third of that (my 'speed limit') or 0.22 SR for costs, then our annual cost budget is 0.07 SR or 700bp; implying we could perhaps safely trade a couple of times a day implying a two day forecast horizon (which means trading once a day) is possible. 

But the median future I get data for has a cost of around twenty times that, meaning a holding period of around two weeks is required to meet the 'speed limit'.

Do we have to pay the spread? Broadly speaking, if you are trading slowly, then you can afford to be more patient in your execution, using passive fills where possible (as I do myself). But as a fast trend follower who thinks the price is going to move away from you in the near future, it's probably harder to sit on your hands and wait. 

Alternatively if we are fast mean reverting traders then we can use passive fills by setting limits around where we think the equilibrium is. That of course runs the risk of adverse selection, but without doing this we are never going to make enough money to overcome the bid/offer if we're trading dozens of times a day. You may also be still liable to commissions unless you received exchange rebates from providing liquidity. Note since we earn the bid/offer spread from passive fills, it might be that the best instruments for this strategy are those with wider rather than narrower bid/offers.



Forgive my father, for I have sinned against the gods of backtesting...

Now in AFTS I introduce two strategies which trade mean reversion, with horizons of around a week (since I'd speculated that would work). It included a very elegant way of including limit orders to passively execute, and the second strategy introduced a very nice trend following overlay. And it looked great! But that obviously isn't consistent with the findings above.

Well gentle reader, I screwed up. As I said in my book:

"But what jumps out from this table is the Sharpe ratio. It is impressively large, and the first we have seen in this book that is over two. In my career as a quantitative trader I have always had a long standing policy: I do not trust a back tested Sharpe ratio over two. There are certainly plenty of reasons not to trust this one. 

Firstly, the historical back test period, just over ten years, is shorter than I would like. There are good reasons to suppose that the last ten years included unusual market conditions that might just have favored this strategy. Secondly, it is hard to back test a strategy deploying limit orders that effectively trades continuously using hourly data. There may well be assumptions or errors in my code that make the results look better than they really would have been."

The underlined section (not underlined in the book!) is key here; basically there was an implicit forward fill in my backtest as I calculated the equilibrium price including todays closing price (which of course I wouldn't have known in the morning). The real backtest shows basically no statistically significant return at all.

The good news is that the basic technology of this strategy should work well, at least pre-costs, with a much shorter time horizon; although for all the reasons above I haven't tried it myself (though I know others that have).




A new slow absolute mean reversion strategy

Since I'm taking away one strategy, let me replace it with another. In AFTS strategy twenty two is a 'value' strategy, which bets on mean reversion over five year periods in relative terms against an asset class index. It has crappy SR (basically zero), but positive alpha and improved overall SR when added to trend and carry strategies. 

But on a recent TTU podcast, herePaul Calluzzo asked me if I'd ever tested absolute mean reversion. Certainly I haven't on this blog. So let's do that.

I'll use a three year return for my forecast, which is slow enough to avoid the two year point where we know momentum probably still works; whilst being quick enough to avoid the death by sqrt(T) that will reduce my SR. We go long if the return is negative so:

Forecast = Price_t-3yrs - Price_t

To avoid the turnover being excessive (this is a slow forecast!), and because we should always vol scale:

Smoothed vol scaled Forecast = EWM_64(Forecast/ EW_std_dev(returns))

Drumroll...

It's not..... great (SR -0.48), apart from perhaps a recent pickup. You could argue that as a lot of my data starts in 2013, and the first five year return occurs in 2018, that it's actually profitable for many instruments and we've just been unlucky in the instruments we've traded before. The median SR is -0.06 though which doesn't completely support that argument. 

But really it would appear that at least with this construction absolute mean reversion isn't as good as the relative mean reversion I tested in AFTS.

OK so we've dropped a strategy with an unfeasibly high backtested SR, and I've replaced it with one that has a very poor backtested SR. Unfair? Well, life isn't fair.


Summary

Good things to trade:

Horizon over two years:  Cross sectional mean reversion, but possibly not absolute mean reversion. And similar type things like equity relative value.

One to two years: Nothing*

Three months to one year horizon: Trend following of pretty much anything

Three weeks to three months horizon: Trend following; avoiding very expensive instruments.

Several days to three weeks: Trend following; only the very cheapest instruments.

One hour to several days: Nothing*

Less than a 30 minute horizon: Mean reversion - the faster the better, but only with limit orders and with great care (the faster you are, the more care needs to be taken).

* or at least not outright momentum or mean reversion