Friday 3 February 2023

Percentage or price differences when estimating standard deviation - that is the question

In a lot of my work, including my new book, I use two different ways of measuring standard deviation. The first method, which most people are familiar with, is to use some series of recent percentage returns. Given a series of prices p_t you might imagine the calculation would be something like this:

Sigma_% = f([p_t - p_t-1]/p_t-1, [p_t-1 - p_t-2]/pt-2, ....)

NOTE: I am not concerned with the form that function f takes in this post, but for the sake of argument let's see it's a simple moving average standard deviation. So we would take the last N of these terms, subtract the rolling mean from them, square them, take the average, and then take the square root.

For futures trading we have two options for p_t: the 'current' price of the contract, and the back adjusted price. These will only be the same in the days since the last roll. In fact, because the back adjusted price can go to zero or become negative, I strongly advocate using the 'current' price as the denominator in the above equation, and the changein back adjusted price as the numerator. If we used the change in current price, we'd see a pop upwards in volatility every time there was a futures roll. So if p*_t is the current price of a contract, then:

Sigma_% = f([p_t - p_t-1]/p*_t-1, [p_t-1 - p_t-2]/p*t-2, ....)

The alternative method, is to use some series of price differences:

Sigma_d = f([p_t - p_t-1], [p_t-1 - p_t-2], ....)

Here these are all 

If I wanted to convert this standard deviation into terms comparable with the % standard deviation, then I would divide this by the current price (*not* the backadjusted price):

Sigma_d% = Sigma_d / p*_t

Now, clearly these are not going to give exactly the same answer, except in the tedious case where there has been no volatility (and perhaps a few, other, odd corner cases). This is illustrated nicely by the following little figure-ette (figure-ine? figure-let? figure-ling?):

import pandas as pd
perc =(px.diff()/pxc.shift(1)).rolling(30, min_periods=3).std()
diff = (px.diff()).rolling(30, min_periods=3).std().ffill()/pxc.ffill()
both = pd.concat([perc,diff], axis=1)
both.columns = ['%', 'diff']



The two series are tracking pretty closely, except in the extreme vol of late 2008, and even they aren't that different. 

Here is another one:

That's WTi crude oil during COVID; and there is quite a big difference there. Incidentally, the difference could have been far worse. I was trading the December 2020 contract at the time... the front contract in this period (May 2020) went below zero for several days.

Now most people are more familiar with % standard deviations, which is why I have used it so much, but what you may not realise is that the price difference standard deviation is far more important.

How come? Well consider the basic position sizing equation that I have used throughout my work:

N = Capital × τ ÷ (Multiplier × Price × FX rate × σ_% )

(This is the version in my latest book, but very similar versions appear in my first and third books). Ignoring most things we get:

N = X ÷ (Price × σ_%)

So the number of contracts held is proportional to one divided by the price multiplied by the percentage standard deviation estimate. The price shown is, if you've been paying attention, the current price not the back adjusted one. But remember:

Sigma_d% = Sigma_d / p*_t

Hence the position is actually proportional to the standard deviation in price difference terms. We can eithier estimate this directly, or as the equation suggests recover it from the standard deviation in percentage terms, which we then multiply by the current futures price.

As the graphs above suggest, in the majority of cases it won't make much difference which of these methods you choose. But for the corner case of prices close to zero, it will be more robust to use price differences. In conclusion: I recommend using price differences to estimate the standard deviation.

Finally, there are also times when it still makes sense to use % returns. For example, when estimating risk it's more natural to do this using percentages (I do this when getting a covariance matrix for my exogenous risk overlay and dynamic optimisation). When percentage standard deviation is required I usually divide my price difference estimate by the absolute value of the current futures price. That will handle prices close to zero and negative prices, but it will result in temporarily very high % standard deviations. This is mostly unavoidable, but at least the problem is confined to a small part of the strategy, and the most likely outcome is that we won't take positions in these markets (probably not a bad thing!).

Footnote: Shout out to the diff(log price) people. How did those negative prices work out for you guys?



No comments:

Post a Comment

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.