A few days ago I was browsing on the elitetrader.com forum site when someone posted this:
I am interested to know if anyone change their SMA/EMA/WMA/KAMA/LRMA/etc. when volatility changes? Let say ATR is rising, would you increase/decrease the MA period to make it more/less sensitive? And the bigger question would be, is there a relationship between volatility and moving average?
Interesing I thought, and I added it to my very long list of things to think about (In fact I've researched something vaguely like this before, but I couldn't remember what the results were, and the research was done whilst at my former employers which means it currently behind a firewall and a 150 page non disclosure agreement).
Then a couple of days ago I ran a poll off the back of this post as to what my blogpost this month should be about (though mainly the post was an excuse to reminisce about the Fighting Fantasy series of books).
And lo and behold, this subject is what people wanted to know about. But even if you don't want to know about it, and were one of the 57% that voted for the other two options, this is still probably a good post to read. I'm going to be discussing principles and techniques that apply to any evaluation of this kind of system modification.
However: spolier alert - this little piece of research took an unexpected turn. Read on to find out what happened...
Why this is topical
This is particularly topical because during the market crisis that consumed much of 2020 it was faster moving averages that outperformed slower. Consider these plots which show the average Sharpe Ratio for different kinds of trading rule averaged across instruments. The first plot is for all the history I have (back to the 1970's), then the second is for the first half of 2020, and finally for March 2020 alone:
The pattern is striking: going faster works much better than it did in the overall sample. What's more, it seems to be confined to the financial asset classes (FX, Rates and especially equities) where vol exploded the most:
Furthermore, we can see a similar effect in another notoriously turbulent year:
If we were sell side analysts that would be our nice little research paper finished, but of course we aren't... a few anecdotes do not make up a serious piece of analysis.Formally specifying the problem
Rewriting the above in fancy sounding language, and bearing in mind the context of my trading system, I can write the above as:
As I pointed out in my last post this leaves a lot of questions unanswered. How should we define the current level of volatility? How we define 'optimality'? How do we evaluate the performance of this change to our simple unconditional trading rules?
Defining the current level of volatility
For this to be a useful thing to do, 'current' is going to have to be based on backward looking data only. It would have been very helpful to have known in early February last year (2020) that vol was about to rise sharply, and thus perhaps different forecast weights were required, but we didn't actually own the keys to a time machine so we couldn't have known with certainty what was about to happen (and if we had, then changing our forecast weights would not have been high up our to-do list!).
So we're going to be using some measure of historic volatility. The standard measure of vol I use in my trading system (exponentially weighted, equivalent to a lookback of around a month) is a good starting point which we know does a good job of predicting vol over the next 30 days or so (although it does suffer from biases, as I discuss here). Arguably a shorter measure of vol would be more responsive, whilst a longer measure of vol would mean that our forecast weights aren't changing as much thus reducing the costs.
Now how do we define the level of volatility? In that previous post I used current vol estimate / 10 year rolling average of the vol for the relevant. That seems pretty reasonable.
Here for example is the rolling % vol for SP500:
import pandas as pd
from systems.provided.futures_chapter15.basesystem import *
system =futures_system()
instrument_list = system.get_instrument_list()
all_perc_vols =[system.rawdata.get_daily_percentage_volatility(code) for code in instrument_list]
And here's the same, after dividing by 10 year vol:
ten_year_averages = [vol.rolling(2500, min_periods=10).mean() for vol in all_perc_vols]
normalised_vol_level = [vol / ten_year_vol for vol, ten_year_vol in zip(all_perc_vols, ten_year_averages)]
def stack_list_of_pd_series(x):
stacked_list = []
for element in x:
stacked_list = stacked_list + list(element.values)
return stacked_list
stacked_vol_levels = stack_list_of_pd_series(normalised_vol_level)
stacked_vol_levels = [x for x in stacked_vol_levels if not np.isnan(x)]
matplotlib.pyplot.hist(stacked_vol_levels, bins=1000)
Update: There was a small bug in my code that didn't affect the conclusions, but had a significant effect on the scale of the normalised vol. Now fixed. Thanks to Rafael L. for pointing this out.
- Low: Normalised vol in the bottom 25% quantile [using the entire historical period so far to determine the quantile] (over the whole period, normalised vol between 0.16 and 0.7 times the ten year average)
- Medium: Between 25% and 75% (over the whole period, normalised vol 0,7 to 1.14 times the ten year average)
- High: Between 75% and 100% (over the whole period, normalised vol 1.14 to 6,6 times more than the ten year average)
def historic_quantile_groups(system, instrument_code, quantiles = [.25,.5,.75]):
daily_vol = system.rawdata.get_daily_percentage_volatility(instrument_code)
# We shift by one day to avoid forward looking information
ten_year_vol = daily_vol.rolling(2500, min_periods=10).mean().shift(1)
normalised_vol = daily_vol / ten_year_vol
quantile_points = [get_historic_quantile_for_norm_vol(normalised_vol, quantile) for quantile in quantiles]
stacked_quantiles_and_vol = pd.concat(quantile_points+[normalised_vol], axis=1)
quantile_groups = stacked_quantiles_and_vol.apply(calculate_group_for_row, axis=1)
return quantile_groups
def get_historic_quantile_for_norm_vol(normalised_vol, quantile_point):
return normalised_vol.rolling(99999, min_periods=4).quantile(quantile_point)
def calculate_group_for_row(row_data: pd.Series) -> int:
values = list(row_data.values)
if any(np.isnan(values)):
return np.nan
vol_point = values.pop(-1)
group = 0 # lowest group
for comparision in values[1:]:
if vol_point<=comparision:
return group
group = group+1
# highest group will be len(quantiles)-1
return group
quantile_groups = [historic_quantile_groups(system, code) for code in instrument_list]
stacked_quantiles = stack_list_of_pd_series(quantile_groups)
- Low vol: 53% of observations
- Medium vol: 22%
- High vol: 25%
Unconditional performance of momentum speeds
rule_list =list(system.rules.trading_rules().keys())
perf_for_rule = {}
for rule in rule_list:
perf_by_instrument = {}
for code in instrument_list:
perf_for_instrument_and_rule = system.accounts.pandl_for_instrument_forecast(code, rule)
perf_by_instrument[code] = perf_for_instrument_and_rule
perf_for_rule[rule] = perf_by_instrument
# stack
stacked_perf_by_rule = {}
for rule in rule_list:
acc_curves_this_rule = perf_for_rule[rule].values()
stacked_perf_this_rule = stack_list_of_pd_series(acc_curves_this_rule)
stacked_perf_by_rule[rule] = stacked_perf_this_rule
def sharpe(x):
# assumes daily data
return 16*np.nanmean(x) / np.nanstd(x)
for rule in rule_list:
print("%s:%.3f" % (rule, sharpe(stacked_perf_by_rule[rule])))
historic_quantiles = {}
for code in instrument_list:
historic_quantiles[code] = historic_quantile_groups(system, code)
conditioned_perf_for_rule_by_state = []
for condition_state in [0,1,2]:
print("State:%d \n\n\n" % condition_state)
conditioned_perf_for_rule = {}
for rule in rule_list:
conditioned_perf_by_instrument = {}
for code in instrument_list:
perf_for_instrument_and_rule = perf_for_rule[rule][code]
condition_vector = historic_quantiles[code]==condition_state
condition_vector = condition_vector.reindex(perf_for_instrument_and_rule.index).ffill()
conditioned_perf = perf_for_instrument_and_rule[condition_vector]
conditioned_perf_by_instrument[code] = conditioned_perf
conditioned_perf_for_rule[rule] = conditioned_perf_by_instrument
conditioned_perf_for_rule_by_state.append(conditioned_perf_for_rule)
stacked_conditioned_perf_by_rule = {}
for rule in rule_list:
acc_curves_this_rule = conditioned_perf_for_rule[rule].values()
stacked_perf_this_rule = stack_list_of_pd_series(acc_curves_this_rule)
stacked_conditioned_perf_by_rule[rule] = stacked_perf_this_rule
print("State:%d \n\n\n" % condition_state)
for rule in rule_list:
print("%s:%.3f" % (rule, sharpe(stacked_conditioned_perf_by_rule[rule])))
Testing the significance of overall performance in different vol environments
from scipy import stats
for rule in rule_list:
perf_group_0 = stack_list_of_pd_series(conditioned_perf_for_rule_by_state[0][rule].values())
perf_group_1 = stack_list_of_pd_series(conditioned_perf_for_rule_by_state[1][rule].values())
perf_group_2 = stack_list_of_pd_series(conditioned_perf_for_rule_by_state[2][rule].values())
t_stat_0_1 = stats.ttest_ind(perf_group_0, perf_group_1)
t_stat_1_2 = stats.ttest_ind(perf_group_1, perf_group_2)
t_stat_0_2 = stats.ttest_ind(perf_group_0, perf_group_2)
print("Rule: %s , low vs medium %.2f medium vs high %.2f low vs high %.2f" % (rule,
t_stat_0_1.pvalue,
t_stat_1_2.pvalue,
t_stat_0_2.pvalue))
Rule: ewmac2_8 , low vs medium 0.37 medium vs high 0.00 low vs high 0.00
Rule: ewmac4_16 , low vs medium 0.25 medium vs high 0.00 low vs high 0.00
Rule: ewmac8_32 , low vs medium 0.12 medium vs high 0.00 low vs high 0.00
Rule: ewmac16_64 , low vs medium 0.08 medium vs high 0.00 low vs high 0.00
Rule: ewmac32_128 , low vs medium 0.07 medium vs high 0.00 low vs high 0.00
Rule: ewmac64_256 , low vs medium 0.03 medium vs high 0.00 low vs high 0.00
Rule: carry , low vs medium 0.00 medium vs high 0.32 low vs high 0.00
Is this an effect we can actually capture?
A more graduated system
- Low: Normalised vol in the bottom 25% quantile
- Medium: Between 25% and 75%
- High: Between 75% and 100%
Smoothing vol forecast attenuation
Testing the attenuation, rule by rule
from systems.forecast_scale_cap import *
class volAttenForecastScaleCap(ForecastScaleCap):
@diagnostic()
def get_vol_quantile_points(self, instrument_code):
## More properly this would go in raw data perhaps
self.log.msg("Calculating vol quantile for %s" % instrument_code)
daily_vol = self.parent.rawdata.get_daily_percentage_volatility(instrument_code)
ten_year_vol = daily_vol.rolling(2500, min_periods=10).mean()
normalised_vol = daily_vol / ten_year_vol
normalised_vol_q = quantile_of_points_in_data_series(normalised_vol)
return normalised_vol_q
@diagnostic()
def get_vol_attenuation(self, instrument_code):
normalised_vol_q = self.get_vol_quantile_points(instrument_code)
vol_attenuation = normalised_vol_q.apply(multiplier_function)
smoothed_vol_attenuation = vol_attenuation.ewm(span=10).mean()
return smoothed_vol_attenuation
@input
def get_raw_forecast_before_attenuation(self, instrument_code, rule_variation_name):
## original code for get_raw_forecast
raw_forecast = self.parent.rules.get_raw_forecast(
instrument_code, rule_variation_name
)
return raw_forecast
@diagnostic()
def get_raw_forecast(self, instrument_code, rule_variation_name):
## overriden methon this will be called downstream so don't change name
raw_forecast_before_atten = self.get_raw_forecast_before_attenuation(instrument_code, rule_variation_name)
vol_attenutation = self.get_vol_attenuation(instrument_code)
attenuated_forecast = raw_forecast_before_atten * vol_attenutation
return attenuated_forecast
def quantile_of_points_in_data_series(data_series):
results = [quantile_of_points_in_data_series_row(data_series, irow) for irow in range(len(data_series))]
results_series = pd.Series(results, index = data_series.index)
return results_seriesfrom statsmodels.distributions.empirical_distribution import ECDF# this is a little slow so suggestions for speeding up are welcomedef quantile_of_points_in_data_series_row(data_series, irow):
if irow<2:
return np.nan
historical_data = list(data_series[:irow].values)
current_value = data_series[irow]
ecdf_s = ECDF(historical_data)
return ecdf_s(current_value)
def multiplier_function(vol_quantile):
if np.isnan(vol_quantile):
return 1.0
return 2 - 1.5*vol_quantile
And here's how to implement it in a new futures system (we just copy and paste the futures_system code and change the object passed for the forecast scaling/capping stage)::
from systems.provided.futures_chapter15.basesystem import *
def futures_system_with_vol_attenuation(data=None, config=None, trading_rules=None, log_level="on"):
if data is None:
data = csvFuturesSimData()
if config is None:
config = Config(
"systems.provided.futures_chapter15.futuresconfig.yaml")
rules = Rules(trading_rules)
system = System(
[
Account(),
Portfolios(),
PositionSizing(),
FuturesRawData(),
ForecastCombine(),
volAttenForecastScaleCap(),
rules,
],
data,
config,
)
system.set_logging_level(log_level)
return system
And now I can set up two systems, one without attenuation and one with:
system =futures_system()
# will equally weight instruments
del(system.config.instrument_weights)
# need to do this to deal fairly with attenuation
# do it here for consistency
system.config.use_forecast_scale_estimates = True
system.config.use_forecast_div_mult_estimates=True
# will equally weight forecasts
del(system.config.forecast_weights)
# standard stuff to account for instruments coming into the sample
system.config.use_instrument_div_mult_estimates = True
system_vol_atten = futures_system_with_vol_attenuation()
del(system_vol_atten.config.forecast_weights)
del(system_vol_atten.config.instrument_weights)
system_vol_atten.config.use_forecast_scale_estimates = True
system_vol_atten.config.use_forecast_div_mult_estimates=True
system_vol_atten.config.use_instrument_div_mult_estimates = True
rule_list =list(system.rules.trading_rules().keys())
for rule in rule_list:
sr1= system.accounts.pandl_for_trading_rule(rule).sharpe()
sr2 = system_vol_atten.accounts.pandl_for_trading_rule(rule).sharpe()
print("%s before %.2f and after %.2f" % (rule, sr1, sr2))
Let's check out the results:
ewmac2_8 before 0.43 and after 0.52
ewmac4_16 before 0.78 and after 0.83
ewmac8_32 before 0.96 and after 1.00
ewmac16_64 before 1.01 and after 1.07
ewmac32_128 before 1.02 and after 1.07
ewmac64_256 before 0.96 and after 1.00
carry before 1.07 and after 1.11
Now these aren't huge improvements, but they are very consistent across every single trading rule. But are they statistically significant?
from syscore.accounting import account_test
for rule in rule_list:
acc1= system.accounts.pandl_for_trading_rule(rule)
acc2 = system_vol_atten.accounts.pandl_for_trading_rule(rule)
print("%s T-test %s" % (rule, str(account_test(acc2, acc1))))
ewmac2_8 T-test (0.005754898313025798, Ttest_relResult(statistic=4.23535684665446, pvalue=2.2974165336647636e-05)) ewmac4_16 T-test (0.0034239182014355815, Ttest_relResult(statistic=2.46790714210943, pvalue=0.013603190422737766)) ewmac8_32 T-test (0.0026717541872894254, Ttest_relResult(statistic=1.8887927423648214, pvalue=0.058941593401076096)) ewmac16_64 T-test (0.0034357601899108192, Ttest_relResult(statistic=2.3628815728522112, pvalue=0.018147935814311716)) ewmac32_128 T-test (0.003079560056791747, Ttest_relResult(statistic=2.0584403445859034, pvalue=0.03956754085349411)) ewmac64_256 T-test (0.002499427499123595, Ttest_relResult(statistic=1.7160401190191614, pvalue=0.08617825487582882)) carry T-test (0.0022278238232666947, Ttest_relResult(statistic=1.3534155676590192, pvalue=0.17594617201514515))
A mixed bag there, but with the exception of carry there does seem to be a reasonable amount of improvement; most markedly with the very fastest rules.
Again, I could do some implicit fitting here to only use the attenuation on momentum, or use less of it on slower momentum. But I'm not going to do that.
Summary
To return to the original question: yes we should change our trading behaviour as vol changes.But not in the way you might think, especially if you had extrapolated the performance from March 2020.As vol gets higher faster trading rules do relatively badly, but actually the bigger story is that all momentum rules suffer(as does carry, a bit). Not what I had expected to find, but very interesting. So a big thanks to the internet's hive mind for voting for this option.