tag:blogger.com,1999:blog-261139923818144971.post1042630594194607814..comments2017-10-21T11:42:49.004+01:00Comments on Investment Idiocy: Correlations, Weights, Multipliers.... (pysystemtrade)Rob Carvernoreply@blogger.comBlogger122125tag:blogger.com,1999:blog-261139923818144971.post-32886975353276565422017-09-04T10:34:21.507+01:002017-09-04T10:34:21.507+01:00Understood. And thank you again, Rob.Understood. And thank you again, Rob.Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-84865759157228580122017-09-04T09:28:58.609+01:002017-09-04T09:28:58.609+01:00Patrick.
Essentially the question is what is the...Patrick. <br /><br />Essentially the question is what is the difference between the correlation of forecast RETURNS rather than forecast VALUES. If returns are more correlated than values, the FDM will be too high (producing a trading rule forecast that has too much volatility). And vice versa. In fact the answer isn't obvious and depends very much on the properties of the signal and the underlying market.<br /><br />At the end of the day I think it's more intuitive to state the FDM in terms that it's ensuring the final combined forecast VALUE has the correct scaling properties. FDM would have to be way off for me to drop this point of view. And it usually is pretty close - in fact any effect is dominated by the problem that volatility and correlations aren't stable enough, which means that trying to hit a volatility target is an imprecise science at best. <br /><br />"Would I pool all instrument data if they didn't share the same rule variations" - ideally yes, but it's very complicated to do this - essentially the problem being to average a series of correlation matrices with some overlapping and non overlapping elements, and produce a well defined matrix at the end. There are techniques to do this, but it strikes me as overkill to bother...Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-14862211563532115332017-09-03T17:31:43.092+01:002017-09-03T17:31:43.092+01:00Many thanks Rob. The more I dig into your system, ...Many thanks Rob. The more I dig into your system, the more I appreciate the serious thought which has gone into engineering it. A couple of follow-up observations. From my comparatively low vantage point, I also see that from a practical point of view certain simplifications are desirable (at very little cost to robustness). Would you say that is the case with the FDM? Strictly, if it made a big difference, should we be adjusting for reduced portfolio volatility of forecast returns rather than forecasts themselves? Also, wrt FDMs would you pool all instrument data whether or not they share the same rule variations after costs? Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-27756858460903760172017-09-02T14:05:24.023+01:002017-09-02T14:05:24.023+01:00You've got it. Couple of interesting points fr...You've got it. Couple of interesting points from what you've said though:<br /><br />- the fact that a forecast for carry comes out as mu/sigma is a nice property, but in general we only require that raw forecasts are PROPORTIONAL to mu/sigma. So some further correcting factor may be necessary (the forecast scalar)<br /><br />- in terms of multiple rules; obviously if weights added up to 1 and rules were 100% correlated you'd end up with a joint forecast with exactly the same properties as the individual forecasts. In theory the forecast diversification multiplier deals with this problem; with the caveat that it's backward looking and based on historic correlations so it only works in expectation.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-702370039248608332017-09-02T13:47:42.422+01:002017-09-02T13:47:42.422+01:00from systems.provided.futures_chapter15.basesystem...from systems.provided.futures_chapter15.basesystem import futures_system<br />from matplotlib.pyplot import show<br /><br />system = futures_system(log_level="on")<br />a=system.combForecast.get_all_forecasts("EUROSTX")<br />b=system.combForecast.get_forecast_weights("EUROSTX")<br />c=a*b<br />c.plot()<br />https://drive.google.com/file/d/0B2xHDlIRSeeXZXZjenU1QlRaQkk/view?usp=sharingRob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-35431593010455014332017-08-31T13:08:15.964+01:002017-08-31T13:08:15.964+01:00Hi Rob,
Do you have the breakdown of subsystem si...Hi Rob,<br /><br />Do you have the breakdown of subsystem signals for the Eurostoxx? You never get short in 2015, only less long? It looks like the market heads down quite a bit. Is this because of the carry signal dwarfing the trend signal? Optically, I can't line up the forecast weights with the chart.<br /><br />Thanks!Matthttps://www.blogger.com/profile/16122419489436306940noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-55295740347327988002017-08-30T19:35:55.430+01:002017-08-30T19:35:55.430+01:00OK I think get it. Really had to have a think and ...OK I think get it. Really had to have a think and run some simulations but as far as I can tell there seem to be two effects in play here. The arithmetic returns from applying a rule on an instrument is the product of two rvs: instrument returns and forecasts. Assuming independence between these rvs, the variance can be shown to be function of their first two moments. Over sufficiently long periods, these moments across different instruments are equal (asymptotically converge). However over shorter periods there may be divergence (i.e. different averages, different vols) which will violate the assumption of equal vols required to be able to run the optimiser using correlations only. As far as I can tell there also is a more subtle effect, and that arises from the fact that forecasts and instrument returns are not independent (EWMAC 2,8 and daily returns when using random gaussian data have a correlation of 45%). This inconveniently introduces covariance terms in the calc for vol. However, in the cross section of a single rule applied across different instruments over sufficiently long periods of time, the covar terms should have an equal effect across all instruments. Again over short periods there may be divergence. This divergence in small samples from the assumed distribution of the population is presumably why it is sensible to equalise vols before optimising. Am I on the right track?<br /><br />BTW please feel free to delete my earlier comment.Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-63334568998745555252017-08-28T13:08:36.138+01:002017-08-28T13:08:36.138+01:00Hi Rob, it's possible I worded my original que...Hi Rob, it's possible I worded my original question poorly, <br /><br />I will try to be more systematic, so please bear with me. To recap on my understanding:<br /><br />1.'vol normalisation' is what you do when standardising forecasts. This is typically done by dividing by rolling 36 day EWMA * current price<br />2. 'vol standardisation' is what you do when standardising subsystems. I would again use rolling 36 day EWM vol (times block value, etc) for this<br />3. 'vol equalisation' is what you do prior to optimisation to scale the returns over the entire (expanding) window so over this window they have the same volatility<br /><br />4. Assuming the above is correct, a subsystem position for carry and EWMAC variation is proportional to exp return/vol^2 (which co-incidentally seems to be proportional to optimal kelly scale - although not for the breakout rule).<br /><br />5. When I said originally 'Now when using pooled data for forecasts, my thinking is fuzzier: is it advisable not to equalise means or vol?', to be clearer I was trying to ask whether it makes sense to equalise vols prior to optimising forecast weights when pooling (not whether to equalise vols when optimising subsystem weights, if we had pooled forecasts previously). In a previous post ('a little demonstration of portfolio optmisation') you do an asset vol 'normalisation' which I believe is the same as 'equalisation' discussed here (scale the whole window, although not done on an expanding window) but I got the impression for forecasts that the normalisation is handled as above and this took care of the need for further equalisation (for forecasts at least).<br /><br />I must admit, I had always thought that if you want to use only correlations and means to optimise then intuitively you should equalise vols in the window being sampled (because to quote you this reduces the covar matrix to a correlation matrix). However I had somehow accepted the fact that normalising forecasts by recent vol ended up doing something similar (also from reading some comments by you about not strictly needing to equalise vols for forecasts, etc). But I guess a different issue arises when pooling short histories?<br /><br />In summary, assuming you deciphered my original question correctly, are you saying it is still important to equalise vols of forecast as the 'realised' variance of forecast returns are proportional to the the level of the forecast (so a forecast of 10 would have twice the variance of a forecast of 5), causing the optimiser to downweight elevated forecasts, which is a problem when pooling short data histories? By equalising vols over the entire window being used for optimisation, we end up removing this effect? If that is what you are saying then I promise to go away and think about this much more deeply.<br /><br />Thanks again for taking the time.Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-56805964771270184932017-08-27T12:46:14.613+01:002017-08-27T12:46:14.613+01:00There is rolling historical 36d EWMA vol used for ...<br />There is rolling historical 36d EWMA vol used for position sizing. Realised vol is what actually happens in the future. <br /><br />When doing optimisation we use a different estimated vol - the vol of returns over a given period (the entire backtest if using an expanding window).<br /><br />The vol used for vol standardisation is the estimated vol of the p&l of the instrument subsystem returns. If all forecasts were identical, and we could predict vol perfectly, then this would automatically be the same for all instruments (because when we scale positions for instrument subsystems we target the same volatility per unit forecast). The fact that estimated vol isn't a perfect forecast adds some randomness to this; a randomness that will be problematic for very short data histories. More problematic again is that for instruments with short histories they will have different forecasts. An instrument with an average forecast of 10 over a year compared to another with an average of 5 will have twice the estimated volatility over that backtest period. But that is an artifact of the different forecast level - it doesn't mean we should actively downweight the higher forecast instrument.<br /><br />I agree I could have been stricter with the terminology I use about equalisation, normalisation and standardisation. There are at least three things going on, which are subtly different:<br /><br />a) many forecasts involve dividing something by estimated vol to get a forecast that is proportional to sharpe ratio (normalisation)<br />b) position scaling involves using estimated vol and assuming it is a good forecast of future vol <br />c) equalisation of standard deviation estimates when doing optimisation for all the reasons I've discussed.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-19274349934839063982017-08-25T15:08:28.178+01:002017-08-25T15:08:28.178+01:00Hi Rob thanks for this. Just so I can get my head ...Hi Rob thanks for this. Just so I can get my head around your answer a bit better, a question on terminology: are 'estimated' vol and 'realised' vol the same thing and equal to the vol used for standardisation (i.e. rolling historic 36d EWM vol)? As I understand it the two inputs into the the optimisation you do are correlation and mean returns. So are you saying that if we relied merely on vol standardisation (using recent realised vol) then a period of high vol for an instrument with a short data history but high forecasts would lower the forecasts and their corresponding weights? I am failing to make the connection between high forecasts and high price vol which is used for standardisation. I am sorry if I have completely missed the point.<br /><br />On a related point, and I should have asked this earlier: on page 289 of your book you recommend that prior to optimisation we should ensure 'returns have been vol normalised' I assume this is the same as 'equalisation' that you refer to in this post and not the same as standardisation (btw the term volatility normalised is in bold so perhaps your publishers might consider putting a reference in the glossary for future editions before your book becomes compulsory reading for our grandkids).<br /> Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-21908696567736658772017-08-25T09:34:33.162+01:002017-08-25T09:34:33.162+01:00You're correct that the solution relies entire...You're correct that the solution relies entirely on correlations in this case (in fact it's the optimal Sharpe, maximum return and the minimum variance solution).<br /><br />"Now when using pooled data for forecasts, my thinking is fuzzier: is it advisable not to equalise means or vol?"<br /><br />No, you should still equalise them. Basically the logic in all cases is this. Vol targeting equalises expected vol, but not realised vol which will still be different. If realised vol goes into the optimiser then it will have an effect on the weights, which we don't want. To take an extreme example if you have an instrument with a very short data history which happens to have a very strong forecast in that period then it's estimated vol will be unrealistically high and it will see it's weights downgraded unless we equalise vols.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-46871376771093917932017-08-25T08:54:11.642+01:002017-08-25T08:54:11.642+01:00Hi Rob, I have a question with regard to setting u...Hi Rob, I have a question with regard to setting up data prior to optimising weights using bootstrapping. If we follow your advice, forecast returns are already standardised across instruments through dividing by say 36-day EWMA vol. However, I understand from the above example, it makes sense also to equalise vols and means. I assume the vol_equaliser fn does this by rescaling the time series of returns so that all the forecast distributions are virtually the same over the entire series (i.e. have identical Sharpes). The weights you derive would presumably be that of a min variance portfolio and therefore relies on a solution based entirely on the correlations between the returns. Is the above correct? I assume you recommend the same procedure for bootstrapping subsystem weights (i.e. equalise means and vol). Now when using pooled data for forecasts, my thinking is fuzzier: is it advisable not to equalise means or vol?Patrickhttps://www.blogger.com/profile/01034323689613977187noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-26631427478077085002017-08-16T03:18:50.738+01:002017-08-16T03:18:50.738+01:00I have read chapter 14 again and it was helpful.
...I have read chapter 14 again and it was helpful.<br /><br />I should have checked the book before I asked.<br /><br />Thanks for reply.Minkyu Kimhttps://www.blogger.com/profile/12721284558759239431noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-48856436433494955842017-08-14T11:37:58.946+01:002017-08-14T11:37:58.946+01:00Solution 1 is correct. Chapter 14 actually discuss...Solution 1 is correct. Chapter 14 actually discusses exactly this problem so it might be worth (re) reading it.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-81444072621210162672017-07-30T17:56:08.654+01:002017-07-30T17:56:08.654+01:00Thank you, Gainz! I see that you have an additiona...Thank you, Gainz! I see that you have an additional folder in 'sysdata' and the 'price' CSVs are 15-minute entries. Can you comment on the adaptation scheme in the stages of the workflow? For example, looking at the /systems/rawdata.py script, only daily prices are mentioned. Did you go that far with changing the code to recognizing intraday, but left the names as 'daily', or is the actual adaptation still needed?Chad Bhttps://www.blogger.com/profile/13026562498196984544noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-4094523543436383862017-07-26T19:33:38.691+01:002017-07-26T19:33:38.691+01:00Just briefly looking at the intra-day posts in thi...Just briefly looking at the intra-day posts in this thread, I can say I spent a considerable amount of time last year making the program intra-day compatible (I was using tradestation data). It's been some time since I looked at this project, but I can refer you to where I left off (I believe it works, but I haven't had anybody review my work): https://github.com/anthonywise/pysystemtrade/tree/tscompare<br /><br />Hope this helpsWesay GAINZhttps://www.blogger.com/profile/05676332880263742262noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-27757168850486696812017-07-25T10:27:24.793+01:002017-07-25T10:27:24.793+01:00This comment has been removed by a blog administrator.Minkyu Kimhttps://www.blogger.com/profile/12721284558759239431noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-3245778459413082002017-07-25T10:26:25.388+01:002017-07-25T10:26:25.388+01:00Dear Mr. Carver,
Most of all I always appreciate ...Dear Mr. Carver,<br /><br />Most of all I always appreciate you for sharing detailed & practical knowledge of quantitative trading.<br /><br />I have a few questions while reading your book & blog posts.<br /><br />I am trying to develop a trend following trading system with ETFs using the framework in your book.<br /><br />The trading system is long-only and constrained by leverage limit(100%).<br /><br />Under the constraints, what is the best way to use your framework properly?<br /><br />Is there any changes in calculating forecast scalars, forecast weights, FDM, IDM, etc?<br /><br /><br />My thought is...<br /><br />Solution 1.<br />- Maintain all the procedures in your framework as if I could long/short and have no leverage limit. (Suppose that I have 15% target vol)<br /><br />- When I calculate positions for trading, I just assign zero position for negative forecasts.<br /><br />And if sum of long positions exceeds my capital I scale down the positions so that the portfolio could not be leveraged.<br /><br /><br />Solution 2.<br />- Forecast scalar:<br />No change. I calculate forecasts and scale them(-20 ~ + 20).<br /><br />- Forecast weights, Correlation: <br />For each trading rules,<br /> + Calculate portfolio returns of pooled instruments according to the forecasts.<br /> + Returns for negative forecasts replace to zeros. (Zero position instead of short)<br /> + And I scale down the returns for positive forecasts when sum of long positions exceeds my capital.<br /> + Returns of trading rules are used when bootstrapping or calculating correlations.<br /> + Forecast weights are optimized using these returns.<br />- FDM: <br />+ Calculate FDM based on forecast weights and correlations among the forecasts as your framework. <br />+ Calculate the historical participations(= sum(long position)/myCapital) using new rescaled forecasts and forecast weights.<br />+ Check the Median(participations) for back-tested period.<br />+ If it exceeds 100% I scaled down FDM in order to get my portfolio not take too much risk.<br /><br />Frankly speaking I don't know what the right ways are. Both ways does not seem proper. Maybe it is because of my lack of understading.<br /><br />Would you give any advice?<br /><br />I am really looking forward to your 2nd book. Thanks for reading.<br /><br />Best regards,<br /><br />Michael KimUnknownhttps://www.blogger.com/profile/12721284558759239431noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-3375776947819864092017-07-18T15:24:35.373+01:002017-07-18T15:24:35.373+01:00To be honest I haven't given this much thought...To be honest I haven't given this much thought. I can see why it will affect the calculation of volatility, but it's not obvious to me how it affects correlation.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-59266450265846578612017-07-18T13:41:44.563+01:002017-07-18T13:41:44.563+01:00Hi Rob, what is your view on using returns (weekly...Hi Rob, what is your view on using returns (weekly in your case) to compute correlations vs log returns (as per this AQR paper). Why do you suppose some choose log returns, do you view any significant difference between the two? I found this http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1586656 on the topic, but still not seeing the fundamental reason to use one vs the other.<br />Appreciate your thoughts as always.JMW100https://www.blogger.com/profile/18171394406805453830noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-73731722140996345712017-07-18T09:23:49.998+01:002017-07-18T09:23:49.998+01:00The answer to your question(s) is that I have not ...The answer to your question(s) is that I have not tested the system with non daily data so there is no guarantee it will work. I am pretty sure there are several places where I assume the data is daily (and you have unearthed one of them); some data is always resampled daily whilst others are not, so there could be some issues with mismatching and what not.<br /><br />The legacy data with multiple snapshots per day is an oversight (my code should resample to daily before burning the legacy data - that might not have happened on the version of the data you are using) - and indeed it may be causing some slightly unpredictable results in the last few years.<br /><br />In summary I would need to do a lot of testing before I was confident the code would work with non daily data, so I really wouldn't trust it for those purposes yet.Rob Carverhttps://www.blogger.com/profile/10175885372013572770noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-12479780945510610322017-07-18T05:09:43.228+01:002017-07-18T05:09:43.228+01:00P.S. for example, does the diversification multipl...P.S. for example, does the diversification multiplier need to be modified for interpreting 1-minute periods instead of sampling at end-of-day? What about volatility scaling floors currently set with daily period?Chad Bhttps://www.blogger.com/profile/13026562498196984544noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-78803010302485970382017-07-18T05:07:24.820+01:002017-07-18T05:07:24.820+01:00Also, since many of the instruments in the legacy ...Also, since many of the instruments in the legacy data have a lot of days near the final years of the records with more than one recorded value per day, it seems that using new CSVs with intraday data would be feasible in short order, but making sure to change the period of several calculations in other stages to recognize the periodicity on a minute scale, instead of days, no?<br /> Sorry in advance for my hasty monologue...Chad Bhttps://www.blogger.com/profile/13026562498196984544noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-91209452641985264062017-07-18T05:04:20.191+01:002017-07-18T05:04:20.191+01:00Good morning, Rob.
When I run your ch_15 system a...Good morning, Rob.<br /> When I run your ch_15 system along with default configs, trading rules, etc. unmodified, the stages run fine. If I substitute the legacyCSV files of several instruments with *intraday* 1-minute bars in the same 2-column format, both for the '_price' file and '_carrydata' (expiration months spaced from current date like you showed in legacy versions), spanning 5 days each, and re-running the system but changing nothing except reduction of the instrument_list entries, I get the error from line 530 (get_notional_position) of /systems/portfolio.py "No rules are cheap enough for CRUDE_W with threshold of 1.300 SR units! Raise threshold (...), add rules, or drop instrument."<br /> I raised it from the original 0.13 to 1.3, and in other tests as high as 100 (ridiculous value of course, just testing...), same result. Seems I'm overlooking a simple principle of the system, but I can't figure why, given the trading rules were left same. Can you offer a pointer?Chad Bhttps://www.blogger.com/profile/13026562498196984544noreply@blogger.comtag:blogger.com,1999:blog-261139923818144971.post-53775798141350342972017-05-03T16:21:17.640+01:002017-05-03T16:21:17.640+01:00Yes, I agree. That was also my initial intuition a...Yes, I agree. That was also my initial intuition as well. I compared the weekly non-overlapping approach with the overlapping 3-day approach over the same time-frame and the markets that are synchronous the correlation estimates are very similar. More importantly, for example when I ran it on the Hang Seng the rolling 3-day approach was quite close to the weekly approach. So obviously some slight differences but, as you say, the approach doesn't seem like something crazy. <br /><br />Your response is much appreciated.OTNY33https://www.blogger.com/profile/09457839737383784545noreply@blogger.com