Static data include:
- Futures, and specific contract details
- System parameters
- Price data
- Volume data
- Fundamental data if used, eg PE ratios for equities
- Accounting data
- Diagnostic data, storing what the system did at various points in the past
State data relates to the control of the trading system and I won't be going into details here - its just static data which is frequently modified.
Code is in the usual git repo. You will need pandas and sqllite3 (which came with my python distro automatically so check your own).
Creating the database
dbname="mydb"dbfilename=get_db_filename(dbname)
setup_blank_tables(dbfilename, ["CREATE TABLE timeseries (datetime text, code text, price float)",
"CREATE TABLE static (code text, fullname text)"])
Here we're creating a database file with two tables in it. Notice the use of database names to abstract away from where they are stored. I find the performance of sqllite3 with massive files isn't great so I tend to stick to one table per file in practice, but for this simple example we don't need to.
If you're a SQL whizz you'll see that I am not doing any relational stuff here.
Static data
st_table=staticdata(dbname)st_table.add("FTSE", "FTSE 100 index")
st_table.modify("FTSE", "FTSE all share")
print st_table.read("FTSE")
st_table.delete("FTSE")
Notice that we use staticdata so we don't need to use any SQL in these commands (in case the underlying table structure changes and to avoid having reams of repetitive nonsense), and within that the connection object ensures that the staticdata code isn't specific to sqlite3.
The sqlite3 read returns lists of tuples, which staticdata.read() resolves to a single string.
Timeseries data
dt_table=tsdata(dbname)someprices=pd.TimeSeries(range(100), pd.date_range('1/1/2014', periods=100))
dt_table.add("FTSE", someprices)
dt_table.read("FTSE")
We use pandas TimeSeries class as the input and output, which is then translated into database terms. sqlite has no native datetime format, only text or float, so we need to translate between pandas/datetime and text. I define a specific format for the text representation to be precise and ensure the database is forward compatible to any changes in pandas.
Why don't you use MongoDB instead of SQL?
ReplyDeleteI think mongodb is great for storing things like system configuration information and the saved state. However for straightforward time series sql is simpler. I haven't used mongodb in my "home" enviroment, only an enterprise enviroment, so I don't know how easy it is to setup.
DeleteHaving said that I have heard very good things about arctic
https://github.com/manahl/arctic
Hey Rob, great and concise post! Thanks for the help.
ReplyDeleteIf you allow me, I'd advise you to change the code template color, it's very hard to read pale green over white. Sorry to bug in with such simple stuff :)
Good suggestion. I will bear that in mind for future posts (can't promise I will spend time changing the past).
Delete