Backtrader中文笔记之CSV Data Feed Development-General(二次修复)
Binary Datafeed Development
二进制数据源开发
Note
The binary file used in the examples goog.fd
belongs to VisualChart and cannot be distributed with backtrader
.
示例中使用的二进制文件goog.fd属于VisualChart,不能和backtrader一起发布。
VisualChart can be downloaded free of charge for those interested in directly using the binary files.
VisualChart可以免费下载有兴趣使用的二进制文件
CSV Data feed development has shown how to add new CSV based data feeds. The existing base class CSVDataBase provides the framework taking most of the work off the subclasses which in most cases can simply do:
CSV数据提要开发展示了如何添加新的基于CSV的数据提要。现有的基类CSVDataBase提供了框架,采取了大部分工作的子类,在大多数情况下可以简单地做:
def _loadline(self, linetokens): # parse the linetokens here and put them in self.lines.close, # self.lines.high, etc return True # if data was parsed, else ... return False
The base class takes care of the parameters, initialization, opening of files, reading lines, splitting the lines in tokens and additional things like skipping lines which don’t fit into the date range (fromdate
, todate
) which the end user may have defined.
基类负责参数、初始化、文件打开、读取行、以记号分割行以及跳过不适合用户定义的日期范围(fromdate, todate)的行等其他事情。
Developing a non-CSV datafeed follows the same pattern without going down to the already splitted line tokens.
开发一个非csv的数据传输遵循同样的模式,而不需要使用已经分离的行标记。
Things to do:
-
Derive from backtrader.feed.DataBase
- 源于backtrader.feed.DataBase
-
Add any parameters you may need
- 添加你需要的任何参数
-
Should initialization be needed, override
__init__(self)
and/orstart(self)
- 如果需要初始化,则重写
__init__(self)
和/或start(self) -
Should any clean-up code be needed, override
stop(self)
- 如果需要任何清理代码,请重写stop(self)
-
The work happens inside the method which MUST always be overriden:
_load(self)
- 工作发生在方法内部,该方法必须始终被重写:_load(self)
Let’s the parameters already provided by backtrader.feed.DataBase
:
让我们看看backtrader.feed.DataBase已经提供的参数:
from backtrader.utils.py3 import with_metaclass ... ... class DataBase(with_metaclass(MetaDataBase, dataseries.OHLCDateTime)): # 这些数据在处理CSV数据的时候可以用 params = (('dataname', None), ('fromdate', datetime.datetime.min), ('todate', datetime.datetime.max), ('name', ''), ('compression', 1), ('timeframe', TimeFrame.Days), ('sessionend', None))
Having the following meanings:
具体以下含义"
-
dataname
is what allows the data feed to identify how to fetch the data. In the case of theCSVDataBase
this parameter is meant to be a path to a file or already a file-like object. - dataname允许数据馈送标识如何获取数据。对于CSVDataBase,此参数意味着是指向文件的路径或已经是类似文件的对象。
-
fromdate
andtodate
define the date range which will be passed to strategies. Any value provided by the feed outside of this range will be ignored - fromdate和todate定义将传递给策略的日期范围。订阅源提供的任何超出此范围的值都将被忽略
-
name
is cosmetic for plotting purposes - 这个名字是用来作图的显示
-
timeframe
indicates the temporal working reference -
时间框架表示时间工作参考
Potential values:
Ticks
,Seconds
,Minutes
,Days
,Weeks
,Months
andYears
-
compression
(default: 1)Number of actual bars per bar. Informative. Only effective in Data Resampling/Replaying.
- 每个bar的实际bar数。仅在数据重采样/重放时有效。
-
compression
-
sessionend
if passed (a datetime.time object) will be added to the datafeeddatetime
line which allows identifying the end of the session - sessionend 如果被传递(一个datetime.time对象)将被添加到datafeed日期时间行,它允许识别会话的结束
Sample binary datafeed
backtrader
already defines a CSV datafeed (VChartCSVData
) for the exports of VisualChart, but it is also possible to directly read the binary data files.
backtrader已经为VisualChart的导出定义了一个CSV datafeed (VChartCSVData),但是也可以直接读取二进制数据文件。
Let’s do it (full data feed code can be found at the bottom)
让我们这样做(完整的数据提要代码可以在底部找到)
Initialization
The binary VisualChart data files can contain either daily (.fd extension) or intraday data (.min extension). Here the parameter timeframe
will be used to distinguish which type of file is being read.
def __init__(self): super(VChartData, self).__init__() # Use the informative "timeframe" parameter to understand if the # code passed as "dataname" refers to an intraday or daily feed if self.p.timeframe >= TimeFrame.Days: self.barsize = 28 self.dtsize = 1 self.barfmt = 'IffffII' else: self.dtsize = 2 self.barsize = 32 self.barfmt = 'IIffffII'
Start
The Datafeed will be started when backtesting commences (it can actually be started several times during optimizations)
数据传输将再回测的开始启动(在优化过程中可以多次启动)
In the start
method the binary file is open unless a file-like object has been passed.
在start方法中,除非传递了一个类似文件的对象,否则将打开二进制文件。
# 开始前查看自己的是否是文档流,self.f
def start(self): # the feed must start ... get the file open (or see if it was open) self.f = None if hasattr(self.p.dataname, 'read'): # A file has been passed in (ex: from a GUI) self.f = self.p.dataname else: # Let an exception propagate self.f = open(self.p.dataname, 'rb')
Stop
Called when backtesting is finished.
当回测完成调用
If a file was open, it will be closed
如果文件打开着就将它关了
def stop(self): # Close the file if any if self.f is not None: self.f.close() self.f = None
Actual Loading
实际读取
The actual work is done in _load
. Called to load the next set of data, in this case the next : datetime, open, high, low, close, volume, openinterest. In backtrader
the “actual” moment corresponds to index 0.
实际工作是在_load中完成的。调用来加载下一组数据,在本例中是下一组数据:datetime、open、high、low、close、volume、openinterest。在backtrader
中,“实际”时刻对应于指数0。
A number of bytes will be read from the open file (determined by the constants set up during __init__
), parsed with the struct
module, further processed if needed (like with divmod operations for date and time) and stored in the lines
of the data feed: datetime, open, high, low, close, volume, openinterest.
从打开的文件读取的一些字节(__init__期间由常量设置),用该结构模块解析,如果需要进一步处理(如与divmod操作日期和时间)并存储在的数据提要:datetime, open, high, low, close, volume, openinterest.
If no data can be read from the file it is assumed that the End Of File (EOF) has been reached
如果无法从文件中读取数据,则假定已经到达文件的末尾(EOF)
False
is returned to indicate the fact no more data is available- 返回False,表示没有更多数据可用
Else if data has been loaded and parsed:
如果数据已经加载和解析:
True
is returned to indicate the loading of the data set was a success- 返回True,表示数据集的加载成功
def _load(self):
# 没有文件直接返回 if self.f is None: # if no file ... no parsing return False # 读取的规定的字节长度 # Read the needed amount of binary data bardata = self.f.read(self.barsize) if not bardata: # if no data was read ... game over say "False" return False # use struct to unpack the data bdata = struct.unpack(self.barfmt, bardata) # Years are stored as if they had 500 days y, md = divmod(bdata[0], 500) # Months are stored as if they had 32 days m, d = divmod(md, 32) # put y, m, d in a datetime dt = datetime.datetime(y, m, d) if self.dtsize > 1: # Minute Bars # Daily Time is stored in seconds hhmm, ss = divmod(bdata[1], 60) hh, mm = divmod(hhmm, 60) # add the time to the existing atetime dt = dt.replace(hour=hh, minute=mm, second=ss) self.lines.datetime[0] = date2num(dt) # Get the rest of the unpacked data o, h, l, c, v, oi = bdata[self.dtsize:] self.lines.open[0] = o self.lines.high[0] = h self.lines.low[0] = l self.lines.close[0] = c self.lines.volume[0] = v self.lines.openinterest[0] = oi # Say success return True
Other Binary Formats
The same model can be applied to any other binary source:
-
Database
- 数据库
-
Hierarchical data storage
- 分层数据存储
-
Online source
- 在线来源
The steps again:
再来一次步骤
-
__init__
-> Any init code for the instance, only once - __init__->实例的任何初始化代码,仅一次
-
start
-> start of backtesting (one or more times if optimization will be run) start
->开始回溯测试(如果要运行优化,一次或多次)-
This would for example open the connection to the database or a socket to an online service
- 例如,这将打开到数据库的连接或到联机服务的套接字
-
stop
-> clean-up like closing the database connection or open sockets stop
->清理,如关闭数据库连接或打开套接字-
_load
-> query the database or online source for the next set of data and load it into thelines
of the object. The standard fields being: datetime, open, high, low, close, volume, openinteres
_load->查询数据库或联机源中的下一组数据,并将其加载到对象的行中。标准字段是:datetime、open、high、low、close、volume、openinterest
VChartData Test
The VCharData
loading data from a local “.fd” file for Google for the year 2006.
VCharData从本地加载2006年的google数据.fd文件
It’s only about loading the data, so not even a subclass of Strategy
is needed.
它只是关于加载数据,因此甚至不需要Strategy的子类。
from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime import backtrader as bt from vchart import VChartData if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro(stdstats=False) # Add a strategy cerebro.addstrategy(bt.Strategy) ########################################################################### # Note: # The goog.fd file belongs to VisualChart and cannot be distributed with # backtrader # # VisualChart can be downloaded from www.visualchart.com ########################################################################### # Create a Data Feed datapath = '../../datas/goog.fd' data = VChartData( dataname=datapath, fromdate=datetime.datetime(2006, 1, 1), todate=datetime.datetime(2006, 12, 31), timeframe=bt.TimeFrame.Days ) # Add the Data Feed to Cerebro cerebro.adddata(data) # Run over everything cerebro.run() # Plot the result cerebro.plot(style='bar')
VChartData Full Code
from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime import struct from backtrader.feed import DataBase from backtrader import date2num from backtrader import TimeFrame class VChartData(DataBase): def __init__(self): super(VChartData, self).__init__() # Use the informative "timeframe" parameter to understand if the # code passed as "dataname" refers to an intraday or daily feed if self.p.timeframe >= TimeFrame.Days: self.barsize = 28 self.dtsize = 1 self.barfmt = 'IffffII' else: self.dtsize = 2 self.barsize = 32 self.barfmt = 'IIffffII' def start(self): # the feed must start ... get the file open (or see if it was open) self.f = None if hasattr(self.p.dataname, 'read'): # A file has been passed in (ex: from a GUI) self.f = self.p.dataname else: # Let an exception propagate self.f = open(self.p.dataname, 'rb') def stop(self): # Close the file if any if self.f is not None: self.f.close() self.f = None def _load(self): if self.f is None: # if no file ... no parsing return False # Read the needed amount of binary data bardata = self.f.read(self.barsize) if not bardata: # if no data was read ... game over say "False" return False # use struct to unpack the data bdata = struct.unpack(self.barfmt, bardata) # Years are stored as if they had 500 days y, md = divmod(bdata[0], 500) # Months are stored as if they had 32 days m, d = divmod(md, 32) # put y, m, d in a datetime dt = datetime.datetime(y, m, d) if self.dtsize > 1: # Minute Bars # Daily Time is stored in seconds hhmm, ss = divmod(bdata[1], 60) hh, mm = divmod(hhmm, 60) # add the time to the existing atetime dt = dt.replace(hour=hh, minute=mm, second=ss) self.lines.datetime[0] = date2num(dt) # Get the rest of the unpacked data o, h, l, c, v, oi = bdata[self.dtsize:] self.lines.open[0] = o self.lines.high[0] = h self.lines.low[0] = l self.lines.close[0] = c self.lines.volume[0] = v self.lines.openinterest[0] = oi # Say success return True
文中通过读取二进制文件,处理一些个性数据的需求,通过继承DataBase,修改方法。
DataBase中的params还是可以处理一些输入的前置需求。_load需要自己逐条传递数据