9.数据库存储(LEVELI&II)的难点:使用分区表加速查询
请关注个人小站:http://sqlhis.com/
先来看看错误的存储方法,好比行情表,拆分成了如图无数个小表,每天一个表
这严重的增加了编程复杂度,例如:
取000001.SZ(中国平安)最近一个月每天早上9:30 到10:00之间的最高价和最低价
如果在一张表里面,代码大概是这样的
SELECT BizDate,MAX(Price) AS MaxPrice,MIN(Price) AS MinPrice
FROM vw_TickData
where SecId='000001.SZ'
and BizDate between 20191201 and 20191231
and TradeTime between 093000000 and 100000000
GROUP BY BizDate
如果是每天一张表,编程复杂度就增加了不知道多少了。
方法1,建立一个视图把所有的表UNION ALL起来(建这个视图其实很麻烦,上千个表)
CREATE VIEW [dbo].[stock_tick]
AS
select * from stock_tick_2013_0101
UNION ALL
select * from stock_tick_2013_0102
.....
虽然从使用角度看,两者趋于一致了,但是使用UNION ALL将可能导致性能问题。相比分区表,如果where语句中有分区条件,只要扫描指定分区就可以了,但是使用这种UNION ALL方式将不可避免的扫描所的表。
当然这其中又有两种情况:
扫描表后很快发现大部分表没有符合要求的数据,这只略微影响性能
对其中每一个表都进行了全表扫描,性能影响极大。
方法2,构造动态SQL,指定需要扫描的表。
这里最大的难度就是编程复杂度太高,对用户非常不友好
所以:笔者认为,每天一个表这种分表方法对用户是及其不友好的
我们来探讨一下为什么会采用每天一个表的分表设计,原因可能有三种:
1.拍脑袋,拍脑袋的事就真没办法了,对数据库的理解是是而非。
2.数据库不支持分区表,这是有可能的,笔者工作过的一家公司,采购的SQL Server是标准版,是没有分区表功能的,为了达到类似的效果,被迫使用了UNION ALL模拟分区表
3.使用单表的时候,插入性能太低。然后发现每天一个表这种方式插入性能高,为了解决插入性能问题建成这样。
不管是哪种原因,只要数据库能支持分区表,就完全没有必要使用分表,分表其实就是模拟分区表,但是这样程序复杂度增加,性能变慢,不如直接使用分区表。可以认为这种行为是重复制造轮子。
下面说下分区表的大致方法
先建个分区函数
CREATE PARTITION FUNCTION pfn_BizDate(INT)
AS RANGE LEFT FOR
VALUES (20070100,20070200,20070300,20070400,20070500,20070600,20070700,20070800,20070900,20071000,20071100,20071200,
20080100,20080200,20080300,20080400,20080500,20080600,20080700,20080800,20080900,20081000,20081100,20081200,
20090100,20090200,20090300,20090400,20090500,20090600,20090700,20090800,20090900,20091000,20091100,20091200,
20100100,20100200,20100300,20100400,20100500,20100600,20100700,20100800,20100900,20101000,20101100,20101200,
20110100,20110200,20110300,20110400,20110500,20110600,20110700,20110800,20110900,20111000,20111100,20111200,
20120100,20120200,20120300,20120400,20120500,20120600,20120700,20120800,20120900,20121000,20121100,20121200,
20130100,20130200,20130300,20130400,20130500,20130600,20130700,20130800,20130900,20131000,20131100,20131200,
20140100,20140200,20140300,20140400,20140500,20140600,20140700,20140800,20140900,20141000,20141100,20141200,
20150100,20150200,20150300,20150400,20150500,20150600,20150700,20150800,20150900,20151000,20151100,20151200,
20160100,20160200,20160300,20160400,20160500,20160600,20160700,20160800,20160900,20161000,20161100,20161200,
20170100,20170200,20170300,20170400,20170500,20170600,20170700,20170800,20170900,20171000,20171100,20171200,
20180100,20180200,20180300,20180400,20180500,20180600,20180700,20180800,20180900,20181000,20181100,20181200,
20190100,20190200,20190300,20190400,20190500,20190600,20190700,20190800,20190900,20191000,20191100,20191200,
20200100,20200200,20200300,20200400,20200500,20200600,20200700,20200800,20200900,20201000,20201100,20201200,
20210100,20210200,20210300,20210400,20210500,20210600,20210700,20210800,20210900,20211000,20211100,20211200,
20220100,20220200,20220300,20220400,20220500,20220600,20220700,20220800,20220900,20221000,20221100,20221200,
20230100,20230200,20230300,20230400,20230500,20230600,20230700,20230800,20230900,20231000,20231100,20231200,
20240100,20240200,20240300,20240400,20240500,20240600,20240700,20240800,20240900,20241000,20241100,20241200,
20250100,20250200,20250300,20250400,20250500,20250600,20250700,20250800,20250900,20251000,20251100,20251200,
20260100,20260200,20260300,20260400,20260500,20260600,20260700,20260800,20260900,20261000,20261100,20261200
)
建立分区
CREATE PARTITION scheme psc_BizDate
AS PARTITION pfn_BizDate all TO ([PRIMARY])
GO
建立表
CREATE TABLE TickData(
SecId INT NOT NULL, --股票编码
BizDate INT NOT NULL, --交易日期
TradeTime INT NOT NULL, --交易时间
RN TINYINT NOT NULL, --相同时间排序
Price INT NULL, --成交价
Volume BIGINT NULL, --成交量
TurnOver BIGINT NULL, --成交额
MatchItems INT NULL, --成交笔数
Interest INT NULL, --持仓量(期货)、IOPV(基金)*10000、利息(债券)*10000
--BSFlag BIT NULL, --买卖标志 0='B' 1='S' NULL=无买卖方向[取消BSFlag由宏汇计算,TDF未提供且算法不清楚]
AccVolume BIGINT NULL, --当日累计成交量
AccTurover BIGINT NULL, --当日累计成交额
--十档买卖价位
AskPrice1 INT NULL, --叫卖价1
AskPrice2 INT NULL, --叫卖价2
AskPrice3 INT NULL, --叫卖价3
AskPrice4 INT NULL, --叫卖价4
AskPrice5 INT NULL, --叫卖价5
AskPrice6 INT NULL, --叫卖价6
AskPrice7 INT NULL, --叫卖价7
AskPrice8 INT NULL, --叫卖价8
AskPrice9 INT NULL, --叫卖价9
AskPrice10 INT NULL, --叫卖价10
AskVolume1 INT NULL, --叫卖量1
AskVolume2 INT NULL, --叫卖量2
AskVolume3 INT NULL, --叫卖量3
AskVolume4 INT NULL, --叫卖量4
AskVolume5 INT NULL, --叫卖量5
AskVolume6 INT NULL, --叫卖量6
AskVolume7 INT NULL, --叫卖量7
AskVolume8 INT NULL, --叫卖量8
AskVolume9 INT NULL, --叫卖量9
AskVolume10 INT NULL, --叫卖量10
BidPrice1 INT NULL, --叫买价1
BidPrice2 INT NULL, --叫买价2
BidPrice3 INT NULL, --叫买价3
BidPrice4 INT NULL, --叫买价4
BidPrice5 INT NULL, --叫买价5
BidPrice6 INT NULL, --叫买价6
BidPrice7 INT NULL, --叫买价7
BidPrice8 INT NULL, --叫买价8
BidPrice9 INT NULL, --叫买价9
BidPrice10 INT NULL, --叫买价10
BidVolume1 INT NULL, --叫买量1
BidVolume2 INT NULL, --叫买量2
BidVolume3 INT NULL, --叫买量3
BidVolume4 INT NULL, --叫买量4
BidVolume5 INT NULL, --叫买量5
BidVolume6 INT NULL, --叫买量6
BidVolume7 INT NULL, --叫买量7
BidVolume8 INT NULL, --叫买量8
BidVolume9 INT NULL, --叫买量9
BidVolume10 INT NULL, --叫买量10
AskAvPrice INT NULL, --加权平均叫卖价(上海L2)
BidAvPrice INT NULL, --加权平均叫买价(上海L2)
TotalAskVolume BIGINT NULL, --叫卖总量(上海L2)
TotalBidVolume BIGINT NULL, --叫买总量(上海L2)
CONSTRAINT PK_TickData PRIMARY KEY CLUSTERED
(SecId,BizDate,TradeTime,RN)
)ON psc_BizDate(BizDate)WITH(DATA_COMPRESSION = PAGE )
最后建立视图,视图是用来给用户访问的
CREATE VIEW [dbo].[vw_TickData]
AS
SELECT b.WindCode,
b.SecurityName,
b.SecuCode,
b.Market,
a.SecId,
a.BizDate,
a.TradeTime,
a.RN,
a.Price*1.0/10000 AS Price,
a.Volume,
a.TurnOver,
a.MatchItems,
a.Interest,
--a.BSFlag,
a.AccVolume,
a.AccTurover,
a.AskPrice1*1.0/10000 AS AskPrice1,
a.AskPrice2*1.0/10000 AS AskPrice2,
a.AskPrice3*1.0/10000 AS AskPrice3,
a.AskPrice4*1.0/10000 AS AskPrice4,
a.AskPrice5*1.0/10000 AS AskPrice5,
a.AskPrice6*1.0/10000 AS AskPrice6,
a.AskPrice7*1.0/10000 AS AskPrice7,
a.AskPrice8*1.0/10000 AS AskPrice8,
a.AskPrice9*1.0/10000 AS AskPrice9,
a.AskPrice10*1.0/10000 AS AskPrice10,
a.AskVolume1,
a.AskVolume2,
a.AskVolume3,
a.AskVolume4,
a.AskVolume5,
a.AskVolume6,
a.AskVolume7,
a.AskVolume8,
a.AskVolume9,
a.AskVolume10,
a.BidPrice1*1.0/10000 AS BidPrice1,
a.BidPrice2*1.0/10000 AS BidPrice2,
a.BidPrice3*1.0/10000 AS BidPrice3,
a.BidPrice4*1.0/10000 AS BidPrice4,
a.BidPrice5*1.0/10000 AS BidPrice5,
a.BidPrice6*1.0/10000 AS BidPrice6,
a.BidPrice7*1.0/10000 AS BidPrice7,
a.BidPrice8*1.0/10000 AS BidPrice8,
a.BidPrice9*1.0/10000 AS BidPrice9,
a.BidPrice10*1.0/10000 AS BidPrice10,
a.BidVolume1,
a.BidVolume2,
a.BidVolume3,
a.BidVolume4,
a.BidVolume5,
a.BidVolume6,
a.BidVolume7,
a.BidVolume8,
a.BidVolume9,
a.BidVolume10,
a.AskAvPrice*1.0/10000 AS AskAvPrice,
a.BidAvPrice*1.0/10000 AS BidAvPrice,
a.TotalAskVolume,
a.TotalBidVolume
FROM TickData a
LEFT OUTER JOIN SecurityId b
ON a.SecId=b.SecId
GO