timesacledb 测试demo数据运行
timesacledb 的安装还是使用docker,对于测试数据需要提前下载
启动timesacledb
使用支持gis 的镜像,后边需要使用
docker run -d --name timescaledb -p 5432:5432 timescale/timescaledb-postgis
预备环境
- 下载测试数据
https://timescaledata.blob.core.windows.net/datasets/nyc_data.tar.gz
- 创建数据库&&扩展加载timesacledb
CREATE DATABASE nyc_data;
\c nyc_data
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
加载数据
- 导入schema
psql -U postgres -d nyc_data -h localhost < nyc_data.sql
- 导入数据
psql -U postgres -d nyc_data -h localhost -c "\COPY rides FROM nyc_data_rides.csv CSV"
运行查询
- 基本查询
SELECT date_trunc('day', pickup_datetime) as day, avg(fare_amount) FROM rides WHERE passenger_count > 1 AND pickup_datetime < '2016-01-08' GROUP BY day ORDER BY day;
SELECT date_trunc('day', pickup_datetime) as day, COUNT(*) FROM rides GROUP BY day ORDER BY day LIMIT 5;
- 时序查询
SELECT time_bucket('5 minute', pickup_datetime) AS five_min, count(*) FROM rides WHERE pickup_datetime < '2016-01-01 02:00' GROUP BY five_min ORDER BY five_min;
系统数据
查询细节
- 通过explain 查看timesacledb 的工作
EXPLAIN SELECT * FROM rides;
gis 查询
- 让nyc_data 支持gis
CREATE EXTENSION postgis;
ALTER TABLE rides ADD COLUMN pickup_geom geometry(POINT,2163);
ALTER TABLE rides ADD COLUMN dropoff_geom geometry(POINT,2163);
- 生成geo 数据(有点慢,需要花点时间)
UPDATE rides SET pickup_geom = ST_Transform(ST_SetSRID(ST_MakePoint(pickup_longitude,pickup_latitude),4326),2163);
UPDATE rides SET dropoff_geom = ST_Transform(ST_SetSRID(ST_MakePoint(dropoff_longitude,dropoff_latitude),4326),2163);
- gis 查询
SELECT time_bucket('30 minutes', pickup_datetime) AS thirty_min, COUNT(*) AS near_times_sq
FROM rides
WHERE ST_Distance(pickup_geom, ST_Transform(ST_SetSRID(ST_MakePoint(-73.9851,40.7589),4326),2163)) < 400
AND pickup_datetime < '2016-01-01 14:00'
GROUP BY thirty_min ORDER BY thirty_min;
- 系统生成数据
说明
总的来说简单,同时具有时序数据库的特点,对于我们来说不需要关注数据多的时候性能的问题,还是很不错的,对于ha 以及集群功能还有待
研究
参考资料
https://docs.timescale.com/v0.9/tutorials/tutorial-hello-nyc
https://docs.timescale.com/v0.9/getting-started/installation/mac/installation-homebrew