使用ClickHouse表函数将MySQL数据导入到ClickHouse
#clickhouse-client :create database dw; :use dw; --导入数据: CREATE TABLE Orders ENGINE = MergeTree ORDER BY OrderID AS SELECT * FROM mysql('10.42.134.136:4000', 'dw', 'Orders', 'root', 'xyz'); CREATE TABLE Orders ENGINE = MergeTree ORDER BY OrderID AS SELECT * FROM mysql('10.42.134.136:4000', 'dw', 'FactSaleOrders', 'root', 'xyz') Ok. 0 rows in set. Elapsed: 1832.114 sec. Processed 85.71 million rows, 65.27 GB (46.78 thousand rows/s., 35.62 MB/s.) 解释:导入数据8571万行,耗时1832.114,总大小65.27GB 最后一次每秒4.678万行导入,每秒导入35.62MB的数据。 前提条件: MySQL的主键必须为not null 10.42.134.136:4000 为MySQL的IP地址和端口号 dw 为MySQL的库,作为数据源的 Orders 为MySQL的表,源表 root为MySQL的账号 xyz为MySQL的账号对应的密码 账号需要具备远程连接访问的权限。 注意: 1.order by 后的字段大小写要和MySQL的大小写一样 2.主键列在MySQL中必须定义为NOT NULL 报错:DB::Exception: Sorting key cannot contain nullable columns. 3.导入速度和mysql的列有关,若列较少导入的更快,和硬件的读取写入速度,网络带宽也有影响。 4.支持从TiDB和MySQL直接读取数据。
注意ClickHouse创建的表映射,经过操作发现对于MySQL的Deciaml类型的不能映射到ClickHouse中,ClickHouse映射的类型是String类型了
可以使用一下方法快速创建映射表并导入数据,还有有其他方法的大神留言
第一步创建Mysql到CLickhouse的映射表插入1条数据,这里可以随表
CREATE TABLE Orders ENGINE = MergeTree ORDER BY OrderID AS SELECT * FROM mysql('10.42.134.136:4000', 'dw', 'Orders', 'root', 'xyz') limit 1;
第二步查询创建的表结构复制下来
show create table Orders;
第三部删除表
drop table Orders;
第四部将第二步复制的表结构里的Decimal字段类型进行变更然后执行即可
第五部插入数据
insert into Orders SELECT * FROM mysql('10.42.134.136:4000', 'dw', 'Orders', 'root', 'xyz');
字段映射表
ClickHouse | MySQL | Comment |
UInt8, UInt16, UInt32, UInt64 | TINYINT UNSIGNED, SMALLINT UNSIGNED, INT UNSIGNED, BIGINT UNSIGNED | |
Int8, Int16, Int32, Int64 | TINYINT SIGNED, SMALLINT SIGNED, INT SIGNED, BIGINT SIGNED | |
Float32, Float64 | FLOAT, DOUBLE | Supports inf, -inf, nan, recommended NOT to use by ClickHouse! |
String | BLOB, TEXT, VARCHAR, VARBINARY | No encoding. Recommended to use UTF-8. In fact behaves like a BLOB. |
FixedString(n) | CHAR, BINARY | \0 padded. Less functions available than String, in fact it behaves like BINARY. |
Date | DATE | UNIX epoch date up to 2038. |
DateTime | DATETIME, TIMESTAMP | UNIX epoch timestamp up to 2038. |
Enum | ENUM | Similar to MySQL ENUM. Behaves like Int8/16. |
Array(type) | n.a. | Array of type. Closest equivalent in MySQL is JSON? Not well supported. |
Tuple() | n.a. | |
Nested() | n.a. | Closest equivalent in MySQL is JSON? |
AggregateFunction() | n.a. | |
Set | n.a. | |
Expression | n.a. |
突破昨天的自己