【BIGDATA】CSV文件导入到ClickHouse
最近遇到给ClickHouse灌注测试数据的场景。方法多种多样,此处提供一种思路。
文件准备
[root@VM-16-48-centos ~]# cat a.csv 2,liudehua 2,chenguanxi 3,bh8ank 4,bh8ank 5,bh8ank 6,bh8ank 7,bh8ank 8,bh8ank [root@VM-16-48-centos ~]#
现有csv文件,内容格式如上。
库表准备
192.168.16.16 :) create database DDD on cluster default_cluster; CREATE DATABASE DDD ON CLUSTER default_cluster ┌─host──────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐ │ 192.168.16.15 │ 9000 │ 0 │ │ 1 │ 0 │ │ 192.168.16.16 │ 9000 │ 0 │ │ 0 │ 0 │ └───────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘ 2 rows in set. Elapsed: 0.112 sec. 192.168.16.16 :) CREATE TABLE DDD.test (id int,name varchar(255)) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/test', '{replica}')ORDER BY(id); CREATE TABLE DDD.test ( `id` int, `name` varchar(255) ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/test', '{replica}') ORDER BY id Ok. 0 rows in set. Elapsed: 0.031 sec. 192.168.16.16 :) 192.168.16.16 :)
将CSV文件插入
[root@VM-16-48-centos ~]# cat a.csv | clickhouse-client --host=172.16.0.72 --port=9000 --user=yourusername --password=yourpassword --query="INSERT INTO DDD.test FORMAT CSV" [root@VM-16-48-centos ~]#
数据检查
192.168.16.15 :) 192.168.16.15 :) 192.168.16.15 :) 192.168.16.15 :) 192.168.16.15 :) select * from DDD.test; SELECT * FROM DDD.test ┌─id─┬─name───────┐ │ 2 │ liudehua │ │ 2 │ chenguanxi │ │ 3 │ bh8ank │ │ 4 │ bh8ank │ │ 5 │ bh8ank │ │ 6 │ bh8ank │ │ 7 │ bh8ank │ │ 8 │ bh8ank │ └────┴────────────┘ 8 rows in set. Elapsed: 0.006 sec. 192.168.16.15 :)