|NO.Z.00007|——————————|BigDataEnd|——|Hadoop&OLAP_Druid.V07|——|Druid.v07|入门|从HDFS加载数据.V1|
一、从HDFS中加载数据
### --- 在hdfs中创建数据目录
~~~ # 在kafka中准备配置文件
[root@hadoop02 ~]# hdfs dfs -cat /data/druidlog.dat
{"ts":"2021-10-01T00:01:35Z","srcip":"6.6.6.6", "dstip":"8.8.8.8", "srcport":6666, "dstPort":8888, "protocol": "tcp", "packets":1, "bytes":1000, "cost": 0.1}
{"ts":"2021-10-01T00:01:36Z","srcip":"6.6.6.6", "dstip":"8.8.8.8", "srcport":6666, "dstPort":8888, "protocol": "tcp", "packets":2, "bytes":2000, "cost": 0.1}
{"ts":"2021-10-01T00:01:37Z","srcip":"6.6.6.6", "dstip":"8.8.8.8", "srcport":6666, "dstPort":8888, "protocol": "tcp", "packets":3, "bytes":3000, "cost": 0.1}
{"ts":"2021-10-01T00:01:38Z","srcip":"6.6.6.6", "dstip":"8.8.8.8", "srcport":6666, "dstPort":8888, "protocol": "tcp", "packets":4, "bytes":4000, "cost": 0.1}
{"ts":"2021-10-01T00:02:08Z","srcip":"1.1.1.1", "dstip":"2.2.2.2", "srcport":6666, "dstPort":8888, "protocol": "udp", "packets":5, "bytes":5000, "cost": 0.2}
{"ts":"2021-10-01T00:02:09Z","srcip":"1.1.1.1", "dstip":"2.2.2.2", "srcport":6666, "dstPort":8888, "protocol": "udp", "packets":6, "bytes":6000, "cost": 0.2}
{"ts":"2021-10-01T00:02:10Z","srcip":"1.1.1.1", "dstip":"2.2.2.2", "srcport":6666, "dstPort":8888, "protocol": "udp", "packets":7, "bytes":7000, "cost": 0.2}
{"ts":"2021-10-01T00:02:11Z","srcip":"1.1.1.1", "dstip":"2.2.2.2", "srcport":6666, "dstPort":8888, "protocol": "udp", "packets":8, "bytes":8000, "cost": 0.2}
{"ts":"2021-10-01T00:02:12Z","srcip":"1.1.1.1", "dstip":"2.2.2.2", "srcport":6666, "dstPort":8888, "protocol": "udp", "packets":9, "bytes":9000, "cost": 0.2}
### --- 定义数据摄取规范
~~~ HDFS文件,数据格式 json,时间戳 ts
~~~ 不定义Rollup;保留所有的明细数据
~~~ Segment granularity:Day
~~~ DataSource Name:yanqitable2
### --- 数据查询
~~~ # 数据查询
select * from "yanqitable2"
select protocol, count(*) rowcoun, sum(bytes) as bytes, sum(packets) as packets, max(cost) as maxcost from "yanqitable2" group by protocol
二、从HDFS摄取数据




三、查看构建的任务
### --- 查看构建的任务
### --- 查看数据源


四、进行数据查询
### --- 进行数据查询
~~~ 查询数据


五、生成的Json文件
{
"type": "index_parallel",
"spec": {
"ioConfig": {
"type": "index_parallel",
"inputSource": {
"type": "hdfs",
"paths": "/data/druidlog.dat"
},
"inputFormat": {
"type": "json"
}
},
"tuningConfig": {
"type": "index_parallel",
"partitionsSpec": {
"type": "dynamic"
}
},
"dataSchema": {
"timestampSpec": {
"column": "ts",
"format": "iso"
},
"dimensionsSpec": {
"dimensions": [
{
"type": "long",
"name": "bytes"
},
{
"type": "double",
"name": "cost"
},
"dstip",
{
"type": "long",
"name": "dstPort"
},
{
"type": "long",
"name": "packets"
},
"protocol",
"srcip",
{
"type": "long",
"name": "srcport"
}
]
},
"granularitySpec": {
"queryGranularity": "minute",
"rollup": false,
"segmentGranularity": "day"
},
"dataSource": "yanqitable2"
}
}
}
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
bdv024-druid
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」