1-4 安装logstash
下载地址:
https://www.elastic.co/cn/downloads/logstash
华为:https://mirrors.huaweicloud.com/
下载最MovieLens最小测试数据集:https://grouplens.org/datasets/movielens/
新建配置文件logstash.conf:
#修改movielens目录下的logstash.conf文件
#path修改为,你实际的movies.csv路径
input {
file {
path => "/usr/local/logstash-7.5.1/testfile/ml-latest-small/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null" #表示每次重新加载文件数据
}
}
output {
elasticsearch {
hosts => [ "10.5.250.168:9200" ]
}
}
input {
file {
path => "/usr/local/logstash-7.5.1/testfile/ml-latest-small/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id","content","genre"]
}
file {
path => "/usr/local/logstash-7.5.1/testfile/ml-latest-small/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id","content","genre"]
}
mutate {
split => { "genre" => "|" }
remove_field => ["path", "host","@timestamp","message"]
}
split => { "genre" => "|" }
remove_field => ["path", "host","@timestamp","message"]
}
mutate {
split => ["content", "("]
add_field => { "title" => "%{[content][0]}"}
add_field => { "year" => "%{[content][1]}"}
}
add_field => { "title" => "%{[content][0]}"}
add_field => { "year" => "%{[content][1]}"}
}
mutate {
convert => {
"year" => "integer"
}
strip => ["title"]
remove_field => ["path", "host","@timestamp","message","content"]
}
convert => {
"year" => "integer"
}
strip => ["title"]
remove_field => ["path", "host","@timestamp","message","content"]
}
}
output {
elasticsearch {
hosts => "10.5.250.168:9200"
index => "movies"
document_id => "%{id}"
}
stdout {}
}
output {
elasticsearch {
hosts => "10.5.250.168:9200"
index => "movies"
document_id => "%{id}"
}
stdout {}
}
启动:
#启动Elasticsearch实例,然后启动 logstash,并制定配置文件导入数据
bin/logstash -f /usr/local/logstash-7.5.1/config/logstash.conf
图中表明启动成功
#查看索引相关信息 GET kibana_sample_data_ecommerce #查看索引的文档总数 GET kibana_sample_data_ecommerce/_count #查看前10条文档,了解文档格式 POST kibana_sample_data_ecommerce/_search { } #_cat indices API #查看indices GET /_cat/indices/kibana*?v&s=index #查看状态为绿的索引 GET /_cat/indices?v&health=green #按照文档个数排序 GET /_cat/indices?v&s=docs.count:desc #查看具体的字段 GET /_cat/indices/kibana*?pri&v&h=health,index,pri,rep,docs.count,mt #How much memory is used per index? GET /_cat/indices?v&h=i,tm&s=tm:desc
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!