Elasticsearch
一、Elasticsearch介绍
Elasticsearch 是一个实时的分布式搜索分析引擎, 它能让你以一个之前从未有过的速度和规模,去探索你的数据。 它被用作全文检索、结构化搜索、分析以及这三个功能的组合。
Elasticsearch是一个基于Apache Lucene的开源搜索引擎。无论在开源还是专有领域,Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。 但是,Lucene只是一个库。想要使用它,你必须使用Java来作为开发语言并将其直接集成到你的应用中,更糟糕的是,Lucene非常复杂,你需要深入了解检索的相关知识来理解它是如何工作的。
Elasticsearch也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性,从而让全文搜索变得简单。
二、Elasticsearch应用场景
mysql虽然也可以搜索,比如查询某个字符串%,需要全表扫描
Elasticsearch非常适合全文检索
Elasticsrarch 可以灵活的存储不同类型的数据
应用场景
商城的商品搜索
所有产品的评论
高亮显示搜索内容
收集展示各种日志
三、Elasticsearch数据格式
Elasticsearch 使用JavaScript Object Notation 或者JSON作为文档的序列化格式。JSON序列化被大多数编程语言所支持,并且已经成为 NoSQL领域的标准格式。 它简单、简洁、易于阅读。
考虑一下这个 JSON 文档,它代表了一个user对象:
{
"email": "john@smith.com",
"first_name": "John",
"last_name": "Smith",
"info":
{ "bio": "Eco-warrior and defender of the weak",
"age": 25,
"interests": [ "dolphins", "whales" ]
},
"join_date": "2014/05/01"
}
四、部署ES
1. 安装java
yum install -y java-1.8.0-openjdk.x86_64
如果ES启动失败 报了java版本的错误 ,可以卸载Java重新下载一个再安装 ,以下方式安装Java环境
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie"http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz tar -xf jdk-8u131-linux-x64.tar.gz mv jdk1.8.0_131/ jdk ln -s /usr/local/jdk/bin/java /usr/local/bin/ vim /etc/profile #追加以下内容 export JAVA_HOME=/usr/local/jdk export JRE_HOME=/usr/local/jdk/jre export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH [root@master01 ~]# source /etc/profile
2. 下载安装软件
mkdir -p /data/es_soft/ cd /data/es_soft/ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.0.rpm rpm -ivh elasticsearch-6.6.0.rpm 配置启动 systemctl daemon-reload systemctl enable elasticsearch.service systemctl start elasticsearch.service systemctl status elasticsearch.service 检查是否启动成功 ps -ef|grep elastic lsof -i:9200 rpm -ql elasticsearch #查看elasticsearch软件安装了哪些目录 rpm -qc elasticsearch #查看elasticsearch的所有配置文件 /etc/elasticsearch/elasticsearch.yml #配置文件 /etc/elasticsearch/jvm.options. #jvm虚拟机配置文件 /etc/init.d/elasticsearch #init启动文件 /etc/sysconfig/elasticsearch #环境变量配置文件 /usr/lib/sysctl.d/elasticsearch.conf #sysctl变量文件,修改最大描述符 /usr/lib/systemd/system/elasticsearch.service #systemd启动文件 /var/lib/elasticsearch # 数据目录 /var/log/elasticsearch #日志目录 /var/run/elasticsearch #pid目录 Elasticsearch 已经有了很好的默认值,特别是涉及到性能相关的配置或者选项,其它数据库可能需要调优,但总得来说,Elasticsearch不需要。如果你遇到了性能问题,解决方法通常是更好的数据布局或者更多的节点。 [root@master01 ~]# egrep -v "^#" /etc/elasticsearch/elasticsearch.yml #cluster.name: Linux #集群名称 node.name: node-1 #节点名称 path.data: /var/lib/elasticsearch #数据目录 path.logs: /var/log/elasticsearch #日志目录 bootstrap.memory_lock: true #锁定内存 network.host: 0.0.0.0 #绑定IP地址 http.port: 9200 #端口号 #discovery.zen.ping.unicast.hosts: ["10.192.27.100", "10.192.27.114"] #集群发现的通讯节点 #discovery.zen.minimum_master_nodes: 2 #最小主节点数 [root@master01 ~]# [root@master01 ~]# vim /etc/elasticsearch/jvm.options
#修改完配置文件后我们需要重启一下 [root@master01 ~]# systemctl restart elasticsearch [root@master01 ~]# systemctl status elasticsearch ● elasticsearch.service - Elasticsearch Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/elasticsearch.service.d └─override.conf Active: active (running) since 五 2019-12-06 15:24:17 CST; 14s ago Docs: http://www.elastic.co Main PID: 148849 (java) CGroup: /system.slice/elasticsearch.service ├─148849 /usr/local/bin/java -Xms16g -Xmx16g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 ... └─149095 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller 12月 06 15:24:17 master01 systemd[1]: Started Elasticsearch.
这个时候可能会启动失败,查看日志可能会发现是锁定内存失败 官方解决方案 https://www.elastic.co/guide/en/elasticsearch/reference/6.6/setup-configuration-memory.html https://www.elastic.co/guide/en/elasticsearch/reference/6.6/setting-system-settings.html#sysconfig ### 修改启动配置文件或创建新配置文件 方法1: systemctl edit elasticsearch 方法2: vim /usr/lib/systemd/system/elasticsearch.service ### 增加如下参数 [Service] LimitMEMLOCK=infinity ### 重新启动 systemctl daemon-reload systemctl restart elasticsearch
五、集群部署
1、master01 10.192.27.100 修改配置重启就可以
[root@master01 ~]# egrep -v "^#" /etc/elasticsearch/elasticsearch.yml cluster.name: Linux node.name: node-1 path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: 0.0.0.0 http.port: 9200 discovery.zen.ping.unicast.hosts: ["10.192.27.100", "10.192.27.114"] #只需要加一台集群中的节点就可以识别到 discovery.zen.minimum_master_nodes: 2 #total number of master-eligible nodes / 2 + 1 这里有三个节点 [root@master01 ~]#
2、master02 10.192.27.114 跟上面安装步骤一样
[root@master02 ~]# egrep -v "^#" /etc/elasticsearch/elasticsearch.yml cluster.name: Linux node.name: node-2 path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: 0.0.0.0 http.port: 9200 discovery.zen.ping.unicast.hosts: ["10.192.27.100", "10.192.27.114"] discovery.zen.minimum_master_nodes: 2 [root@master02 ~]#
3、master03 10.192.27.111 跟上面安装步骤一样
[root@master03 ~]# egrep -v "^#" /etc/elasticsearch/elasticsearch.yml cluster.name: Linux node.name: node-3 path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch bootstrap.memory_lock: true network.host: 0.0.0.0 http.port: 9200 discovery.zen.ping.unicast.hosts: ["10.192.27.100", "10.192.27.111"] discovery.zen.minimum_master_nodes: 2 [root@master03 ~]#
查看日志
[root@master01 ~]# tailf /var/log/elasticsearch/elasticsearch.log at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131] [2019-12-05T14:50:12,868][INFO ][o.e.n.Node ] [Linux] stopping ... [2019-12-05T14:50:12,897][WARN ][o.e.d.z.ZenDiscovery ] [Linux] not enough master nodes discovered during pinging (found [[Candidate{node={Linux}{kBD_JYsbTryUFJyc3mnKyw}{zkjgB4R2Q3-ySJI_3Hxo4A}{10.192.27.100}{10.192.27.100:9300}{ml.machine_memory=33440313344, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again [2019-12-05T14:50:12,898][INFO ][o.e.x.w.WatcherService ] [Linux] stopping watch service, reason [shutdown initiated] [2019-12-05T14:50:13,169][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [Linux] [controller/8102] [Main.cc@148] Ml controller exiting [2019-12-05T14:50:13,170][INFO ][o.e.x.m.p.NativeController] [Linux] Native controller process has stopped - no new native processes can be started [2019-12-05T14:50:13,179][INFO ][o.e.n.Node ] [Linux] stopped [2019-12-05T14:50:13,179][INFO ][o.e.n.Node ] [Linux] closing ... [2019-12-05T14:50:13,191][INFO ][o.e.n.Node ] [Linux] closed
4、交互模式
curl命令插入三条数据数据:默认是5分片 1副本
curl -XPUT '10.192.27.100:9200/linux/user/1?pretty' -H 'Content-Type: application/json' -d' { "first_name" : "John", "last_name": "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } ' curl -XPUT 'localhost:9200/linux/user/2?pretty' -H 'Content-Type: application/json' -d' { "first_name": "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] }' curl –XPUT 'localhost:9200/linux/user/3?pretty' -H 'Content-Type: application/json' -d' { "first_name": "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] }'
这里使用谷歌浏览器离线安装es-head插件
任何查询一个节点 查看
5、集群监控状态
集群状态颜色:
绿色:所有条件都满足,数据完整,副本满足
黄色:数据完整,副本不满足
红色:有索引里的数据出现不完整了
紫色:有分片正在同步中
如何监控:1.监控集群健康状态 不是 green 或者 2.监控集群节点数量 是不是等于3
查看单机信息
[root@master01 ~]# curl -s -XGET http://10.192.27.100:9200 { "name" : "node-1", "cluster_name" : "Linux", "cluster_uuid" : "XRGGo2PORO2Rh3X149ZaGw", "version" : { "number" : "6.6.0", "build_flavor" : "default", "build_type" : "rpm", "build_hash" : "a9861f4", "build_date" : "2019-01-24T11:27:09.439740Z", "build_snapshot" : false, "lucene_version" : "7.6.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" } [root@master01 ~]# echo $? 0 [root@master01 ~]#
查看集群信息 官网参考地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html
[root@master01 ~]# curl -XGET 'http://localhost:9200/_cluster/health?pretty' { "cluster_name" : "Linux", #集群名 "status" : "green", #状态 "timed_out" : false, "number_of_nodes" : 3, #节点数 "number_of_data_nodes" : 3, "active_primary_shards" : 3, #主分片数 "active_shards" : 6, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 } [root@master01 ~]# #status 字段是我们最关心的。 #green 所有的主分片和副本分片都正常运行。 #yellow 所有的主分片都正常运行,但不是所有的副本分片都正常运行。 r#ed 有主分片没能正常运行。
查看系统检索信息 官网地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-stats.html Cluster Stats API允许从群集范围的角度检索统计信息。 操作命令: [root@master01 ~]# curl -XGET 'http://localhost:9200/_cluster/stats?human&pretty' 查看集群的设置 官方地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-get-settings.html 操作命令: curl -XGET 'http://localhost:9200/_cluster/settings?include_defaults=true&human&pretty’ 查询节点的状态 官网地址:https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-info.html 操作命令: curl -XGET 'http://localhost:9200/_nodes/procese?human&pretty' curl -XGET 'http://localhost:9200/_nodes/_all/info/jvm,process?human&pretty' [root@master02 ~]# curl -XGET 'http://localhost:9200/_cat/nodes?human&pretty' 10.192.27.111 17 67 0 0.00 0.01 0.05 mdi - node-3 10.192.27.114 14 74 0 0.02 0.03 0.05 mdi * node-2 10.192.27.100 16 99 0 0.13 0.16 0.10 mdi - node-1 [root@master02 ~]# curl -XGET 'http://localhost:9200/_cat/nodes?human&pretty' | wc -l % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 150 100 150 0 0 7943 0 --:--:-- --:--:-- --:--:-- 8333 3 [root@master02 ~]# curl -s -XGET 'http://localhost:9200/_cat/nodes?human&pretty' | wc -l 3 [root@master02 ~]#