ES项目实战
前置
ES: Java Spark/Flink Stack + Spring Boot + ES Scala/Java + Java/Scala + Java ==> 用API的方式来掌握ES的用法(API、SpringBoot的使用) ES: API RESTFul
ElasticSearch + Kibana 存储 展示/分析 ES Plugin: Head SQL Kibana (三个插件,Kibana也算插件) 最终数据要到SQL (易用性)
ES的安装
地址: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.2.tar.gz  
- 解压: tar -zxvf elasticsearch-6.6.2.tar.gz -C ~/app/
- 配置到系统环境变量:
- 修改: cd elasticsearch-6.6.2 删除bin目录中以bat结尾的命令(windows中的,无用)
$ES_HOME (重要的文件)
bin
elasticsearch 启动ES
elasticsearch 前台方法启动 elasticsearch -d 后台方法启动 (需要打开日志查看系统输出信息) elasticsearch-plugin 管理ES插件
elasticsearch-sql-cli sql客户端
config
elasticsearch.yml es的配置信息
#cluster.name: my-application (集群名称)
#node.name: node-1 (节点名称)
#path.data: /path/to/data
Path to log files:
path.logs: /path/to/logs
#network.host: 0.0.0.0 (全网)
Set a custom port for HTTP:
http.port: 9200
jvm.optiom es的JVM相关的配置信息
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms1g
-Xmx1g
## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
# explicitly set the stack size
-Xss1m
# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data
# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log
# ----------------------------------- Memory -----------------------------------
elasticsearch-migrate x-pack-env
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
***注:ES对硬盘要求极高(一般SSD、内存大) ***
后台启动ES: elasticsearch -d
去Web UI查看: hadoop000:9200 (Web UI port)
hadoop000:9300 服务器端口(server port)
Lucene_version: "7.6.0" 不对应
技巧:用chrome中的json字符串美化插件 (JSON Formatter)
ES核心概念
Cluster
Node
Index Database
Type Table
Document Row
Field Column //这四种的REST API在工作中用的最多
shard(分片--分区集叫分片)
replica(副本)
对应关系:
Index -> Type -> Document -> Field
Database -> Table -> Row -> Column
代码开发
需要: IDEA+Maven+Java pom.xml
<!--添加elasticsearch依赖-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.6.2</version>
</dependency>
<!--添加junit依赖(默认已有)-->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<!--<scope>test</scope>(去掉这块),需要重新导入依赖-->
</dependency>
PreBuiltTransportClient.java
/**
* Creates a new transport client with pre-installed plugins.
*
* @param settings the settings passed to this transport client
* @param plugins an optional array of additional plugins to run with this client
*/
@SafeVarargs
public PreBuiltTransportClient(Settings settings, Class<? extends Plugin>... plugins) {
this(settings, Arrays.asList(plugins));
}
//Class<? extends Plugin>...
// ...代表一个/多个可变参数, 可填可不填
TransportClient.java
/**
* Adds a transport address that will be used to connect to.
* <p>
* The Node this transport address represents will be used if its possible to connect to it.
* If it is unavailable, it will be automatically connected to once it is up.
* <p>
* In order to get the list of all the current connected nodes, please see {@link #connectedNodes()}.
*/
public TransportClient addTransportAddress(TransportAddress transportAddress) {
nodesService.addTransportAddresses(transportAddress);
return this;
}