ElasticSearch的安装与使用必知问题

ElasticSearch 的安装与使用必知问题

关于elasticSearch入门的教程,网上有很多,我这里主要就Windows平台下必然出现的问题稍作讲解,期望刚接触es的童鞋能少花点时间。

特别写这篇入门说明,是因为这两个问题难到了大多入门学elasticSearch的童鞋。linux下使用者,在网上类似的问题也有,可以参考。

第一条:安装必读

安装前必读注意事项:jdk9对elasticSearch不太友好(版本太新),必须使用JDK8,本人使用的是JDK8u152(jdk-8u152-windows-x64.exe)。如果使用JDK9,使用elasticSearch-rtf(v5.1.1),会出现下面的错误,请特别注意。
elasticSearch6.0的版本则必须使用JDK9,否则官网下载的msi不能安装成功,原因还没有去仔细检查。

elasticSearch-rtf使用JDK9会出现的问题:

$> elasticsearch

Java HotSpot(TM) 64-Bit Server VM warning: Option UseParNewGC was deprecated in version 9.0 and will likely be removed in a future release.
Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:190)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.lang.UnsupportedOperationException: Boot class path mechanism is not supported
at java.management/sun.management.RuntimeImpl.getBootClassPath(RuntimeImpl.java:99)
at org.elasticsearch.monitor.jvm.JvmInfo.(JvmInfo.java:77)
...

原因是JDK9不再支持UseConcMarkSweepGC,具体情况如下:

废弃的GC选项已被移除( JEP 214 )。 在 JDK 8( JEP 173 )中已经弃用了一些详细的 GC 选项和选项组合。这些将不会被识别,并将导致 JVM 在启动时中止。要注意的选项如下所示
-XX:-UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-Xincgc
-XX:+CMSIncrementalMode -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -XX:+UseConcMarkSweepGC -XX:-UseParNewGC
-XX:+UseCMSCompactAtFullCollection
-XX:+CMSFullGCsBeforeCompaction
-XX:+UseCMSCollectionPassing
在 JDK 9 中,concurrent-mark-sweep (iCMS) 的增量模式已被移除,目前的计划是在 JDK 10 中完全删除 CMS。。。

第二条:使用必读

注意事项之二,网上有无数小教程,如使用普通的命令,

$ curl -XPUT http://localhost:9200/test?pretty  -d '{  "settings": {  "number_of_shards" : 2, "number_of_replicas" : 0  } }'

但在windows下必然会报如下类似的错误信息,

{
  "error" : "ElasticsearchParseException[failed to parse source for create index]; nested: JsonParseException[Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@401789d1; line: 1, column: 0])\n at [Source: [B@401789d1; line: 1, column: 3]]; ",
  "status" : 400
}
curl: (6) Could not resolve host: settings
curl: (3) [globbing] unmatched brace in column 1
curl: (6) Could not resolve host: number_of_shards
curl: (3) Bad URL, colon is first character
curl: (6) Could not resolve host: 2,
curl: (6) Could not resolve host: number_of_replicas
curl: (3) Bad URL, colon is first character
curl: (6) Could not resolve host: 0
curl: (3) [globbing] unmatched close brace/bracket in column 1
curl: (3) [globbing] unmatched close brace/bracket in column 1
...

如上,这是由于windows内部对引号的识别引起的,windows的cmd不能识别单引号,也不能识别双引号,上面的这条指令,正确的写法是用双引号替代单引号,用转义双引号(\”)替代双引号

$ curl -XPUT http://localhost:9200/test?pretty  -d "{  \"settings\": {  \"number_of_shards\" : 2, \"number_of_replicas\" : 0  } }'

你要是不嫌麻烦,用三个连续的双引号代替双引号(用”“”代替”),也是可以的。有网友说在Linux下也会遇到类似的问题,有时候json也无法识别其中的参数,此时也需要经过转义才能使用,我还没尝试。

第三条:理解

基本概念的理解,就是要知道index, type, document, field这些名词到底啥意思,看下表。

传统关系型数据库(如 MySQL)与 Elasticsearch 对比

Relational DBElasticsearch释义
DatabasesIndices索引(名词)即数据库
TablesTypes类型即表名
RowsDocuments文档即每行数据
ColumnsFields字段

第四条,进入正文

ElasticSearch使用入门教程

ElasticSearch的RESTful API通过tcp协议的9200端口提供,可通过任何趁手的客户端工具与此接口进行交互,这其中包括最为流行的curl。curl与ElasticSearch交互的通用请求格式如下面所示。

curl -X<VERB> '<PROTOCOL>://<HOST>/<PATH>?<QUERY_STRING>' -d '<BODY>'

其中各参数的解释如下,

VERB:HTTP协议的请求方法,常用的有GET、POST、PUT、HEAD以及DELETE;
PROTOCOL:协议类型,http或https;
HOST:ES集群中的任一主机的主机名;
PORT:ES服务监听的端口,默认为9200;
QUERY_STRING:查询参数,例如?pretty表示使用易读的JSON格式输出;
BODY:JSON格式的请求主体;

例如,查看ElasticSearch工作正常与否的信息,用下面的指令

curl 'http://localhost:9200/?pretty'

我在使用时参考很多网上的资料,无法一一注明感谢,如下列出一个地址,是这里curl命令例程的提供者,大家可以上去看看,不过切记,该地址提供的指令在windows上是无法直接运行的,而且,有些运行结果也和原博给出的有很大出入。

(refer to https://www.cnblogs.com/austinspark-jessylu/p/6797060.html

创建文档:

curl -XPUT "http://localhost:9200/music?pretty"

上面这条没有任何新奇的地方,返回

{
  "acknowledged" : true
}

继续添加数据。

注意下面这样会出错,原因参见第二条使用必读。

curl -XPUT "http://localhost:9200/music/songs/1" -d '
{ "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }'

返回:

{"error":"MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=1): [39]]; ","status":400}curl: (3) [globbing] unmatched brace in column 1
curl: (6) Could not resolve host: name
curl: (6) Could not resolve host: Deck the Halls,
curl: (6) Could not resolve host: year
curl: (6) Could not resolve host: 1885,
curl: (6) Could not resolve host: lyrics
curl: (6) Could not resolve host: Fa la la la la
curl: (3) [globbing] unmatched close brace/bracket in column 1

正确的情况下,只能这样用

$ curl -XPUT "http://localhost:9200/music/songs/1" -d "{ \"name\": \"Deck the Halls\", \"year\": 1885, \"lyrics\": \"Fa la la la la\" }"
返回,
{"_index":"music","_type":"songs","_id":"1","_version":1,"created":true}
Administrator@WIN10-711171017 MSYS ~
$ curl -XGET "http://localhost:9200/music/songs/1"
{"_index":"music","_type":"songs","_id":"1","_version":1,"found":true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

如果这样用,是不是好看一些?

$ curl -XGET "http://localhost:9200/music/songs/1?pretty"
返回,
{
  "_index" : "music",
  "_type" : "songs",
  "_id" : "1",
  "_version" : 1,
  "found" : true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }
}

可能你已经注意到了,http://localhost:9200/music/songs/1?pretty这段,有没有双引号都能正确被识别。

查看文档

要查看该文档,可使用简单的 GET 命令,正确的情况下,是这样使用的,

$ curl -XGET "http://localhost:9200/music/songs/1"
返回
{"_index":"music","_type":"songs","_id":"1","_version":1,"found":true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }}

同样,可以再看看?pretty的格式化作用,

$ curl -XGET "http://localhost:9200/music/songs/1?pretty"
返回
{
  "_index" : "music",
  "_type" : "songs",
  "_id" : "1",
  "_version" : 1,
  "found" : true, "_source" : { "name": "Deck the Halls", "year": 1885, "lyrics": "Fa la la la la" }
}

更新文档

命令:

curl -XPUT "http://localhost:9200/music/lyrics/1" -d '{ "name": "Deck the Halls", "year": 1886, "lyrics": "Fa la la la la" }'

当然正式使用时要转义双引号,否则windows必报错。

curl -XPUT http://localhost:9200/music/lyrics/1 -d "{ \"name\": \"Deck the Halls\", \"year\": 1886, \"lyrics\": \"Fa la la la la\" }"

然后change the data year from 1886 to 1887

curl -XPUT http://localhost:9200/music/lyrics/1 -d "{ \"name\": \"Deck the Halls\", \"year\": 1887, \"lyrics\": \"Fa la la la la\" }"

删除文档(但暂时不要删除)

试试下面这条命令,注意使用时一定要转义,简单的情况下,此种情况后面不再一一说明。

curl -XDELETE "http://localhost:9200/music/lyrics/1"

从文件插入文档

命令:

curl -XPUT http://localhost:9200/music/lyrics/2 -d @caseyjones.json

添加一首针对传统歌曲 “Ballad of Casey Jones” 的文档。将清单 1 复制到一个名为 caseyjones.json 的文件中。将该文件放在任何方便对它运行 cURL 命令的地方。我这里运行的目录是D:\cmder,所以该文件的是在D:\cmder\caseyjones.json。

清单 1. “Ballad of Casey Jones” 的 JSON 文档(caseyjones.json)
{
  "artist": "Wallace Saunders",
  "year": 1909,
  "styles": ["traditional"],
  "album": "Unknown",
  "name": "Ballad of Casey Jones",
  "lyrics": "Come all you rounders if you want to hear
The story of a brave engineer
Casey Jones was the rounder's name....
Come all you rounders if you want to hear
The story of a brave engineer
Casey Jones was the rounder's name
On the six-eight wheeler, boys, he won his fame
The caller called Casey at half past four
He kissed his wife at the station door
He mounted to the cabin with the orders in his hand
And he took his farewell trip to that promis'd land

Chorus:
Casey Jones--mounted to his cabin
Casey Jones--with his orders in his hand
Casey Jones--mounted to his cabin
And he took his... land"
}

运行过程和结果如下,

d:\cmder
λ> curl -XPUT http://localhost:9200/music/lyrics/2 -d @caseyjones.json
{"_index":"music","_type":"lyrics","_id":"2","_version":1,"created":true}
d:\cmder
清单 2. “Walking Boss” JSON(walking.json)
{
  "artist": "Clarence Ashley",
  "year": 1920,
  "name": "Walking Boss",
  "styles": ["folk","protest"],
  "album": "Traditional",
  "lyrics": "Walkin' boss
Walkin' boss
Walkin' boss
I don't belong to you

I belong
I belong
I belong
To that steel driving crew

Well you work one day
Work one day
Work one day
Then go lay around the shanty two"
}

将此文档推送到索引中:

curl -XPUT "http://localhost:9200/music/lyrics/3" -d @walking.json

运行过程和结果如下

d:\cmder
λ curl -XPUT http://localhost:9200/music/lyrics/3 -d @walking.json
{"_index":"music","_type":"lyrics","_id":"3","_version":1,"created":true}
d:\cmder
λ

跑了这么久,我们截个图看下到底什么情况,

图片在此

这里写图片描述

搜索 REST API

文档 URL 有一个内置的 _search 端点用于此用途。在歌词中找到所有包含单词 you 的歌曲:

curl -XGET "http://localhost:9200/music/lyrics/_search?q=lyrics:'you'"

q 参数表示一个查询。

运行过程和结果如下,

d:\cmder
λ curl -XGET "http://localhost:9200/music/lyrics/_search?q=lyrics:'you'"
{"took":21,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}
d:\cmder
λ

使用其他比较符

例如,找到所有 1900 年以前编写的歌曲:

curl -XGET http://localhost:9200/music/lyrics/_search?q=year:"<1900"

此查询将返回完整的 “Casey Jones” 和 “Walking Boss” 文档。

d:\cmder
λ curl -XGET http://localhost:9200/music/lyrics/_search?q=year:"<1900"
{"took":20,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"music","_type":"lyrics","_id":"1","_score":1.0, "_source" : { "name": "Deck the Halls", "year": 1887, "lyrics": "Fa la la la la" }}]}}

限制字段

要限制您在结果中看到的字段,可将 fields 参数添加到您的查询中:

curl -XGET "http://localhost:9200/music/lyrics/_search?q=year:>1900&fields=year"

运行结果如下

d:\cmder
λ curl -XGET "http://localhost:9200/music/lyrics/_search?q=year:>1900&fields=year"
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"music","_type":"lyrics","_id":"2","_score":1.0,"fields":{"year":[1909]}},{"_index":"music","_type":"lyrics","_id":"3","_score":1.0,"fields":{"year":[1920]}}]}}

有没有注意到,在两次查询中我刻意使用了不同的格式?前面说过,http这段,有没有双引号都能正确被识别。

使用更高级别的基本DSL的查询

DSL就是Domain Specified Language的意思。我们建一个文件,query.json,

{
    "query" : {
        "match" : {
            "album" : "Traditional"
        }
    }
}

使用命令,

curl -XGET "http://localhost:9200/music/lyrics/_search" -d @query.json

运行结果如下

d:\cmder
λ curl -XGET http://localhost:9200/music/lyrics/_search -d @query.json
{"took":17,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"music","_type":"lyrics","_id":"3","_score":0.30685282, "_source" : {  "artist": "Clarence Ashley",  "year": 1920,  "name": "Walking Boss",  "styles": ["folk","protest"],  "album": "Traditional",  "lyrics": "Walkin' bossWalkin' bossWalkin' bossI don't belong to youI belongI belongI belongTo that steel driving crewWell you work one dayWork one dayWork one dayThen go lay around the shanty two"}}]}}

到这先休息一下,打字敲的有点累了。

posted @ 2017-12-03 21:35  SpaceVision  阅读(112)  评论(0编辑  收藏  举报