Elasticsearch1.7服务搭建与入门操作

ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎，基于RESTful web接口。Elasticsearch是用Java开发的，并作为Apache许可条款下的开放源码发布，是当前流行的企业级搜索引擎。设计用于云计算中，能够达到实时搜索，稳定，可靠，快速，安装使用方便。

ElasticSearch目前最新版本是2.0，由于相关配套的框架没有跟上它的更新速度，如spring-data-elasticsearch，所以我选择相关配套比较完善的版本：1.7.3。

（红色标识为完整的命令）

一、安装

elasticsearch基本上不需要安装，下载即可用。

从官网：https://www.elastic.co/downloads/elasticsearch下载安装包 elasticsearch-1.7.3.zip至/usr/local，然后完成以下操作步骤：

1、root@api-test:/usr/local# unzip elasticsearch-1.7.3.zip

2、root@api-test:/usr/local# cd elasticsearch-1.7.3/bin/
3、root@api-test:/usr/local/elasticsearch-1.7.3/bin# ./elasticsearch &
    [2016-03-04 17:29:30,042][INFO ][node                     ] [G-Force] version[1.7.3], pid[23165], build[05d4530/2015-10-15T09:14:17Z]
    [2016-03-04 17:29:30,046][INFO ][node                     ] [G-Force] initializing ...
    [2016-03-04 17:29:30,377][INFO ][plugins                  ] [G-Force] loaded [], sites []
    [2016-03-04 17:29:30,518][INFO ][env                      ] [G-Force] using [1] data paths, mounts [[/ (/dev/mapper/ubuntu--api--test--vg-root)]], net usable_space [90.7gb], net total_space [120.2gb], types [ext4]
    [2016-03-04 17:29:37,122][INFO ][node                     ] [G-Force] initialized
    [2016-03-04 17:29:37,122][INFO ][node                     ] [G-Force] starting ...
    [2016-03-04 17:29:37,752][INFO ][transport                ] [G-Force] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.12.206:9300]}
....

[2016-03-04 17:30:07,859][INFO ][http ] [G-Force] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.12.206:9200]}
[2016-03-04 17:30:07,860][INFO ][node ] [G-Force] started

如果看到上面这段打印信息，说明elasticsearch已经成功启动，tcp端口9300,http端口9200。

可以通过elasticsearch自身提供的restful接口验证一下：

    root@api-test:/usr/local/elasticsearch-1.7.3/bin# curl localhost:9200
    {
       "status" : 200,
   "name" : "Bird-Man",
   "cluster_name" : "elasticsearch",
   "version" : {
      "number" : "1.7.3",
      "build_hash" : "05d4530971ef0ea46d0f4fa6ee64dbc8df659682",
      "build_timestamp" : "2015-10-15T09:14:17Z",
              "build_snapshot" : false,
      "lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
     }

二、基本操作

对于提供全文检索的工具来说，索引时一个关键的过程——只有通过索引操作，才能对数据进行分析存储、创建倒排索引，从而让使用者查询到相关的信息。

elasticsearch有三个关键词：index索引、type类型、ID，如果把Elasticsearch比作关系型数据库，那么index相当于数据库，type相当于数据表，ID相当于数据行的唯一键。

索引的创建也是很简单的，通过restful来操作。

1、创建索引，索引名称为testdb,索引对应的document为testtable

root@api-test:/home/clonen.cheng# curl -XPUT http://localhost:9200/testdb -d '{
   "mappings" : {
      "testtable" : {
         "properties" : {
            "name" : {
               "type" : "string"
            },
            "sex" : {
               "type" : "integer"
            }
         }
      }
   }
}'

可以通过以下命令查看是否成功创建了索引：

root@api-test:/home/clonen.cheng# curl localhost:9200/testdb?pretty
{
"testdb" : {
    "aliases" : { },
    "mappings" : {
      "testtable" : {
        "properties" : {
          "name" : {
            "type" : "string"
          },
          "sex" : {
            "type" : "integer"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1457085335345",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "version" : {
          "created" : "1070399"
        },
        "uuid" : "73Z4UoXhTzCRJJVeW3aKMA"
      }
    },
    "warmers" : { }
}
}

2、往索引中新增数据

root@api-test:/home/clonen.cheng# curl -XPUT localhost:9200/testdb/testtable/1 -d '{"name":"zhangsan","sex":"1"}'
{"_index":"testdb","_type":"testtable","_id":"1","_version":1,"created":true}

这样，我们往上面创建的索引中添加了名称为zhangsan，性别为1的一条数据，可以通过以下命令查看：
root@api-test:/home/clonen.cheng# curl -XGET localhost:9200/testdb/testtable/1?pretty
{
"_index" : "testdb",
"_type" : "testtable",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source":{"name":"zhangsan","sex":"1"}
}

3、为索引建立同义词

同义词在elasticsearch中是一个很有用的功能，可以进行热切换，特别是在生产环境，不得不重建索引的情况下，不会影响正在使用的功能。

所以在外部调用时推荐大家使用同义词而非索引本身名称，这点我深有体会。

root@api-test:/home/clonen.cheng# curl -XPOST localhost:9200/_aliases -d '
{
    "actions": [
        { "add": {
            "alias": "testdb_index",
            "index": "testdb"
        }}
    ]
}'

这样我们就为上面的索引创建了一个同义词：testdb_index。

4、切换同义词指向的索引

在上面我们提到，因需求变化我们可能会改变原索引结构，在不想重建原索引的情况下我们可以为同义词向新的索引，从而实现热切换。

首先我们重复第1步，新建另外一个索引testdb1，略..，然后进行切换操作：

root@api-test:/home/clonen.cheng# curl -XPOST localhost:9200/_aliases -d '
{
    "actions": [
        { "remove": {
            "alias": "testdb_index",
            "index": "testdb"
        }},
        { "add": {
            "alias": "testdb_index",
            "index": "testdb1"
        }}
    ]
}'

这样，testdb_index就由testdb索引无缝切换至了testdb1索引。

三、分词器的使用

对于索引可能最关系的就是分词了，一般对于es来说默认的smartcn 但效果不是很好,好在国内有medcl大神（国内最早研究es的人之一）写的两个中文分词插件，一个是ik的，一个是mmseg的，两者其实都差不多的，下面主要介绍IK的安装操作，命令行：

1、从https://github.com/medcl/elasticsearch-analysis-ik找到对应的ik版本，最好是和elasticsearch当前版本匹配，这里我选择elasticsearch-analysis-ik-1.4.1.zip。

2、 root@api-test:/usr/local# unzip elasticsearch-analysis-ik-1.4.1.zip

3、进入解压后的目录，找到configs目录，拷贝里面的ik目录与elasticsearch.yml至es的config目录下。

root@api-test:/usr/local# cd elasticsearch-analysis-ik-1.4.1/config

root@api-test:/usr/local/elasticsearch-analysis-ik-1.4.1/config# cd elasticsearch-analysis-ik-1.4.1/config

可以看到如下目录结构：

root@api-test:/usr/local/elasticsearch-analysis-ik-1.4.1/config# cp -r ik elasticsearch.yml /usr/local/elasticsearch-1.7.3/config

4、编辑elasticsearch.yml，确认ik是否配置正确，配置如下图所示。

root@api-test:/usr/local/elasticsearch-analysis-ik-1.4.1/config# cd /usr/local/elasticsearch-1.7.3/config

root@api-test:/usr/local/elasticsearch-1.7.3/config# vi elasticsearch.yml

################################## Security ################################

   # Uncomment if you want to enable JSONP as a valid return transport on the
   # http server. With this enabled, it may pose a security risk, so disabling
   # it unless you need it is recommended (it is disabled by default).
   #
   #http.jsonp.enable: true

index:
  analysis:                   
    analyzer:      
      ik:
          alias: [ik_analyzer]
          type: org.elasticsearch.index.analysis.IkAnalyzerProvider
      ik_max_word:
          type: ik
          use_smart: false
      ik_smart:
          type: ik
          use_smart: true

5、重启Elasticsearch

四、数据库数据同步导入索引的操作

ElasticSearch有相关插件来做数据库数据同步操作的，这里我们选择elasticsearch-jdbc。

打开官网地址：https://github.com/jprante/elasticsearch-jdbc，选择合适的版本，这里我用的是elasticsearch-jdbc-1.7.3.0-dist.zip，基本不需要额外安装过程，按官网上的一步一步操作就行了。

1、解压在合适的目录,如/opt

root@api-test:/opt# tar -zxvf elasticsearch-jdbc-1.7.3.0-dist.zip

root@api-test:/opt# cd elasticsearch-jdbc-1.7.3.0/bin

2、在bin目录下有许多的示例脚本，如简单的脚本mysql-state-example.sh，还有计划脚本、地理位置脚本等，可以逐一查看了解相关法语，这里我们拷贝一个计划脚本来演示。

root@api-test:/opt/elasticsearch-jdbc-1.7.3.0/bin# cp mysql-schedule.sh mysql-test-schedule.sh

    然后按模板格式稍微改造，就可以用作我们实际当中使用的脚本了，具体的细节我不多说，直接看改造后的脚本内容(着重看红色标识的内容）:
   #!/bin/sh

   DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
   bin=${DIR}/../bin
   lib=${DIR}/../lib

   echo '
   {
      "type" : "jdbc",
      "jdbc" : {
        "schedule" : "0 0-59 0-23 ? * *", //时间调度，每分钟跑一次
        "url": "jdbc:mysql://xxxx:3306/testdb?useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull", //jdbc连接
        "user": "xxxx",
        "password": "xxxx",
        "sql" : [" select * from view_search_meal ","select *from view_search_event "], //sql语句，可以多个,这里用的是视图，如果视图中的字段名与索引中的字段名相同，会自动匹配导入（非常实际）
         "elasticsearch" : { //搜索引擎地址
               "cluster" : "elasticsearch",
             "host" : "localhost",
           "port" : 9300
        },
        "max_bulk_actions" : 20000,
        "max_concurrent_bulk_requests" : 10,
        "index" : "fullbiz_index",
        "type" : "testdb_index",        //索引名称
        "statefile" : "statefile.json",
        "metrics" : {
            "enabled" : true,
            "interval" : "1m",
            "logger" : {
                "plain" : false,
                "json" : true
            }
          }
        }
      }
   ' | java \
   -cp "${lib}/*" \
   -Dlog4j.configurationFile=${bin}/log4j2.xml \
   org.xbib.tools.Runner \
   org.xbib.tools.JDBCImporter

3、启动脚本，请选保证elasticsearch已经在运行，启动后约等1分钟就能看到索引中有数据导入了。

root@api-test:/opt/elasticsearch-jdbc-1.7.3.0/bin# ./mysql-test-schedule.sh &

4、可以用前面提到过的查看命令查看是否已经导入了数据库中的数据，如果安装了，kibana，查看更加方便，如：

posted on 2016-03-07 14:42 虾米&老黄牛阅读(851) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

虾米&老黄牛

Elasticsearch1.7服务搭建与入门操作

导航

公告