elasticsearch查询及logstash简介

Query DSL：

request body：

分成两类：

query dsl：执行full-text查询时，基于相关度来评判其匹配结果；

查询执行过程复杂，且不会被缓存；

filter dsl：执行exact查询时，基于其结果为“yes”或“no”进行评判；

速度快，且结果缓存

查询语句的结构：

{
QUERY_NAME: {

AGGUMENT: VALUE,

ARGUMENT: VALUE,...
　　}
}

{
　　QUERY_NAME: {
　　FIELD_NAME: {
　　ARGUMENT: VALUE,...
　　}
　　}
}
filter dsl：

term filter：精确匹配包含指定term的文档；

The term query performs no analysis on the input text, so it will look for exactly the value that is supplied

{ "term": {"name": "Guo"} }

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": { "term": { "first_name": "jing" } }}' 在查询时这个j不能写成大些，不然查不到

terms filter：用于多值精确匹配；

{ "terms": { "name": ["Guo", "Rong"] }}

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": {

"terms": {

"age": [25,24,23]
　　}
　　}
}'

range filters：用于在指定的范围内查找数值或时间

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": {

"range": {

"age": {

"gte": 20,

"lt": 30
　　}
　　}
　　}
}'

boolean filter：

基于boolean的逻辑来合并多个filter子句；

must：其内部所有的子句条件必须同时匹配，即and；

must: {
"term": { "age": 25 }

"term": { "gender": "Female" }
}

must_not：其所有子句必须不匹配，即not

must_not: {

"term": { "age": 25 }
}

should：至少有一个子句匹配，即or

should: {

"term": { "age": 25 }

"term": { "gender": "Female" }
}

QUERY DSL：

match_all Query：

用于匹配所有文档，没有指定任何query，默认即为match_all query

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": { "match_all": {} }}'

match Query：

在几乎任何域上执行full-text或exact-value查询

如果执行full-text查询：首先对查询时的语句做分析；

{ "match": {"students": "Guo" }}

如果执行exact-value查询：搜索精确值；此时，建议使用过滤，而非查询

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": { "match": { "first_name": "Rong,Jing" } }}'

multi_match Query：

用于在多个域上执行相同的查询

curl -XGET 'localhost:9200/students/_search?pretty' -d '{

"query": {

"multi_match": {

"query": "Rong,Jing",

"fields": [ "last_name","first_name" ]
　　}
　　}
}'

合并filter和query：

{
"filterd": {

query: { "match": {"gender": "Female"} }

filter: { "term": {"age": 25}}

　　}
}

ELK stack的另外两个组件：

L: logstash

K: Kibina

Logstash：

支持多数据获取机制，通过TCP/UDP协议、文件、syslog、windows EventLogs及STDIN等；获取到数据后，它支持对数据执行过

滤、修改等操作

JRuby语言开发的，必须运行在JVM上，agent/server模型

logstash的安装，先去官网下载软件包，logstash-1.5.6-1.noarch.rpm，官网就是elasticsearch的官网

yum -y install logstash-1.5.6-1.noarch.rpm

默认logstash的命令程序装在了这个目录下/opt/logstash/bin/，因此 vim /etc/profile.d/logstash.sh

export PATH=/opt/logstash/bin:$PATH

exec bash

默认的配置文件为/etc/logstash/conf.d/目录下所有以.conf结尾的文件，默认此目录下不会有任何文件

创建一个简单的配置文件： vim sample.conf

input {

stdin{}

}

output {

stdout{

codec => rubydebug
　　}
}

其中codec => rubydebug是固定格式,整段代码的意思是从标准输入读数据，输出至标准输出

logstash -f /etc/logstash/conf.d/sample.conf --configtest 此命令可以测试语法是否正确

logstash -f /etc/logstash/conf.d/sample.conf 运行程序接下来输入一个hello 就可以看到JSON格式显示的信息

{

"message" => "hello",

"@version" => "1",

"@timestamp" => "2016-07-20T07:57:19.458Z",

"host" => "centos7"

}

logstash的基本配置框架：

input {

...

}

filter {

...

}

output {

...

}

Logstash的工作流程：input | filter | output, 如无需对数据进行额外处理，filter可省略

四种类型的插件：

input, filter, codec, output

数据类型：

Array：[item1, item2,...]

Boolean：true, false

Bytes：

Codec：编码器

Hash：key => value

Number：数值

Password：

Path：文件系统路径；

String：字符串

字段引用：[]

条件判断：
==, !=, <, <=, >, >=
=~, !~
in, not in
and, or
()

发表于 2016-07-20 16:18 Howareyou? 阅读(961) 评论(0) 编辑收藏举报

公告