ElasticSearch学习笔记
mac系统,安装java jdk
brew install elasticsearch,安装完之后记得设置开机自动启动和马上启动elasticsearch服务
根据elasticsearch-rails这个gem为项目加入两个gem
gem 'elasticsearch-model', git: 'git://github.com/elasticsearch/elasticsearch-rails.git' gem 'elasticsearch-rails', git: 'git://github.com/elasticsearch/elasticsearch-rails.git'
根据产品需求:根据场馆地址和名字来搜索场馆
全中文的话用默认的设置也能满足,所以搜索函数封装如下:
module Searchable extend ActiveSupport::Concern included do include Elasticsearch::Model include Elasticsearch::Model::Callbacks mapping do end def self.search(query) __elasticsearch__.search( { query: { multi_match: { query: query, fields: [ "name", "address" ] } } } ) end end end
model方面的设置如下:
equire 'elasticsearch/model' class Stadium < ActiveRecord::Base include Searchable has_many :fields belongs_to :city belongs_to :sport validates_presence_of :status_gap, :available end #每次都删除之前的index并且重新创建 Stadium.__elasticsearch__.client.indices.delete index: Stadium.index_name rescue nil Stadium.__elasticsearch__.client.indices.create \ index: Stadium.index_name, body: { settings: Stadium.settings.to_hash, mappings: Stadium.mappings.to_hash } Stadium.import
但是对于部分匹配的需求无用,例如对于"15936525874","tom",
想要输入"to"或者"om"或者"936"也能搜索出结果就匹配失败
根据官网的教程,我用通配符把search函数改写如下:
module Searchable extend ActiveSupport::Concern included do include Elasticsearch::Model include Elasticsearch::Model::Callbacks #用通配符的话最好设置其analyzer为not_analyzed,减少系统消耗 mapping do indexes :name, index: "not_analyzed" indexes :address, index: "not_analyzed" indexes :contact_phone, index: "not_analyzed" end def self.search(query) __elasticsearch__.search( { query: { query_string: { query: "*#{query}*", fields: [ "name", "address", "contact_phone" ] } } } ) end end end
但是以通配符开头的模式是非常消耗资源的,应该避免,现在以实现功能为主,暂时先这样
fuzzy query 模糊查询
模糊查询基本格式如下:
"fuzzy" : { "price" : { "value" : 12, "fuzziness" : 2 } }
当value为数字或者时间格式时,查询变成一个范围
例如对于上面的代码来说就变成查询10<price<14的范围
当value为string的格式时,就涉及一个叫“编辑距离”的东西,具体可以参考这篇文章
例如当我想搜索用户名字叫"tom"的字符时,我的设置如下:
"fuzzy" : { "name" : { "value" : "to", "fuzziness" : 2 } }
因为字符串从"to"变成"tom"只需要一步的变化,所以即使fuzziness设置为1也能够匹配到
但是对于电话号码或者比较长的用户名字(例如"tommy"时)就匹配不了