Getting Started with Elastic Search in .NET

I have been working on many application during my career.  Many if not all had some searching capabilities.  The more complex the search got, the harder it was to control its performance and impact on database transactions.  If you also would like to support full text search, your problems become larger.  We could use Full text search capabilities in Oracle or SQL Server, but we would need to setup a separate instance if we want to limit impact on the transactions yet again.  Or we can pick another solution, better suited to solve the problem at hand.  This is where Elastic Search comes in.

It is a web services layer setup on top of Lucene, a search engine written in Java.  As a result, we will need to have Java Runtime installed on all machine where we are going to install Elastic Search, including developer machines.  You can download it from oracle web site, and it is free.  http://www.oracle.com/technetwork/java/javase/downloads/jre8-downloads-2133155.html.  I would recommend making sure you have 64 bit installed.

 

You can download Elastic Search at https://www.elastic.co/downloads/elasticsearch.  Let’s start by doing so. Once you have the zip file, we will install windows service that hosts Elastic Search on windows.  This is the easiest way to run Elastic on Windows.  Once you unzip the download, switch to bin folder using command window.  Then run service.bat to install windows service.  Just type the following

service install

If you get the following message: JAVA_HOME environment variable must be set.  We need to setup this environment variable.  Go to properties of “My Computer”.  Then select advanced system settings, then environment variables.  Add new system variable called JAVA_HOME.  It’s value should be something similar to “C:Program FilesJavajre1.8.0_74”.  Then run service install again.  You should see something like the following.  if so, you are done with step 1.

 

image

Next step is to start new .NET project.  I would pick class library to house our search code in.  We will test it using tests project.

So, start new class library project.  I’ll call mine Search.Library.  When we integrate Elastic with .NET, we should use .NET client library.  I use NEST, which is the best one in my opinion.  To install it use Nuget.  I would switch to NuGet package manager console window and type

install-package NEST

or you can use package manager window.  NEST will install ElasticSearch.Net package, one of its dependencies and JSON.NET.  We could then technically speaking start testing, but we need to take a few more steps in advance.

I have been using Elastic Search for a while.  In theory, you can use schema-less approach with it.  However, in practice, this does not work really well.  Schema has many advantages.  We can be very precise, especially for nullable data, when Elastic does not really know how to index the data.  So, if would apply default full text search approaches to all the data.  This may or may not be what we want. 

In an example let’s define a class that we will use for samples.  Let’s call it Location, corresponding to a city.

namespace Search.Library { public class Location { public int CityId { get; set; } public string City { get; set; } public string Zip { get; set; } public string Type { get; set; } public string State { get; set; } public string County { get; set; } public string AreaCodes { get; set; } public double Latitude { get; set; } public double Longitude { get; set; } public string WorldRegion { get; set; } public string Country { get; set; } public int EstimatedPopulation { get; set; } public Coordinates Coordinates { get; set; } } }

 

Because we need to support geographic location, we defined a separate location type.

namespace Search.Library { public class Coordinates { public double Lat { get; set; } public double Lon { get; set; } } }

 

We need to configure the mappings in Elastic.  We may want to use free handed search on all fields, which in Elastic Search will be referred to as analyzed field.   Analyzed fields are broken into words then indexed for speedy word based search.  Say in the case of area codes we want to search similarly to LIKE ‘%%’ in SQL Server though.  Any such fields we need to flag as “not analyzed”.  In addition we may want to use custom analyzer to account for case sensitive search on not-analyzed fields.  We also want to use the same approach for all the fields that we want to use exact match on.  We should also think about primary keys.  In this case we want to flag CityId as id field in elastic search.  I feel that thinking about your mappings and queries upfront will save you some headaches down the road.

private void CreateMappings() { _client.Map<Location>(descriptor => { descriptor.Index(DefaultIndexName); descriptor.Properties(propertiesDescriptor => { propertiesDescriptor.Number(loc => loc.Name(location => location.CityId)); propertiesDescriptor.String(loc => loc.Name(location => location.City)); propertiesDescriptor.String(loc => loc.Name(location => location.Country)); propertiesDescriptor.String(loc => loc.Name(location => location.State)); propertiesDescriptor.String(loc => loc.Name(location => location.Type)); propertiesDescriptor.String(loc => loc.Name(location => location.Zip) .NotAnalyzed().Analyzer(LowerCaseAnalyzerName)); propertiesDescriptor.Number(loc => loc.Name(location => location.Latitude)); propertiesDescriptor.Number(loc => loc.Name(location => location.Latitude)); propertiesDescriptor.Number(loc => loc.Name(location => location.EstimatedPopulation)); propertiesDescriptor.GeoPoint(loc => { loc.Name(location => location.Coordinates); loc.LatLon(); return loc; }); return propertiesDescriptor; }); return descriptor; }); }

 

 

In the mapping creation above _client is an instance of ElasticClient.  Then we run through the property of our type, Location, and setup up each property.  In case of zip we set it up fot wild card search.  The reset of string properties are setup for stadard word base indexed search.  Finally, I setup location as type GeoPoint for spatial search.  We are going to run through the code in unit tests to make sure our mappings work Ok.

using Microsoft.VisualStudio.TestTools.UnitTesting; using Nest; namespace Search.Library.Tests { [TestClass] public class SearchTests { private ElasticClient _client; [TestInitialize] public void OnInit() { _client = new ElasticConfiguration().CreatElasticClient(); // just for testing. Should use custom index name. var indexExists = _client.IndexExists(new IndexExistsRequest(Indices.Parse(ElasticConfiguration.DefaultIndexName))); if (indexExists.Exists) { _client.DeleteIndex(new DeleteIndexDescriptor(Indices.Parse(ElasticConfiguration.DefaultIndexName))); } _client.Refresh(new RefreshRequest(Indices.All)); new ElasticConfiguration().SetupMappings(); } [TestMethod] public void Should_Create_Mappings() { var config = new ElasticConfiguration(); config.SetupMappings(); } [TestMethod] public void Should_Add_Data() { _client = new ElasticConfiguration().CreatElasticClient(); var loc = new Location { Type = "STANDARD", Coordinates = new Coordinates { Lat = 30, Lon = 40 }, Latitude = 30, CityId = 1, EstimatedPopulation = 23, State = "GA", City = "Atlanta", Zip = "30000", Country = "USA", AreaCodes = "33333 44444", County = "Gwinnett", Longitude = 40, WorldRegion = "North America" }; _client.Index(loc, descriptor => { descriptor.Index("default"); return descriptor; }); } } }

 

 

If we want to look at our mappings, we can easily do this in Chrome.  Go to extensions and search for “Sense”.  This will install Elastic Search plugin.  You can click on the plugin after that, and you will something similar to the following.

image

To look at the mappings, just type get _mapping and hit green arrow.

image

Our mappings look as follows.

{ "default": { "mappings": { "locations": { "properties": { "areaCodes": { "type": "string" }, "city": { "type": "string" }, "cityId": { "type": "double" }, "coordinates": { "type": "geo_point", "lat_lon": true }, "country": { "type": "string" }, "county": { "type": "string" }, "estimatedPopulation": { "type": "double" }, "latitude": { "type": "double" }, "longitude": { "type": "double" }, "state": { "type": "string" }, "type": { "type": "string" }, "worldRegion": { "type": "string" }, "zip": { "type": "string", "index": "not_analyzed", "analyzer": "customLowerCase" } } } } } }

 

We will discuss queries in subsequent posts.  You can download current project here.

posted @ 2017-06-30 15:45  a-du  阅读(281)  评论(0编辑  收藏  举报