1.添加jar文件:
tika-core-0.10.jar
tika-parsers-0.10.jar
.....
2.修改solrconfig.xml,修改完成后重启solr实例:
<lib dir="solr路径/dist/" regex="apache-solr-cell-\d.*\.jar" /> <lib dir="solr路径/contrib/extraction/lib" regex=".*\.jar" />
<requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler"> <lst name="defaults"> <str name="map.Last-Modified">last_modified</str> <str name="uprefix">metadata_</str> </lst> </requestHandler>
3.c#调用代码:
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexDocument>>(); private void AddFile(ISolrOperations<IndexDocument> solr, string id, byte[] content, string resourceName) { using (MemoryStream stream = new MemoryStream(content)) { var response = solr.Extract(new ExtractParameters(stream, id, resourceName) { ExtractFormat = ExtractFormat.Text, ExtractOnly = false, Fields = new[] { new ExtractField("name1", "value1"), new ExtractField("name2", "value2") } }); Console.WriteLine(response.Content); } }
作者:协思
出处:http://zeeman.cnblogs.com/
QQ交流群:32972862