LCZ-Java课程设计-基于学院网站的搜索引擎
1、团队名称、团队成员介绍、任务分配,团队成员课程设计博客链接
姓名 | 成员介绍 | 任务分配 | 课程设计博客链接 |
---|---|---|---|
李睿(组长) | 头发日渐稀疏 | Elasticsearch后台功能实现,Web前端设计及后端衔接。 | Elasticsearch后台功能实现 Web前端设计 |
陈曦 | 帮忙端茶送水 | 爬虫功能实现 | https://www.cnblogs.com/xiudian7/p/17040344.html |
郑博文 | 负责拿外卖 | GUI版搜索引擎实现 |
2、项目简介,涉及技术
基于学院网站的搜索引擎,可以对学院网站进行抓取、建索、搜索、摘要显示、按时间范围搜索
参考项目:Java团队课程设计——基于学院的搜索引擎(18级学长们真nb,尽力做了)
涉及技术:
- Jsoup
- HTML+CSS,javascript
- jQuery&jQuery-UI
- Bootstrap5
- Elasticsearch
- IK analyzer
- Servlet
- JSP
- Maven、Git
- Windows
3、本项目的git地址。
https://github.com/lrui1/LCZ-SearchEngine
4、项目git提交记录截图
5、前期调查
5.1 搜索主页界面
5.2 搜索结果界面
6、主要功能流程图
7、面向对象设计类图
爬虫模块
Elasticsearch模块
GUI模块
8、项目运行截图
9、项目关键代码分模块描述
爬虫模块
爬取计算机工程学院网站所有class内容
Set<String> classSet = new HashSet<String>();
Elements div = document.getElementsByTag("div");
for (Element element : div) {
String aClass = element.attr("class");
if (aClass != ""){ classSet.add(aClass);}
}
解析网站,获取ResultEntry的内容
Set<ResultEntry> backSet=new HashSet<>();
Elements select = connection.select(selection);
for (Element element : select) {
Elements a = element.getElementsByTag("a");
ResultEntry e=new ResultEntry();
String href = a.attr("href");
e.setUrl("http://cec.jmu.edu.cn/"+href);
String title = a.attr("title");e.setTitle(title);
try {
String text = Jsoup.connect(e.getUrl()).get().text();e.setText(text);
String declearTime = getDeclearTime(Jsoup.connect(e.getUrl()).get());
e.setDeclareTime(declearTime);
backSet.add(e);
}catch (Exception ex){
continue;
}
}
List<ResultEntry>backList=new ArrayList<>(backSet);
return backList;
获取发布时间
String text = connection.select("div.er_right_xnew_date").text();
int indexOf = text.indexOf("时间:");
int suffixNum=3;
if (indexOf != -1) {return (text.substring(indexOf + suffixNum));}
else {return null;}
函数返回的数组合并到输出数组
for (ResultEntry resultEntry : addList) {
printList.add(resultEntry);
}
return printList;
Elasticsearch模块
Elasticsearch Java API Client 连接
public static ElasticsearchClient getConnect() {
// 创建许可证
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(
AuthScope.ANY, new UsernamePasswordCredentials(USERNAME, PASSWORD));
// 导入许可证
RestClientBuilder builder = RestClient.builder(new HttpHost(URL, PORT))
.setHttpClientConfigCallback(httpAsyncClientBuilder -> httpAsyncClientBuilder
.setDefaultCredentialsProvider(credentialsProvider));
// 建立连接
restClient = builder.build();
transport = new RestClientTransport(
restClient, new JacksonJsonpMapper());
return new ElasticsearchClient(transport);
}
创建索引,写入mapping
public boolean newIndex(Reader reader) {
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.withJson(reader)
.index(EsUtil.index)
.build();
CreateIndexResponse response = null;
try {
response = client.indices().create(createIndexRequest);
} catch (Exception e) {
// 创建索引引发的异常
}
if(response != null) {
return Objects.requireNonNullElse(response.acknowledged(), false);
} else {
return false;
}
}
public static void main(String[] args) {
Reader reader = new StringReader("{\n" +
" \"mappings\": {\n" +
" \"properties\": {\n" +
" \"url\":{\"type\": \"keyword\"},\n" +
" \"title\":{\n" +
" \"type\": \"text\",\n" +
" \"analyzer\": \"ik_max_word\", \n" +
" \"fields\": {\n" +
" \"suggest\": {\n" +
" \"type\": \"completion\",\n" +
" \"analyzer\": \"ik_max_word\"\n" +
" }\n" +
" }\n" +
" },\n" +
" \"text\":{\n" +
" \"type\": \"text\",\n" +
" \"analyzer\": \"ik_max_word\"\n" +
" },\n" +
" \"declareTime\": {\n" +
" \"type\": \"date\"\n" +
" }\n" +
" }\n" +
" }\n" +
"}");
boolean newBool = search.newIndex(reader);
}
添加文档
public ResultEntry add(ResultEntry entry) {
try {
client.index(i -> i
.index(EsUtil.index).document(entry));
} catch (IOException e) {
return null;
}
return entry;
}
全文检索(默认第一页)
public List<ResultEntry> search(String searchText, int page) {
// 页数从0开始编号
int value = (page - 1) * 10;
SearchResponse<ResultEntry> search = null;
try {
search = client.search(s -> s
.index(EsUtil.index)
.query(q -> q
.multiMatch(m -> m
.query(searchText)
.fields("title", "text")
.analyzer("ik_smart")))
.highlight(h -> h
.preTags("<span class=\"hit-result\">")
.postTags("</span>")
.fields("title", builder -> builder)
.fields("text", builder -> builder))
.from(value)
.size(10)
, ResultEntry.class);
} catch (IOException e) {
e.printStackTrace();
}
return dealSearchResponse(search);
}
根据时间检索
public List<ResultEntry> search(String searchText, int page, String beginDate) {
// 页数从0开始编号
int value = (page - 1) * 10;
JsonData jsonBeginDate = JsonData.of(beginDate);
SearchResponse<ResultEntry> search = null;
try {
search = client.search(s -> s
.index(EsUtil.index)
.query(q -> q
.bool(b -> b
.must(b1 -> b1
.multiMatch(b2 -> b2
.query(searchText)
.fields("title", "text")
.analyzer("ik_smart")))
.filter(b3 -> b3
.range(b4 -> b4
.field("declareTime")
.gte(jsonBeginDate)))))
.highlight(h -> h
.preTags("<span class=\"hit-result\">")
.postTags("</span>")
.fields("title", builder -> builder)
.fields("text", builder -> builder))
.from(value)
.size(10)
, ResultEntry.class);
} catch (IOException e) {
e.printStackTrace();
}
return dealSearchResponse(search);
}
Web前端
搜索提示,用户输入时异步请求SearchSuggest,返回的数据使用jQuery-UI autocomplete 呈现
$(function () {
$(".search-input").autocomplete({
source: function( request, response ) {
var input = $(".search-input").val();
var source = "";
$.ajax({
type : "get",
url : "SearchSuggest",
datatype : "json",
data: {"input": input},
async : false,
error : function() {
console.error("Load recommand data failed!");
},
success : function(data) {
source = data;
}
});
response(JSON.parse(decodeURI(source)));
}
})
})
翻页功能,使用jqPaginator,当用户选择页数时,使用GET请求跳转至search.jsp页面显示结果
$("#my-pagination").jqPaginator({
totalPages: <%=searchCount/10+1%>,
visiblePages: 10,
currentPage: <%=currentPage%>,
first: '<li class="first page-item"><a class="page-link" href="javascript:;">首页</a></li>',
prev: '<li class="prev page-item"><a class="page-link" href="javascript:;">上一页</a></li>',
next: '<li class="next page-item"><a class="page-link" href="javascript:;">下一页</a></li>',
last: '<li class="last page-item"><a class="page-link" href="javascript:;">末页</a></li>',
page: '<li class="page page-item"><a class="page-link" href="javascript:;">{{page}}</a></li>',
onPageChange: function (num, type) {
$('#my-pagination-text').html('当前第' + num + '页');
if("change" == type) { // 换页触发的
let inputText = getQueryVariable("inputText");
let beginDate = getQueryVariable("beginDate");
// console.log("search.jsp?inputText="+inputText+"&page="+num);
if(beginDate != "") {
window.location.href = "search.jsp?inputText="+inputText+"&page="+num+"&beginDate="+beginDate;
} else {
window.location.href = "search.jsp?inputText="+inputText+"&page="+num;
}
}
}
});
时间选择,通过监听下拉菜单的每个选项,通过不同的GET请求去请求search.jsp
$(function () {
$("#range-all").click(function () {
let inputText = getQueryVariable("inputText");
window.location.href = "search.jsp?inputText="+inputText;
});
$("#range-week").click(function () {
// 获取当前选择的日期
let myDate = new Date();
myDate.setDate(myDate.getDate() - 7);
let beginDate = myGetDate(myDate);
// 跳转页面
let inputText = getQueryVariable("inputText");
window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate;
});
$("#range-month").click(function () {
// 获取当前选择的日期
let myDate = new Date();
myDate.setMonth(myDate.getMonth() - 1);
let beginDate = myGetDate(myDate);
// 跳转页面
let inputText = getQueryVariable("inputText");
window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate;
});
$("#range-year").click(function () {
// 获取当前选择的日期
let myDate = new Date();
myDate.setFullYear(myDate.getFullYear() - 1);
let beginDate = myGetDate(myDate);
// 跳转页面
let inputText = getQueryVariable("inputText");
window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate;
});
});
输出搜索结果,获取结果的高亮标签的下标,利用下标规划输出摘要
<%
for(ResultEntry resultEntry : searchResult) {
out.println("<div class=\"col col-lg-7 mt-3\">");
out.println("<a class=\"address\" href="+resultEntry.getUrl()+" target=\"_blank\">"+resultEntry.getTitle()+"</a>");
// 获取content需要输出的区间
int front = resultEntry.getText().indexOf("<span");
int tail = resultEntry.getText().lastIndexOf("span>");
front -= 10; tail += 25;
if(front < 0) {
front = 0;
}
if(tail > resultEntry.getText().length() - 1) {
tail = resultEntry.getText().length() - 1;
do {
if(!"\"".equals(resultEntry.getText().charAt(tail))) {
tail++;
break;
}
tail--;
}while (tail > 0);
}
String content = resultEntry.getText().substring(front, tail);
out.println("<div class=\"content\">"+content+"</div>");
out.println("<a href="+resultEntry.getUrl()+" style=\"font-size: small; color: gray\">"+resultEntry.getUrl()+"</a>");
out.println("</div>");
}
%>
GUI模块
页面初始化,搜索并打开结果页面
private void searchActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
Page.page=1;
String text = input.getText();
Input.read(text);
Result result = new Result();
result.setVisible(true);
this.setVisible(false);
this.dispose();
}
获得输入内容,展示初次搜索结果
private static List<ResultEntry> results;
private static String input;
public static void read(String text){
Search search = new EsSearch();
results = search.search(text);
search.close();
input = text;
}
public static String getText(){
return input;
}
public static List<ResultEntry> getResults() {
return results;
}
实现再次搜索,展示功能
private void searchActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
String text = input.getText();
Input.read(text);
input.setText(Input.getText());
List<ResultEntry> results = Input.getResults();
Page.page=1;
nowPage.setText(Page.page+"");
if(results.size()!=0){
content1.setText(null);
content2.setText(null);
content3.setText(null);
content4.setText(null);
content5.setText(null);
content1.setText(results.get(0).getText());
content2.setText(results.get(1).getText());
content3.setText(results.get(2).getText());
content4.setText(results.get(3).getText());
content5.setText(results.get(4).getText());
}
}
实现页面分类检索,上下页翻动功能
private void prePageActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
if(Page.page==1){
}else {
List<ResultEntry> results;
Search search = new EsSearch();
Page.page--;
results = search.search(Input.getText(),(Page.page+1)/2);
for(ResultEntry x:results) {
System.out.println(x);
}
if(results.get(0)==null){
results = search.search(Input.getText(),(Page.page+2)/2);
Page.page++;
}else{
}
nowPage.setText(Page.page+"");
if(results.size()!=0){
content1.setText(null);
content2.setText(null);
content3.setText(null);
content4.setText(null);
content5.setText(null);
if(Page.page%2==1){
content1.setText(results.get(0).getText());
content2.setText(results.get(1).getText());
content3.setText(results.get(2).getText());
content4.setText(results.get(3).getText());
content5.setText(results.get(4).getText());
}else {
content1.setText(results.get(5).getText());
content2.setText(results.get(6).getText());
content3.setText(results.get(7).getText());
content4.setText(results.get(8).getText());
content5.setText(results.get(9).getText());
}
}
search.close();
}
}
private void nextPageActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
Page.page++;
List<ResultEntry> results;
Search search = new EsSearch();
results = search.search(Input.getText(),(Page.page+1)/2);
if(results.size()==0){
results = search.search(Input.getText(),(Page.page)/2);
Page.page--;
}else{
}
nowPage.setText(Page.page+"");
if(results.size()!=0){
content1.setText(null);
content2.setText(null);
content3.setText(null);
content4.setText(null);
content5.setText(null);
if(Page.page%2==1){
content1.setText(results.get(0).getText());
content2.setText(results.get(1).getText());
content3.setText(results.get(2).getText());
content4.setText(results.get(3).getText());
content5.setText(results.get(4).getText());
}else {
content1.setText(results.get(5).getText());
content2.setText(results.get(6).getText());
content3.setText(results.get(7).getText());
content4.setText(results.get(8).getText());
content5.setText(results.get(9).getText());
}
}
nowPage.setText(Page.page+"");
search.close();
}
实现超链接跳转功能,展示url1的具体实现,url2-5同理
private void url1ActionPerformed(java.awt.event.ActionEvent evt) {
// TODO add your handling code here:
List<ResultEntry> results;
Search search = new EsSearch();
results = search.search(Input.getText(),(Page.page+1)/2);
Desktop desktop = Desktop.getDesktop();
URI uri=null;
if(results.size()!=0) {
if ((Page.page % 2) == 1) {
try {
uri = new URI(results.get(0).getUrl());
} catch (URISyntaxException e) {
throw new RuntimeException(e);
}
} else {
if(results.size()>5) {
try {
uri = new URI(results.get(5).getUrl());
} catch (URISyntaxException e) {
throw new RuntimeException(e);
}
}
}
try {
desktop.browse(uri);//使用默认浏览器打开超链接
} catch (IOException e) {
throw new RuntimeException(e);
}
}
search.close();
}
10、项目代码扫描结果及改正。
扫描结果
改正
if语句没加大括号
缺少注释者信息
缺少方法描述
11、项目总结(包括不足与展望、想要进一步完成的任务)
这次Java课程团队课程设计,团队学到了JavaEE规范,编写简单爬虫,大数据处理,Web相关等Java应用技术。可惜计划赶不上变化,团队变成小羊村让整体项目停摆了一段时间,还因为有一些软件选择的版本太新,没选择稳定的版本,导致网上的示例太少,出现bug也很难找到解决方法。项目的完成度相比于18级学长还是差了很多
目前项目的爬虫的抓取策略仍有改进之处,Elasticsearch的后台的搜索没有利用到评分系统,搜索提示不是很好用,Web前端展示只进行了初步的美化,GUI功能略有不足。
未来项目可以实现前后端分离,保证合法输入,增加时间区间选择搜索,美化Web界面对手机进行更好的自适应。希望将来有学弟学妹继续这个项目可以做的更好