wget http://www.coreseek.cn/uploads/csft/3.1/Source/csft-3.1.tar.gz

wget http://www.coreseek.cn/uploads/csft/3.1/Source/mmseg-3.1.tar.gz 

安装mmseg

$./configure --prefix=/usr/local/mmesg

$make

$make install

wget http://ftp.gnu.org/gnu/libiconv/libiconv-1.14.tar.gz

$./configure

$make

$make install

安装coreseek

./configure --prefix=/usr/local/csft --with-mmseg=/usr/local/mmseg/bin/mmseg --with-mmseg-includes=/usr/local/mmesg/include/mmseg/ --with-mmseg-libs=/usr/local/mmesg/lib/

$make

$make install

出错的话,参考 http://blog.csdn.net/hhq163/article/details/5864470 修改

./src/MakeFile 182行文件
将
LIBS = -lm -lexpat -L/usr/local/lib 
改成
LIBS = -lm -lexpat -liconv -L/usr/local/lib 

MySQL:

/etc/mysql/my.cnf

[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
default-character-set=utf8

[mysqld]
#
# * Basic Settings
#

#
# * IMPORTANT
# If you make changes to these settings and your system uses apparmor, you may
# also need to also adjust /etc/apparmor.d/usr.sbin.mysqld.
#

user = mysql
socket = /var/run/mysqld/mysqld.sock
default-character-set=utf8

 

 

mysql> grant all privileges on test.* to 'test'@'%' identified by 'test';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on test.* to 'test'@'localhost' identified by 'test';
Query OK, 0 rows affected (0.00 sec)

 

 

vi /usr/local/coreseek/data/dict/mmseg.ini

[mmseg]
merge_number_and_ascii=1;
number_and_ascii_joint=-;
compress_space=0;
seperate_number_ascii=1;

解释如下

/*
merge_number_and_ascii: 字母和数字连续出现是非切分
number_and_ascii_joint:连接数字和字母可用的符号,如- . 等
compress_space:好像没有用
seperate_number_ascii:是否拆分数字,如 1988 -> 1/x 9/x 8/x 8/x

*/

中文支持:

vi etc/csft.conf

# charset_type = sbcs
charset_type = utf-8

 

索引ok,通过客户端查询乱码:参考:http://jjw.in/server/226

目前为止,中文搜索似乎是ok了,但是用putty直接运行search的时候,返回的中文会是乱码,不知道原因ing,可能用php去调会好,后续有情况再记录。
明天争取合并了词库,并且完成用php查询的实验,如果能完成增量索引的概念则更佳,^ ^
That’s all for today.

————————2010/4/12分割线————————
今天到公司尝试了下php的api之类的,昨天mysql的编码问题其实并没有彻底解决,发现csft.conf里面还是加了
sql_query_pre = SET NAMES utf8
才能正常读中文的东东,于是查了些资料,参考了下公司的mysql设置,
发现,原来mysql4.1.2开始支持一个叫init_connect的参数了,也就是默认链接时运行的语句,只要
[mysqld]
init_connect=’SET NAMES utf8′
就可以了,

posted on 2013-05-24 13:30  @且听风吟@  阅读(207)  评论(0编辑  收藏  举报