(32位)本体学习程序(ontoEnrich)系统配置说明文档
1.系统环境
32位 Ubuntu
-
源代码中已经包含在32位下编译生成的.o文件,配置好依赖库(步骤2)后,参考步骤3则可重新link.
link无误即可运行程序。
2.依赖库
2.1 boost_1_44_0
2.1.1 ubuntu 64位
boost_1_44_0安装说明
2.1.2 ubuntu 32位
- 源码中已经包含boost_1_44_0在32位下编译后的文件,将(boost_1_44_0.tar)解压缩之后的文件放置在
/usr/local
下即可。 - 编译Boost.Regex:
如果在{$BOOST_PATH}/libs/regex/build/
目录中不存在名为gcc的文件夹,则执行下面操作:
在{$BOOST_PATH}/libs/regex/build/
执行如下命令,生成动态链接库
make -f gcc-shared.mak make -f gcc.mak sudo ln -s libboost_regex-gcc-1_42.so /usr/local/lib sudo ln -s libboost_regex-gcc-1_42.so /usr/lib
若存在gcc文件夹,且文件夹中只有*.o
文件,而没有*.so
文件。则执行下面操作
在{$BOOST_PATH}/libs/regex/build/gcc
执行如下命令,生成动态链接库
g++ *.o -fPIC -shared -o libboost_regex-gcc-1_42.so sudo ln -s libboost_regex-gcc-1_42.so /usr/local/lib sudo ln -s libboost_regex-gcc-1_42.so /usr/lib
配置libboost_regex-gcc-1_42.so的路径,执行如下命令
1 cd ~ 2 gedit .bashrc //在.bashrc文件中添加下面两行 export BOOST_PATH="/usr/local/boost_1_44_0" export LD_LIBRARY_PATH=$BOOST_PATH/libs/regex/build/gcc //保存退出 3 source .bashrc
2.2安装mysql
安装mysql,包括mysql-server,mysql-client,libmysqlclient-dev:
sudo apt-get install mysql-server sudo apt-get install mysql-client sudo apt-get install libmysqlclient-dev
程序中所用的数据库名为wikipedia,用户名为'root'@'localhost',密码为‘root’。如果用户名root的密码不为root时,注意修改密码。
创建数据库wikipedia:
> create database wikipedia;
> use wikipedia;
恢复数据:将wikipedia_mysql_backup(里面包含create table语句和insert语句)恢复到wikipedia数据库中:
> source wikipedia_mysql_backup;
2.3{程序路径}\ontoEnrich\ltp-service
修改ltp-service/__ltpService/LTPOption.h文件,对该文件进行如下修改:
//#define LINUX_OS -> #define LINUX_OS #define WIN_OS -> //#define WIN_OS
在ltp_service文件内,依次执行如下3个命令:
./configure;
make;
sudo make install
make
需要g++支持,如果没有安装g++(用指令g++ -v 查看是否安装
),则通过sudo apt-get install g++
安装.
程序运行所需的主要的libutil libxml4nlp libservice在 /usr/local/lib
中。
3. make源文件3.1 将 {程序路径}\ontoEnrich\system\makefile,用如下的makefile覆盖:
#如果提示文件找不到,添加需要的 INCLUDE,LINK 路径 #修改时,请对比原makefile文件,并不要修改原文件(将原makefile文件备份) INCLUDE = ./ -I ../segment -I ../conceptRecognise -I ../utility \ -I ../regexMatch -I ../include -I ../suffixTree -I ../wikiProject \ -I ../relationExtractor -I ../simWord -I ../clustering -I ../pattern \ -I /usr/local/include -I /usr/local/boost_1_44_0 VPATH =:../segment:../conceptRecognise:../utility:../suffixTree \ :../regexMatch:../wikiProject:../utility:../relationExtractor \ :../simWord:../clustering:../pattern BOOSTLIBS = -L /usr/local/boost_1_44_0/libs/regex/build/gcc -lboost_regex-gcc-1_42 object1=text.o corpus.o simpleConceptExtractor.o object2=getRulePattern.o regexMatch.o object3=wikiCategory.o zh2sim.o connectMysql.o regexMatch.o object4=CWikiNetworkTrainer.o fire.o CWikiNetwork.o StrFun.o object5=suffixTree.o charConverter.o object6=relationPopulation.o mark.o kmeans.o patternUtility.o pattern.o synForest.o object7=clustering.o wordVector.o distance.o tree.o object8=getPattern.o editDistanceCal.o patternGenerator.o object=$(object1) $(object2) $(object3) $(object4) $(object5) $(object6) $(object7) $(object8) \ wikiInfoExtractor.o wikiInfoExtractor.o compoundConceptExtractor.o \ addElement.o ontoLearner.o ontologyEnrichment.o sentParser.o ontologyEnrichment:$(object) g++ -o ontologyEnrichment -g $^ -I$(INCLUDE) -lmysqlclient ${BOOSTLIBS} \ -L ../segment -lsegment -L /usr/local/lib -lutil -lxml4nlp -lservice myUtility.o:myUtility.cpp g++ -g -c $^ -I$(INCLUDE) -L ../segment -lsegment #simpleConceptLearner.o:$(object1) # g++ -g -o simpleConceptLearner.o $^ -L../segment -lsegment simpleConceptExtractor.o:simpleConceptExtractor.cpp g++ -c -g $^ -I$(INCLUDE) text.o:text.cpp g++ -c -g $^ -I$(INCLUDE) -L ../segment -lsegment corpus.o:corpus.cpp g++ -c -g $^ -I$(INCLUDE) -L ../segment -lsegment # getRuleFile.o:$(object2) # g++ -g -o getRuleFile.o $^ -I$(INCLUDE) -L../ -lsegment -lboost_regex-gcc-1_42 getRulePattern.o:getRulePattern.cpp g++ -g -c $^ -I$(INCLUDE) regexMatch.o:regexMatch.cpp g++ -c -g $^ -I$(INCLUDE) ${BOOSTLIBS} wikiInfoExtractor.o:wikiInfoExtractor.cpp g++ -g -c $^ -I$(INCLUDE) ${BOOSTLIBS} -L ../segment -lsegment compoundConceptExtractor.o:compoundConceptExtractor.cpp g++ -g -c $^ -I$(INCLUDE) -L ../segment -lsegment # cateRel.o:$(object3) # g++ -g -o cateRel.o $^ -I$(INCLUDE) -lboost_regex-gcc-1_42 -lmysqlclient -L ../segment -lsegment wikiCategory.o:wikiCategory.cpp g++ -c -g $^ -I$(INCLUDE) ${BOOSTLIBS} -L ../segment -lsegment zh2sim.o:zh2sim.cpp g++ -g -c $^ -I$(INCLUDE) connectMysql.o:connectMysql.cpp g++ -g -c $^ -I$(INCLUDE) -lmysqlclient addElement.o:addElement.cpp g++ -g -c $^ -I$(INCLUDE) -lmysqlclient ontoLearner.o:ontoLearner.cpp g++ -g -c $^ ontologyEnrichment.o:ontologyEnrichment.cpp g++ -g -c $^ CWikiNetworkTrainer.o: CWikiNetworkTrainer.cpp g++ -g CWikiNetworkTrainer.cpp -c -o CWikiNetworkTrainer.o -I$(INCLUDE) fire.o :fire.cpp g++ -g $^ -c -o fire.o -I$(INCLUDE) CWikiNetwork.o : CWikiNetwork.cpp g++ -g CWikiNetwork.cpp -c -o CWikiNetwork.o -I$(INCLUDE) StrFun.o: StrFun.cpp g++ -g StrFun.cpp -c -I$(INCLUDE) suffixTree.o:suffixTree.cpp g++ -c -g $^ -I$(INCLUDE) -L ../segment -lsegment charConverter.o:charConverter.cpp g++ -c -g $^ relationPopulation.o:relationPopulation.cpp g++ -c $^ -I$(INCLUDE) mark.o:mark.cpp g++ -c $^ -I$(INCLUDE) kmeans.o:kmeans.cpp g++ -c $^ -I$(INCLUDE) patternUtility.o:patternUtility.cpp g++ -c $^ -I$(INCLUDE) pattern.o:pattern.cpp g++ -c $^ -I$(INCLUDE) synForest.o:synForest.cpp g++ -g -c $^ -I$(INCLUDE) -L ../segment -lsegment clustering.o:clustering.cpp g++ -c -g $^ -I$(INCLUDE) wordVector.o:wordVector.cpp g++ -c -g $^ -I$(INCLUDE) distance.o:distance.cpp g++ -c -g $^ tree.o:tree.cpp g++ -c -g $^ getPattern.o:getPattern.cpp g++ -g -c $^ -I$(INCLUDE) ${BOOSTLIBS} patternGenerator.o:patternGenerator.cpp g++ -g -c $^ -I$(INCLUDE) ${BOOSTLIBS} editDistanceCal.o:editDistanceCal.cpp g++ -g -c $^ -I$(INCLUDE) sentParser.o:sentParser.cpp g++ -g -c $^ -I$(INCLUDE) -L /usr/local/lib -lutil -lxml4nlp -lservice clean: rm $(object) ontologyEnrichment
3.2
cd {程序路径}/ontoEnrich/system;
make
如果提示 "xx"是最新的,请查阅相关资料自行处理。
3.3 运行程序:
cd {程序路径}/ontoEnrich/system;
./ontologyEnrichment
4.程序代码说明
主程序:ontoEnrich/system/ontologyEnrichment.cpp
[概念学习] 1.简单概念学习程序:simpleConceptLearner():ontoEnrich/conceptRecognise/simpleConceptExtractor.cpp
2.符合概念学习程序:compoundConceptLearner():ontoEnrich/conceptRecognise/compoundConceptExtractor.cpp
[ 关系学习]
1、使用维基百科信息盒进行关系学习:infoboxExtractor():ontoEnrich/wikiProject/wikiInfoExtractor.cpp
2、使用维基百科分类名进行关系学习:categoryExtractor():ontoEnrich/wikiProject/wikiCategory.cpp
3、使用维基百科链接进行关系学习:linkExtractor():ontoEnrich/wikiProject/fire.cpp
4、使用广义后缀树识别概念分类关系:suffixTreeLearner()
5、使用层次聚类学习概念间分类关系:hierClusteringLearner()
6、模板匹配法学习特定概念间关系(方法一):patternRelationExtractor()
7、模板匹配法学习特定概念间关系(方法二):patternLearner()