今天部署了Heritrix
今天试着部署了下Heritrix爬虫,之前没有玩过java,准备环境花了不少时间,先是32bit和64bit环境的问题,在win2008上安装了32bit的jvm,结果下载了64bit的eclipse,结果提示:failed to load the JNI shared library "d:\java\bin\client\jvm,不能加载jvm.dll,后来才发现是32bit和64bit的问题,下载了64bit的jvm就OK了
安装时参照文档:
http://www.ibm.com/developerworks/cn/opensource/os-cn-heritrix/index.html
安装遇到问题:
Exception in thread "main" org.mortbay.util.MultiException[java.net.BindException: Address already in use: JVM_Bind]
at org.mortbay.http.HttpServer.start(HttpServer.java:640)
at org.archive.crawler.SimpleHttpServer.startServer(SimpleHttpServer.java:279)
at org.archive.crawler.Heritrix.startEmbeddedWebserver(Heritrix.java:1236)
at org.archive.crawler.Heritrix.doCmdLineArgs(Heritrix.java:715)
at org.archive.crawler.Heritrix.main(Heritrix.java:556)
原因是端口被占用导致,更改端口或者终止程序即可。
The selection does not contain a main type
解决办法在buidl Path 的libraries里删除JRE System Library,然后点击右侧的Add Libraray重新加入即可,奇怪。