Tools for Natural Language Processing(转)
- http://www.nltk.org/
- Open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.
- LingPipe
-
LingPipe is a suite of Java libraries for the linguistic analysis of human language.
- Text simplification - Wikipedia, the free encyclopedia
- Text simplification is an operation used in natural language processing to modify, enhance, classify or otherwise process an existing corpus of human-readable text in such a way that the grammar and structure of the prose is greatly simplified, while the underlying meaning and information remains the same. Text simplification is an important area of research, because natural human languages ordinarily contain complex compound constructions that are not easily processed through automation.
- CoPT, Corpus Processing Tools
- CoPT, Corpus Processing Tools, is a set of java classes intended to assist field linguists, NLP researchers and developers, students and software developers in all corpus-related processing.
- Jazzy - Java spell checker API
- Jazzy is a Java spell checker based on the algorithms used by aspell.
- JLinkGrammarParser
- JLinkGrammarParser is a Java port of the CMU link grammar parser, a syntactic parser for english.
- jSpellCorrect
- It’s a simple statistical spelling corrector.
- jTokeniser
- jTokeniser is a Java library for tokenising strings into a list of tokens. A variety of possible tokenisers are available, including a very basic whitespace tokeniser, a more flexible StringTokeniser, a couple of regular expression tokenisers, and a tokeniser that utilises Java’s BreakIterator, which provides more complex, locale dependant tokenisation. More recently, a tokeniser that add breaks text into its constituent sentences. All are very simple to use.
- Linguistic Tree Constructor
- LTC is a free program for building linguistic syntax trees from text.
It lets the user build the tree in a point-and-click fashion.
The program does no analysis on its own — the user is completely free to draw the tree however he or she wishes. However, the program makes sure that the tree is a tree and not some other kind of graph.
- MII Medical NLP Toolkit
- This is a toolkit for medical natural language processing (NLP). The core engine is general enough to be used in a variety of text processing domains, though the toolkit includes specific support for medical reports and patient de-identification.
- nlpFarm
- The nlpFarm is a Natural Language Processing (NLP) resource where early research prototypes (Java) can evolve into robust and useful open source. Our farmstead collaborates under the OpenNLP initiative, in order to make NLP software publically available.
- OpenNLP
- OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP components
- Open source natural language tools
- Toolkit for implementing question answering systems and machine translation in both controlled languages and natural languages. Includes first order logic inference, parsing and semantic analysis, and APIs and standalone server software. Currently some t
- The OpenNLP Grok Library
- Grok is a library of natural language processing components, including support for parsing with categorial grammars and various preprocessing tasks such as part-of-speech tagging, sentence detection, and tokenization.
- The OpenNLP Leo Project
- Leo is a project to provide an architecture for defining XML specifications of grammars for different natural language parsing systems and tools for using that architecture to permit sharing of grammar resources across different systems.
- The OpenNLP Maximum Entropy Package
- Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Grok Library.
- Visuwords™ online graphical dictionary - download source code
- Download the source code for Visuwords.
- Balie
- Extraction from Text with Machine Learning and Natural Language Techniques
- FerFT: Spectral Analyzer
- This software is for multi-purpose power spectral analyzer based on the successive Fourier transformation method. (® UTD) It has been developed with Java (ver.1.5) and works on any OS implemented Java ver.1.5 or later.
- Julius Speech Recognition Engine
- Julius Speech recognition engine
- Modular Audio Recognition Framework
- MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
- VoxForge 0.0.1
- Speech recognition support
- OpenCCG: The OpenNLP CCG Library
- OpenCCG, the OpenNLP CCG Library, is an open source natural language processing library written in Java, which provides parsing and realization services based on Mark Steedman’s Combinatory Categorial Grammar (CCG) formalism.
- Joone
- Joone (Java Object Oriented Neural Engine) is an artificial neural network Java framework. It is used to build and train neural networks with a powerful visual environment. It has a modular design and can be easily extended by writing new modules to implement new learning algorithms or architectures.