



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Project; Class: LITERARY THEORY; Subject: German; University: University of California - Los Angeles; Term: Spring 2004;
Typology: Study Guides, Projects, Research
1 / 7
This page cannot be seen from the preview
Don't miss anything!




Title openNLP Interface
Version 0.0-
Date 2009-06-
Author Ingo Feinerer, Kurt Hornik
Maintainer Kurt Hornik
Imports methods, rJava (>= 0.6-3), tm
Enhances tm
Suggests openNLPmodels.en, openNLPmodels.es
SystemRequirements Java (>= 5.0)
Description An interface to openNLP (http://opennlp.sourceforge.net/), a collection of natural language processing tools including a sentence detector, tokenizer, pos-tagger, shallow and full syntactic parser, and named-entity detector, using the Maxent Java package for training and using maximum entropy models.
License LGPL-2.
Encoding UTF-
Repository CRAN
Date/Publication 2009-06-27 09:04:
R topics documented:
sentDetect.......................................... 2 tagPOS........................................... 3 tmSentDetect-methods................................... 4 tmTagPOS-methods..................................... 4 tmTokenize-methods.................................... 5 tokenize........................................... 5
Index 7
2 sentDetect
sentDetect Detect sentences
Description
Detect sentences.
Usage
sentDetect(s, language = "en", model = NULL)
Arguments
s A character vector with texts from which sentences should be detected. language A character string giving the language of s. This argument is only used if model is NULL for selecting a default model. At the moment, languages ‘en’ (English), ‘es’ (Spanish), ‘de’ (German) and ‘th’ (Thai) are supported, pro- vided that the corresponding openNLP model language packages (openNLP- models.en,... ) are available. model A model.
Details
If model is NULL then a default model for sentence detection is loaded from the corresponding openNLP models language package.
Value
A character vector resulting from sentence detection in s.
Author(s)
Ingo Feinerer
References
OpenNLP http://opennlp.sourceforge.net/
Examples
s <- "This is a sentence. This another---but with dash-like structures, and some commas. Maybe another with question marks? Sure!" sentDetect(s, language = "en") s <- "¿Como se llama usted? El castellano es la lengua española oficial del Estado." sentDetect(s, language = "es")
4 tmTagPOS-methods
tmSentDetect-methods Methods for Function tmSentDetect in Package ‘openNLP’
Description
Methods for function tmSentDetect in package openNLP.
Methods
object = "PlainTextDocument" Detect sentences in object and return the object.
Examples
if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmSentDetect(crude[[1L]])
tmTagPOS-methods Methods for Function tmTagPOS in Package ‘openNLP’
Description
Methods for function tmTagPOS in package openNLP.
Methods
object = "PlainTextDocument" Tag part-of-speech in object and return the object.
Examples
if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmTagPOS(crude[[1L]])
tmTokenize-methods 5
tmTokenize-methods Methods for Function tmTokenize in Package ‘openNLP’
Description
Methods for function tmTokenize in package openNLP.
Methods
object = "PlainTextDocument" Tokenize object and return the object.
Examples
if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmTokenize(crude[[1L]])
tokenize Tokenizer
Description
Tokenizes the input.
Usage
tokenize(s, language = "en", model = NULL)
Arguments
s A character vector of texts to be tokenized. language A character string giving the language of s. This argument is only used if model is NULL for selecting a default model. At the moment, languages ‘en’ (English), ‘es’ (Spanish), ‘de’ (German) and ‘th’ (Thai) are supported, pro- vided that the corresponding openNLP model language packages (openNLP- models.en,... ) are available. model A model.
Details
If model is NULL then a default model for sentence detection is loaded from the corresponding openNLP models language package.
sentDetect, 2 tagPOS, 3 tokenize, 5
tmSentDetect-methods, 4 tmTagPOS-methods, 4 tmTokenize-methods, 5
sentDetect, 2
tagPOS, 3 tmSentDetect (tmSentDetect-methods), 4 tmSentDetect,PlainTextDocument-method (tmSentDetect-methods), 4 tmSentDetect-methods, 4 tmTagPOS (tmTagPOS-methods), 4 tmTagPOS,PlainTextDocument-method (tmTagPOS-methods), 4 tmTagPOS-methods, 4 tmTokenize (tmTokenize-methods), 5 tmTokenize,PlainTextDocument-method (tmTokenize-methods), 5 tmTokenize-methods, 5 tokenize, 5