Package openNLP Interface - Literary Theory | GERMAN 0270, Study Guides, Projects, Research of German Philology

Material Type: Project; Class: LITERARY THEORY; Subject: German; University: University of California - Los Angeles; Term: Spring 2004;

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 08/31/2009

koofers-user-tyx
koofers-user-tyx 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Package ‘openNLP’
June 27, 2009
Title openNLP Interface
Version 0.0-7
Date 2009-06-27
Author Ingo Feinerer, Kurt Hornik
Maintainer Kurt Hornik <[email protected]>
Imports methods, rJava (>= 0.6-3), tm
Enhances tm
Suggests openNLPmodels.en, openNLPmodels.es
SystemRequirements Java (>= 5.0)
Description An interface to openNLP (http://opennlp.sourceforge.net/), a collection of natural
language processing tools including a sentence detector, tokenizer, pos-tagger, shallow and full
syntactic parser, and named-entity detector, using the Maxent Java package for training and using
maximum entropy models.
License LGPL-2.1
Encoding UTF-8
Repository CRAN
Date/Publication 2009-06-27 09:04:36
Rtopics documented:
sentDetect.......................................... 2
tagPOS ........................................... 3
tmSentDetect-methods ................................... 4
tmTagPOS-methods..................................... 4
tmTokenize-methods .................................... 5
tokenize........................................... 5
Index 7
1
pf3
pf4
pf5

Partial preview of the text

Download Package openNLP Interface - Literary Theory | GERMAN 0270 and more Study Guides, Projects, Research German Philology in PDF only on Docsity!

Package ‘openNLP’

June 27, 2009

Title openNLP Interface

Version 0.0-

Date 2009-06-

Author Ingo Feinerer, Kurt Hornik

Maintainer Kurt Hornik

Imports methods, rJava (>= 0.6-3), tm

Enhances tm

Suggests openNLPmodels.en, openNLPmodels.es

SystemRequirements Java (>= 5.0)

Description An interface to openNLP (http://opennlp.sourceforge.net/), a collection of natural language processing tools including a sentence detector, tokenizer, pos-tagger, shallow and full syntactic parser, and named-entity detector, using the Maxent Java package for training and using maximum entropy models.

License LGPL-2.

Encoding UTF-

Repository CRAN

Date/Publication 2009-06-27 09:04:

R topics documented:

sentDetect.......................................... 2 tagPOS........................................... 3 tmSentDetect-methods................................... 4 tmTagPOS-methods..................................... 4 tmTokenize-methods.................................... 5 tokenize........................................... 5

Index 7

2 sentDetect

sentDetect Detect sentences

Description

Detect sentences.

Usage

sentDetect(s, language = "en", model = NULL)

Arguments

s A character vector with texts from which sentences should be detected. language A character string giving the language of s. This argument is only used if model is NULL for selecting a default model. At the moment, languages ‘en’ (English), ‘es’ (Spanish), ‘de’ (German) and ‘th’ (Thai) are supported, pro- vided that the corresponding openNLP model language packages (openNLP- models.en,... ) are available. model A model.

Details

If model is NULL then a default model for sentence detection is loaded from the corresponding openNLP models language package.

Value

A character vector resulting from sentence detection in s.

Author(s)

Ingo Feinerer

References

OpenNLP http://opennlp.sourceforge.net/

Examples

s <- "This is a sentence. This another---but with dash-like structures, and some commas. Maybe another with question marks? Sure!" sentDetect(s, language = "en") s <- "¿Como se llama usted? El castellano es la lengua española oficial del Estado." sentDetect(s, language = "es")

4 tmTagPOS-methods

tmSentDetect-methods Methods for Function tmSentDetect in Package ‘openNLP’

Description

Methods for function tmSentDetect in package openNLP.

Methods

object = "PlainTextDocument" Detect sentences in object and return the object.

Examples

if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmSentDetect(crude[[1L]])

tmTagPOS-methods Methods for Function tmTagPOS in Package ‘openNLP’

Description

Methods for function tmTagPOS in package openNLP.

Methods

object = "PlainTextDocument" Tag part-of-speech in object and return the object.

Examples

if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmTagPOS(crude[[1L]])

tmTokenize-methods 5

tmTokenize-methods Methods for Function tmTokenize in Package ‘openNLP’

Description

Methods for function tmTokenize in package openNLP.

Methods

object = "PlainTextDocument" Tokenize object and return the object.

Examples

if(!require("tm")) stop("could not load tm package") data("crude") crude[[1L]] tmTokenize(crude[[1L]])

tokenize Tokenizer

Description

Tokenizes the input.

Usage

tokenize(s, language = "en", model = NULL)

Arguments

s A character vector of texts to be tokenized. language A character string giving the language of s. This argument is only used if model is NULL for selecting a default model. At the moment, languages ‘en’ (English), ‘es’ (Spanish), ‘de’ (German) and ‘th’ (Thai) are supported, pro- vided that the corresponding openNLP model language packages (openNLP- models.en,... ) are available. model A model.

Details

If model is NULL then a default model for sentence detection is loaded from the corresponding openNLP models language package.

Index

∗Topic file

sentDetect, 2 tagPOS, 3 tokenize, 5

∗Topic methods

tmSentDetect-methods, 4 tmTagPOS-methods, 4 tmTokenize-methods, 5

sentDetect, 2

tagPOS, 3 tmSentDetect (tmSentDetect-methods), 4 tmSentDetect,PlainTextDocument-method (tmSentDetect-methods), 4 tmSentDetect-methods, 4 tmTagPOS (tmTagPOS-methods), 4 tmTagPOS,PlainTextDocument-method (tmTagPOS-methods), 4 tmTagPOS-methods, 4 tmTokenize (tmTokenize-methods), 5 tmTokenize,PlainTextDocument-method (tmTokenize-methods), 5 tmTokenize-methods, 5 tokenize, 5