CRAN - Package boilerpipeR

boilerpipeR: Interface to the Boilerpipe Java Library

Generic Extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe (http://code.google.com/p/boilerpipe/) Java library. The extraction heuristics from boilerpipe show a robust performance for a wide range of web site templates.

Version:	1.3
Imports:	rJava
Suggests:	RCurl
Published:	2015-05-11
Author:	See AUTHORS file. boilerpipeR author details
Maintainer:	Mario Annau <mario.annau at gmail.com>
BugReports:	https://github.com/mannau/boilerpipeR/issues
License:	Apache License (== 2.0)
URL:	https://github.com/mannau/boilerpipeR
NeedsCompilation:	no
Materials:	NEWS
In views:	NaturalLanguageProcessing, WebTechnologies
CRAN checks:	boilerpipeR results

Downloads:

Reference manual:	boilerpipeR.pdf
Vignettes:	Introduction to the tm.plugin.webmining Package
Package source:	boilerpipeR_1.3.tar.gz
Windows binaries:	r-devel: boilerpipeR_1.3.zip, r-release: boilerpipeR_1.3.zip, r-oldrel: boilerpipeR_1.3.zip
macOS binaries:	r-release: boilerpipeR_1.3.tgz, r-oldrel: boilerpipeR_1.3.tgz
Old sources:	boilerpipeR archive

Reverse dependencies:

Reverse imports:

tm.plugin.webmining

Linking:

Please use the canonical form https://CRAN.R-project.org/package=boilerpipeR to link to this page.