Get full text research articles
Checkout the package docs and the fulltext manual to get started.
rOpenSci has a number of R packages to get either full text, metadata, or both from various publishers. The goal of fulltext
is to integrate these packages to create a single interface to many data sources.
fulltext
makes it easy to do text-mining by supporting the following steps:
ft_search
ft_get
ft_links
ft_extract
ft_table
Previously supported use cases, extracted out to other packages:
It’s easy to go from the outputs of ft_get
to text-mining packages such as tm and quanteda.
Data sources in fulltext
include:
rcrossref
packagerplos
packageaRxiv
packagebiorxivr
packagerentrez
packageAuthentication: A number of publishers require authentication via API key, and some even more draconian authentication processes involving checking IP addresses. We are working on supporting all the various authentication things for different publishers, but of course all the OA content is already easily available. See the Authentication section in ?fulltext-package
after loading the package.
We’d love your feedback. Let us know what you think in the issue tracker (https://github.com/ropensci/fulltext/issues)
Article full text formats by publisher: https://docs.ropensci.org/fulltext/articles/formats
Stable version from CRAN
Development version from GitHub
Load library
Note: this example not included in vignettes as that would require the two below packages in Suggests here. To see many examples and documentation see the package docs and the fulltext manual.
cache_options_set(path = (td <- 'foobar'))
res <- ft_get(c('10.7554/eLife.03032', '10.7554/eLife.32763'), type = "pdf")
library(readtext)
x <- readtext::readtext(file.path(cache_options_get()$path, "*.pdf"))
fulltext
: citation(package = 'fulltext')