Europe PMC is a repository of life science literature. Europe PMC ingests all PubMed content and extends its index with other literature and patent sources.
For more background on Europe PMC, see:
Levchenko, M., Gou, Y., Graef, F., Hamelers, A., Huang, Z., Ide-Smith, M., … McEntyre, J. (2017). Europe PMC in 2017. Nucleic Acids Research, 46(D1), D1254–D1260. https://doi.org/10.1093/nar/gkx1005
This client supports the Europe PMC search syntax. If you are unfamiliar with searching Europe PMC, check out the Europe PMC query builder, a very nice tool that helps you to build queries. To make use of Europe PMC queries in R, copy & paste the search string to the search functions of this package.
In the following, some examples demonstrate how to search Europe PMC with R.
empc_search()
is the main function to query Europe PMC. It searches both metadata and fulltexts.
library(europepmc)
europepmc::epmc_search('malaria')
#> # A tibble: 100 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… Bala… Drewry LL, … Virulence 1
#> 2 3206… MED 3206… PMC7… 10.1… Pred… Patel H, Du… Virulence 1
#> 3 3204… MED 3204… <NA> 10.1… Mode… Olaniyi S, … J Biol Dyn 1
#> 4 3246… MED 3246… <NA> 10.1… Back… Xing Y, Guo… J Biol Dyn 1
#> 5 3190… MED 3190… PMC6… 10.1… Sett… Bucşan AN, … Virulence 1
#> 6 3236… MED 3236… PMC7… 10.1… Mate… Charlier C,… Virulence 1
#> 7 3185… MED 3185… PMC6… 10.1… Inhi… Alissa SA, … J Enzyme In… 1
#> 8 3207… MED 3207… <NA> 10.2… 2-Am… Serban G. Acta Pharm 3
#> 9 3220… MED 3220… PMC7… 10.1… Esta… Acharya KP,… Emerg Micro… 1
#> 10 3186… MED 3186… PMC6… 10.1… Iden… Zhou Y, Wen… Pharm Biol 1
#> # … with 90 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
It is worth noting that Europe PMC expands queries with MeSH synonyms by default, a behavior which can be turned off with the synonym
parameter.
europepmc::epmc_search('malaria', synonym = FALSE)
#> # A tibble: 100 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… "Bal… Drewry LL, … Virulence 1
#> 2 3204… MED 3204… <NA> 10.1… "Mod… Olaniyi S, … J Biol Dyn 1
#> 3 3246… MED 3246… <NA> 10.1… "Bac… Xing Y, Guo… J Biol Dyn 1
#> 4 3206… MED 3206… PMC7… 10.1… "Pre… Patel H, Du… Virulence 1
#> 5 3190… MED 3190… PMC6… 10.1… "Set… Bucşan AN, … Virulence 1
#> 6 PPR1… PPR <NA> <NA> 10.2… "Urb… Hassen J, D… <NA> <NA>
#> 7 PPR1… PPR <NA> <NA> 10.2… "Mal… Dufera M, K… <NA> <NA>
#> 8 PPR1… PPR <NA> <NA> 10.2… "Mal… Mundagowa P… <NA> <NA>
#> 9 3247… MED 3247… <NA> 10.1… "Cos… Sarker AR, … PLoS One 5
#> 10 PPR1… PPR <NA> <NA> 10.2… "Pat… Monroe A, M… <NA> <NA>
#> # … with 90 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
To get an exact match, use quotes as in the following example:
europepmc::epmc_search('"Human malaria parasites"')
#> # A tibble: 100 x 28
#> id source pmid doi title authorString journalTitle pubYear journalIssn
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3247… MED 3247… 10.1… "C-t… Kimata-Arig… J Biochem 2020 "0021-924x…
#> 2 PPR1… PPR <NA> 10.1… "A d… Cobb DW, Ku… <NA> 2020 <NA>
#> 3 3192… MED 3192… 10.1… "Fal… Rosenthal P… Biochim Bio… 2020 "1570-9639…
#> 4 PPR9… PPR <NA> 10.1… "Mal… Kwon H, Rey… <NA> 2019 <NA>
#> 5 PPR9… PPR <NA> 10.1… "Dis… Subudhi AK,… <NA> 2019 <NA>
#> 6 PPR1… PPR <NA> 10.2… "A r… Jivapetthai… <NA> 2019 <NA>
#> 7 PPR6… PPR <NA> 10.1… "Gen… McLean KJ, … <NA> 2018 <NA>
#> 8 PPR8… PPR <NA> 10.1… "Qua… Hopp CS, Ka… <NA> 2019 <NA>
#> 9 PPR5… PPR <NA> 10.1… "A m… Tang Y, Mei… <NA> 2018 <NA>
#> 10 3149… MED 3149… 10.1… "Par… Greischar M… Evolution 2019 "0014-3820…
#> # … with 90 more rows, and 19 more variables: pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, issue <chr>, journalVolume <chr>,
#> # pageInfo <chr>, pmcid <chr>
By default, 100 records are returned, but the number of results can be expanded or limited with the limit
parameter.
europepmc::epmc_search('"Human malaria parasites"', limit = 10)
#> # A tibble: 10 x 27
#> id source pmid doi title authorString journalTitle pubYear journalIssn
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3247… MED 3247… 10.1… "C-t… Kimata-Arig… J Biochem 2020 "0021-924x…
#> 2 PPR1… PPR <NA> 10.1… "A d… Cobb DW, Ku… <NA> 2020 <NA>
#> 3 3192… MED 3192… 10.1… "Fal… Rosenthal P… Biochim Bio… 2020 "1570-9639…
#> 4 PPR9… PPR <NA> 10.1… "Mal… Kwon H, Rey… <NA> 2019 <NA>
#> 5 PPR9… PPR <NA> 10.1… "Dis… Subudhi AK,… <NA> 2019 <NA>
#> 6 PPR1… PPR <NA> 10.2… "A r… Jivapetthai… <NA> 2019 <NA>
#> 7 PPR6… PPR <NA> 10.1… "Gen… McLean KJ, … <NA> 2018 <NA>
#> 8 PPR8… PPR <NA> 10.1… "Qua… Hopp CS, Ka… <NA> 2019 <NA>
#> 9 PPR5… PPR <NA> 10.1… "A m… Tang Y, Mei… <NA> 2018 <NA>
#> 10 3149… MED 3149… 10.1… "Par… Greischar M… Evolution 2019 "0014-3820…
#> # … with 18 more variables: pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, issue <chr>, journalVolume <chr>,
#> # pageInfo <chr>
Results are sorted by relevance. Other options via the sort
parameter are
sort = 'cited'
by the number of citation, descending from the most cited publicationsort = 'date'
by date published starting with the most recent publicationSometimes, you would like to check, if articles are indexed in Europe PMC using DOI names, a widely used identifier for scholarly articles. Use epmc_search_by_doi()
for this purpose.
my_dois <- c(
"10.1159/000479962",
"10.1002/sctm.17-0081",
"10.1161/strokeaha.117.018077",
"10.1007/s12017-017-8447-9"
)
europepmc::epmc_search_by_doi(doi = my_dois)
#> # A tibble: 4 x 28
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 2895… MED 2895… 10.1… Clin… Schnieder M… Eur Neurol 5-6 78
#> 2 2894… MED 2894… 10.1… Conc… Doeppner TR… Stem Cells … 11 6
#> 3 2901… MED 2901… 10.1… One-… Psychogios … Stroke 11 48
#> 4 2862… MED 2862… 10.1… Defe… Carboni E, … Neuromolecu… 2-3 19
#> # … with 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>
By default, a non-nested data frame printed as tibble is returned. Other formats are output = "id_list"
returning a list of IDs and sources, and output = “‘raw’”" for getting full metadata as list. Please be aware that these lists can become very large.
Europe PMC provides text-mined annotations contained in abstracts and open access full-text articles.
These automatically identified concepts and term can be retrieved at the article-level:
europepmc::epmc_annotations_by_id(c("MED:28585529", "PMC:PMC1664601"))
#> # A tibble: 774 x 13
#> source ext_id pmcid prefix exact postfix name uri id type section
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 MED 28585… PMC5… "tive… Beta… " allo… Beta… http… http… Orga… Title …
#> 2 MED 28585… PMC5… "at, … suga… " (Bet… suga… http… http… Orga… Abstra…
#> 3 MED 28585… PMC5… "d a … beet ". " beet http… http… Orga… Abstra…
#> 4 MED 28585… PMC5… "lati… beets " (B. … beets http… http… Orga… Abstra…
#> 5 MED 28585… PMC5… "of <… B. v… " ssp.… B. v… http… http… Orga… Abstra…
#> 6 MED 28585… PMC5… " bee… ssp ". mar… ssp http… http… Gene… Abstra…
#> 7 MED 28585… PMC5… "ify … Beta… " ssp.… Beta… http… http… Orga… Abstra…
#> 8 MED 28585… PMC5… "beet… ssp ". vul… ssp http… http… Gene… Abstra…
#> 9 MED 28585… PMC5… "ed v… MBS "). " MBS http… http… Gene… Abstra…
#> 10 MED 28585… PMC5… "2 wa… MBS " and … MBS http… http… Gene… Abstra…
#> # … with 764 more rows, and 2 more variables: provider <chr>, subType <chr>
To obtain a list of articles where Europe PMC has text-minded annotations, either subset the resulting data.frame
tt <- epmc_search("malaria")
tt[tt$hasTextMinedTerms == "Y" | tt$hasTMAccessionNumbers == "Y",]
#> # A tibble: 97 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… Bala… Drewry LL, … Virulence 1
#> 2 3206… MED 3206… PMC7… 10.1… Pred… Patel H, Du… Virulence 1
#> 3 3204… MED 3204… <NA> 10.1… Mode… Olaniyi S, … J Biol Dyn 1
#> 4 3246… MED 3246… <NA> 10.1… Back… Xing Y, Guo… J Biol Dyn 1
#> 5 3190… MED 3190… PMC6… 10.1… Sett… Bucşan AN, … Virulence 1
#> 6 3236… MED 3236… PMC7… 10.1… Mate… Charlier C,… Virulence 1
#> 7 3185… MED 3185… PMC6… 10.1… Inhi… Alissa SA, … J Enzyme In… 1
#> 8 3207… MED 3207… <NA> 10.2… 2-Am… Serban G. Acta Pharm 3
#> 9 3220… MED 3220… PMC7… 10.1… Esta… Acharya KP,… Emerg Micro… 1
#> 10 3186… MED 3186… PMC6… 10.1… Iden… Zhou Y, Wen… Pharm Biol 1
#> # … with 87 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
or expand the query choosing an annotation type or provider from the Europe PMC Advanced Search query builder.
epmc_search('malaria AND (ANNOTATION_TYPE:"Cell") AND (ANNOTATION_PROVIDER:"Europe PMC")')
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3130… MED 3130… PMC7… 10.1… Blac… Opoka RO, W… Clin Infect… 11
#> 2 3169… MED 3169… PMC7… 10.1… Redu… Kingston HW… J Infect Dis 9
#> 3 3150… MED 3150… <NA> 10.1… Acut… Oshomah-Bel… J Trop Pedi… 2
#> 4 3182… MED 3182… <NA> 10.1… CD8+… Riggle BA, … J Clin Inve… 3
#> 5 3167… MED 3167… <NA> 10.1… A Sy… Thiengsusuk… Eur J Drug … 2
#> 6 3104… MED 3104… <NA> 10.1… Elev… Datta D, Co… Clin Infect… 6
#> 7 3168… MED 3168… <NA> 10.1… Eval… Ferdinand D… Trans R Soc… 3
#> 8 3085… MED 3085… <NA> 10.1… An E… Woodford J,… J Infect Dis 6
#> 9 3153… MED 3153… <NA> 10.1… Asso… Peitzmeier … AIDS Behav 3
#> 10 3184… MED 3184… PMC6… 10.1… Arte… Pull L, Lup… Malar J 1
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>
Another nice feature of Europe PMC is to search for cross-references between Europe PMC to other databases. For instance, to get publications cited by entries in the Protein Data bank in Europe published 2016:
europepmc::epmc_search('(HAS_PDB:y) AND FIRST_PDATE:2016')
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 2803… MED 2803… PMC5… 10.1… Stru… Su HP, Rick… Proc Natl A… 3
#> 2 2803… MED 2803… PMC5… 10.1… Stru… Kovaľ T, Øs… PLoS One 12
#> 3 2797… MED 2797… <NA> 10.1… Comp… De Deurwaer… ACS Chem Ne… 5
#> 4 2814… MED 2814… PMC5… 10.3… Bioc… Ulrich V, B… Beilstein J… <NA>
#> 5 2802… MED 2802… <NA> 10.1… Stru… Zhou Z, Liu… Appl Microb… 7
#> 6 2795… MED 2795… <NA> 10.1… Glyc… Hamark C, B… J Am Chem S… 1
#> 7 2795… MED 2795… PMC6… 10.1… Stru… Reed AJ, Vy… J Am Chem S… 1
#> 8 2803… MED 2803… PMC5… 10.1… Stru… Sevrioukova… Proc Natl A… 3
#> 9 2808… MED 2808… PMC5… 10.3… Conf… Paoletti F,… Front Mol B… <NA>
#> 10 2802… MED 2802… <NA> 10.1… Solu… Bibow S, Po… Nat Struct … 2
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>
The following sources are supported
To retrieve metadata about these external database links, use europepmc_epmc_db()
.
Europe PMC let us also obtain citation metadata and reference sections. For retrieving citation metadata per article, use
europepmc::epmc_citations("9338777", limit = 500)
#> # A tibble: 232 x 11
#> id source citationType title authorString journalAbbrevia… pubYear volume
#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 3156… MED research-ar… Regu… Chung HC, N… J Vet Sci 2019 20
#> 2 3023… MED research su… Bioe… Legallais C… Adv Healthc Mat… 2018 7
#> 3 3026… MED research su… Porc… Fiebig U, F… Xenotransplanta… 2018 25
#> 4 2975… MED historical … Infe… Weiss RA. Xenotransplanta… 2018 25
#> 5 2964… MED research su… Trac… Kawasaki J,… Viruses 2018 10
#> 6 2876… MED research su… Pres… Kawasaki J,… J Virol 2017 91
#> 7 2843… MED research su… Thre… Colon-Moran… Virology 2017 507
#> 8 2805… MED research su… Anti… Inoue Y, Yo… Ann Biomed Eng 2017 45
#> 9 2783… MED research-ar… Tran… Kim N, Choi… PLoS One 2016 11
#> 10 2746… MED research su… Exis… Kuse K, Ito… J Virol 2016 90
#> # … with 222 more rows, and 3 more variables: issue <chr>, pageInfo <chr>,
#> # citedByCount <int>
For reference section from an article:
europepmc::epmc_refs("28632490", limit = 200)
#> # A tibble: 169 x 19
#> id source citationType title authorString journalAbbrevia… issue pubYear
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 1200… MED JOURNAL ART… Tric… Adolfsson-E… Chemosphere 9-10 2002
#> 2 1879… MED JOURNAL ART… In v… Ahn KC, Zha… Environ. Health… 9 2008
#> 3 1855… MED JOURNAL ART… Effe… Aiello AE, … Am J Public Hea… 8 2008
#> 4 1768… MED JOURNAL ART… Cons… Aiello AE, … Clin. Infect. D… <NA> 2007
#> 5 1527… MED JOURNAL ART… Rela… Aiello AE, … Antimicrob. Age… 8 2004
#> 6 1820… MED JOURNAL ART… The … Allmyr M, H… Sci. Total Envi… 1 2008
#> 7 1700… MED JOURNAL ART… Tric… Allmyr M, A… Sci. Total Envi… 1 2006
#> 8 2694… MED JOURNAL ART… Pres… Alvarez-Riv… J Chromatogr A <NA> 2016
#> 9 2319… MED JOURNAL ART… Expo… Anderson SE… Toxicol. Sci. 1 2012
#> 10 2583… MED JOURNAL ART… Obse… Vladar EK, … Methods Cell Bi… <NA> 2015
#> # … with 159 more rows, and 11 more variables: volume <chr>, pageInfo <chr>,
#> # citedOrder <int>, match <chr>, issn <chr>, essn <chr>,
#> # publicationTitle <chr>, publisherLoc <chr>, publisherName <chr>,
#> # externalLink <chr>, doi <chr>
Europe PMC gives not only access to metadata, but also to full-texts. Adding AND (OPEN_ACCESS:y)
to your search query, returns only those articles where Europe PMC has also the fulltext.
Fulltext as xml document can accessed via the PMID or the PubMed Central ID (PMCID):
europepmc::epmc_ftxt("PMC3257301")
#> {xml_document}
#> <article article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
#> [1] <front>\n <journal-meta>\n <journal-id journal-id-type="nlm-ta">PLoS ...
#> [2] <body>\n <sec id="s1">\n <title>Introduction</title>\n <p>Atmosphe ...
#> [3] <back>\n <ack>\n <p>We would like to thank Dr. C. Gourlay and Dr. T. ...