Europe PMC is a repository of life science literature. Europe PMC ingests all PubMed content and extends its index with other literature and patent sources.
For more background on Europe PMC, see:
Levchenko, M., Gou, Y., Graef, F., Hamelers, A., Huang, Z., Ide-Smith, M., … McEntyre, J. (2017). Europe PMC in 2017. Nucleic Acids Research, 46(D1), D1254–D1260.
This client supports the Europe PMC search syntax. If you are unfamiliar with searching Europe PMC, check out the Europe PMC query builder, a very nice tool that helps you to build queries. To make use of Europe PMC queries in R, copy & paste the search string to the search functions of this package.
In the following, some examples demonstrate how to search Europe PMC with R.
is the main function to query Europe PMC. It searches both metadata and fulltexts.
#> # A tibble: 100 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… Bala… Drewry LL, … Virulence 1
#> 2 3206… MED 3206… PMC7… 10.1… Pred… Patel H, Du… Virulence 1
#> 3 3204… MED 3204… <NA> 10.1… Mode… Olaniyi S, … J Biol Dyn 1
#> 4 3246… MED 3246… <NA> 10.1… Back… Xing Y, Guo… J Biol Dyn 1
#> 5 3190… MED 3190… PMC6… 10.1… Sett… Bucşan AN, … Virulence 1
#> 6 3236… MED 3236… PMC7… 10.1… Mate… Charlier C,… Virulence 1
#> 7 3185… MED 3185… PMC6… 10.1… Inhi… Alissa SA, … J Enzyme In… 1
#> 8 3207… MED 3207… <NA> 10.2… 2-Am… Serban G. Acta Pharm 3
#> 9 3220… MED 3220… PMC7… 10.1… Esta… Acharya KP,… Emerg Micro… 1
#> 10 3186… MED 3186… PMC6… 10.1… Iden… Zhou Y, Wen… Pharm Biol 1
#> # … with 90 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
It is worth noting that Europe PMC expands queries with MeSH synonyms by default, a behavior which can be turned off with the synonym
europepmc::epmc_search('malaria', synonym = FALSE)
#> # A tibble: 100 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… "Bal… Drewry LL, … Virulence 1
#> 2 3204… MED 3204… <NA> 10.1… "Mod… Olaniyi S, … J Biol Dyn 1
#> 3 3246… MED 3246… <NA> 10.1… "Bac… Xing Y, Guo… J Biol Dyn 1
#> 4 3206… MED 3206… PMC7… 10.1… "Pre… Patel H, Du… Virulence 1
#> 5 3190… MED 3190… PMC6… 10.1… "Set… Bucşan AN, … Virulence 1
#> 6 PPR1… PPR <NA> <NA> 10.2… "Urb… Hassen J, D… <NA> <NA>
#> 7 PPR1… PPR <NA> <NA> 10.2… "Mal… Dufera M, K… <NA> <NA>
#> 8 PPR1… PPR <NA> <NA> 10.2… "Mal… Mundagowa P… <NA> <NA>
#> 9 3247… MED 3247… <NA> 10.1… "Cos… Sarker AR, … PLoS One 5
#> 10 PPR1… PPR <NA> <NA> 10.2… "Pat… Monroe A, M… <NA> <NA>
#> # … with 90 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
To get an exact match, use quotes as in the following example:
europepmc::epmc_search('"Human malaria parasites"')
#> # A tibble: 100 x 28
#> id source pmid doi title authorString journalTitle pubYear journalIssn
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3247… MED 3247… 10.1… "C-t… Kimata-Arig… J Biochem 2020 "0021-924x…
#> 2 PPR1… PPR <NA> 10.1… "A d… Cobb DW, Ku… <NA> 2020 <NA>
#> 3 3192… MED 3192… 10.1… "Fal… Rosenthal P… Biochim Bio… 2020 "1570-9639…
#> 4 PPR9… PPR <NA> 10.1… "Mal… Kwon H, Rey… <NA> 2019 <NA>
#> 5 PPR9… PPR <NA> 10.1… "Dis… Subudhi AK,… <NA> 2019 <NA>
#> 6 PPR1… PPR <NA> 10.2… "A r… Jivapetthai… <NA> 2019 <NA>
#> 7 PPR6… PPR <NA> 10.1… "Gen… McLean KJ, … <NA> 2018 <NA>
#> 8 PPR8… PPR <NA> 10.1… "Qua… Hopp CS, Ka… <NA> 2019 <NA>
#> 9 PPR5… PPR <NA> 10.1… "A m… Tang Y, Mei… <NA> 2018 <NA>
#> 10 3149… MED 3149… 10.1… "Par… Greischar M… Evolution 2019 "0014-3820…
#> # … with 90 more rows, and 19 more variables: pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, issue <chr>, journalVolume <chr>,
#> # pageInfo <chr>, pmcid <chr>
By default, 100 records are returned, but the number of results can be expanded or limited with the limit
europepmc::epmc_search('"Human malaria parasites"', limit = 10)
#> # A tibble: 10 x 27
#> id source pmid doi title authorString journalTitle pubYear journalIssn
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3247… MED 3247… 10.1… "C-t… Kimata-Arig… J Biochem 2020 "0021-924x…
#> 2 PPR1… PPR <NA> 10.1… "A d… Cobb DW, Ku… <NA> 2020 <NA>
#> 3 3192… MED 3192… 10.1… "Fal… Rosenthal P… Biochim Bio… 2020 "1570-9639…
#> 4 PPR9… PPR <NA> 10.1… "Mal… Kwon H, Rey… <NA> 2019 <NA>
#> 5 PPR9… PPR <NA> 10.1… "Dis… Subudhi AK,… <NA> 2019 <NA>
#> 6 PPR1… PPR <NA> 10.2… "A r… Jivapetthai… <NA> 2019 <NA>
#> 7 PPR6… PPR <NA> 10.1… "Gen… McLean KJ, … <NA> 2018 <NA>
#> 8 PPR8… PPR <NA> 10.1… "Qua… Hopp CS, Ka… <NA> 2019 <NA>
#> 9 PPR5… PPR <NA> 10.1… "A m… Tang Y, Mei… <NA> 2018 <NA>
#> 10 3149… MED 3149… 10.1… "Par… Greischar M… Evolution 2019 "0014-3820…
#> # … with 18 more variables: pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, issue <chr>, journalVolume <chr>,
#> # pageInfo <chr>
Results are sorted by relevance. Other options via the sort
parameter are
sort = 'cited'
by the number of citation, descending from the most cited publicationsort = 'date'
by date published starting with the most recent publicationSometimes, you would like to check, if articles are indexed in Europe PMC using DOI names, a widely used identifier for scholarly articles. Use epmc_search_by_doi()
for this purpose.
my_dois <- c(
europepmc::epmc_search_by_doi(doi = my_dois)
#> # A tibble: 4 x 28
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 2895… MED 2895… 10.1… Clin… Schnieder M… Eur Neurol 5-6 78
#> 2 2894… MED 2894… 10.1… Conc… Doeppner TR… Stem Cells … 11 6
#> 3 2901… MED 2901… 10.1… One-… Psychogios … Stroke 11 48
#> 4 2862… MED 2862… 10.1… Defe… Carboni E, … Neuromolecu… 2-3 19
#> # … with 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>
By default, a non-nested data frame printed as tibble is returned. Other formats are output = "id_list"
returning a list of IDs and sources, and output = “‘raw’”" for getting full metadata as list. Please be aware that these lists can become very large.
Europe PMC provides text-mined annotations contained in abstracts and open access full-text articles.
These automatically identified concepts and term can be retrieved at the article-level:
europepmc::epmc_annotations_by_id(c("MED:28585529", "PMC:PMC1664601"))
#> # A tibble: 774 x 13
#> source ext_id pmcid prefix exact postfix name uri id type section
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 MED 28585… PMC5… "tive… Beta… " allo… Beta… http… http… Orga… Title …
#> 2 MED 28585… PMC5… "at, … suga… " (Bet… suga… http… http… Orga… Abstra…
#> 3 MED 28585… PMC5… "d a … beet ". " beet http… http… Orga… Abstra…
#> 4 MED 28585… PMC5… "lati… beets " (B. … beets http… http… Orga… Abstra…
#> 5 MED 28585… PMC5… "of <… B. v… " ssp.… B. v… http… http… Orga… Abstra…
#> 6 MED 28585… PMC5… " bee… ssp ". mar… ssp http… http… Gene… Abstra…
#> 7 MED 28585… PMC5… "ify … Beta… " ssp.… Beta… http… http… Orga… Abstra…
#> 8 MED 28585… PMC5… "beet… ssp ". vul… ssp http… http… Gene… Abstra…
#> 9 MED 28585… PMC5… "ed v… MBS "). " MBS http… http… Gene… Abstra…
#> 10 MED 28585… PMC5… "2 wa… MBS " and … MBS http… http… Gene… Abstra…
#> # … with 764 more rows, and 2 more variables: provider <chr>, subType <chr>
To obtain a list of articles where Europe PMC has text-minded annotations, either subset the resulting data.frame
tt <- epmc_search("malaria")
tt[tt$hasTextMinedTerms == "Y" | tt$hasTMAccessionNumbers == "Y",]
#> # A tibble: 97 x 29
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3204… MED 3204… PMC7… 10.1… Bala… Drewry LL, … Virulence 1
#> 2 3206… MED 3206… PMC7… 10.1… Pred… Patel H, Du… Virulence 1
#> 3 3204… MED 3204… <NA> 10.1… Mode… Olaniyi S, … J Biol Dyn 1
#> 4 3246… MED 3246… <NA> 10.1… Back… Xing Y, Guo… J Biol Dyn 1
#> 5 3190… MED 3190… PMC6… 10.1… Sett… Bucşan AN, … Virulence 1
#> 6 3236… MED 3236… PMC7… 10.1… Mate… Charlier C,… Virulence 1
#> 7 3185… MED 3185… PMC6… 10.1… Inhi… Alissa SA, … J Enzyme In… 1
#> 8 3207… MED 3207… <NA> 10.2… 2-Am… Serban G. Acta Pharm 3
#> 9 3220… MED 3220… PMC7… 10.1… Esta… Acharya KP,… Emerg Micro… 1
#> 10 3186… MED 3186… PMC6… 10.1… Iden… Zhou Y, Wen… Pharm Biol 1
#> # … with 87 more rows, and 20 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, versionNumber <int>
or expand the query choosing an annotation type or provider from the Europe PMC Advanced Search query builder.
epmc_search('malaria AND (ANNOTATION_TYPE:"Cell") AND (ANNOTATION_PROVIDER:"Europe PMC")')
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3130… MED 3130… PMC7… 10.1… Blac… Opoka RO, W… Clin Infect… 11
#> 2 3169… MED 3169… PMC7… 10.1… Redu… Kingston HW… J Infect Dis 9
#> 3 3150… MED 3150… <NA> 10.1… Acut… Oshomah-Bel… J Trop Pedi… 2
#> 4 3182… MED 3182… <NA> 10.1… CD8+… Riggle BA, … J Clin Inve… 3
#> 5 3167… MED 3167… <NA> 10.1… A Sy… Thiengsusuk… Eur J Drug … 2
#> 6 3104… MED 3104… <NA> 10.1… Elev… Datta D, Co… Clin Infect… 6
#> 7 3168… MED 3168… <NA> 10.1… Eval… Ferdinand D… Trans R Soc… 3
#> 8 3085… MED 3085… <NA> 10.1… An E… Woodford J,… J Infect Dis 6
#> 9 3153… MED 3153… <NA> 10.1… Asso… Peitzmeier … AIDS Behav 3
#> 10 3184… MED 3184… PMC6… 10.1… Arte… Pull L, Lup… Malar J 1
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>
Another nice feature of Europe PMC is to search for cross-references between Europe PMC to other databases. For instance, to get publications cited by entries in the Protein Data bank in Europe published 2016:
europepmc::epmc_search('(HAS_PDB:y) AND FIRST_PDATE:2016')
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 2803… MED 2803… PMC5… 10.1… Stru… Su HP, Rick… Proc Natl A… 3
#> 2 2803… MED 2803… PMC5… 10.1… Stru… Kovaľ T, Øs… PLoS One 12
#> 3 2797… MED 2797… <NA> 10.1… Comp… De Deurwaer… ACS Chem Ne… 5
#> 4 2814… MED 2814… PMC5… 10.3… Bioc… Ulrich V, B… Beilstein J… <NA>
#> 5 2802… MED 2802… <NA> 10.1… Stru… Zhou Z, Liu… Appl Microb… 7
#> 6 2795… MED 2795… <NA> 10.1… Glyc… Hamark C, B… J Am Chem S… 1
#> 7 2795… MED 2795… PMC6… 10.1… Stru… Reed AJ, Vy… J Am Chem S… 1
#> 8 2803… MED 2803… PMC5… 10.1… Stru… Sevrioukova… Proc Natl A… 3
#> 9 2808… MED 2808… PMC5… 10.3… Conf… Paoletti F,… Front Mol B… <NA>
#> 10 2802… MED 2802… <NA> 10.1… Solu… Bibow S, Po… Nat Struct … 2
#> # … with 90 more rows, and 19 more variables: journalVolume <chr>,
#> # pubYear <chr>, journalIssn <chr>, pageInfo <chr>, pubType <chr>,
#> # isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>, hasBook <chr>,
#> # hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>
The following sources are supported
To retrieve metadata about these external database links, use europepmc_epmc_db()
Europe PMC let us also obtain citation metadata and reference sections. For retrieving citation metadata per article, use
europepmc::epmc_citations("9338777", limit = 500)
#> # A tibble: 232 x 11
#> id source citationType title authorString journalAbbrevia… pubYear volume
#> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 3156… MED research-ar… Regu… Chung HC, N… J Vet Sci 2019 20
#> 2 3023… MED research su… Bioe… Legallais C… Adv Healthc Mat… 2018 7
#> 3 3026… MED research su… Porc… Fiebig U, F… Xenotransplanta… 2018 25
#> 4 2975… MED historical … Infe… Weiss RA. Xenotransplanta… 2018 25
#> 5 2964… MED research su… Trac… Kawasaki J,… Viruses 2018 10
#> 6 2876… MED research su… Pres… Kawasaki J,… J Virol 2017 91
#> 7 2843… MED research su… Thre… Colon-Moran… Virology 2017 507
#> 8 2805… MED research su… Anti… Inoue Y, Yo… Ann Biomed Eng 2017 45
#> 9 2783… MED research-ar… Tran… Kim N, Choi… PLoS One 2016 11
#> 10 2746… MED research su… Exis… Kuse K, Ito… J Virol 2016 90
#> # … with 222 more rows, and 3 more variables: issue <chr>, pageInfo <chr>,
#> # citedByCount <int>
For reference section from an article:
europepmc::epmc_refs("28632490", limit = 200)
#> # A tibble: 169 x 19
#> id source citationType title authorString journalAbbrevia… issue pubYear
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 1200… MED JOURNAL ART… Tric… Adolfsson-E… Chemosphere 9-10 2002
#> 2 1879… MED JOURNAL ART… In v… Ahn KC, Zha… Environ. Health… 9 2008
#> 3 1855… MED JOURNAL ART… Effe… Aiello AE, … Am J Public Hea… 8 2008
#> 4 1768… MED JOURNAL ART… Cons… Aiello AE, … Clin. Infect. D… <NA> 2007
#> 5 1527… MED JOURNAL ART… Rela… Aiello AE, … Antimicrob. Age… 8 2004
#> 6 1820… MED JOURNAL ART… The … Allmyr M, H… Sci. Total Envi… 1 2008
#> 7 1700… MED JOURNAL ART… Tric… Allmyr M, A… Sci. Total Envi… 1 2006
#> 8 2694… MED JOURNAL ART… Pres… Alvarez-Riv… J Chromatogr A <NA> 2016
#> 9 2319… MED JOURNAL ART… Expo… Anderson SE… Toxicol. Sci. 1 2012
#> 10 2583… MED JOURNAL ART… Obse… Vladar EK, … Methods Cell Bi… <NA> 2015
#> # … with 159 more rows, and 11 more variables: volume <chr>, pageInfo <chr>,
#> # citedOrder <int>, match <chr>, issn <chr>, essn <chr>,
#> # publicationTitle <chr>, publisherLoc <chr>, publisherName <chr>,
#> # externalLink <chr>, doi <chr>
Europe PMC gives not only access to metadata, but also to full-texts. Adding AND (OPEN_ACCESS:y)
to your search query, returns only those articles where Europe PMC has also the fulltext.
Fulltext as xml document can accessed via the PMID or the PubMed Central ID (PMCID):
#> {xml_document}
#> <article article-type="research-article" xmlns:xlink="" xmlns:mml="">
#> [1] <front>\n <journal-meta>\n <journal-id journal-id-type="nlm-ta">PLoS ...
#> [2] <body>\n <sec id="s1">\n <title>Introduction</title>\n <p>Atmosphe ...
#> [3] <back>\n <ack>\n <p>We would like to thank Dr. C. Gourlay and Dr. T. ...