The ‘Epidemiological Report’ Package

European Centre for Disease Prevention and Control (ECDC)

Description

The EpiReport package allows the user to draft an epidemiological report similar to the ECDC Annual Epidemiological Report (AER) (see https://ecdc.europa.eu/en/annual-epidemiological-reports) in Microsoft Word format for a given disease.

Through standalone functions, the package is specifically designed to generate each disease-specific output presented in these reports, using ECDC Atlas export data.

Package details below:

Package Description
Version 0.1.1
Published 2020-05-11
Authors Lore Merdrignac ,
Author of the package and original code

Tommi Karki ,


Esther Kissling ,


Joana Gomes Dias ,
Project manager
Maintainer Lore Merdrignac
License EUPL
Link to the ECDC AER reports https://ecdc.europa.eu/en/annual-epidemiological-reports

Background

ECDC’s annual epidemiological report is available as a series of individual epidemiological disease reports. Reports are published on the ECDC website https://ecdc.europa.eu/en/annual-epidemiological-reports as they become available.

The year given in the title of the report (i.e. ‘Annual epidemiological report for 2016’) refers to the year the data were collected. Reports are usually available for publication one year after data collection is complete.

All reports are based on data collected through The European Surveillance System (TESSy)1 and exported from the ECDC Atlas. Countries participating in disease surveillance submit their data electronically.

The communicable diseases and related health issues covered by the reports are under European Union and European Economic Area disease surveillance2 3 4 5.

ECDC’s annual surveillance reports provide a wealth of epidemiological data to support decision-making at the national level. They are mainly intended for public health professionals and policymakers involved in disease prevention and control programmes.

1. Datasets to be used in the Epidemiological Report package

1.1. Disease dataset specification

Two types of datasets can be used:

Description of each variable required in the disease dataset (naming and format):

Tab.1 Example of Salmonellosis data 2012-2016
HealthTopicCode MeasurePopulation MeasureCode TimeUnit TimeCode GeoCode XLabel YLabel ZValue YValue N
SALM Confirmed cases CONFIRMED.AGE.COUNT M 2014-04 CY 45-64 NA NA 0.000000 5
SALM Confirmed cases CONFIRMED.AGE.RATE M 2015-05 PL 25-44 NA NA NA 543
SALM Confirmed cases CONFIRMED.AGE_GENDER.PROPORTION Y 2016 EE 65+ Female 3.133903 2.000000 351
SALM Confirmed cases CONFIRMED.AGE.PROPORTION Y 2016 EU28 45-64 NA NA 16.256519 84704
SALM Confirmed cases CONFIRMED.AGE.COUNT M 2016-07 NL 5-14 NA NA 15.000000 104
SALM Confirmed cases CONFIRMED.GENDER.PROPORTION Y 2013 NO Male NA NA 49.155033 1361
SALM Confirmed cases CONFIRMED.AGE_GENDER.COUNT Y 2015 SK 45-64 Female 364.000000 2.000000 4841
SALM Confirmed cases CONFIRMED.AGE.PROPORTION M 2013-04 EL 25-44 NA NA 3.448276 29
SALM Confirmed cases CONFIRMED.AGE.PROPORTION Y 2014 SI 45-64 NA NA 15.075377 597
SALM Confirmed cases CONFIRMED.AGE.RATE M 2016-01 CY 0-4 NA NA 2.110952 2

1.2. Report parameters dataset specification

The internal dataset EpiReport::AERparams describes the parameters to be used for each output of each disease report.

If the user wishes to set different parameters for one of the 53 covered health topics, or if the user wishes to analyse an additional disease not covered by the default parameter table, it is possible to use an external dataset as long as it is specified as described below and in the help page ?EpiReport::AERparams. All functions of the EpiReport package can then be fed with this specific parameter table.

List of the main parameters included:

Tab.2 Example of the main columns of the parameter dataset
HealthTopic MeasurePopulation TableUse AgeGenderUse TSTrendGraphUse TSSeasonalityGraphUse MapNumbersUse MapRatesUse MapASRUse
LEPT CONFIRMED ASR AG-RATE Y Y N Y N
HEPA CONFIRMED ASR AG-RATE Y Y N Y N
FILO CONFIRMED NO NO N N N N N
CHIK ALL ASR AG-RATE Y Y Y N N
RIFT ALL NO NO N N N N N

1.3. Member States correspondence table dataset

The internal dataset EpiReport::MSCode provides the correspondence table of the geographical code GeoCode used in the disease dataset, and the geographical label Country to use throughout the report. Additional information on the EU/EEA affiliation is also available in column EUEEA.

Tab.3 Example of geographical codes and associated labels
Country GeoCode EUEEA
Belgium BE EU
Austria AT EU
France FR EU
EU-EEA EU_EEA31 NA
Estonia EE EU

2. How to generate the Epidemiological Report in Microsoft Word format

To generate a similar report to the Annual Epidemiological Report, we can use the default dataset included in the EpiReport package presenting Salmonellosis data 2012-2016.

Calling the function getAER(), the Salmonellosis 2016 report will be generated and stored in your working directory (see getwd()) by default.

getAER()

Please specify the full path to the output folder if necessary:

output <- "C:/EpiReport/doc/"
getAER(outputPath = output)

2.1. External disease dataset

To generate the report using an external dataset, please use the syntax below.

In the following example, Pertussis 2016 TESSy data (in csv format, in the /data folder) is used to produce the corresponding report.

Pertussis PNG maps have previously been created and stored in a specific folder /maps.

# --- Importing the dataset
PERT2016 <- read.table("data/PERT2016.csv", 
                       sep = ",", 
                       header = TRUE, 
                       stringsAsFactors = FALSE)

# --- Specifying the folder containing pertussis maps
pathMap <- paste(getwd(), "/maps", sep = "")


# --- (optional) Setting the local language in English for month label
Sys.setlocale("LC_TIME", "C")
#> [1] "C"

# --- Producing the report
EpiReport::getAER(disease = "PERT", 
       year = 2016, 
       x = PERT2016, 
       pathPNG = pathMap)
#> 5
#> _x0000_t202
#> Text Box 2
#> 1
#> 0
#> 

Please note that the font Tahoma is used in the plot axis and legend. It is advised to import this font using the extrafont package and the command font_import and loadfonts.

However, if the users prefer the use of the default Arial in plots, it is optional. In that case, warnings will appear in the console for each plot.

2.2. Word template

By default, an empty ECDC template (Microsoft Word) is used to produce the report. In order to modify this template, please first download the default template using the function getTemplate().

You can store this Microsoft Word template in a specific folder /template.

getTemplate(output_path = "C:/EpiReport/template")

Then, apply the modifications required, save it and use it as a new Microsoft Word template when producing the epidemiological report as described below.

getAER(template = "C:/EpiReport/template/New_AER_Template.docx",
       outputPath = "C:/EpiReport/doc/")

Please make sure that the Microsoft Word bookmarks are preserved throughout the modifications to the template. The bookmarks specify the location where to include each output.

3. How to generate each epidemiological outputs independently

The EpiReport package allows the user to generate each epidemiological output independently of the Microsoft Word report.

The ECDC annual epidemiological Report includes five types of outputs:

3.1. Table: distribution of cases by Member State

The function getTableByMS() generates a flextable object (see package flextable) presenting the number of cases by Member State over the last five years.

By default, the function will use the internal Salmonellosis 2012-2016 data and present the number of confirmed cases and the corresponding rate for each year, with a focus on 2016 and age-standardised rates.

EpiReport::getTableByMS()

Country

2012

2013

2014

2015

2016

Number

Rate

Number

Rate

Number

Rate

Number

Rate

Number

Rate

ASR

Austria

1773

21.1

1404

16.6

1654

19.4

1544

18.0

1415

16.3

17.2

Belgium

3101

28.0

2528

22.7

2698

24.1

3050

27.1

2698

23.9

22.9

Bulgaria

839

11.5

766

10.5

730

10.1

1076

14.9

718

10.0

10.8

Croatia

0

0.0

0

0.0

1494

35.2

1593

37.7

1240

29.6

30.3

Cyprus

90

10.4

79

9.1

88

10.3

65

7.7

77

9.1

8.3

Czech Republic

10056

95.7

9790

93.1

13255

126.1

12408

117.7

11610

110.0

113.5

Denmark

1207

21.6

1137

20.3

1124

20.0

925

16.3

1081

18.9

18.8

Estonia

249

18.8

183

13.9

92

7.0

112

8.5

351

26.7

26.5

Finland

2210

40.9

1984

36.6

1622

29.8

1650

30.2

1512

27.6

28.7

France

8705

27.8

8927

28.4

8880

28.1

10305

32.3

8876

27.7

26.8

Germany

20493

25.5

18696

23.2

16000

19.8

13667

16.8

12858

15.6

16.8

Greece

404

3.6

414

3.8

349

3.2

466

4.3

735

6.8

7.0

Hungary

5462

55.0

4953

50.0

5249

53.1

4894

49.7

4722

48.0

50.2

Iceland

38

11.9

48

14.9

40

12.3

44

13.4

39

11.7

12.4

Ireland

309

6.7

326

7.1

259

5.6

270

5.8

299

6.3

6.2

Italy

4829

8.1

5048

8.5

4467

7.3

3825

6.3

4134

6.8

7.0

Latvia

547

26.8

385

19.0

278

13.9

380

19.1

454

23.1

23.8

Liechtenstein

.

.

.

.

.

.

.

.

.

.

.

Lithuania

1762

58.7

1199

40.3

1145

38.9

1082

37.0

1076

37.3

37.4

Luxembourg

136

25.9

120

22.3

110

20.0

106

18.8

108

18.7

19.3

Malta

88

21.1

84

19.9

132

31.0

126

29.3

158

36.4

37.6

Netherlands

2199

20.5

979

9.1

970

9.0

974

9.0

1150

10.6

10.7

Norway

1371

27.5

1361

26.9

1118

21.9

928

18.0

865

16.6

16.9

Poland

7959

20.9

7315

19.2

8042

21.2

8245

21.7

9718

25.6

-

Portugal

185

1.8

167

1.6

244

2.3

325

3.1

376

3.6

3.9

Romania

698

3.5

1302

6.5

1512

7.6

1330

6.7

1479

7.5

7.6

Slovakia

4627

85.6

3807

70.4

4078

75.3

4841

89.3

5299

97.7

99.9

Slovenia

392

19.1

316

15.3

597

29.0

401

19.4

311

15.1

15.7

Spain

4224

-

4537

-

6633

-

9015

-

9818

-

-

Sweden

2922

30.8

2842

29.7

2211

22.9

2312

23.7

2247

22.8

23.3

United Kingdom

8812

13.9

8465

13.2

8099

12.6

9490

14.6

9902

15.1

15.0

EU-EEA

95687

22.1

89162

20.5

93170

20.8

95449

21.0

95326

20.4

20.3

Table. Distribution of confirmed salmonellosis cases, EU/EEA, 2012-2016

This table can be drafted using external data, and specifying the disease code and the year to use as reference in the report.

In the example below, we use Zika virus data. According to the report parameters, the table for this disease should present the number of reported cases over the last five years and by Member State.

ZIKV2016 <- read.table("data/ZIKV2016.csv", 
                       sep = ",", 
                       header = TRUE, 
                       stringsAsFactors = FALSE)
EpiReport::getTableByMS(x = ZIKV2016, 
             disease = "ZIKV", 
             year = 2016)

Country

2012

2013

2014

2015

2016

Number

Number

Number

Number

Number

Austria

-

-

-

1

41

Belgium

-

-

-

1

120

Bulgaria

.

.

.

.

.

Croatia

.

.

.

.

.

Cyprus

.

.

.

.

.

Czech Republic

-

-

-

-

13

Denmark

-

-

-

-

8

Estonia

-

-

-

-

0

Finland

-

-

-

1

6

France

-

-

-

-

1141

Germany

.

.

.

.

.

Greece

-

-

-

-

4

Hungary

-

-

-

-

2

Iceland

.

.

.

.

.

Ireland

-

-

-

1

15

Italy

-

-

-

-

101

Latvia

0

0

0

0

0

Liechtenstein

.

.

.

.

.

Lithuania

.

.

.

.

.

Luxembourg

-

-

-

-

2

Malta

-

-

-

-

2

Netherlands

-

-

-

11

98

Norway

-

-

-

-

8

Poland

.

.

.

.

.

Portugal

-

-

-

-

18

Romania

-

-

-

-

3

Slovakia

-

-

-

-

3

Slovenia

-

-

-

-

7

Spain

-

-

-

10

301

Sweden

-

-

-

1

34

United Kingdom

-

-

-

3

194

EU-EEA

0

0

0

29

2121

Table. Distribution of Zika virus infection cases, EU/EEA, 2012-2016

3.2. Seasonality plot: distribution of cases by month

The function getSeason() generates a ggplot (see package ggplot2) presenting the distribution of cases at EU/EEA level, by month, over the past five years.

The plot includes:

By default, the function will use the internal Salmonellosis 2012-2016 data.

# --- Salmonellosis 2016 plot
EpiReport::getSeason()
#> Warning: Use of `data[[min4years]]` is discouraged. Use `.data[[min4years]]`
#> instead.
#> Warning: Use of `data[[max4years]]` is discouraged. Use `.data[[max4years]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[mean4years]]` is discouraged. Use `.data[[mean4years]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.

Figure. Distribution of confirmed salmonellosis cases by month, EU/EEA, 2016 and 2012-2015

The plot can also be drafted using external data, and specifying the disease dataset, the disease code and the year to use as reference in the report.

In the example below, we use Pertussis 2012-2016 data.

# --- Pertussis 2016 plot
EpiReport::getSeason(x = PERT2016,
                     disease = "PERT",
                     year = 2016)
#> Warning: Use of `data[[min4years]]` is discouraged. Use `.data[[min4years]]`
#> instead.
#> Warning: Use of `data[[max4years]]` is discouraged. Use `.data[[max4years]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[mean4years]]` is discouraged. Use `.data[[mean4years]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.

Figure. Distribution of pertussis cases by month, EU/EEA, 2016 and 2012-2015

3.3. Trend plot: trend and number of cases by month

The function getTrend() generates a ggplot (see package ggplot2) presenting the trend and the number of cases at EU/EEA level, by month, over the past five years.

The plot includes:

By default, the function will use the internal Salmonellosis 2012-2016 data.

# --- Salmonellosis 2016 plot
EpiReport::getTrend()
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[movAverage]]` is discouraged. Use `.data[[movAverage]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.

Figure. Trend and number of confirmed salmonellosis cases, EU/EEA by month, 2012-2016

The plot can also be drafted using external data, and specifying the disease dataset, the disease code and the year to use as reference in the report.

In the example below, we use again Pertussis 2012-2016 data.

# --- Pertussis 2016 plot
EpiReport::getTrend(x = PERT2016,
                    disease = "PERT",
                    year = 2016)
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[movAverage]]` is discouraged. Use `.data[[movAverage]]`
#> instead.
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.

Figure. Trend and number of pertussis cases, EU/EEA by month, 2012-2016

3.4. Maps: distribution of cases by Member State

The function getMap() provides with a preview of the PNG map associated with the disease.

By default, the function will use the internal Salmonellosis 2016 PNG maps. According to the report parameters, the corresponding map should present the notification rate of confirmed salmonellosis cases.

# --- Salmonellosis 2016 map
EpiReport::getMap()

Figure. Distribution of confirmed salmonellosis cases per 100 000 population by country, EU/EEA, 2016

The map can also be included using external PNG files, and specifying the disease code and the year to use as reference in the report. The corresponding syntax is described below (pertussis map not available).

# --- Pertussis 2016 map
EpiReport::getMap(disease = "PERT", 
                  year = 2016, 
                  pathPNG = "C:/EpiReport/maps/")

3.5. Age and gender bar graph

The function getAgeGender() generates a ggplot (see package ggplot2) presenting in a bar graph the distribution of cases at EU/EEA level by age and gender.

The bar graph uses either:

By default, the function will use the internal Salmonellosis 2012-2016 data with the rate of confirmed cases per 100 000 population.

# --- Salmonellosis 2016 bar graph
EpiReport::getAgeGender()
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[group]]` is discouraged. Use `.data[[group]]` instead.

Figure. Distribution of confirmed salmonellosis cases per 100 000 population, by age and gender, EU/EEA, 2016

The bar graph can also be drafted using external data, and specifying the disease dataset, the disease code and the year to use as reference in the report.

In the example below, we use Zika 2012-2016 data.

# --- Zika 2016 bar graph
EpiReport::getAgeGender(x = ZIKV2016, 
                        disease = "ZIKV", 
                        year = 2016)
#> Warning: Use of `data[[xvar]]` is discouraged. Use `.data[[xvar]]` instead.
#> Warning: Use of `data[[yvar]]` is discouraged. Use `.data[[yvar]]` instead.
#> Warning: Use of `data[[group]]` is discouraged. Use `.data[[group]]` instead.

Figure. Distribution of Zika virus infection proportion (%), by age and gender, EU/EEA, 2016


  1. The European Surveillance System (TESSy) is a system for the collection, analysis and dissemination of data on communicable diseases. EU Member States and EEA countries contribute to the system by uploading their infectious disease surveillance data at regular intervals.↩︎

  2. 2000/96/EC: Commission Decision of 22 December 1999 on the communicable diseases to be progressively covered by the Community network under Decision No 2119/98/EC of the European Parliament and of the Council. Official Journal, OJ L 28, 03.02.2000, p. 50-53.↩︎

  3. 2003/534/EC: Commission Decision of 17 July 2003 amending Decision No 2119/98/EC of the European Parliament and of the Council and Decision 2000/96/EC as regards communicable diseases listed in those decisions and amending Decision 2002/253/EC as regards the case definitions for communicable diseases. Official Journal, OJ L 184, 23.07.2003, p. 35-39.↩︎

  4. 2007/875/EC: Commission Decision of 18 December 2007 amending Decision No 2119/98/EC of the European Parliament and of the Council and Decision 2000/96/EC as regards communicable diseases listed in those decisions. Official Journal, OJ L 344, 28.12.2007, p. 48-49.↩︎

  5. Commission Decision 2119/98/EC of the Parliament and of the Council of 24 September 1998 setting up a network for the epidemiological surveillance and control of communicable diseases in the Community. Official Journal, OJ L 268, 03/10/1998 p. 1-7.↩︎