The main goal of this package is makeing the statistical analysis of emprical comparisions of algorithms easy and fast. For that reason, and with the aim at being a complete solution, the package includes functions to load data and manipulate data, as well as to format the results for its further use in publications. This vignettes shows the use of these functions.
The data matrices required by the package funtions should have one row per problem and a number of columns. The columns can be divided into two subsets, descriptors of the problem and results obtained by the algorithms applied to that problem. The package comes with three examples of data sets:
> library(scmamp)
> data(data_gh_2008)
> head(data.gh.2008)
## C4.5 k-NN(k=1) NaiveBayes Kernel CN2
## Abalone* 0.219 0.202 0.249 0.165 0.261
## Adult* 0.803 0.750 0.813 0.692 0.798
## Australian 0.859 0.814 0.845 0.542 0.816
## Autos 0.809 0.774 0.673 0.275 0.785
## Balance 0.768 0.790 0.727 0.872 0.706
## Breast 0.759 0.654 0.734 0.703 0.714
> data(data_gh_2010)
> head(data.gh.2010)
## PDFC NNEP IS_CHC_1NN FH_GBML
## Adult 0.752 0.773 0.785 0.795
## Breast 0.727 0.748 0.724 0.713
## Bupa 0.736 0.716 0.585 0.638
## Car 0.994 0.861 0.880 0.791
## Cleveland 0.508 0.553 0.575 0.515
## Contraceptive 0.535 0.536 0.513 0.471
> data(data_blum_2015)
> head(data.blum.2015)
## Size Radius FruitFly Shukla Ikeda Turau Rand1 Rand2 FrogCOL FrogMIS
## 1 1000 0.049 223 213 214 214 214 212 246 226
## 2 1000 0.049 224 207 209 216 205 211 241 219
## 3 1000 0.049 219 206 215 214 209 213 243 221
## 4 1000 0.049 227 208 218 218 215 219 251 230
## 5 1000 0.049 231 218 210 212 211 217 243 239
## 6 1000 0.049 230 214 214 208 211 206 246 229
The first two correspond to the example datasets used in García and Herrera (2008) and García and Herrera (2010) respectively, and they do not have any descriptor of the dataset—actually, the descritor is in the names of the rows, that indicate the data set used in each comparison. The thirds set corresponds to the results presented in Blum et al. (2015). In particular, these are the results obtained by 8 decentralized algorithms in a set of random geometric graphs. In this case, the first two columns (Size
and Radius
) represent the descriptors of the problem (number of nodes in the graph and maximum distance to consider two nodes as connected).
This type of matrix can be easily loaded from a csv file with the same structure—we will name this structure as comparison format. However, in some cases the results are not in this format. As an alternative to externally process the results to build such a file, the package includes some function to do this task in some typical cases. If you are able to construct a matrix like this in R, then you can skip this section.
The most simple function is readComparisonFile
, a function that process one single file in comparison format. This is the format of the tables shown above, so this function essentially reads files of this kind. The only additional processing of this function is the posibility of including the column names in case the file does not contain a header and a reorganization of the columns to have all the descriptors at the begining and the algorithms at the end. The function has three parameters:
file
- The path of the file to load.alg.cols
- A vector with either column names or column indices to indicate which columns in the file contain results. The rest are assumed to be descriptors of the problem.col.names
- An optional parameter to indicate the name of the columns. If not provided, the names are taken from the header of the file (the first line).The function accepts additional parameter for the read.csv
function, such as sep
for the character used to separate columns or skip
to skip the first n
lines of the file. The only parameter not accepted is header
, as it is fixed depending on whether the col.names
parameter is used or not.
For example, if we want to load a file named results.dat
where the first 5 lines are comments, the elements are separated by a semicolon and the actual results are in three columns named Alg_1
, Alg_1
and Alg_3
, the call would be:
> data.raw <- readComparisonFile(file="results.dat", alg.cols=c('Alg_1', 'Alg_2', 'Alg_3'),
+ skip=5, sep=";")
As an example of real use of this function, the package includes a file containing all the results in data.blum.2015
. This file can be loaded as follows:
> data.dir <- system.file("loading_tests",package="scmamp")
> file.path <- paste(data.dir, "rgg_complete_comparison.out", sep="/")
> data.raw <- readComparisonFile(file=file.path, alg.cols=3:10, col.names=NULL)
> head(data.raw)
## Size Radius FruitFly Shukla Ikeda Turau Rand1 Rand2 FrogCOL FrogMIS
## 1 1000 0.049 223 213 214 214 214 212 246 226
## 2 1000 0.049 224 207 209 216 205 211 241 219
## 3 1000 0.049 219 206 215 214 209 213 243 221
## 4 1000 0.049 227 208 218 218 215 219 251 230
## 5 1000 0.049 231 218 210 212 211 217 243 239
## 6 1000 0.049 230 214 214 208 211 206 246 229
Quite often the results of an experimentation are separated into different files (e.g., when the experiment has been paralelized and run in a cluster). In such cases, part of the information we need to load may be encoded in the file name itself; scmamp
includes functions to cope with this situation. In case each of the result files are in comparison format (i.e., they have a structure similar to the examples above), the function readComparisonDir
can be used to load all the files in a given directory. Note that the function will try to load all the files, so the directory must contain only result files.
Instead of passing the path to a file, in this case we need to provide the path of the directory that contains the files. As in the previous function, we have the parameters alg.cols
and col.names
, that have the same meaning as in readComparisonFile
. The function has another two parameters, names
and fname.pattern
, which are the arguments used to define how the file names have to be processed.
The fname.pattern
is used to specify, using regular expressions, the pattern of the files. In this patter there should be one or more groups, which are represented between parenthesis. These groups are the part of the information that will be extracted from the name; the names
argument is a vector to assign names to each of the extracted elements.
Although the patterns can be far more complex, quite frequently the file name will be an alternation of fixed and variable parts. The package includes an example of directory with this kind of files.
> dir <- paste(system.file("loading_tests",package="scmamp"),
+ "comparison_files", sep="/")
> list.files(dir)
## [1] "rgg_size_1000_r_0.049.out" "rgg_size_1000_r_0.058.out"
## [3] "rgg_size_1000_r_0.067.out" "rgg_size_1000_r_0.076.out"
## [5] "rgg_size_1000_r_0.085.out" "rgg_size_1000_r_0.094.out"
## [7] "rgg_size_1000_r_0.103.out" "rgg_size_1000_r_0.112.out"
## [9] "rgg_size_1000_r_0.121.out" "rgg_size_1000_r_0.134.out"
## [11] "rgg_size_100_r_0.140.out" "rgg_size_100_r_0.143.out"
## [13] "rgg_size_100_r_0.146.out" "rgg_size_100_r_0.149.out"
## [15] "rgg_size_100_r_0.152.out" "rgg_size_100_r_0.155.out"
## [17] "rgg_size_100_r_0.158.out" "rgg_size_100_r_0.161.out"
## [19] "rgg_size_100_r_0.164.out" "rgg_size_100_r_0.169.out"
## [21] "rgg_size_5000_r_0.024.out" "rgg_size_5000_r_0.036.out"
## [23] "rgg_size_5000_r_0.048.out" "rgg_size_5000_r_0.060.out"
## [25] "rgg_size_5000_r_0.072.out" "rgg_size_5000_r_0.084.out"
## [27] "rgg_size_5000_r_0.096.out" "rgg_size_5000_r_0.108.out"
## [29] "rgg_size_5000_r_0.120.out" "rgg_size_5000_r_0.134.out"
The structure of the names in this example is as follows. All the names start with a fixed string, rgg_size_
. Then there is an integer value, corresponding to the size of the graph. Then, after another fixed string (_r_
), we a real number, the radius used to crate the graph. Finally, the name ends with the extension .out
. The way we can construct the pattern for this files is:
> fname.pattern <- "rgg_size_([0-9]*)_r_([0-9]*.[0-9]*)\\.out"
The pattern includes the fixed strings and the pattern of the variable parts between brackets. For instance, if we have an integer of variable size, we can define its pattern as [0-9]*
, [0-9]
representing any digit and *
the previous pattern repeated any number of times. It is important to include these patterns between brackets, as only these parts will be extracted. In general, in most cases we can define between square brackets the chracters we may find and then add an *
after it, as for example:
[0-9]*.[0-9]*
for non-integer numbers.[a - z]*
for lower case strings.[A - Z]
for upper case strings.[A - Z][a - z]*
for lower case strings starting with an upper case letter.Note that, given that all the radius used start with 0., we can simplify the patter changing the [0-9]*
berfore the period with just a 0. In the pattern above there are two groups defined and, thus, we need to assign three two to them:
> var.names <- c("Size", "Radius")
The files have a header indicaint the name of the columns which, in this case, correspond to the results obtained by the four estimators. Therefore, we do not need to specify the column names but, as in the case of a single file, we have to indicate which columns are the ones that have the results. This can be done using the index of the columns (used in the previous example), or their names:
> alg.names <- c("FruitFly", "Shukla", "Ikeda", "Turau", "Rand1", "Rand2", "FrogCOL", "FrogMIS")
Finally, we can load the data
> rm("data.raw")
> data.raw <- readComparisonDir (directory=dir, alg.cols=alg.names, col.names=NULL,
+ names=var.names, fname.pattern=fname.pattern)
> head(data.raw)
## Size Radius FruitFly Shukla Ikeda Turau Rand1 Rand2 FrogCOL FrogMIS
## 1 1000 0.049 223 213 214 214 214 212 246 226
## 2 1000 0.049 224 207 209 216 205 211 241 219
## 3 1000 0.049 219 206 215 214 209 213 243 221
## 4 1000 0.049 227 208 218 218 215 219 251 230
## 5 1000 0.049 231 218 210 212 211 217 243 239
## 6 1000 0.049 230 214 214 208 211 206 246 229
As we can see above, besides the content of files (the last four columns), the resulting matrix includes the information extracted from the name of the files, named according to names
. Note that, when alg.cols
contains the column indices, these are refered to the columns inside the file. In other words, we do not expect to have the name of the algorithm in the file name.
However, in some situations, the results for each algorithm may be in a different file. Such kind of files contain the results of only one of the algorithms per line. We will name this structure experiment format, to distinguish it from the previous structure. There are two functions to handle this kind of files, readExperimentFile
and readExperimentDir
. These functions are similar to the previous two, but have some differences that have to do with the format of the files.
Each row of these files will have the result of applying one algorithm to one problem. Therefore, the experiment is characterized using descriptors for the problem, a column indicating the algorithm used and a column containing the result to be compared. The package includes one example of this kind of file that contains all the results in data.blum.2015
:
> dir <- system.file("loading_tests", package="scmamp")
> file <- paste(dir, "rgg_complete_experiment.out", sep="/")
> content <- read.csv(file)
> content[c(1,901,181),]
## Size Radius Algorithm Evaluation
## 1 1000 0.049 FruitFly 223
## 901 1000 0.049 Shukla 213
## 181 1000 0.103 FruitFly 74
As can be seen above, the first two columns are the same descriptors as in previous examples, but now we have only two more columns. The Algorithm
column, that indicates the algorithm used, and Evaluation
, that contains the value to be used. This kind of file can be read using the function readExperimentFile
in order to produce the table we need for the analysis.
> rm("data.raw")
> data.raw <- readExperimentFile (file=file, alg.col="Algorithm", value.col="Evaluation")
> head(data.raw)
## Size Radius FruitFly Shukla Ikeda Turau Rand1 Rand2 FrogCOL FrogMIS
## 1 1000 0.049 223 213 214 214 214 212 246 226
## 2 1000 0.049 224 207 209 216 205 211 241 219
## 3 1000 0.049 219 206 215 214 209 213 243 221
## 4 1000 0.049 227 208 218 218 215 219 251 230
## 5 1000 0.049 231 218 210 212 211 217 243 239
## 6 1000 0.049 230 214 214 208 211 206 246 229
Note that, in this case, the file has to be process to build a matrix in comparison format and, thus, loading the data from this type of structure is computationally more expensive. Therefore, it is highly recommended to use the comparison format to store the results.
Now, instead of an argument to determine which columns include the results we have two arguments, alg.col
and value.col
, that have to be either the name or the index of the columns that contain the algorithm used and the value obtained, respectively. Additionally, as in the previous functions there is an argument to indicate the name of the columns, in case the file has not a header.
As in the case of the comparision format, the package includes a function to load all the files in a directory: readExperimentDir
. Conversely to the previous function, as in this case the information about the algorithm can be either inside the file or in its name, instead of the alg.col
argument that can be the name or the index, now we have an argument, alg.var.name
, that can only be a string; This string should be a column name or the name assigned to any of the variables extracted from the file name.
Similarly to the function readComparisonDir
, we have two parameters, name
and fname.pattern
, to define how the name of the files will be processed. An example of the use of this function is the following.
> rm("data.raw")
> dir <- paste(system.file("loading_tests", package="scmamp"),
+ "experiment_files", sep="/")
> list.files(dir)[1:10]
## [1] "rgg_size_1000_r_0.049_FrogCOL.out"
## [2] "rgg_size_1000_r_0.049_FrogMIS.out"
## [3] "rgg_size_1000_r_0.049_FruitFly.out"
## [4] "rgg_size_1000_r_0.049_Ikeda.out"
## [5] "rgg_size_1000_r_0.049_Rand1.out"
## [6] "rgg_size_1000_r_0.049_Rand2.out"
## [7] "rgg_size_1000_r_0.049_Shukla.out"
## [8] "rgg_size_1000_r_0.049_Turau.out"
## [9] "rgg_size_1000_r_0.058_FrogCOL.out"
## [10] "rgg_size_1000_r_0.058_FrogMIS.out"
> pattern <- "rgg_size_([0-9]*)_r_(0.[0-9]*)_([a-z, A-Z, 1, 2]*).out"
> var.names <- c("Size", "Radius", "Algorithm")
> data.raw <- readExperimentDir (directory=dir, names=var.names, fname.pattern=pattern,
+ alg.var.name='Algorithm', value.col=1, col.names="Evaluation")
> head(data.raw)
## Size Radius FrogCOL FrogMIS FruitFly Ikeda Rand1 Rand2 Shukla Turau
## 1 1000 0.049 246 226 223 214 214 212 213 214
## 2 1000 0.049 241 219 224 209 205 211 207 216
## 3 1000 0.049 243 221 219 215 209 213 206 214
## 4 1000 0.049 251 230 227 218 215 219 208 218
## 5 1000 0.049 243 239 231 210 211 217 218 212
## 6 1000 0.049 246 229 230 214 211 206 214 208
In this case, the format of the file names is similar, but it includes the name of the estimator, so in this case the information about the algorithm used is in the file name itself. Actually, the files contain one single column with the results of 30 repetitions.
The package includes functions that can be used perform two basic operations with data matrices, summarizing and filtering.
The summarization can be achieved easily with the function summarizeData
. For example, we can get the median value obtained, for each graph size, by the each algorithm:
## groups FrogCOL FrogMIS FruitFly Ikeda Rand1 Rand2 Shukla Turau
## 1 1000 93.5 87.0 93 79.5 81.5 79.5 81.0 79.5
## 2 100 28.0 26.0 25 23.0 23.0 23.0 23.0 23.0
## 3 5000 135.5 122.5 1 117.5 115.0 115.5 117.5 115.5
The function filterData
can be used to remove rows and columns in a simple way. For example, to reduce the data matrix to the results where the size was 100 and Rand2 has a value higher than Rand1, retaining all the columns except the size, we can run:
data.filtered <- filterData(data=data.raw,
condition="Size == 100 & Rand1 <= Rand2",
remove.cols="Size")
dim(data.filtered)
## [1] 191 9
dim(data.raw)
## [1] 900 10
This can be combined with the previous function to get, for instance, the average error for each radius.
summarizeData(data.filtered, group.by=c("Radius"))
## groups FrogCOL FrogMIS FruitFly Ikeda Rand1 Rand2 Shukla
## 1 0.140 30.82353 28.88235 27.76471 25.52941 25.29412 26.76471 25.47059
## 2 0.143 30.55000 28.55000 26.65000 25.85000 24.90000 26.60000 25.35000
## 3 0.146 29.83333 28.05556 26.22222 25.22222 24.38889 25.88889 24.77778
## 4 0.149 29.35000 28.00000 26.25000 24.30000 23.85000 25.35000 24.75000
## 5 0.152 28.42105 25.78947 25.10526 23.47368 23.21053 24.36842 23.68421
## 6 0.155 27.50000 25.59091 24.04545 22.81818 22.31818 23.36364 23.36364
## 7 0.158 27.35000 24.30000 24.10000 22.55000 21.60000 23.30000 22.35000
## 8 0.161 26.11765 23.58824 22.94118 21.23529 21.41176 22.23529 22.11765
## 9 0.164 25.85000 23.20000 22.50000 20.80000 20.65000 22.20000 21.05000
## 10 0.169 24.66667 22.27778 22.00000 20.16667 19.61111 21.00000 21.00000
## Turau
## 1 26.05882
## 2 25.60000
## 3 25.66667
## 4 24.75000
## 5 23.26316
## 6 23.09091
## 7 22.55000
## 8 22.05882
## 9 20.80000
## 10 20.11111
The package includes a number of functions to generate plots and tables of results. The plots (shown in other vignettes) can be directly used as material for publication, but the tables requires some formating. The package includes a simple function to print tables in LaTeX format, called writeTabular
.
Suppose we want to compare, for each problem in the example presented above, all the classifiers with the best one. This can be done using the postHocTest
function.
group.by <- c("Size","Radius")
alg.cols <- 3:10
result <- postHocTest(data=data.raw, algorithms=alg.cols, group.by=group.by,
test=test, control="max", correct="holland")
The result includes the summarized values and the p-values associated to each comparison. In a typical table we would like to have the summarize values, highlighting the control value and those with no significant differences. We can create such a table with the followng call:
summ <- result$summary
pval <- result$corrected.pval
bold <- is.na(pval)
mark <- pval > 0.05
## Warning in Ops.factor(left, right): '>' not meaningful for factors
## Warning in Ops.factor(left, right): '>' not meaningful for factors
mark[, (1:2)] <- FALSE
mark[is.na(mark)] <- FALSE
digits <- c(0, 3, rep(2, 8))
writeTabular(table=summ, format="f", bold=bold, mark=mark, mark.char="+",
hrule=c(0, 10, 20, 30), vrule = c(2, 4), digits=digits,
print.row.names=FALSE)
## \begin{tabular}{|ll|ll|llllll|}
## \hline
## Size & Radius & FrogCOL & FrogMIS & FruitFly & Ikeda & Rand1 & Rand2 & Shukla & Turau \\
## \hline
## 1000 & 0.049 & {\bf 247.67} & 227.60 & 226.27 & 213.20 & 212.70 & 214.53 & 212.40 & 211.90 \\
## 1000 & 0.058 & {\bf 189.90} & 176.93 & 174.90 & 162.00 & 161.47 & 163.47 & 162.87 & 162.70 \\
## 1000 & 0.067 & {\bf 151.93} & 140.57 & 142.23 & 130.37 & 129.83 & 129.60 & 130.83 & 129.87 \\
## 1000 & 0.076 & {\bf 122.87} & 114.13 & 117.80 & 104.60 & 105.30 & 104.90 & 105.30 & 105.30 \\
## 1000 & 0.085 & {\bf 102.37} & 94.33 & 99.43 & 85.87 & 86.73 & 87.07 & 87.43 & 86.70 \\
## 1000 & 0.094 & 85.47$^+$ & 79.33 & {\bf 85.80} & 72.87 & 72.63 & 72.30 & 74.20 & 72.90 \\
## 1000 & 0.103 & 74.20 & 68.13 & {\bf 75.57} & 62.77 & 63.23 & 62.30 & 63.17 & 62.13 \\
## 1000 & 0.112 & 64.37 & 58.20 & {\bf 66.90} & 54.07 & 54.17 & 54.37 & 54.50 & 54.83 \\
## 1000 & 0.121 & 56.50 & 51.30 & {\bf 59.53} & 47.43 & 47.03 & 47.60 & 47.97 & 47.17 \\
## 1000 & 0.134 & {\bf 47.53} & 43.10 & 24.27 & 40.07 & 39.93 & 40.10 & 40.77 & 40.10 \\
## \hline
## 100 & 0.140 & {\bf 31.07} & 29.10 & 27.80 & 25.70 & 26.17 & 26.10 & 25.63 & 25.90 \\
## 100 & 0.143 & {\bf 30.53} & 28.57 & 26.80 & 25.60 & 25.17 & 25.73 & 25.43 & 25.53 \\
## 100 & 0.146 & {\bf 29.93} & 28.03 & 26.57 & 25.37 & 25.07 & 25.30 & 24.90 & 25.80 \\
## 100 & 0.149 & {\bf 29.10} & 27.33 & 25.80 & 23.97 & 23.87 & 24.40 & 24.47 & 24.40 \\
## 100 & 0.152 & {\bf 28.47} & 25.97 & 25.03 & 23.57 & 23.83 & 23.87 & 23.77 & 23.57 \\
## 100 & 0.155 & {\bf 27.30} & 25.47 & 24.13 & 22.87 & 22.63 & 22.90 & 23.27 & 23.00 \\
## 100 & 0.158 & {\bf 27.33} & 24.57 & 24.03 & 22.63 & 22.40 & 22.90 & 22.47 & 22.50 \\
## 100 & 0.161 & {\bf 25.83} & 23.40 & 22.80 & 21.03 & 21.57 & 21.40 & 21.70 & 21.93 \\
## 100 & 0.164 & {\bf 25.57} & 22.93 & 22.40 & 20.67 & 20.93 & 21.43 & 20.80 & 20.90 \\
## 100 & 0.169 & {\bf 24.50} & 22.20 & 21.73 & 20.10 & 20.13 & 20.17 & 20.57 & 20.03 \\
## \hline
## 5000 & 0.024 & {\bf 975.90} & 972.40$^+$ & 964.53 & 899.60 & 897.83 & 905.30 & 898.23 & 902.50 \\
## 5000 & 0.036 & 502.00 & 495.30 & {\bf 515.83} & 453.87 & 455.57 & 456.60 & 459.30 & 452.97 \\
## 5000 & 0.048 & 309.17 & 294.10 & {\bf 332.13} & 273.67 & 275.23 & 273.83 & 276.40 & 273.00 \\
## 5000 & 0.060 & {\bf 210.90} & 196.93 & 42.03 & 182.30 & 184.73 & 183.00 & 185.60 & 183.43 \\
## 5000 & 0.072 & {\bf 153.90} & 139.87 & 2.80 & 131.20 & 132.50 & 131.90 & 134.97 & 131.47 \\
## 5000 & 0.084 & {\bf 117.90} & 105.73 & 0.27 & 99.87 & 99.37 & 99.83 & 102.03 & 99.60 \\
## 5000 & 0.096 & {\bf 93.30} & 82.57 & 0.00 & 78.77 & 78.73 & 78.20 & 79.97 & 78.60 \\
## 5000 & 0.108 & {\bf 75.83} & 67.43 & 0.00 & 63.73 & 63.33 & 63.10 & 64.50 & 63.40 \\
## 5000 & 0.120 & {\bf 63.27} & 54.57 & 0.00 & 52.10 & 52.63 & 52.60 & 54.07 & 51.90 \\
## 5000 & 0.134 & {\bf 52.20} & 45.47 & 0.00 & 43.43 & 43.10 & 43.07 & 44.27 & 43.47 \\
## \hline
## \end{tabular}
The way the function works is quite simple. It has as imput up to four matrices of the same size:
table
- This is mandatory and it has to contain the information to be printedbold
- An optional logical matrix indicating which cells have to be highlighted in bold fontitalic
- An optional logical matrix indicating which cells have to be highlighted in italicmark
- An optional logical matrix indicating which cells have to be highlighted in with a mark. This mark can be changed with the mark.char
. Note that the way the mark is generated is using mathematical environment using the superscript modifier. Therefore, any code compatible with this can be used. For example, mark.char = '{H_0}'
would be a valid way of marking cells.The function also has an argument file
. If provided, the result is written to that file, rather than printed in the standard output.
Regarding the formatting of the numbers, the funtion uses the R’s formatC
function, so for possible values for the format
parameter check this function. This parameter also has to do with the digits
parameter that can be either a single value or a vector of values indicating the numer of significant digits to be used in each column.
Regarding the alignment of columns, the align
argument can be modified to the typical values 'l'
,'r'
or 'c'
.
Optionally, the column and row names can be printed. This is the default behaviour, if they should not be printed, then the arguments print.row.names
and/or print.col.names
should be set to FALSE
.
Finally, the function allows the definition of the horizontal and vertical lines in the table through the parameters bty
, hrule
and vrule
.
The first is a vector of strings that indicate which borders have to be printed. Valid elements for this parameter are 't'
, for top border, 'b'
, for bottom border, 'l'
, for left border and 'r'
, for right border. Any subset of these values can be used.
Regarding the hrule
and vrule
arguments, they can be a list of numbers ranging between 0 and the number of rows/columns - 1, and they indicate after which row/column a line has to be drawn. The 0 value is used to indicate that there has to be a line after the row/column name. Note that the lines after the last row/column are set using the bty
argument.
Additionally, the writeTabular
function allows us to include the tabular into a latex table environment. writeTabular
has an extra set of parameters to control this option: wrap.as.table
, table.position
, caption
,caption.position
, centering
and label
. If wrap.a.table
is TRUE
(default value is FALSE
) the resulting tabular is embedded into a table environment. table.position
controls the position of the table in the latex document using the typical latex values (h
, t
or b
). caption
and label
controls the caption and the label of the table, caption.position
allows to write the caption over the table (caption.position="t"
) or under the table (caption.position="b"
). Finally, centering
allows the use of the \centering
latex command within the table environment in order to center the table in the page. Using this writeTabular
facility, we can directly generate the corresponding latex code from Sweave or Knit (using the option results='asis'
in the corresponding R code chunk) and thus, include the table in the resulting pdf document.