Introduction

The package implements a special data structure, the tracks object, to allow rapid computation of different analysis metrics on cell tracks. This tutorial will show how to load tracking data, how to deal with tracking objects, how to filter and subset data, and how to convert between track objects and other datastructures.

1 Reading in data

First load the package:

library( celltrackR )
library( ggplot2 )

1.1 Input data format

Tracking data is usually stored as a table, with columns indicating the cellid, time, and coordinates of each measured point. Here we have an example in the file Tcells.txt, which we can read in as a normal dataframe:

d <- read.table( system.file("extdata", "t-cells.txt", package="celltrackR" ) )
str(d)
## 'data.frame':    381 obs. of  6 variables:
##  $ V1: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ V2: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ V3: num  0 27.8 55.5 83.3 111.1 ...
##  $ V4: num  133 134 132 133 132 ...
##  $ V5: num  119 119 118 118 118 ...
##  $ V6: num  8.75 8.75 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 ...
head(d)
##   V1 V2      V3      V4      V5   V6
## 1  1  0   0.000 132.521 118.692 8.75
## 2  2  0  27.781 133.909 118.700 8.75
## 3  3  0  55.484 131.763 118.129 6.25
## 4  4  0  83.296 133.161 117.903 6.25
## 5  5  0 111.093 131.530 117.894 6.25
## 6  6  0 138.906 132.229 117.665 6.25

The result is a normal dataframe, where here we have row numbers in the first column, cell id in the second column, time in the third column, and \((x,y,z)\) coordinates in columns 4:6.

1.2 Directly reading in data as a tracks object

While we can read tracks as a dataframe by using R's basic function read.table(), the function read.tracks.csv() allows to read in data directly as a tracks object, a special data structure designed for efficient handling of tracking data.

Applying this to the same file as before:

t <- read.tracks.csv( system.file("extdata", "t-cells.txt", package="celltrackR" ), 
              header = FALSE, 
                      id.column = 2, time.column = 3, pos.columns = 4:6 )
plot(t)

plot of chunk unnamed-chunk-3

where we have to specify header=FALSE because the file Tcells.txt does not contain any column headers. Note that read.tracks.csv() also works with non-csv text files, as long as the data is organised with separate columns for track id, time index, and coordinates. See the documentation at ?read.tracks.csv for details.

2 The tracks object

2.1 The tracks object data structure

The tracks object is a special datastructure that allows efficient handling of track datasets. As an example, we will use the tracks loaded in the previous section.

A tracks object has the form of a list, where each element of the list is a track of a single cell:

# Structure of the TCells object
str( t, list.len = 3 )
## List of 22
##  $ 0 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2 : num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "tracks"
# This object is both a list and a "tracks" object
is.list( t )
## [1] TRUE
is.tracks( t )
## [1] TRUE
# The first element is the track of the first cell in the data:
head( t[[1]] )
##            t       x       y    z
## [1,]   0.000 132.521 118.692 8.75
## [2,]  27.781 133.909 118.700 8.75
## [3,]  55.484 131.763 118.129 6.25
## [4,]  83.296 133.161 117.903 6.25
## [5,] 111.093 131.530 117.894 6.25
## [6,] 138.906 132.229 117.665 6.25

Each track in the tracks object is a matrix with coordinates at different timepoints for each cell. The cell id is no longer a column in this matrix, as tracks belonging to different cells are stored in different elements of the tracks object list.

2.2 Subsetting data

Note that we can subset the track matrix of an individual track using the double square brackets:

# Get the first track
t1 <- t[[1]]
str(t1)
##  num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:4] "t" "x" "y" "z"
# This is no longer a tracks object, but a matrix
is.tracks( t1 )
## [1] FALSE
is.matrix( t1 )
## [1] TRUE

If we now want to plot this track, the plotting method for tracks will not work because this is not recognized as a tracks object. We can use the frunction wrapTrack() to “pack” this matrix back into a tracks object:

par( mfrow=c(1,2) )
plot( t1, main = "Plotting matrix directly" )
plot( wrapTrack( t1 ), main = "After using wrapTrack()" )

plot of chunk unnamed-chunk-6

Note that we can also achieve this by subsetting with single instead of double brackets:

# Get the first track
t1b <- t[1]
str(t1b)
## List of 1
##  $ 0: num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  - attr(*, "class")= chr "tracks"
# This remains a track object
is.tracks( t1b )
## [1] TRUE

In the same way, we can also subset multiple tracks at once

# Get the first and the third track
t13 <- t[c(1,3)]
str(t13)
## List of 2
##  $ 0: num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2: num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  - attr(*, "class")= chr "tracks"

Note that the track ids start from 0, so getting the first and third track actually yields the tracks with ids 0 and 2. If we want the ones with ids 1 and 3, we can subset using the track name as a character string:

# Get tracks with ids 1 and 3
t13b <- t[c("1","3")]
str(t13b)
## List of 2
##  $ 1: num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 3: num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  - attr(*, "class")= chr "tracks"

2.3 Using tracks objects in combination with R's lapply and sapply

Because tracks objects are lists, we can make use of R's lapply() and sapply() functions to compute metrics or manipulate tracks efficiently.

For example, if we want to compute the speed of each track, we simply use:

speeds <- sapply( t, speed )
speeds
##          0          1          2          3          4          5          6 
## 0.04111888 0.16407805 0.14649521 0.12828778 0.08589151 0.26304168 0.18699222 
##          7          8          9         10         11         12         13 
## 0.34978184 0.12357669 0.24119949 0.22368373 0.20173171 0.18841842 0.11007267 
##         14         15         16         17         18         19         20 
## 0.08669212 0.17835775 0.25081840 0.15948551 0.24686951 0.18119184 0.16442000 
##         21 
## 0.22800033

Note that sapply() applies the speed() function to each matrix in the track list (analogous to subsetting with double brackets). Thus, the speed() function sees an individual track matrix, not a tracks object.

Or we can use lapply() to manipulate each track in the dataset with some custom function, keeping separate tracks as separate list elements. For example, suppose we wish to remove all data after a given timepoint:

# Function to remove all data after given timepoint
# x must be a single track matrix, which is what this function will
# receive from lapply
removeAfterT <- function( x, time.cutoff ){

  # Filter out later timepoints
  x2 <- x[ x[,"t"] <= time.cutoff, ]

  # Return the new matrix, or NULL if there are no timepoints before the cutoff
  if( nrow(x2) == 0 ){
    return(NULL)
  } else {
    return(x2)
  }
}

# Call function on each track using lapply
filtered.t <- lapply( t, function(x) removeAfterT( x, 200 ) )

# Remove any tracks where NULL was returned
filtered.t <- filtered.t[ !sapply( filtered.t, is.null )]

Note that lapply() returns list but not a tracks object:

str(filtered.t, list.len = 3 )
## List of 12
##  $ 0 : num [1:8, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:8, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2 : num [1:8, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
is.list( filtered.t )
## [1] TRUE
is.tracks( filtered.t )
## [1] FALSE

Fix this by calling as.tracks() and plot the result:

filtered.t <- as.tracks( filtered.t )
is.tracks( filtered.t )
## [1] TRUE
par(mfrow=c(1,2))
plot( t, main = "Unfiltered data")
plot( filtered.t, main = "Filtered on timepoints < 200" )

plot of chunk unnamed-chunk-13

2.4 Built-in filtering/subsetting functions

The package contains several built-in functions to filter and subset tracks.

The function filterTracks() can be used to select tracks with a certain property. For example, to select all tracks with at least 15 steps (16 datapoints):

# The filtering function must return TRUE or FALSE for each track given to it
my.filter <- function(x){
  return( nrow(x) > 15 )
}

# Filter with this function using filterTracks
long.tracks <- filterTracks( my.filter, t )

# Plot the result
par(mfrow=c(1,2))
plot( t, main = "All tracks")
plot( long.tracks, main = "Long tracks only" )

plot of chunk unnamed-chunk-14

The function selectTracks() selects tracks based on upper and lower bounds of a certain measure. For example, we can get the fastest half of the T cells:

# Filter with this function using filterTracks
median.speed <- median( sapply( t, speed ) )
fast.tracks <- selectTracks( t, speed, median.speed, Inf )

# Plot the result
par(mfrow=c(1,2))
plot( t, main = "All tracks")
plot( fast.tracks, main = "Fastest half" )

plot of chunk unnamed-chunk-15

Using the function subsample(), we can adjust the time resolution of the data by keeping e.g. only every \(k^{th}\) timepoint:

# Lower resolution
lower.res <- subsample( t, k = 2 )

# Plot the result
par(mfrow=c(1,2))
plot( t, main = "Original data")
plot( lower.res, main = "Lower resolution" )

plot of chunk unnamed-chunk-16

2.5 Extracting subtracks

The package also contains functions to extract parts of tracks. For example, use subtracks() to extract subtracks of a given length:

subtrack.nsteps <- 2
t.2steps <- subtracks( t, subtrack.nsteps )
str( t.2steps, list.len = 3 )
## List of 337
##  $ 0.1  : num [1:3, 1:4] 0 27.8 55.5 132.5 133.9 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 0.2  : num [1:3, 1:4] 27.8 55.5 83.3 133.9 131.8 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 0.3  : num [1:3, 1:4] 55.5 83.3 111.1 131.8 133.2 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "tracks"

Note that these subtracks overlap:

# Last step of the first subtrack and first step of the second are equal
t.2steps[c(1,2)]
## $`0.1`
##           t       x       y    z
## [1,]  0.000 132.521 118.692 8.75
## [2,] 27.781 133.909 118.700 8.75
## [3,] 55.484 131.763 118.129 6.25
## 
## $`0.2`
##           t       x       y    z
## [1,] 27.781 133.909 118.700 8.75
## [2,] 55.484 131.763 118.129 6.25
## [3,] 83.296 133.161 117.903 6.25
## 
## attr(,"class")
## [1] "tracks"

We can prevent this by adjusting the overlap argument to 0, or even to negative values so that space is left between the subtracks:

t.2steps.b <- subtracks( t, subtrack.nsteps, overlap = 0 )

# No longer any overlap
t.2steps.b[c(1,2)]
## $`0.1`
##           t       x       y    z
## [1,]  0.000 132.521 118.692 8.75
## [2,] 27.781 133.909 118.700 8.75
## [3,] 55.484 131.763 118.129 6.25
## 
## $`0.3`
##            t       x       y    z
## [1,]  55.484 131.763 118.129 6.25
## [2,]  83.296 133.161 117.903 6.25
## [3,] 111.093 131.530 117.894 6.25
## 
## attr(,"class")
## [1] "tracks"

An alternative to subtracks() is prefixes(), which returns only the first subtrack of a given length from each track:

t.prefixes <- prefixes( t, subtrack.nsteps )

# these subtracks come from different cells
t.prefixes[c(1,2)]
## $`0`
##           t       x       y    z
## [1,]  0.000 132.521 118.692 8.75
## [2,] 27.781 133.909 118.700 8.75
## [3,] 55.484 131.763 118.129 6.25
## 
## $`1`
##           t       x       y     z
## [1,]  0.000 113.708 124.957 41.25
## [2,] 27.781 114.127 124.959 41.25
## [3,] 55.484 120.741 123.121 43.75

If we want to extract subtracks starting at a specific timepoint, use subtracksByTime():

# Check which timepoints occur in the dataset
tp <- timePoints(t)
tp
##  [1]    0.000   27.781   55.484   83.296  111.093  138.906  166.656  194.406
##  [9]  222.078  249.890  277.703  305.531  333.343  361.250  388.921  416.640
## [17]  444.484  472.250  500.093  528.109  555.828  583.546  611.343  639.078
## [25]  666.843  694.562  722.421  750.171  778.156  805.843  833.687  861.484
## [33]  889.296  917.046  944.781  972.625 1000.440 1028.270 1056.110
# Extract all subtracks starting from the third timepoint
t.sbytime <- subtracksByTime( t, tp[3], subtrack.nsteps )

t.sbytime[c(1,2)]
## $`0`
##            t       x       y    z
## [1,]  55.484 131.763 118.129 6.25
## [2,]  83.296 133.161 117.903 6.25
## [3,] 111.093 131.530 117.894 6.25
## 
## $`1`
##            t       x       y     z
## [1,]  55.484 120.741 123.121 43.75
## [2,]  83.296 122.817 124.800 43.75
## [3,] 111.093 128.066 124.209 46.25
## 
## attr(,"class")
## [1] "tracks"

3 Converting between tracks objects and other data structures

We can convert between tracks, regular R lists, and dataframes using as.tracks(), as.list(), or as.data.frame():

# Original tracks object
str( t, list.len = 3 )
## List of 22
##  $ 0 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2 : num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "tracks"
# Converted to dataframe
t.df <- as.data.frame(t)
str( t.df )
## 'data.frame':    381 obs. of  5 variables:
##  $ id: Factor w/ 22 levels "0","1","10","11",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ t : num  0 27.8 55.5 83.3 111.1 ...
##  $ x : num  133 134 132 133 132 ...
##  $ y : num  119 119 118 118 118 ...
##  $ z : num  8.75 8.75 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 ...
# Converted to list (note class at the bottom)
t.list <- as.list(t)
str( t.list, list.len = 3 )
## List of 22
##  $ 0 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2 : num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "list"
# Convert list back to tracks
str( as.tracks( t.list ), list.len = 3 )
## List of 22
##  $ 0 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 2 : num [1:39, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "tracks"
# Convert dataframe to tracks
str( as.tracks( t.df ), list.len = 3 )
## List of 22
##  $ 0 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 1 : num [1:11, 1:4] 0 27.8 55.5 83.3 111.1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##  $ 10: num [1:12, 1:4] 55.5 83.3 111.1 138.9 166.7 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "t" "x" "y" "z"
##   [list output truncated]
##  - attr(*, "class")= chr "tracks"

Note that the method as.tracks.data.frame() contains arguments id.column, time.column, and pos.columns to specify where information is stored, just like read.tracks.csv.

For help, see ?as.list.tracks, ?as.data.frame.tracks, ?as.tracks.data.frame, or as.tracks.list.