Getting Started with rENA

Cody L Marquart

2019-12-17

Getting Started

To get started, we load the rENA library. If it is not already installed, you will have run install.packages('rENA') prior to loading the library. These examples are all written using the provided RS.data set that is packaged with rENA, which we load first.

library(rENA)
data(RS.data)

Identify Columns To Accumulate

Before running the ena.accumulate.data function, we need to first identify which columns of the data to use for our units, conversation, and codes. There is also an optional metadata parameter which is for unit-specific data we wish to carry through the accumulation process and keep associated with the identified units. The accumlation process actually requires individual dataframes for each of the respective parameters, so we will subset and preview the RS.data using the identified columns.


units = RS.data[,c("Condition","UserName")]
head(units)
#>   Condition    UserName
#> 1 FirstGame    steven z
#> 2 FirstGame     akash v
#> 3 FirstGame alexander b
#> 4 FirstGame   brandon l
#> 5 FirstGame   brandon l
#> 6 FirstGame christian x

conversation = RS.data[,c("Condition","GroupName")]
head(conversation)
#>   Condition GroupName
#> 1 FirstGame  Electric
#> 2 FirstGame  Electric
#> 3 FirstGame  Electric
#> 4 FirstGame  Electric
#> 5 FirstGame  Electric
#> 6 FirstGame  Electric

codeCols = c(
  'Data','Technical.Constraints','Performance.Parameters',
  'Client.and.Consultant.Requests','Design.Reasoning','Collaboration'
)
codes = RS.data[,codeCols]
head(codes)
#>   Data Technical.Constraints Performance.Parameters
#> 1    0                     0                      0
#> 2    0                     0                      0
#> 3    0                     0                      0
#> 4    0                     0                      0
#> 5    0                     0                      0
#> 6    0                     0                      0
#>   Client.and.Consultant.Requests Design.Reasoning Collaboration
#> 1                              0                0             0
#> 2                              0                0             0
#> 3                              0                0             0
#> 4                              0                0             0
#> 5                              0                0             0
#> 6                              0                0             0

# optional
meta = RS.data[,c("CONFIDENCE.Change",
                  "CONFIDENCE.Pre","CONFIDENCE.Post","C.Change")]
head(meta)
#>   CONFIDENCE.Change CONFIDENCE.Pre CONFIDENCE.Post   C.Change
#> 1                 1              7               8 Pos.Change
#> 2                 2              6               8 Pos.Change
#> 3                 1              5               7 Pos.Change
#> 4                 1              5               6 Pos.Change
#> 5                 1              5               6 Pos.Change
#> 6                 0              4               4 Neg.Change

Run the ENA Accumulation

With the data identified and subset, the accumulation and set generation are both fairly straightforward. We pass along our subset dataframes and indicate the size of our stanza.window. In this case we will use a window size of 4, indicating we want to look for co-occurrences of our codes within the referrant line and the preceding 3 lines.

accum = ena.accumulate.data(
  units = units,
  conversation = conversation,
  codes = codes,
  metadata = meta,
  window.size.back = 4
)

### adjacency.vectors: Each Unit's Co-Occurrence Accumulation
#head(accum$adjacency.vectors)

### adjacency.matrix: Columns representing co-occurred 
### codes in the adjacency.vector
#head(accum$adjacency.matrix)

Generate the ENA Set

The most basic form of an ENAset can be generated by passing along the result of calling the ena.accumulate.data function to ena.make.set.

set = ena.make.set(
  enadata = accum
)

### The location in space for each unit.  Units are rows, columns are each 
### dimension in the high-dimensional space.
#head(set$points.rotated)

### The positiona of each code in the high-dimensional space
#head(set$node.positions)

### The weight of each connection. Units are rows, columns the co-occurrence
#head(set$line.weights)

Let’s Plot

Now that we have generated ENAset, we can think about plotting. To do so, we first need to think about what it is we to plot or compare. In this example, we will look at the two specific groups, defined in RS.data by the Condition column.

Plot Units In Each Group

We will start by plotting the units for each condtion as a different color. Referencing back to the set$points.rotated we will subset the rows that are in each condition and plot each group of units individually (on the same plot), as a different color.

### Subset rotated points for the first condition
first.game.points = as.matrix(set$points$Condition$FirstGame)

### Subset rotated points for the second condition
second.game.points = as.matrix(set$points$Condition$SecondGame)

plot = ena.plot(set, scale.to = "network", title = "Groups of Units")
plot = ena.plot.points(plot, points = first.game.points, confidence.interval = "box", colors = c("red"))
plot = ena.plot.points(plot, points = second.game.points, confidence.interval = "box", colors = c("blue"))
plot$plot

Plotting Means

Plotting the means of a group of units can be done using the ena.plot.group function and by passing the same set of points the function will calculate the mean to plot. In this case we will define the color to match the units that we already plotted, along with a corresponding confidence interval.

### Using the same plot object above, we will be able to plot the means 
### alongside their corresponding units.
plot = ena.plot(set, scale.to = list(x=-1:1, y=-1:1), title = "Groups and Means")
plot = ena.plot.points(plot, points = first.game.points, 
                       confidence.interval = "box", colors = c("red"))
plot = ena.plot.points(plot, points = second.game.points, 
                       confidence.interval = "box", colors = c("blue"))
plot = ena.plot.group(plot, point = first.game.points, 
                      colors =c("red"), confidence.interval = "box")
plot = ena.plot.group(plot, point = second.game.points, 
                      colors =c("blue"), confidence.interval = "box")
plot$plot

Plotting a Network

To plot a network, we will use the ena.plot.network function. This function requires one parameter, in addition to the standard first plot paramater, which is the network parameter (a character vector of line weights).

The line weights will come from set$line.weights, which we will subset by our two groups as we did while plotting the unit points.

### Subset lineweights for FirstGame and Calculate the colMeans
first.game.lineweights = as.matrix(set$line.weights$Condition$FirstGame)

### Subset lineweights for SecondGame and Calculate the colMeans
second.game.lineweights = as.matrix(set$line.weights$Condition$SecondGame)
first.game.mean = as.vector(colMeans(first.game.lineweights))
second.game.mean = as.vector(colMeans(second.game.lineweights))

### Subtract the two sets of means, resulting in a vector with negative values
### indicatinag a stronger connection with the SecondGame, and positive values
### a stronger FirstGame connection
subtracted.mean = first.game.mean - second.game.mean

# View the first 5 elements to see the substraction
head(first.game.mean, 5)
#> [1] 0.40328303 0.27885635 0.34790907 0.08888453 0.10970647
head(second.game.mean, 5)
#> [1] 0.32637678 0.31983831 0.32743531 0.08801366 0.07239016
head(subtracted.mean, 5)
#> [1]  0.0769062510 -0.0409819566  0.0204737523  0.0008708646  0.0373163091
#Plot subtracted network only
plot.first = ena.plot(set, title = "FirstGame")
plot.first = ena.plot.network(plot.first, network = first.game.mean)
plot.first$plot
plot.second = ena.plot(set, title = "SecondGame")
plot.second = ena.plot.network(plot.second, network = second.game.mean, colors = c("blue"))
plot.second$plot
plot.sub = ena.plot(set, title = "Subtracted")
plot.sub = ena.plot.network(plot.sub, network = subtracted.mean)
plot.sub$plot

Plot Everything Together

Now we can combine everything together, and using the magrittr library, we can make things a bit easier. The magrittr library passes the result of one function, using the forward-pipe operator %>%, as the first parameter to the next chained function. This removes the need to supply the plot parameter to each call.

library(magrittr)
library(scales)

# Scale the nodes to match that of the network, for better viewing
point.max = max(first.game.points, second.game.points)
first.game.scaled = scales::rescale(first.game.points,
                                    c(0,max(as.matrix(set$rotation$nodes))), c(0,point.max))
second.game.scaled = scales::rescale(second.game.points,
                                     c(0,max(as.matrix(set$rotation$nodes))), c(0,point.max))

plot = ena.plot(set, title = "Plot with Units and Network", font.family = "Times") %>% 
          ena.plot.points(points = first.game.scaled, colors = c("red")) %>% 
          ena.plot.points(points = second.game.scaled, colors = c("blue")) %>% 
          ena.plot.group(point = first.game.scaled, colors =c("red"),
                         confidence.interval = "box") %>% 
          ena.plot.group(point = second.game.scaled, colors =c("blue"), 
                         confidence.interval = "box") %>%
          ena.plot.network(network = subtracted.mean)

plot$plot