ERGM equations

George G Vega Yon

2020-02-12

The likelihood of an Exponential Random Graph Model (ERGM) is defined as follows:

\[ \text{P}\left(\left.\mathbf{Y}= \mathbf{y}\vphantom{X = x}\right|\vphantom{\mathbf{Y}= \mathbf{y}}X = x\right) = \frac{% \text{exp}\left\{{\theta}^{\text{t}}g\left(\mathbf{y}, x\right)\right\} % }{% \sum_{\mathbf{y}'\in\mathcal{Y}} \text{exp}\left\{{\theta}^{\text{t}}g\left(\mathbf{y}', x\right)\right\} % },\quad \forall \mathbf{y}\in\mathcal{Y} \]

Where \(\mathbf{y}\in\mathcal{Y}\) is a random graph, \(X\) is a vector of attributes, \(\theta\) is a column-vector of length \(k\) (model parameters), and \(g\left(\cdot\right)\) is a function that returns a column-vector of sufficient statistics, also of length \(k\).

In the case of ergmito, we usually look at a pooled model with \(n\) networks, i.e.

\[ \prod_{i \in N}\text{P}\left(\left.\mathbf{Y}= \mathbf{y}_i\vphantom{X = x_i}\right|\vphantom{\mathbf{Y}= \mathbf{y}_i}X = x_i\right) = \prod_{i \in N}\frac{% \text{exp}\left\{{\theta}^{\text{t}}g\left(\mathbf{y}_i, x_i\right)\right\} % }{% \sum_{\mathbf{y}_i'\in\mathcal{Y}} \text{exp}\left\{{\theta}^{\text{t}}g\left(\mathbf{y}_i', x_i\right)\right\} % },\quad \forall \mathbf{y}_i\in\mathcal{Y} \]

Where \(N\equiv\{1,\dots, n\}\) is a vector of indices.

log-likelihood

In the case of a single network, the model’s log-likelihood is given by

\[ \text{log}\left(\text{P}\left(\cdot\right)\right) = {\theta}^{\text{t}}g\left(\mathbf{y}, x\right) - % \text{log}\left( % \sum_{\mathbf{y}'\in\mathbf{Y}}\text{exp}\left\{{\theta}^{\text{t}}g\left(\mathbf{y}', x\right)\right\} % \right) \] In general, we can reduce the computational complexity of this calculations by looking at the isomorphic sufficient statistics, this is, group up elements based on the set of unique vectors of sufficient statistics:

\[ {\theta}^{\text{t}}g\left(\mathbf{y}, x\right) - % \text{log}\left( % \sum_{\mathbf{s}\in\mathcal{S}}\mathbf{w}_\mathbf{s}\text{exp}\left\{{\theta}^{\text{t}}\mathbf{s}\right\} % \right) \]

Where \(\mathcal{S}\) is the support of the sufficient statistics under \(\mathcal{Y}\), \(\mathbf{s}\in\mathcal{S}\) is one of its realizations, and \(\mathbf{w}_\mathbf{s}\equiv|\left\{\mathbf{y}\in\mathcal{Y}: g\left(\mathbf{y},x\right) = \mathbf{s}\right\}|\) is the number of networks in \(\mathcal{Y}\) which sufficient statistics equal \(\mathbf{s}\). Furthermore, we can write this in matrix form:

\[ {\theta}^{\text{t}}g\left(\mathbf{y}, x\right) - % \text{log}\left( % \mathbf{W}\times \text{exp}\left\{\mathbf{S}\times \theta\right\} % \right) \]

Where \(\mathbf{W}\equiv\{\mathbf{w}_\mathbf{s}\}_{\mathbf{s}\in\mathbf{S}}\) is a row vector of, length \(w\) and \(\mathbf{S}\) is a matrix of size \(w\times k\). The log-likelihood of a pooled model is simply the sum of the individual ones.

Gradient

The partial derivative of the log-likelihood with respect to \(j\)-th parameter equals:

\[ \frac{\delta}{\delta\theta_j}\text{log}\left(\text{P}\left(\cdot\right)\right) = % g\left(\mathbf{y}\right)_j - % \frac{ % \sum_{\mathbf{s}\in\mathcal{S}}\mathbf{w}_\mathbf{s}\mathbf{s}_j\text{exp}\left\{{\theta}^{\text{t}}\mathbf{s}\right\} % }{ % \sum_{\mathbf{s}\in\mathcal{S}}\mathbf{w}_\mathbf{s}\text{exp}\left\{{\theta}^{\text{t}}\mathbf{s}\right\} % }, \quad\forall j \]

We can also write this using matrix notation as follows:

\[ \nabla\text{log}\left(\text{P}\left(\cdot\right)\right) = % g\left(\mathbf{y}, x\right) - % {\mathbf{S}}^{\text{t}}\times \left[ \mathbf{W}\circ \text{exp}\left\{\mathbf{S}\times \theta\right\} \right]/ % \lambda({\theta}) \]

Where \(\circ\) is the element-wise product and \(\lambda({\theta}) = \mathbf{W}\times \text{exp}\left\{\mathbf{S}\times \theta\right\}\).

Hessian

In the case of the hessian, each \((j, l)\) element of, \(\frac{\delta^2}{\delta\theta_k\delta\theta_u}\text{log}\left(\text{P}\left(\cdot\right)\right)\), can be computed as:

\[ - \frac{% \left(\sum_{\mathbf{s}'\in\mathcal{S}}\mathbf{s}_j'\mathbf{s}_l'\mathbf{w}_\mathbf{s}\text{exp}\left\{{\theta}^{\text{t}} \mathbf{s}\right\}\right) \lambda(\theta) - % \left(\sum_{\mathbf{s}'\in\mathcal{S}}\mathbf{s}_j'\mathbf{w}_\mathbf{s}\text{exp}\left\{{\theta}^{\text{t}} \mathbf{s}\right\}\right) \left(\sum_{\mathbf{s}'\in\mathcal{S}}\mathbf{s}_l'\mathbf{w}_\mathbf{s}\text{exp}\left\{{\theta}^{\text{t}} \mathbf{s}\right\}\right) }{% \lambda(\theta)^2% } \] Where \(\mathbf{s}_j\) as the \(j\)-th element of the vector \(\mathbf{s}\). Once again, we can simplify this using matrix notation:

\[ -\frac{% \mathbf{W}\times\left[\mathbf{S}_j\circ\mathbf{S}_l \circ \text{exp}\left\{\mathbf{S}\times\theta\right\}\right] \lambda(\theta) - % \left(\mathbf{W}\times\left[\mathbf{S}_j \circ \text{exp}\left\{\mathbf{S}\times\theta\right\}\right]\right) \left(\mathbf{W}\times\left[\mathbf{S}_l \circ \text{exp}\left\{\mathbf{S}\times\theta\right\}\right]\right) }{% \lambda(\theta)^2% } \]

Where \(\mathbf{S}_j\) is the \(j\)-th column of the matrix \(\mathbf{S}\).

References

Vega Yon, George, Andrew Slaughter, and Kayla de la Haye. 2019. “Exponential Random Graph Models for Little Networks.” arXiv preprint arXiv:1904.10406. https://arxiv.org/pdf/1904.10406.pdf.