By A.Goldenberg, A.X.Zheng, S.E.Fienberg and E.M.Airoldi

Presented by Wutao Wei

- Birth and death of nodes and edges

- Assumption: No edge will be removed after added
- First, start with N nodes at time 0
- Generate an edge to the network with Probability \(p=E/{n \choose 2}\)
- Degree Distribution: Binomial, and near Poisson when N is large

- One of major criticism to ERGM: not scale-free, not following power law
- PA Model step:
- Time 0: at time 0, there are \(N_0\) unconnected nodes.
- subsequent time step: a new node added with \(m\leq N_0\)
- connect with m nodes with prob of \(p_i = \frac{\theta{i}}{\sum_{j}\theta{j}}\)
*hubs*,*Rich-get-richer*

- Power Law: \(P(k)\sim k^{-\gamma}\)
- \(P(k)\) is the fraction of \(\frac{\text{Node with k links}}{\text{Total Nodes}}\)
- ERGM: Poisson
- Dorogovtsev and Mendes: additional decaying as \((t-t_i)^{\nu}\), \(t_i\) is the age of \(k_i\)

- Watts-Strogatz: begin with a ring lattice with N nodes and k edges per node
- Randomly rewires each edge with probability p
- As p goes from 0 to 1, the construction moves toward an ERGM
- Model is not dynamic

- random edges are added to a fixed grid
- probability of connection depends on the distance in the grid
- \(P(\text{x,y are connected})\sim d(x,y)^{-\alpha}\)
- Extension: Clauset and Moore, limited the steps of rewiring
- converges to a power law, where \(\alpha=\alpha_{rewired}\)

- Different rewireing scheme
- Start with N nodes ring
- at each time step \(j=1,2,3,\dots\), choose a random start node x and a target node y
- perform greedy routing from x to y, denoting the routing is \(\{x,z_0,z_1,\dots,z_n,y\}\)
- independently and with (small) probability p, update the long-range link of each node \(z_i\) on the resulting path to \(z_i\) to y
- When stationary, the distribution of distances spanned by long-range links is theoretical optimum for search and the expected length of searches is polylogaithmic
- Example: P2P networks

- full symmetric and the expected degree is the same for all nodes
- fix the degree-distribution parameters or distribution on some statistics
- Relationship with \(p_1\) model:
- when \(\rho = 0\), look into distribution on minimum sufficient statistics
- calculate \(\{\alpha_i\}\) and \(\{\beta_j\}\) by using in-degrees and out-degrees

- consider about web hyper-links
- Denote the graph at time t as \(G_t=(\mathcal{N}_t,\mathcal{E}_t)\)
- when t+1, add a node N into \(G_t\), and it is connected to a prototype node \(m\), chosen uniformly in \(\mathcal{N}_t\)
- Then \(d\) out-links are added to node N.
- The ith out-link is chosen with probability \(\alpha\), it can connect to any nodes in \(\mathcal{N}_t\) uniformly
- \(1-\alpha\), it connects back to m
- extension: relate to distance with \(p^{-d(v,w)/2}\); mixture models

- Continuous Markov Chain Process(CMCP) + ERGM
- Markov condition: for any possible outcome \(\tilde{y}\in\mathcal{Y}\) and any pair of time points \(\{t_a < t_b \mid t_a,t_b\in \mathcal{T}\}\)
- \(Pr\{Y(t_b)=\tilde{y} \mid Y(t)=y(t), \forall t: t\leq t_a\}=Pr\{Y(t_b)=\tilde{y} \mid Y(t_a)=y(t_a)\)
- CMCP: \(Pr(t)=e^{tQ}\), where \(Q\) is the
*intensity matrix*with elements \(q(y,\tilde{y})\)

- Independent arc model: \(q_{ij}(\mathbf y)=\lambda_{y_{ij}}\)
- rate from 0 to 1 is \(\lambda_0\), form 0 to 1 is \(\lambda_1\); not depending on other edges
- Reciprocity model: \(q_{ij}(\mathbf y)=\lambda_{y_{ij}} + \mu_{y_{ij}} y_{ji}\)
- Popularity model: \(q_{ij}(\mathbf y)=\lambda_{y_{ij}} + \pi_{y{ij}} y_{+j}\)
- Expansiveness model: \(q_{ij}(\mathbf y)=\lambda_{y_{ij}} + \pi_{y{ij}} y_{i+}\)

- two components: opportunity for change and propensity of change
- or to say one control when it happens and the other controls probability of generation of edges
- general form: \(q_{ij}(\mathbf y)=\rho p_{ij}(\mathbf y)\)
- \(p_{ij}(\mathbf y)=\frac{exp(f(y(i,j,1-y{ij})))}{exp(f(y(i,j,0)))+exp(f(y(i,j,1)))}\)
- \(f(\mathbf y)=\sum_k \beta_k s_k(\mathbf y)\)
- degeneracy

- general form: \(q_{ij}(\mathbf y)=\rho p_{i}(\mathbf y)\)
- \(p_{ij}(\mathbf y)=\frac{exp(f(y(i,j,1-y{ij})))}{\sum_{h\neq i}exp(f(y(i,h,1-y_{ih})))}\)
- \(f(\mathbf y)=\sum_k \beta_k s_k(\mathbf y)\)
- also can compose edge-node mixed dynamics: \(q_{ij}(\mathbf y)=\rho \frac{exp(f(y(i,j,1-y{ij})))}{\sum_{h\neq i}exp(f(y(i,h,1-y_{ih})))}\)
- Remark: estimation in CPCM by using method of moments via MCMC

- satisfy Markov property
- \(Pr(Y^1,Y^2,\dots,Y^T)=Pr(Y^T \mid Y^{T-1})Pr(Y^{T-1} \mid Y^{T-2})\cdots Pr(Y^2 \mid Y^{1})\)
- \(\{Y^1,Y^2,\dots,Y^T\}\) is a sequence of snapshots of network

- \(Pr(\mathbf y^t \mid \mathbf y^{t-1})=\frac{1}{Z} \exp\{\sum_k \beta_k s_k(\mathbf y^t, \mathbf y^{t-1})\}\)
- Density of edges: \(s_1(\mathbf y^t, \mathbf y^{t-1})=\frac{1}{n-1}\sum_{ij} y_{ij}^t\)
- Stability: \(s_2(\mathbf y^t, \mathbf y^{t-1})=\frac{1}{n-1}\sum_{ij} [y_{ij}^t y_{ij}^{t-1} + (1-y_{ij}^t)(1-y_{ij}^{t-1})]\)
- Reciprocity: \(s_3(\mathbf y^t, \mathbf y^{t-1})= n \dfrac{\sum_{ij} y_{ji}^t y_{ij}^{t-1}}{\sum_{ij} y_{ij}^{t-1}}\)
- Transitivity: \(s_4(\mathbf y^t, \mathbf y^{t-1})= n \dfrac{\sum_{ijk} y_{ik}^t y_{ij}^{t-1} y_{jk}^{t-1}}{\sum_{ij} y_{ij}^{t-1} y_{jk}^{t-1}}\)
- Markov property: \(Pr(Y^{K+1},Y^{K+2},\dots, Y^T \mid Y^1,\dots,Y^K)=\prod_{t=K+1}^T Pr(Y^T \mid Y^{t-K},\dots,Y^{t-1})\)
- where \(Pr(Y^t \mid Y^{t-K},\dots,Y^{t-1})=\frac{1}{Z}\exp\{\sum_k \beta_k s_k (Y^t,\dots,Y^{t-K})\}\)

- likelihood: \(Pr(\mathbf y \mid \mathbf \theta)=\frac{\prod_{c\in\mathcal C} \phi(\mathbf y_c \mid \mathbf \theta_c)}{z}\)
- Write it as exponential family: \(Pr(\mathbf y \mid \mathbf \theta)=\exp\{\mathbf \theta^T u(\mathbf y)-\log z\}\)

- allow latent positions to change over time in Gaussian-distributed random steps: \(Z_t\mid Z_{t-1} \sim \mathcal N(Z_{t-1},\sigma^2I)\)
- \(p_{ij}^L:=p^L(y_{ij}=1)=\frac{1}{1+\exp(d_{ij}-r_{ij})}\)
- where \(d_{ij}\) is the Euclidean distance between i and j,\(r_{ij}\) is a radius of influence defined as \(c\times (\max(\delta_i,\delta_j)+1)\)
- weigh the link probability with a kernel function \(K(d_{ij})\), continulus and differentiable at \(r_{ij}\). \(K(d_{ij})=(1-(d_{ij}/r_{ij})^2)^2\), when \(d_{ij}<r_{ij}\), and 0 otherwise.
- we can model link probability \(p_{ij}=p_{ij}^L K(d_{ij})+(1-K(d_{ij}))\rho\), \(\rho\) is a noise probability
- Find MLE based on \(Pr(Y^t \mid Z^t)=\prod_{i\sim j}p_{ij}\prod_{i\nsim j}(1-p_{ij})\)

- Model:\(\log \frac{P(Y(i,j))}{1-P(Y(i,j))} = \alpha + \beta^{'} X_{ij} - \lvert Z_i - Z_j \lvert \equiv \eta_{ij}\)
- Step 1: Find MLE of \(Z\) denoted as \(\hat Z\)
- Step 2 a: set \(Z_0=\hat Z\), also a symmetric proposal distribution \(J(Z\mid Z_k)\)
- b: Sample a \(Z*\) from \(J(Z\mid Z_k)\)
- c: Accept \(Z*\) as \(Z_{k+1}\) with probability \(\frac{p(Y \mid Z*, \alpha_k,\beta_k,X)}{p(Y \mid Z_K, \alpha_k,\beta_k,X)} \frac{\pi{Z*}}{Z_k}\); or \(Z_{k+1}=Z_k\)
- d: Store \(\tilde Z_{k+1}=\arg\min_{TZ_{k-1}} tr (\hat Z-TZ_{k+1})^'(\hat Z-TZ_{k+1})\)
- Step 3: Update \(\alpha\) and \(\beta\) with a Metropolis-Hastings algorithm

- NIPS paper and physics community co-authorship
- US Supreme Court citation networks in different opinion eras

- Context changes, the relationship changes
- weighted network
- Generative process:
- Step 1: For each node i, sample context \(C_i\sim mult(\theta_i)\), \(\theta_i\) denotes the context distribution parameters
- Step 2: For each pair of nodes i and j in the same context, sample meeting variable \(M_{ij}\sim Bern(\nu_i\nu_j)\), where \(\nu_i\) and \(\nu_j\) represent the "friendliness" of nodes i and j;
- Step 3: \[\begin{equation} W_{ij}^t=\begin{cases} Poi( \lambda_h (W_{ij}^{t-1}+1)) & \text{if}\ M_{ij}= 1 \\ Poi(\lambda_l(W_{ij}^{t-1})) & \text{otherwise} \end{cases} \end{equation}\] where \(\lambda_h\) and \(\lambda_l\) are hyperparameters indicating the rates of growth and decay.

- Network Visualization
- Computability
- Asymptotics and Assessing Goodness of Fit
- Sampling
- Missing Data

- Predicion
- Embeddability
- Identifiablity
- Combining links with their attributes