当前位置:网站首页>graph generation model

graph generation model

2020-11-07 20:15:17 thinkingo

Generative Graph Models


The previous parts of this book introduced a wide variety of methods for learning representations of graphs. In this final part of the book, we will discuss a distinct but closely related task: the problem of > graph generation.
The goal of graph generation is to build models that can generate realistic(逼真的) graph structures. In some ways, we can view this graph generation problem as the mirror image of the graph embedding problem. Instead of assuming that we are given a graph structure G = (V,E) as input to our model, in graph generation we want the output of our model to be a graph G. Of course, simply generating an arbitrary graph is not necessarily that challenging. For instance, it is trivial to generate a fully connected graph or a graph with no edges. The key challenge in graph generation, however, is generating graphs that have certain desirable properties. As we will see in the following chapters, the way in which we define “desirable properties”—and how we perform graph generation—varies drastically between different approaches.
In this chapter, we begin with a discussion of traditional approaches to graph generation. These tradiational approaches pre-date most research on graph representation learning—and even machine learning research in general. The methods we will discuss in this chapter thus provide the backdrop to motivate the deep learning-based approaches that we will introduce in Chapter 9.

8.1 Overview of Traditional Approaches 传统方法概述
Traditional approaches to graph generation generally involve specifying some kind of generative process, which defines how the edges in a graph are created. In most cases we can frame this generative process as a way of specifying the probability or likelihood P(A[u, v] = 1) of an edge existing between two nodes u and v. The challenge is crafting some sort of generative process that is both tractable and also able to generate graphs with non-trivial properties or acteristics. Tractability is essential because we want to be able to sample or analyze the graphs that are generated. However, we also want these graphs to have some properties that make them good models for the kinds of graphs we see in the real world.
The three approaches we review in this subsection represent a small but representative subset of the traditional graph generation approaches that exist in the literature. For a more thorough survey and discussion, we recommend Newman [2018] as a useful resource.

8.2 ER模型
Perhaps the simplest and most well-known generative model of graphs is the ER model . In this model we define the likelihood of an edge occurring between any pair of nodes as

where r ∈ [0,1] is parameter controlling the density of the graph. In other words, the ER model simply assumes that the probability of an edge occurring between any pairs of nodes is equal to r.
The ER model is attractive due to its simplicity. To generate a random ER graph, we simply choose (or sample) how many nodes we want, set the density parameter r, and then use Equation (8.1) to generate the adjacency matrix. Since the edge probabilities are all independent, the time complexity to generate a graph is O(|V|2), i.e., linear in the size of the adjacency matrix.
The downside of the ER model, however, is that it does not generate very realistic graphs. In particular, the only property that we can control in the ER model is the density of the graph, since the parameter r is equal (in expectation) to the average degree in the graph. Other graph properties—such as the degree distribution, existence of community structures, node clustering coefficients, and the occurrence of structural motifs—are not captured by the ER model. It is well known that graphs generated by the ER model fail to reflect the distribution of these more complex graph properties, which are known to be important in the structure and function of real-world graphs.

8.3 Stochastic Block Models 随机块图
Many traditional graph generation approaches seek to improve the ER model by better capturing additional properties of real-world graphs, which the ER model ignores. One prominent example is the class of stochastic block models (SBMs), which seek to generate graphs with community structure.
In a basic SBM model, we specify a number γ of different blocks: C1, ...,Cγ. Every node u ∈ V then has a probability piof belonging to block i, i.e. pi= P(u ∈ Ci),∀u ∈ V, i = 1, ..., γ wherePγ i=1pi= 1. Edge probabilities are then specified by a block-to-block probability matrix P ∈ [0,1]γ×γ, where C[i, j] gives the probability of an edge occuring between a node in block Ciand a node in block Cj. The generative process for the basic SBM model is as follows: