dgl.to_homogeneous

dgl.to_homogeneous(G, ndata=None, edata=None, store_type=True, return_count=False)[source]

将异构图转换为同构图并返回。

默认情况下，此函数将输入图的节点类型和边类型存储在返回图的 dgl.NTYPE 和 dgl.ETYPE 特征中。每个特征是一个整数，表示类型 ID，由 DGLGraph.get_ntype_id() 和 DGLGraph.get_etype_id() 方法确定。可以通过指定 store_type=False 来省略此步骤。

结果图将相同类型的节点和边分配在连续的 ID 范围内（例如，第一种类型的节点 ID 从 0 开始，到 G.num_nodes(G.ntypes[0]) 结束；第二种类型的节点紧随其后，依此类推）。因此，一种更节省内存的类型信息格式是整数列表；第 i 个元素对应于第 i 种类型的节点/边数量。可以通过指定 return_count=True 来选择此格式。

参数：

G (DGLGraph) – 输入的异构图。
ndata (list[str], optional) – 需要在所有节点类型之间合并的节点特征。对于 ndata 中的每个特征 feat，它会将所有节点类型 T 的 G.nodes[T].data[feat] 进行拼接。因此，所有节点类型的特征 feat 应该具有相同的形状和数据类型。默认情况下，返回的图将不包含任何节点特征。
edata (list[str], optional) – 需要在所有边类型之间合并的边特征。对于 edata 中的每个特征 feat，它会将所有边类型 T 的 G.edges[T].data[feat] 进行拼接。因此，所有边类型的特征 feat 应该具有相同的形状和数据类型。默认情况下，返回的图将不包含任何边特征。
store_type (bool, optional) – 如果为 True，则将类型信息作为 dgl.NTYPE 和 dgl.ETYPE 特征存储在返回的图中。
return_count (bool, optional) – 如果为 True，则将类型信息作为整数列表返回；第 i 个元素对应于第 i 种类型的节点/边数量。

返回：

DGLGraph – 同构图。
ntype_count (list[int], optional) – 每种类型的节点数量。当 return_count 为 True 时返回。
etype_count (list[int], optional) – 每种类型的边数量。当 return_count 为 True 时返回。

说明

计算类型信息可能会引入显著的开销。如果不需要类型信息，将 store_type 和 return_count 都设置为 False 可以避免此开销。否则，由于其内存效率，DGL 推荐使用 store_type=False 和 return_count=True。
ntype_count 和 etype_count 列表有助于加速某些操作。例如参见 RelGraphConv。
先调用 to_homogeneous()，然后再次调用 to_heterogeneous() 会得到相同的结果。

示例

以下示例使用 PyTorch 后端。

>>> import dgl
>>> import torch

>>> hg = dgl.heterograph({
...     ('user', 'follows', 'user'): ([0, 1], [1, 2]),
...     ('developer', 'develops', 'game'): ([0, 1], [0, 1])
...     })
>>> hg.nodes['user'].data['h'] = torch.ones(3, 1)
>>> hg.nodes['developer'].data['h'] = torch.zeros(2, 1)
>>> hg.nodes['game'].data['h'] = torch.ones(2, 1)
>>> g = dgl.to_homogeneous(hg)
>>> # The first three nodes are for 'user', the next two are for 'developer',
>>> # and the last two are for 'game'
>>> g.ndata
{'_TYPE': tensor([0, 0, 0, 1, 1, 2, 2]), '_ID': tensor([0, 1, 2, 0, 1, 0, 1])}
>>> # The first two edges are for 'follows', and the next two are for 'develops' edges.
>>> g.edata
{'_TYPE': tensor([0, 0, 1, 1]), '_ID': tensor([0, 1, 0, 1])}

在转换中合并所有节点类型的特征 'h'。

>>> g = dgl.to_homogeneous(hg, ndata=['h'])
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [0.], [0.], [1.], [1.]])

另请参阅

to_heterogeneous