用于异构图的 OnDiskDataset
本教程展示了如何为异构图创建 OnDiskDataset
,该数据集可在 GraphBolt 框架中使用。与为同构图创建数据集的主要区别在于,我们需要为边、特征数据、训练/验证/测试集指定节点/边类型。
在本教程结束时,您将能够
组织图结构数据。
组织特征数据。
为特定任务组织训练/验证/测试集。
要创建一个 OnDiskDataset
对象,您需要将所有数据(包括图结构、特征数据和任务)组织到一个目录中。该目录应包含一个描述数据集元数据的 metadata.yaml
文件。
现在,让我们一步一步生成各种数据,并将它们组织在一起,最终实例化 OnDiskDataset
。
安装 DGL 包
[1]:
# Install required packages.
import os
import torch
import numpy as np
os.environ['TORCH'] = torch.__version__
os.environ['DGLBACKEND'] = "pytorch"
# Install the CPU version.
device = torch.device("cpu")
!pip install --pre dgl -f https://data.dgl.ai/wheels-test/repo.html
try:
import dgl
import dgl.graphbolt as gb
installed = True
except ImportError as error:
installed = False
print(error)
print("DGL installed!" if installed else "DGL not found!")
Looking in links: https://data.dgl.ai/wheels-test/repo.html
Requirement already satisfied: dgl in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (2.2a240410)
Requirement already satisfied: numpy>=1.14.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (1.26.4)
Requirement already satisfied: scipy>=1.1.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (1.14.1)
Requirement already satisfied: networkx>=2.1 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (3.4.2)
Requirement already satisfied: requests>=2.19.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (2.32.3)
Requirement already satisfied: tqdm in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (4.66.6)
Requirement already satisfied: psutil>=5.8.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (6.1.0)
Requirement already satisfied: torchdata>=0.5.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (0.9.0)
Requirement already satisfied: pandas in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from dgl) (2.2.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from requests>=2.19.0->dgl) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from requests>=2.19.0->dgl) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from requests>=2.19.0->dgl) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from requests>=2.19.0->dgl) (2024.8.30)
Requirement already satisfied: torch>=2 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torchdata>=0.5.0->dgl) (2.1.0+cpu)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from pandas->dgl) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from pandas->dgl) (2024.2)
Requirement already satisfied: tzdata>=2022.7 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from pandas->dgl) (2024.2)
Requirement already satisfied: six>=1.5 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas->dgl) (1.16.0)
Requirement already satisfied: filelock in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (3.16.1)
Requirement already satisfied: typing-extensions in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (4.12.2)
Requirement already satisfied: sympy in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (1.13.3)
Requirement already satisfied: jinja2 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (3.1.4)
Requirement already satisfied: fsspec in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (2024.10.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from jinja2->torch>=2->torchdata>=0.5.0->dgl) (3.0.2)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/dgl-dev-cpu/lib/python3.10/site-packages (from sympy->torch>=2->torchdata>=0.5.0->dgl) (1.3.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pythonlang.cn/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
DGL installed!
数据准备
为了演示如何组织各种数据,我们首先创建一个基础目录。
[2]:
base_dir = './ondisk_dataset_heterograph'
os.makedirs(base_dir, exist_ok=True)
print(f"Created base directory: {base_dir}")
Created base directory: ./ondisk_dataset_heterograph
生成图结构数据
对于异构图,我们需要将不同的边(即种子)保存到单独的 Numpy 或 CSV 文件中。
注意:- 当保存到 Numpy 时,数组的形状需要为 (2, N)
。推荐使用此格式,因为它比 CSV 文件构造图的速度快得多。- 当保存到 CSV 文件时,不要保存索引和头部。
[3]:
import numpy as np
import pandas as pd
# For simplicity, we create a heterogeneous graph with
# 2 node types: `user`, `item`
# 2 edge types: `user:like:item`, `user:follow:user`
# And each node/edge type has the same number of nodes/edges.
num_nodes = 1000
num_edges = 10 * num_nodes
# Edge type: "user:like:item"
like_edges_path = os.path.join(base_dir, "like-edges.csv")
like_edges = np.random.randint(0, num_nodes, size=(num_edges, 2))
print(f"Part of [user:like:item] edges: {like_edges[:5, :]}\n")
df = pd.DataFrame(like_edges)
df.to_csv(like_edges_path, index=False, header=False)
print(f"[user:like:item] edges are saved into {like_edges_path}\n")
# Edge type: "user:follow:user"
follow_edges_path = os.path.join(base_dir, "follow-edges.csv")
follow_edges = np.random.randint(0, num_nodes, size=(num_edges, 2))
print(f"Part of [user:follow:user] edges: {follow_edges[:5, :]}\n")
df = pd.DataFrame(follow_edges)
df.to_csv(follow_edges_path, index=False, header=False)
print(f"[user:follow:user] edges are saved into {follow_edges_path}\n")
Part of [user:like:item] edges: [[380 336]
[487 223]
[977 689]
[808 14]
[726 198]]
[user:like:item] edges are saved into ./ondisk_dataset_heterograph/like-edges.csv
Part of [user:follow:user] edges: [[153 111]
[ 33 32]
[611 91]
[167 269]
[540 95]]
[user:follow:user] edges are saved into ./ondisk_dataset_heterograph/follow-edges.csv
生成图的特征数据
目前支持 numpy 数组和 torch 张量作为特征数据。让我们为每种节点/边类型生成特征数据。
[4]:
# Generate node[user] feature in numpy array.
node_user_feat_0_path = os.path.join(base_dir, "node-user-feat-0.npy")
node_user_feat_0 = np.random.rand(num_nodes, 5)
print(f"Part of node[user] feature [feat_0]: {node_user_feat_0[:3, :]}")
np.save(node_user_feat_0_path, node_user_feat_0)
print(f"Node[user] feature [feat_0] is saved to {node_user_feat_0_path}\n")
# Generate another node[user] feature in torch tensor
node_user_feat_1_path = os.path.join(base_dir, "node-user-feat-1.pt")
node_user_feat_1 = torch.rand(num_nodes, 5)
print(f"Part of node[user] feature [feat_1]: {node_user_feat_1[:3, :]}")
torch.save(node_user_feat_1, node_user_feat_1_path)
print(f"Node[user] feature [feat_1] is saved to {node_user_feat_1_path}\n")
# Generate node[item] feature in numpy array.
node_item_feat_0_path = os.path.join(base_dir, "node-item-feat-0.npy")
node_item_feat_0 = np.random.rand(num_nodes, 5)
print(f"Part of node[item] feature [feat_0]: {node_item_feat_0[:3, :]}")
np.save(node_item_feat_0_path, node_item_feat_0)
print(f"Node[item] feature [feat_0] is saved to {node_item_feat_0_path}\n")
# Generate another node[item] feature in torch tensor
node_item_feat_1_path = os.path.join(base_dir, "node-item-feat-1.pt")
node_item_feat_1 = torch.rand(num_nodes, 5)
print(f"Part of node[item] feature [feat_1]: {node_item_feat_1[:3, :]}")
torch.save(node_item_feat_1, node_item_feat_1_path)
print(f"Node[item] feature [feat_1] is saved to {node_item_feat_1_path}\n")
# Generate edge[user:like:item] feature in numpy array.
edge_like_feat_0_path = os.path.join(base_dir, "edge-like-feat-0.npy")
edge_like_feat_0 = np.random.rand(num_edges, 5)
print(f"Part of edge[user:like:item] feature [feat_0]: {edge_like_feat_0[:3, :]}")
np.save(edge_like_feat_0_path, edge_like_feat_0)
print(f"Edge[user:like:item] feature [feat_0] is saved to {edge_like_feat_0_path}\n")
# Generate another edge[user:like:item] feature in torch tensor
edge_like_feat_1_path = os.path.join(base_dir, "edge-like-feat-1.pt")
edge_like_feat_1 = torch.rand(num_edges, 5)
print(f"Part of edge[user:like:item] feature [feat_1]: {edge_like_feat_1[:3, :]}")
torch.save(edge_like_feat_1, edge_like_feat_1_path)
print(f"Edge[user:like:item] feature [feat_1] is saved to {edge_like_feat_1_path}\n")
# Generate edge[user:follow:user] feature in numpy array.
edge_follow_feat_0_path = os.path.join(base_dir, "edge-follow-feat-0.npy")
edge_follow_feat_0 = np.random.rand(num_edges, 5)
print(f"Part of edge[user:follow:user] feature [feat_0]: {edge_follow_feat_0[:3, :]}")
np.save(edge_follow_feat_0_path, edge_follow_feat_0)
print(f"Edge[user:follow:user] feature [feat_0] is saved to {edge_follow_feat_0_path}\n")
# Generate another edge[user:follow:user] feature in torch tensor
edge_follow_feat_1_path = os.path.join(base_dir, "edge-follow-feat-1.pt")
edge_follow_feat_1 = torch.rand(num_edges, 5)
print(f"Part of edge[user:follow:user] feature [feat_1]: {edge_follow_feat_1[:3, :]}")
torch.save(edge_follow_feat_1, edge_follow_feat_1_path)
print(f"Edge[user:follow:user] feature [feat_1] is saved to {edge_follow_feat_1_path}\n")
Part of node[user] feature [feat_0]: [[0.05815504 0.62263662 0.59561617 0.66398507 0.79119118]
[0.23600282 0.2229376 0.82513779 0.02374633 0.13690369]
[0.21409339 0.28683459 0.79769497 0.19185927 0.14454077]]
Node[user] feature [feat_0] is saved to ./ondisk_dataset_heterograph/node-user-feat-0.npy
Part of node[user] feature [feat_1]: tensor([[0.6937, 0.3329, 0.7192, 0.3123, 0.1724],
[0.7422, 0.8544, 0.9974, 0.8008, 0.8708],
[0.1762, 0.5480, 0.9746, 0.8190, 0.9903]])
Node[user] feature [feat_1] is saved to ./ondisk_dataset_heterograph/node-user-feat-1.pt
Part of node[item] feature [feat_0]: [[0.2033349 0.01100595 0.80832844 0.43027092 0.37818883]
[0.33674085 0.52218617 0.8544344 0.52959796 0.2149139 ]
[0.70668641 0.26724273 0.24429898 0.57041181 0.6830655 ]]
Node[item] feature [feat_0] is saved to ./ondisk_dataset_heterograph/node-item-feat-0.npy
Part of node[item] feature [feat_1]: tensor([[0.8859, 0.8434, 0.3209, 0.6922, 0.2568],
[0.4988, 0.8797, 0.9821, 0.8718, 0.3224],
[0.2982, 0.1218, 0.1621, 0.0178, 0.0617]])
Node[item] feature [feat_1] is saved to ./ondisk_dataset_heterograph/node-item-feat-1.pt
Part of edge[user:like:item] feature [feat_0]: [[0.6675822 0.51602577 0.80937203 0.59834655 0.16118514]
[0.54314096 0.34548349 0.26585401 0.57061101 0.322611 ]
[0.03970957 0.28232127 0.04387671 0.78814138 0.8168992 ]]
Edge[user:like:item] feature [feat_0] is saved to ./ondisk_dataset_heterograph/edge-like-feat-0.npy
Part of edge[user:like:item] feature [feat_1]: tensor([[0.9647, 0.9774, 0.6275, 0.2024, 0.2268],
[0.1538, 0.3849, 0.9253, 0.7636, 0.1180],
[0.2832, 0.2294, 0.1581, 0.7326, 0.6709]])
Edge[user:like:item] feature [feat_1] is saved to ./ondisk_dataset_heterograph/edge-like-feat-1.pt
Part of edge[user:follow:user] feature [feat_0]: [[0.53172039 0.76260774 0.6752395 0.09390495 0.75917655]
[0.12432898 0.06612804 0.26096148 0.73147272 0.78818301]
[0.07754046 0.66636421 0.95190899 0.51193136 0.40437589]]
Edge[user:follow:user] feature [feat_0] is saved to ./ondisk_dataset_heterograph/edge-follow-feat-0.npy
Part of edge[user:follow:user] feature [feat_1]: tensor([[0.9387, 0.8550, 0.4421, 0.3698, 0.6186],
[0.2243, 0.5516, 0.6031, 0.1381, 0.9809],
[0.2079, 0.1675, 0.6611, 0.8580, 0.5825]])
Edge[user:follow:user] feature [feat_1] is saved to ./ondisk_dataset_heterograph/edge-follow-feat-1.pt
生成任务
OnDiskDataset
支持多种任务。对于每种任务,我们需要分别准备训练/验证/测试集。这些集合通常在不同任务之间有所不同。在本教程中,让我们创建一个 节点分类 任务和 链接预测 任务。
节点分类任务
对于节点分类任务,我们需要为每个训练/验证/测试集准备 节点 ID 和相应的 标签。与特征数据一样,这些集合也支持 numpy 数组和 torch 张量。
[5]:
# For illustration, let's generate item sets for each node type.
num_trains = int(num_nodes * 0.6)
num_vals = int(num_nodes * 0.2)
num_tests = num_nodes - num_trains - num_vals
user_ids = np.arange(num_nodes)
np.random.shuffle(user_ids)
item_ids = np.arange(num_nodes)
np.random.shuffle(item_ids)
# Train IDs for user.
nc_train_user_ids_path = os.path.join(base_dir, "nc-train-user-ids.npy")
nc_train_user_ids = user_ids[:num_trains]
print(f"Part of train ids[user] for node classification: {nc_train_user_ids[:3]}")
np.save(nc_train_user_ids_path, nc_train_user_ids)
print(f"NC train ids[user] are saved to {nc_train_user_ids_path}\n")
# Train labels for user.
nc_train_user_labels_path = os.path.join(base_dir, "nc-train-user-labels.pt")
nc_train_user_labels = torch.randint(0, 10, (num_trains,))
print(f"Part of train labels[user] for node classification: {nc_train_user_labels[:3]}")
torch.save(nc_train_user_labels, nc_train_user_labels_path)
print(f"NC train labels[user] are saved to {nc_train_user_labels_path}\n")
# Train IDs for item.
nc_train_item_ids_path = os.path.join(base_dir, "nc-train-item-ids.npy")
nc_train_item_ids = item_ids[:num_trains]
print(f"Part of train ids[item] for node classification: {nc_train_item_ids[:3]}")
np.save(nc_train_item_ids_path, nc_train_item_ids)
print(f"NC train ids[item] are saved to {nc_train_item_ids_path}\n")
# Train labels for item.
nc_train_item_labels_path = os.path.join(base_dir, "nc-train-item-labels.pt")
nc_train_item_labels = torch.randint(0, 10, (num_trains,))
print(f"Part of train labels[item] for node classification: {nc_train_item_labels[:3]}")
torch.save(nc_train_item_labels, nc_train_item_labels_path)
print(f"NC train labels[item] are saved to {nc_train_item_labels_path}\n")
# Val IDs for user.
nc_val_user_ids_path = os.path.join(base_dir, "nc-val-user-ids.npy")
nc_val_user_ids = user_ids[num_trains:num_trains+num_vals]
print(f"Part of val ids[user] for node classification: {nc_val_user_ids[:3]}")
np.save(nc_val_user_ids_path, nc_val_user_ids)
print(f"NC val ids[user] are saved to {nc_val_user_ids_path}\n")
# Val labels for user.
nc_val_user_labels_path = os.path.join(base_dir, "nc-val-user-labels.pt")
nc_val_user_labels = torch.randint(0, 10, (num_vals,))
print(f"Part of val labels[user] for node classification: {nc_val_user_labels[:3]}")
torch.save(nc_val_user_labels, nc_val_user_labels_path)
print(f"NC val labels[user] are saved to {nc_val_user_labels_path}\n")
# Val IDs for item.
nc_val_item_ids_path = os.path.join(base_dir, "nc-val-item-ids.npy")
nc_val_item_ids = item_ids[num_trains:num_trains+num_vals]
print(f"Part of val ids[item] for node classification: {nc_val_item_ids[:3]}")
np.save(nc_val_item_ids_path, nc_val_item_ids)
print(f"NC val ids[item] are saved to {nc_val_item_ids_path}\n")
# Val labels for item.
nc_val_item_labels_path = os.path.join(base_dir, "nc-val-item-labels.pt")
nc_val_item_labels = torch.randint(0, 10, (num_vals,))
print(f"Part of val labels[item] for node classification: {nc_val_item_labels[:3]}")
torch.save(nc_val_item_labels, nc_val_item_labels_path)
print(f"NC val labels[item] are saved to {nc_val_item_labels_path}\n")
# Test IDs for user.
nc_test_user_ids_path = os.path.join(base_dir, "nc-test-user-ids.npy")
nc_test_user_ids = user_ids[-num_tests:]
print(f"Part of test ids[user] for node classification: {nc_test_user_ids[:3]}")
np.save(nc_test_user_ids_path, nc_test_user_ids)
print(f"NC test ids[user] are saved to {nc_test_user_ids_path}\n")
# Test labels for user.
nc_test_user_labels_path = os.path.join(base_dir, "nc-test-user-labels.pt")
nc_test_user_labels = torch.randint(0, 10, (num_tests,))
print(f"Part of test labels[user] for node classification: {nc_test_user_labels[:3]}")
torch.save(nc_test_user_labels, nc_test_user_labels_path)
print(f"NC test labels[user] are saved to {nc_test_user_labels_path}\n")
# Test IDs for item.
nc_test_item_ids_path = os.path.join(base_dir, "nc-test-item-ids.npy")
nc_test_item_ids = item_ids[-num_tests:]
print(f"Part of test ids[item] for node classification: {nc_test_item_ids[:3]}")
np.save(nc_test_item_ids_path, nc_test_item_ids)
print(f"NC test ids[item] are saved to {nc_test_item_ids_path}\n")
# Test labels for item.
nc_test_item_labels_path = os.path.join(base_dir, "nc-test-item-labels.pt")
nc_test_item_labels = torch.randint(0, 10, (num_tests,))
print(f"Part of test labels[item] for node classification: {nc_test_item_labels[:3]}")
torch.save(nc_test_item_labels, nc_test_item_labels_path)
print(f"NC test labels[item] are saved to {nc_test_item_labels_path}\n")
Part of train ids[user] for node classification: [645 75 874]
NC train ids[user] are saved to ./ondisk_dataset_heterograph/nc-train-user-ids.npy
Part of train labels[user] for node classification: tensor([2, 8, 9])
NC train labels[user] are saved to ./ondisk_dataset_heterograph/nc-train-user-labels.pt
Part of train ids[item] for node classification: [ 37 469 825]
NC train ids[item] are saved to ./ondisk_dataset_heterograph/nc-train-item-ids.npy
Part of train labels[item] for node classification: tensor([7, 7, 1])
NC train labels[item] are saved to ./ondisk_dataset_heterograph/nc-train-item-labels.pt
Part of val ids[user] for node classification: [384 877 969]
NC val ids[user] are saved to ./ondisk_dataset_heterograph/nc-val-user-ids.npy
Part of val labels[user] for node classification: tensor([3, 4, 3])
NC val labels[user] are saved to ./ondisk_dataset_heterograph/nc-val-user-labels.pt
Part of val ids[item] for node classification: [147 696 586]
NC val ids[item] are saved to ./ondisk_dataset_heterograph/nc-val-item-ids.npy
Part of val labels[item] for node classification: tensor([4, 4, 5])
NC val labels[item] are saved to ./ondisk_dataset_heterograph/nc-val-item-labels.pt
Part of test ids[user] for node classification: [946 247 458]
NC test ids[user] are saved to ./ondisk_dataset_heterograph/nc-test-user-ids.npy
Part of test labels[user] for node classification: tensor([3, 0, 4])
NC test labels[user] are saved to ./ondisk_dataset_heterograph/nc-test-user-labels.pt
Part of test ids[item] for node classification: [ 28 485 116]
NC test ids[item] are saved to ./ondisk_dataset_heterograph/nc-test-item-ids.npy
Part of test labels[item] for node classification: tensor([5, 2, 1])
NC test labels[item] are saved to ./ondisk_dataset_heterograph/nc-test-item-labels.pt
链接预测任务
对于链接预测任务,我们需要为每个训练/验证/测试集准备 种子 或 相应的标签和索引,它们表示种子的正/负属性和分组。与特征数据一样,这些集合也支持 numpy 数组和 torch 张量。
[6]:
# For illustration, let's generate item sets for each edge type.
num_trains = int(num_edges * 0.6)
num_vals = int(num_edges * 0.2)
num_tests = num_edges - num_trains - num_vals
# Train seeds for user:like:item.
lp_train_like_seeds_path = os.path.join(base_dir, "lp-train-like-seeds.npy")
lp_train_like_seeds = like_edges[:num_trains, :]
print(f"Part of train seeds[user:like:item] for link prediction: {lp_train_like_seeds[:3]}")
np.save(lp_train_like_seeds_path, lp_train_like_seeds)
print(f"LP train seeds[user:like:item] are saved to {lp_train_like_seeds_path}\n")
# Train seeds for user:follow:user.
lp_train_follow_seeds_path = os.path.join(base_dir, "lp-train-follow-seeds.npy")
lp_train_follow_seeds = follow_edges[:num_trains, :]
print(f"Part of train seeds[user:follow:user] for link prediction: {lp_train_follow_seeds[:3]}")
np.save(lp_train_follow_seeds_path, lp_train_follow_seeds)
print(f"LP train seeds[user:follow:user] are saved to {lp_train_follow_seeds_path}\n")
# Val seeds for user:like:item.
lp_val_like_seeds_path = os.path.join(base_dir, "lp-val-like-seeds.npy")
lp_val_like_seeds = like_edges[num_trains:num_trains+num_vals, :]
lp_val_like_neg_dsts = np.random.randint(0, num_nodes, (num_vals, 10)).reshape(-1)
lp_val_like_neg_srcs = np.repeat(lp_val_like_seeds[:,0], 10)
lp_val_like_neg_seeds = np.concatenate((lp_val_like_neg_srcs, lp_val_like_neg_dsts)).reshape(2,-1).T
lp_val_like_seeds = np.concatenate((lp_val_like_seeds, lp_val_like_neg_seeds))
print(f"Part of val seeds[user:like:item] for link prediction: {lp_val_like_seeds[:3]}")
np.save(lp_val_like_seeds_path, lp_val_like_seeds)
print(f"LP val seeds[user:like:item] are saved to {lp_val_like_seeds_path}\n")
# Val labels for user:like:item.
lp_val_like_labels_path = os.path.join(base_dir, "lp-val-like-labels.npy")
lp_val_like_labels = np.empty(num_vals * (10 + 1))
lp_val_like_labels[:num_vals] = 1
lp_val_like_labels[num_vals:] = 0
print(f"Part of val labels[user:like:item] for link prediction: {lp_val_like_labels[:3]}")
np.save(lp_val_like_labels_path, lp_val_like_labels)
print(f"LP val labels[user:like:item] are saved to {lp_val_like_labels_path}\n")
# Val indexes for user:like:item.
lp_val_like_indexes_path = os.path.join(base_dir, "lp-val-like-indexes.npy")
lp_val_like_indexes = np.arange(0, num_vals)
lp_val_like_neg_indexes = np.repeat(lp_val_like_indexes, 10)
lp_val_like_indexes = np.concatenate([lp_val_like_indexes, lp_val_like_neg_indexes])
print(f"Part of val indexes[user:like:item] for link prediction: {lp_val_like_indexes[:3]}")
np.save(lp_val_like_indexes_path, lp_val_like_indexes)
print(f"LP val indexes[user:like:item] are saved to {lp_val_like_indexes_path}\n")
# Val seeds for user:follow:item.
lp_val_follow_seeds_path = os.path.join(base_dir, "lp-val-follow-seeds.npy")
lp_val_follow_seeds = follow_edges[num_trains:num_trains+num_vals, :]
lp_val_follow_neg_dsts = np.random.randint(0, num_nodes, (num_vals, 10)).reshape(-1)
lp_val_follow_neg_srcs = np.repeat(lp_val_follow_seeds[:,0], 10)
lp_val_follow_neg_seeds = np.concatenate((lp_val_follow_neg_srcs, lp_val_follow_neg_dsts)).reshape(2,-1).T
lp_val_follow_seeds = np.concatenate((lp_val_follow_seeds, lp_val_follow_neg_seeds))
print(f"Part of val seeds[user:follow:item] for link prediction: {lp_val_follow_seeds[:3]}")
np.save(lp_val_follow_seeds_path, lp_val_follow_seeds)
print(f"LP val seeds[user:follow:item] are saved to {lp_val_follow_seeds_path}\n")
# Val labels for user:follow:item.
lp_val_follow_labels_path = os.path.join(base_dir, "lp-val-follow-labels.npy")
lp_val_follow_labels = np.empty(num_vals * (10 + 1))
lp_val_follow_labels[:num_vals] = 1
lp_val_follow_labels[num_vals:] = 0
print(f"Part of val labels[user:follow:item] for link prediction: {lp_val_follow_labels[:3]}")
np.save(lp_val_follow_labels_path, lp_val_follow_labels)
print(f"LP val labels[user:follow:item] are saved to {lp_val_follow_labels_path}\n")
# Val indexes for user:follow:item.
lp_val_follow_indexes_path = os.path.join(base_dir, "lp-val-follow-indexes.npy")
lp_val_follow_indexes = np.arange(0, num_vals)
lp_val_follow_neg_indexes = np.repeat(lp_val_follow_indexes, 10)
lp_val_follow_indexes = np.concatenate([lp_val_follow_indexes, lp_val_follow_neg_indexes])
print(f"Part of val indexes[user:follow:item] for link prediction: {lp_val_follow_indexes[:3]}")
np.save(lp_val_follow_indexes_path, lp_val_follow_indexes)
print(f"LP val indexes[user:follow:item] are saved to {lp_val_follow_indexes_path}\n")
# Test seeds for user:like:item.
lp_test_like_seeds_path = os.path.join(base_dir, "lp-test-like-seeds.npy")
lp_test_like_seeds = like_edges[-num_tests:, :]
lp_test_like_neg_dsts = np.random.randint(0, num_nodes, (num_tests, 10)).reshape(-1)
lp_test_like_neg_srcs = np.repeat(lp_test_like_seeds[:,0], 10)
lp_test_like_neg_seeds = np.concatenate((lp_test_like_neg_srcs, lp_test_like_neg_dsts)).reshape(2,-1).T
lp_test_like_seeds = np.concatenate((lp_test_like_seeds, lp_test_like_neg_seeds))
print(f"Part of test seeds[user:like:item] for link prediction: {lp_test_like_seeds[:3]}")
np.save(lp_test_like_seeds_path, lp_test_like_seeds)
print(f"LP test seeds[user:like:item] are saved to {lp_test_like_seeds_path}\n")
# Test labels for user:like:item.
lp_test_like_labels_path = os.path.join(base_dir, "lp-test-like-labels.npy")
lp_test_like_labels = np.empty(num_tests * (10 + 1))
lp_test_like_labels[:num_tests] = 1
lp_test_like_labels[num_tests:] = 0
print(f"Part of test labels[user:like:item] for link prediction: {lp_test_like_labels[:3]}")
np.save(lp_test_like_labels_path, lp_test_like_labels)
print(f"LP test labels[user:like:item] are saved to {lp_test_like_labels_path}\n")
# Test indexes for user:like:item.
lp_test_like_indexes_path = os.path.join(base_dir, "lp-test-like-indexes.npy")
lp_test_like_indexes = np.arange(0, num_tests)
lp_test_like_neg_indexes = np.repeat(lp_test_like_indexes, 10)
lp_test_like_indexes = np.concatenate([lp_test_like_indexes, lp_test_like_neg_indexes])
print(f"Part of test indexes[user:like:item] for link prediction: {lp_test_like_indexes[:3]}")
np.save(lp_test_like_indexes_path, lp_test_like_indexes)
print(f"LP test indexes[user:like:item] are saved to {lp_test_like_indexes_path}\n")
# Test seeds for user:follow:item.
lp_test_follow_seeds_path = os.path.join(base_dir, "lp-test-follow-seeds.npy")
lp_test_follow_seeds = follow_edges[-num_tests:, :]
lp_test_follow_neg_dsts = np.random.randint(0, num_nodes, (num_tests, 10)).reshape(-1)
lp_test_follow_neg_srcs = np.repeat(lp_test_follow_seeds[:,0], 10)
lp_test_follow_neg_seeds = np.concatenate((lp_test_follow_neg_srcs, lp_test_follow_neg_dsts)).reshape(2,-1).T
lp_test_follow_seeds = np.concatenate((lp_test_follow_seeds, lp_test_follow_neg_seeds))
print(f"Part of test seeds[user:follow:item] for link prediction: {lp_test_follow_seeds[:3]}")
np.save(lp_test_follow_seeds_path, lp_test_follow_seeds)
print(f"LP test seeds[user:follow:item] are saved to {lp_test_follow_seeds_path}\n")
# Test labels for user:follow:item.
lp_test_follow_labels_path = os.path.join(base_dir, "lp-test-follow-labels.npy")
lp_test_follow_labels = np.empty(num_tests * (10 + 1))
lp_test_follow_labels[:num_tests] = 1
lp_test_follow_labels[num_tests:] = 0
print(f"Part of test labels[user:follow:item] for link prediction: {lp_test_follow_labels[:3]}")
np.save(lp_test_follow_labels_path, lp_test_follow_labels)
print(f"LP test labels[user:follow:item] are saved to {lp_test_follow_labels_path}\n")
# Test indexes for user:follow:item.
lp_test_follow_indexes_path = os.path.join(base_dir, "lp-test-follow-indexes.npy")
lp_test_follow_indexes = np.arange(0, num_tests)
lp_test_follow_neg_indexes = np.repeat(lp_test_follow_indexes, 10)
lp_test_follow_indexes = np.concatenate([lp_test_follow_indexes, lp_test_follow_neg_indexes])
print(f"Part of test indexes[user:follow:item] for link prediction: {lp_test_follow_indexes[:3]}")
np.save(lp_test_follow_indexes_path, lp_test_follow_indexes)
print(f"LP test indexes[user:follow:item] are saved to {lp_test_follow_indexes_path}\n")
Part of train seeds[user:like:item] for link prediction: [[380 336]
[487 223]
[977 689]]
LP train seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-train-like-seeds.npy
Part of train seeds[user:follow:user] for link prediction: [[153 111]
[ 33 32]
[611 91]]
LP train seeds[user:follow:user] are saved to ./ondisk_dataset_heterograph/lp-train-follow-seeds.npy
Part of val seeds[user:like:item] for link prediction: [[315 755]
[251 979]
[282 863]]
LP val seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-seeds.npy
Part of val labels[user:like:item] for link prediction: [1. 1. 1.]
LP val labels[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-labels.npy
Part of val indexes[user:like:item] for link prediction: [0 1 2]
LP val indexes[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-indexes.npy
Part of val seeds[user:follow:item] for link prediction: [[144 382]
[926 365]
[581 712]]
LP val seeds[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-seeds.npy
Part of val labels[user:follow:item] for link prediction: [1. 1. 1.]
LP val labels[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-labels.npy
Part of val indexes[user:follow:item] for link prediction: [0 1 2]
LP val indexes[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-indexes.npy
Part of test seeds[user:like:item] for link prediction: [[599 385]
[463 702]
[697 260]]
LP test seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-seeds.npy
Part of test labels[user:like:item] for link prediction: [1. 1. 1.]
LP test labels[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-labels.npy
Part of test indexes[user:like:item] for link prediction: [0 1 2]
LP test indexes[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-indexes.npy
Part of test seeds[user:follow:item] for link prediction: [[710 533]
[577 575]
[ 81 104]]
LP test seeds[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-seeds.npy
Part of test labels[user:follow:item] for link prediction: [1. 1. 1.]
LP test labels[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-labels.npy
Part of test indexes[user:follow:item] for link prediction: [0 1 2]
LP test indexes[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-indexes.npy
将数据组织到 YAML 文件中
现在我们需要创建一个 metadata.yaml
文件,其中包含图结构、特征数据、训练/验证/测试集的路径和数据类型。请注意,所有路径都应相对于 metadata.yaml
。
对于异构图,我们需要在 type 字段中指定节点/边类型。对于边类型,需要规范的 etype,它是一个由源节点类型、etype 和目标节点类型通过 :
连接而成的字符串。
注意:- 所有路径应相对于 metadata.yaml
。- 以下字段是可选的,在以下示例中未指定。- in_memory
:指示是否将数据加载到内存或进行 mmap
。默认为 True
。
请参考 YAML 规范 获取更多详细信息。
[7]:
yaml_content = f"""
dataset_name: heterogeneous_graph_nc_lp
graph:
nodes:
- type: user
num: {num_nodes}
- type: item
num: {num_nodes}
edges:
- type: "user:like:item"
format: csv
path: {os.path.basename(like_edges_path)}
- type: "user:follow:user"
format: csv
path: {os.path.basename(follow_edges_path)}
feature_data:
- domain: node
type: user
name: feat_0
format: numpy
path: {os.path.basename(node_user_feat_0_path)}
- domain: node
type: user
name: feat_1
format: torch
path: {os.path.basename(node_user_feat_1_path)}
- domain: node
type: item
name: feat_0
format: numpy
path: {os.path.basename(node_item_feat_0_path)}
- domain: node
type: item
name: feat_1
format: torch
path: {os.path.basename(node_item_feat_1_path)}
- domain: edge
type: "user:like:item"
name: feat_0
format: numpy
path: {os.path.basename(edge_like_feat_0_path)}
- domain: edge
type: "user:like:item"
name: feat_1
format: torch
path: {os.path.basename(edge_like_feat_1_path)}
- domain: edge
type: "user:follow:user"
name: feat_0
format: numpy
path: {os.path.basename(edge_follow_feat_0_path)}
- domain: edge
type: "user:follow:user"
name: feat_1
format: torch
path: {os.path.basename(edge_follow_feat_1_path)}
tasks:
- name: node_classification
num_classes: 10
train_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_train_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_train_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_train_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_train_item_labels_path)}
validation_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_val_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_val_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_val_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_val_item_labels_path)}
test_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_test_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_test_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_test_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_test_item_labels_path)}
- name: link_prediction
num_classes: 10
train_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_train_like_seeds_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_train_follow_seeds_path)}
validation_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_val_like_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_val_like_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_val_like_indexes_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_val_follow_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_val_follow_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_val_follow_indexes_path)}
test_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_test_like_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_test_like_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_test_like_indexes_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_test_follow_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_test_follow_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_test_follow_indexes_path)}
"""
metadata_path = os.path.join(base_dir, "metadata.yaml")
with open(metadata_path, "w") as f:
f.write(yaml_content)
实例化 OnDiskDataset
现在我们可以通过 dgl.graphbolt.OnDiskDataset
加载数据集。实例化时,只需传入包含 metadata.yaml
文件的基础目录。
在首次实例化时,GraphBolt 会对原始数据进行预处理,例如从边构建 FusedCSCSamplingGraph
。预处理后,所有数据(包括图、特征数据、训练/验证/测试集)都会被放入 preprocessed
目录中。后续任何数据集加载都将跳过预处理阶段。
预处理后,需要显式调用 load()
方法才能加载图、特征数据和任务。
[8]:
dataset = gb.OnDiskDataset(base_dir).load()
graph = dataset.graph
print(f"Loaded graph: {graph}\n")
feature = dataset.feature
print(f"Loaded feature store: {feature}\n")
tasks = dataset.tasks
nc_task = tasks[0]
print(f"Loaded node classification task: {nc_task}\n")
lp_task = tasks[1]
print(f"Loaded link prediction task: {lp_task}\n")
Start to preprocess the on-disk dataset.
Finish preprocessing the on-disk dataset.
Loaded graph: FusedCSCSamplingGraph(csc_indptr=tensor([ 0, 9, 20, ..., 19983, 19990, 20000], dtype=torch.int32),
indices=tensor([1765, 1933, 1528, ..., 1752, 1534, 1272], dtype=torch.int32),
total_num_nodes=2000, num_edges={'user:follow:user': 10000, 'user:like:item': 10000},
node_type_offset=tensor([ 0, 1000, 2000], dtype=torch.int32),
type_per_edge=tensor([1, 1, 1, ..., 0, 0, 0], dtype=torch.uint8),
node_type_to_id={'item': 0, 'user': 1},
edge_type_to_id={'user:follow:user': 0, 'user:like:item': 1},)
Loaded feature store: TorchBasedFeatureStore(
{(<OnDiskFeatureDataDomain.NODE: 'node'>, 'user', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.0582, 0.6226, 0.5956, 0.6640, 0.7912],
[0.2360, 0.2229, 0.8251, 0.0237, 0.1369],
[0.2141, 0.2868, 0.7977, 0.1919, 0.1445],
...,
[0.6758, 0.6131, 0.3490, 0.6229, 0.3385],
[0.2861, 0.9366, 0.8665, 0.7879, 0.8078],
[0.0634, 0.4893, 0.7595, 0.9323, 0.6172]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'user', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.6937, 0.3329, 0.7192, 0.3123, 0.1724],
[0.7422, 0.8544, 0.9974, 0.8008, 0.8708],
[0.1762, 0.5480, 0.9746, 0.8190, 0.9903],
...,
[0.3797, 0.8574, 0.5656, 0.9678, 0.1397],
[0.2517, 0.6104, 0.2175, 0.3430, 0.6954],
[0.8357, 0.9246, 0.1623, 0.3658, 0.9542]]),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'item', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.2033, 0.0110, 0.8083, 0.4303, 0.3782],
[0.3367, 0.5222, 0.8544, 0.5296, 0.2149],
[0.7067, 0.2672, 0.2443, 0.5704, 0.6831],
...,
[0.3594, 0.4731, 0.0955, 0.8706, 0.0313],
[0.2760, 0.6426, 0.6040, 0.0195, 0.4584],
[0.6533, 0.8748, 0.1271, 0.6261, 0.5419]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'item', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.8859, 0.8434, 0.3209, 0.6922, 0.2568],
[0.4988, 0.8797, 0.9821, 0.8718, 0.3224],
[0.2982, 0.1218, 0.1621, 0.0178, 0.0617],
...,
[0.6471, 0.7150, 0.1411, 0.3613, 0.9244],
[0.9040, 0.2340, 0.0029, 0.4672, 0.8632],
[0.4902, 0.0648, 0.6301, 0.8208, 0.3444]]),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:like:item', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.6676, 0.5160, 0.8094, 0.5983, 0.1612],
[0.5431, 0.3455, 0.2659, 0.5706, 0.3226],
[0.0397, 0.2823, 0.0439, 0.7881, 0.8169],
...,
[0.5587, 0.5632, 0.1673, 0.3457, 0.8432],
[0.1734, 0.0078, 0.8432, 0.6124, 0.0663],
[0.1232, 0.5811, 0.9371, 0.6533, 0.8329]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:like:item', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.9647, 0.9774, 0.6275, 0.2024, 0.2268],
[0.1538, 0.3849, 0.9253, 0.7636, 0.1180],
[0.2832, 0.2294, 0.1581, 0.7326, 0.6709],
...,
[0.4754, 0.7786, 0.3426, 0.7748, 0.7076],
[0.4046, 0.1207, 0.5583, 0.4437, 0.1142],
[0.2138, 0.9615, 0.5857, 0.9232, 0.4814]]),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:follow:user', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.5317, 0.7626, 0.6752, 0.0939, 0.7592],
[0.1243, 0.0661, 0.2610, 0.7315, 0.7882],
[0.0775, 0.6664, 0.9519, 0.5119, 0.4044],
...,
[0.0181, 0.9069, 0.0193, 0.2000, 0.6538],
[0.9659, 0.9456, 0.1061, 0.4416, 0.3406],
[0.8894, 0.1650, 0.8613, 0.8644, 0.5971]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:follow:user', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.9387, 0.8550, 0.4421, 0.3698, 0.6186],
[0.2243, 0.5516, 0.6031, 0.1381, 0.9809],
[0.2079, 0.1675, 0.6611, 0.8580, 0.5825],
...,
[0.5964, 0.5642, 0.0805, 0.6991, 0.6833],
[0.8988, 0.8473, 0.3402, 0.8153, 0.0780],
[0.8287, 0.8431, 0.0942, 0.5180, 0.3003]]),
metadata={},
)}
)
Loaded node classification task: OnDiskTask(validation_set=HeteroItemSet(
itemsets={'user': ItemSet(
items=(tensor([384, 877, 969, 196, 655, 148, 141, 556, 501, 879, 851, 374, 158, 607,
906, 461, 923, 707, 650, 641, 818, 813, 658, 664, 976, 263, 339, 833,
479, 307, 821, 872, 285, 712, 37, 670, 602, 502, 336, 510, 706, 159,
154, 958, 240, 848, 386, 535, 347, 343, 69, 237, 962, 195, 22, 523,
554, 191, 845, 578, 637, 957, 51, 144, 825, 226, 165, 565, 715, 678,
850, 9, 589, 101, 238, 696, 381, 442, 106, 197, 797, 873, 358, 772,
830, 371, 219, 387, 686, 36, 915, 724, 378, 433, 465, 618, 759, 254,
611, 741, 760, 792, 73, 794, 559, 950, 332, 351, 138, 636, 778, 300,
365, 457, 844, 785, 977, 151, 258, 288, 532, 644, 211, 177, 21, 290,
462, 689, 207, 382, 606, 620, 725, 815, 338, 922, 218, 937, 791, 248,
941, 233, 599, 312, 586, 528, 530, 11, 997, 41, 294, 85, 67, 610,
180, 372, 949, 3, 129, 454, 287, 838, 472, 986, 20, 432, 369, 952,
679, 164, 718, 503, 162, 483, 213, 289, 42, 750, 605, 697, 303, 876,
493, 561, 681, 455, 742, 812, 286, 172, 451, 548, 560, 900, 979, 203,
18, 13, 704, 187], dtype=torch.int32), tensor([3, 4, 3, 4, 7, 7, 6, 5, 5, 5, 3, 0, 2, 3, 5, 4, 3, 3, 7, 7, 1, 7, 2, 2,
3, 2, 9, 2, 8, 5, 8, 0, 2, 9, 2, 4, 2, 2, 1, 4, 4, 6, 4, 1, 7, 9, 4, 4,
3, 2, 5, 8, 2, 8, 8, 8, 7, 6, 6, 8, 2, 5, 4, 7, 8, 3, 6, 5, 4, 1, 3, 3,
0, 0, 1, 6, 2, 9, 1, 6, 4, 7, 6, 5, 4, 3, 0, 8, 8, 9, 5, 7, 2, 7, 1, 4,
3, 3, 1, 7, 0, 4, 7, 1, 4, 5, 4, 6, 1, 9, 9, 7, 0, 0, 5, 5, 3, 3, 7, 6,
5, 3, 9, 7, 4, 4, 6, 7, 1, 9, 7, 9, 6, 9, 0, 0, 3, 8, 2, 1, 3, 3, 8, 7,
6, 2, 6, 4, 9, 6, 6, 2, 4, 1, 1, 2, 6, 6, 7, 8, 1, 0, 9, 5, 9, 6, 5, 4,
1, 4, 4, 7, 4, 4, 3, 4, 9, 0, 4, 4, 1, 4, 1, 6, 6, 7, 5, 7, 5, 8, 7, 6,
7, 4, 3, 4, 2, 4, 6, 1])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([147, 696, 586, 275, 839, 249, 736, 340, 556, 862, 857, 344, 851, 426,
367, 189, 2, 338, 933, 542, 403, 440, 75, 507, 698, 643, 296, 65,
211, 271, 73, 628, 912, 551, 553, 967, 112, 831, 754, 214, 512, 414,
148, 936, 476, 687, 899, 191, 234, 445, 172, 609, 213, 423, 255, 846,
261, 111, 886, 378, 143, 795, 894, 8, 280, 633, 855, 900, 652, 988,
774, 584, 538, 291, 842, 583, 319, 506, 327, 677, 135, 67, 123, 0,
141, 814, 853, 50, 769, 911, 188, 536, 598, 313, 514, 342, 850, 719,
90, 260, 685, 946, 868, 202, 745, 404, 860, 337, 872, 784, 216, 435,
812, 101, 465, 144, 375, 321, 323, 737, 591, 486, 823, 9, 15, 154,
494, 456, 74, 914, 23, 676, 364, 380, 882, 949, 383, 373, 395, 145,
804, 813, 45, 258, 863, 193, 170, 891, 475, 990, 982, 758, 286, 925,
495, 164, 892, 822, 49, 300, 212, 322, 888, 668, 631, 869, 581, 127,
550, 59, 637, 366, 224, 448, 627, 83, 233, 452, 458, 305, 841, 56,
950, 738, 430, 302, 602, 25, 521, 717, 528, 441, 792, 959, 753, 651,
931, 830, 715, 671], dtype=torch.int32), tensor([4, 4, 5, 6, 0, 8, 5, 8, 8, 1, 0, 7, 1, 2, 3, 3, 7, 5, 8, 2, 4, 1, 5, 2,
4, 4, 3, 1, 8, 0, 6, 6, 2, 0, 6, 0, 4, 4, 7, 1, 7, 7, 2, 3, 2, 0, 1, 3,
6, 1, 4, 0, 5, 6, 1, 7, 3, 9, 8, 9, 9, 7, 6, 3, 4, 7, 9, 5, 3, 8, 5, 9,
3, 5, 9, 5, 6, 3, 9, 4, 6, 6, 8, 2, 8, 6, 6, 5, 0, 0, 1, 9, 6, 7, 6, 4,
5, 1, 6, 5, 2, 4, 4, 1, 1, 4, 4, 4, 0, 5, 3, 0, 0, 9, 1, 6, 8, 2, 5, 7,
8, 1, 4, 3, 1, 3, 6, 6, 1, 1, 1, 1, 0, 1, 9, 1, 1, 1, 0, 2, 1, 1, 2, 0,
1, 1, 4, 2, 8, 2, 3, 7, 8, 2, 2, 5, 2, 7, 1, 5, 6, 6, 2, 5, 9, 5, 9, 1,
4, 9, 3, 0, 1, 0, 2, 1, 9, 3, 8, 2, 3, 7, 3, 9, 1, 7, 0, 2, 5, 4, 7, 3,
9, 0, 1, 0, 4, 7, 7, 4])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
train_set=HeteroItemSet(
itemsets={'user': ItemSet(
items=(tensor([645, 75, 874, 337, 334, 438, 474, 490, 903, 710, 777, 814, 295, 870,
811, 885, 767, 35, 890, 70, 486, 496, 630, 841, 526, 604, 424, 883,
246, 79, 8, 788, 423, 185, 784, 257, 446, 633, 735, 824, 907, 379,
967, 557, 998, 115, 29, 682, 994, 427, 651, 685, 415, 250, 313, 987,
439, 506, 770, 616, 945, 659, 632, 849, 323, 104, 518, 751, 499, 48,
853, 223, 7, 866, 437, 643, 817, 964, 839, 239, 72, 98, 112, 534,
947, 640, 673, 619, 224, 2, 887, 951, 779, 871, 46, 449, 631, 426,
410, 328, 136, 293, 119, 826, 266, 281, 568, 127, 395, 216, 137, 822,
588, 43, 836, 68, 771, 134, 229, 621, 842, 153, 843, 400, 646, 363,
15, 155, 856, 309, 205, 634, 244, 761, 317, 799, 763, 52, 471, 401,
749, 459, 105, 135, 513, 444, 529, 299, 975, 54, 469, 942, 468, 321,
690, 595, 980, 965, 325, 925, 601, 380, 892, 888, 491, 492, 727, 368,
87, 520, 930, 97, 863, 156, 417, 570, 674, 688, 963, 26, 531, 464,
178, 412, 356, 614, 441, 880, 416, 204, 500, 182, 911, 928, 512, 726,
898, 988, 253, 24, 896, 708, 217, 249, 272, 470, 939, 190, 653, 875,
192, 241, 891, 279, 649, 597, 575, 19, 55, 598, 865, 926, 440, 452,
45, 720, 283, 999, 497, 562, 480, 613, 919, 33, 38, 758, 284, 803,
617, 910, 878, 769, 220, 699, 683, 663, 488, 16, 525, 762, 716, 188,
389, 453, 576, 745, 774, 886, 905, 583, 515, 667, 160, 212, 647, 862,
786, 864, 329, 996, 721, 808, 627, 206, 314, 245, 754, 39, 403, 419,
584, 62, 425, 383, 902, 982, 566, 306, 227, 31, 917, 431, 775, 753,
255, 485, 855, 74, 929, 121, 298, 86, 435, 921, 703, 748, 404, 345,
717, 349, 790, 139, 671, 175, 693, 624, 56, 82, 10, 698, 126, 396,
270, 143, 734, 27, 722, 398, 146, 326, 585, 481, 820, 352, 943, 456,
429, 268, 376, 128, 543, 277, 361, 262, 322, 215, 669, 1, 635, 95,
411, 893, 827, 399, 938, 914, 832, 972, 569, 522, 145, 320, 743, 654,
30, 847, 736, 692, 157, 916, 495, 259, 527, 348, 44, 932, 421, 894,
476, 301, 773, 800, 273, 450, 305, 407, 168, 746, 829, 346, 125, 533,
402, 202, 375, 494, 319, 505, 60, 966, 260, 668, 83, 701, 793, 308,
593, 524, 377, 304, 869, 91, 516, 970, 728, 364, 80, 675, 695, 225,
504, 590, 912, 840, 713, 766, 805, 738, 536, 992, 113, 507, 665, 132,
626, 467, 795, 99, 264, 991, 214, 324, 959, 166, 198, 662, 955, 342,
615, 108, 61, 909, 64, 555, 920, 443, 291, 353, 809, 755, 394, 370,
549, 508, 77, 672, 236, 282, 828, 709, 179, 732, 776, 694, 648, 509,
642, 737, 434, 133, 882, 473, 366, 445, 810, 122, 235, 230, 711, 804,
184, 484, 49, 731, 622, 629, 574, 802, 546, 657, 537, 661, 931, 163,
373, 448, 318, 582, 927, 269, 895, 209, 638, 100, 330, 116, 176, 581,
609, 261, 57, 149, 256, 538, 477, 333, 327, 78, 392, 140, 58, 23,
660, 109, 391, 360, 6, 782, 393, 798, 194, 131, 837, 409, 174, 702,
76, 265, 908, 691, 796, 719, 59, 628, 981, 111, 580, 357, 547, 170,
482, 519, 63, 367, 274, 447, 428, 167, 594, 801, 96, 460, 271, 53,
124, 32, 199, 28, 783, 89, 88, 228, 924, 953, 92, 123, 517, 267,
189, 231, 677, 297, 341, 118, 934, 81, 511, 406, 147, 868],
dtype=torch.int32), tensor([2, 8, 9, 6, 9, 6, 7, 7, 3, 9, 2, 8, 1, 9, 1, 2, 8, 2, 6, 7, 9, 4, 5, 8,
3, 3, 4, 1, 2, 4, 3, 4, 1, 2, 5, 4, 7, 7, 2, 2, 6, 6, 0, 5, 4, 1, 8, 5,
5, 6, 2, 6, 1, 0, 6, 9, 7, 3, 2, 9, 6, 2, 4, 8, 3, 7, 3, 3, 0, 4, 0, 9,
5, 7, 7, 1, 1, 5, 9, 5, 8, 9, 9, 5, 7, 9, 9, 6, 6, 6, 3, 0, 7, 5, 6, 0,
8, 2, 3, 2, 6, 4, 8, 5, 9, 3, 4, 4, 4, 9, 5, 2, 8, 5, 0, 8, 2, 1, 3, 7,
5, 8, 7, 3, 3, 9, 0, 9, 7, 4, 2, 3, 0, 4, 9, 6, 3, 8, 7, 4, 4, 8, 5, 3,
7, 0, 0, 4, 5, 0, 6, 3, 5, 4, 5, 3, 8, 7, 7, 7, 8, 3, 2, 7, 8, 9, 9, 5,
7, 8, 9, 9, 6, 1, 9, 3, 1, 7, 7, 1, 2, 5, 7, 2, 2, 1, 6, 3, 6, 9, 5, 4,
1, 0, 3, 1, 7, 1, 4, 7, 8, 5, 5, 6, 1, 9, 2, 7, 8, 1, 8, 1, 1, 5, 8, 0,
5, 7, 6, 3, 1, 0, 9, 6, 3, 4, 7, 0, 0, 1, 4, 2, 6, 9, 2, 0, 9, 8, 3, 6,
1, 9, 9, 0, 4, 6, 5, 0, 6, 4, 7, 3, 1, 9, 3, 2, 6, 0, 3, 6, 4, 8, 3, 7,
8, 4, 7, 6, 7, 0, 3, 5, 9, 6, 6, 0, 4, 1, 9, 4, 5, 1, 4, 6, 0, 0, 2, 7,
8, 6, 8, 7, 7, 5, 6, 4, 5, 6, 3, 6, 1, 9, 3, 4, 4, 3, 3, 9, 4, 5, 9, 7,
9, 1, 7, 2, 3, 7, 1, 6, 2, 4, 9, 1, 0, 2, 0, 4, 8, 0, 2, 4, 7, 7, 9, 4,
8, 1, 4, 3, 9, 5, 8, 3, 0, 5, 7, 3, 4, 5, 5, 5, 9, 6, 8, 1, 2, 7, 6, 3,
4, 2, 4, 0, 1, 3, 8, 4, 4, 7, 3, 8, 8, 3, 0, 4, 5, 2, 8, 8, 9, 6, 6, 9,
2, 4, 7, 1, 5, 0, 5, 1, 8, 4, 8, 5, 5, 4, 8, 3, 9, 2, 3, 9, 6, 9, 8, 0,
7, 6, 1, 1, 8, 8, 3, 6, 9, 4, 2, 3, 0, 3, 4, 0, 0, 0, 2, 9, 1, 6, 7, 1,
9, 3, 9, 8, 6, 5, 4, 0, 6, 6, 4, 3, 7, 5, 8, 5, 3, 9, 0, 2, 2, 6, 3, 0,
7, 4, 9, 1, 2, 0, 2, 7, 8, 5, 9, 3, 0, 4, 9, 4, 8, 5, 9, 6, 8, 2, 4, 0,
2, 5, 0, 4, 6, 6, 6, 5, 1, 5, 9, 8, 0, 1, 4, 4, 9, 0, 3, 5, 6, 0, 6, 0,
3, 0, 0, 8, 2, 6, 9, 7, 6, 7, 1, 4, 8, 4, 1, 0, 0, 9, 6, 1, 8, 7, 6, 6,
2, 1, 9, 0, 5, 2, 2, 6, 4, 8, 5, 3, 5, 4, 7, 1, 0, 6, 8, 9, 6, 8, 5, 1,
4, 1, 5, 6, 6, 7, 1, 4, 4, 3, 0, 2, 9, 0, 0, 0, 0, 1, 2, 3, 7, 6, 9, 5,
2, 0, 3, 5, 5, 6, 5, 0, 6, 1, 5, 6, 8, 1, 7, 2, 8, 9, 9, 7, 1, 4, 6, 4])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([ 37, 469, 825, 314, 493, 555, 961, 64, 359, 284, 508, 735, 624, 725,
468, 208, 529, 605, 788, 589, 79, 608, 455, 547, 559, 136, 281, 365,
361, 641, 142, 565, 994, 979, 898, 230, 92, 805, 619, 603, 908, 885,
432, 437, 450, 198, 215, 980, 491, 663, 168, 827, 320, 14, 13, 114,
4, 166, 752, 683, 119, 22, 675, 352, 303, 250, 498, 579, 330, 236,
69, 526, 357, 716, 759, 907, 947, 118, 727, 206, 653, 995, 548, 921,
622, 336, 968, 615, 434, 543, 130, 304, 732, 726, 545, 733, 499, 33,
128, 155, 590, 692, 71, 488, 405, 221, 718, 58, 138, 24, 347, 913,
693, 688, 640, 939, 523, 294, 226, 816, 707, 220, 552, 756, 776, 887,
107, 462, 96, 117, 477, 971, 570, 870, 150, 832, 204, 572, 527, 315,
358, 750, 702, 761, 613, 740, 428, 377, 760, 699, 684, 283, 177, 376,
483, 184, 866, 951, 392, 948, 149, 617, 237, 156, 461, 385, 257, 171,
182, 513, 12, 11, 858, 187, 539, 903, 109, 94, 742, 639, 656, 973,
632, 878, 363, 689, 519, 585, 798, 614, 755, 582, 904, 120, 47, 290,
46, 577, 642, 937, 779, 895, 381, 174, 875, 106, 778, 232, 781, 673,
629, 153, 749, 248, 766, 341, 861, 43, 844, 501, 115, 929, 943, 183,
709, 223, 102, 333, 372, 26, 489, 429, 840, 871, 517, 681, 185, 963,
60, 399, 408, 91, 879, 540, 70, 773, 906, 384, 470, 833, 282, 978,
68, 927, 301, 800, 262, 356, 328, 576, 670, 944, 93, 163, 647, 658,
966, 920, 730, 442, 480, 826, 525, 10, 484, 932, 767, 421, 616, 848,
648, 532, 952, 703, 744, 487, 623, 161, 345, 87, 51, 502, 316, 578,
287, 695, 278, 562, 701, 873, 902, 210, 277, 505, 524, 593, 558, 680,
317, 731, 285, 251, 985, 723, 567, 332, 201, 238, 771, 179, 407, 247,
922, 787, 229, 690, 386, 661, 369, 140, 203, 53, 152, 393, 896, 95,
669, 992, 18, 409, 253, 263, 573, 134, 446, 346, 360, 293, 490, 246,
544, 422, 454, 227, 307, 836, 955, 965, 799, 657, 350, 722, 397, 901,
564, 86, 794, 396, 427, 708, 883, 665, 331, 244, 884, 522, 809, 273,
266, 739, 984, 801, 268, 431, 926, 618, 370, 124, 509, 235, 259, 746,
569, 52, 355, 122, 705, 169, 34, 515, 225, 534, 137, 786, 500, 3,
81, 401, 103, 587, 976, 63, 686, 433, 88, 835, 607, 780, 389, 697,
554, 610, 560, 620, 964, 824, 353, 36, 748, 374, 400, 864, 447, 391,
724, 121, 981, 272, 77, 764, 537, 411, 713, 31, 729, 596, 634, 35,
39, 126, 390, 938, 254, 785, 649, 777, 763, 84, 474, 20, 765, 173,
503, 710, 412, 960, 209, 19, 890, 132, 349, 38, 845, 131, 796, 466,
318, 416, 76, 57, 636, 270, 626, 256, 289, 808, 549, 492, 867, 594,
473, 267, 190, 496, 970, 533, 811, 625, 772, 566, 595, 85, 751, 986,
834, 242, 646, 394, 402, 644, 910, 482, 98, 6, 464, 158, 897, 348,
592, 197, 874, 838, 89, 54, 996, 650, 546, 694, 797, 588, 217, 451,
7, 312, 269, 406, 654, 175, 734, 817, 61, 438, 444, 425, 66, 240,
820, 930, 264, 16, 481, 571, 604, 958, 398, 424, 843, 292, 165, 265,
942, 743, 326, 923, 815, 600, 969, 655, 176, 889, 467, 711, 339, 105,
821, 601, 520, 541, 78, 810, 274, 151, 239, 924, 599, 700, 597, 30,
662, 674, 129, 194, 535, 791, 721, 915, 5, 611, 207, 561],
dtype=torch.int32), tensor([7, 7, 1, 6, 8, 3, 0, 0, 5, 9, 5, 1, 6, 7, 2, 9, 4, 9, 5, 8, 1, 5, 4, 5,
0, 5, 2, 9, 0, 9, 4, 0, 2, 2, 5, 5, 7, 1, 1, 0, 1, 6, 6, 9, 3, 6, 4, 3,
6, 4, 7, 2, 0, 0, 2, 7, 0, 4, 6, 7, 3, 3, 5, 1, 9, 5, 9, 1, 3, 3, 2, 9,
4, 0, 3, 4, 2, 2, 9, 1, 7, 0, 0, 4, 7, 3, 6, 1, 3, 6, 5, 4, 9, 3, 4, 3,
6, 3, 1, 5, 0, 5, 7, 6, 1, 5, 5, 3, 9, 6, 7, 4, 6, 9, 7, 5, 9, 0, 0, 1,
8, 2, 5, 6, 6, 3, 7, 3, 9, 3, 0, 5, 4, 4, 6, 7, 2, 7, 1, 1, 3, 8, 4, 9,
9, 4, 4, 3, 4, 9, 5, 4, 7, 1, 9, 2, 0, 8, 0, 0, 4, 3, 3, 3, 4, 3, 5, 2,
2, 5, 2, 4, 3, 8, 6, 4, 3, 8, 2, 9, 5, 6, 9, 5, 0, 0, 3, 4, 0, 3, 1, 3,
3, 2, 2, 4, 0, 0, 4, 8, 4, 6, 4, 8, 9, 6, 2, 8, 4, 9, 7, 9, 0, 3, 2, 5,
4, 9, 6, 0, 3, 9, 9, 5, 8, 6, 6, 3, 1, 6, 4, 9, 1, 6, 9, 9, 5, 7, 5, 0,
5, 0, 7, 9, 0, 2, 9, 7, 5, 3, 6, 0, 3, 1, 1, 6, 7, 2, 2, 9, 9, 4, 7, 6,
4, 9, 9, 4, 1, 8, 1, 4, 5, 4, 9, 9, 6, 6, 4, 1, 6, 1, 0, 7, 6, 6, 3, 2,
2, 5, 5, 4, 3, 1, 1, 0, 9, 1, 2, 2, 9, 4, 1, 8, 4, 5, 0, 9, 7, 7, 1, 1,
9, 4, 7, 9, 8, 1, 9, 0, 6, 2, 3, 7, 2, 3, 6, 0, 0, 6, 3, 2, 1, 7, 4, 6,
7, 9, 3, 1, 3, 0, 9, 8, 4, 0, 0, 3, 7, 6, 8, 3, 7, 6, 7, 2, 5, 0, 0, 9,
9, 5, 9, 6, 7, 3, 1, 7, 1, 5, 7, 5, 2, 4, 6, 0, 0, 0, 4, 2, 5, 8, 5, 0,
7, 2, 3, 5, 1, 1, 1, 6, 2, 1, 7, 1, 8, 7, 7, 5, 4, 8, 5, 7, 7, 6, 7, 7,
4, 6, 2, 0, 8, 6, 0, 9, 8, 2, 2, 0, 3, 2, 4, 8, 3, 5, 7, 9, 0, 5, 8, 1,
9, 1, 9, 1, 0, 8, 0, 3, 9, 4, 2, 4, 7, 5, 2, 5, 9, 4, 2, 5, 3, 0, 6, 2,
8, 2, 3, 4, 1, 5, 7, 0, 1, 8, 1, 2, 4, 4, 2, 0, 8, 0, 1, 4, 5, 5, 3, 4,
6, 3, 2, 0, 3, 2, 8, 5, 6, 1, 1, 4, 0, 2, 8, 6, 5, 9, 7, 6, 7, 2, 6, 8,
8, 1, 6, 5, 9, 7, 9, 0, 9, 7, 6, 5, 5, 1, 4, 0, 3, 7, 4, 0, 0, 6, 3, 7,
1, 7, 3, 0, 7, 6, 0, 5, 4, 1, 5, 1, 8, 2, 7, 1, 3, 9, 6, 6, 7, 6, 0, 0,
4, 9, 3, 7, 5, 8, 3, 0, 9, 4, 8, 0, 8, 6, 7, 8, 6, 7, 5, 6, 6, 1, 0, 3,
7, 9, 7, 3, 6, 4, 8, 4, 6, 2, 9, 0, 6, 4, 0, 0, 3, 3, 5, 8, 8, 3, 1, 0])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
test_set=HeteroItemSet(
itemsets={'user': ItemSet(
items=(tensor([946, 247, 458, 4, 744, 889, 93, 780, 278, 984, 359, 995, 974, 181,
579, 414, 550, 700, 193, 787, 222, 90, 114, 514, 816, 913, 971, 747,
807, 676, 221, 117, 47, 344, 5, 201, 612, 553, 84, 846, 884, 551,
563, 596, 897, 17, 296, 904, 110, 310, 103, 34, 302, 985, 933, 936,
418, 954, 200, 12, 183, 789, 858, 656, 232, 0, 478, 591, 25, 544,
739, 918, 489, 408, 234, 14, 388, 152, 854, 540, 280, 150, 390, 723,
436, 545, 857, 171, 835, 765, 539, 107, 687, 899, 292, 542, 71, 558,
960, 385, 340, 275, 623, 989, 592, 652, 781, 94, 993, 487, 968, 173,
729, 130, 354, 243, 973, 823, 756, 831, 639, 608, 430, 242, 65, 466,
990, 956, 463, 315, 420, 768, 577, 66, 730, 311, 142, 752, 169, 355,
860, 852, 208, 944, 521, 666, 498, 940, 764, 350, 961, 40, 335, 405,
102, 978, 625, 680, 186, 806, 600, 413, 861, 475, 161, 120, 714, 552,
587, 210, 564, 983, 567, 422, 331, 252, 859, 573, 251, 834, 705, 901,
881, 733, 740, 362, 316, 50, 541, 397, 867, 935, 571, 819, 276, 684,
572, 948, 757, 603], dtype=torch.int32), tensor([3, 0, 4, 4, 1, 4, 2, 8, 7, 2, 5, 5, 9, 0, 7, 9, 6, 7, 9, 7, 4, 5, 6, 3,
2, 2, 8, 4, 9, 7, 1, 6, 7, 6, 0, 1, 6, 4, 4, 7, 3, 4, 4, 5, 1, 8, 1, 9,
7, 2, 1, 7, 8, 4, 7, 9, 6, 9, 0, 5, 4, 7, 6, 4, 7, 1, 1, 8, 3, 1, 3, 4,
7, 3, 0, 6, 0, 2, 9, 9, 8, 1, 6, 4, 3, 7, 9, 5, 0, 0, 4, 9, 2, 3, 4, 4,
5, 1, 2, 6, 2, 9, 6, 2, 8, 4, 8, 6, 4, 4, 9, 6, 2, 3, 4, 8, 6, 0, 0, 7,
1, 3, 5, 8, 2, 2, 0, 0, 2, 8, 0, 3, 2, 8, 6, 7, 2, 6, 7, 4, 0, 4, 6, 6,
4, 4, 8, 1, 5, 6, 6, 5, 8, 4, 2, 1, 5, 0, 5, 5, 0, 5, 8, 0, 2, 6, 6, 0,
6, 0, 3, 2, 5, 1, 6, 9, 6, 7, 7, 2, 4, 3, 6, 6, 3, 3, 2, 7, 2, 5, 1, 3,
0, 1, 7, 6, 3, 7, 0, 1])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([ 28, 485, 116, 219, 351, 621, 876, 295, 934, 354, 818, 918, 568, 956,
1, 40, 783, 881, 157, 471, 222, 382, 195, 324, 691, 666, 782, 419,
987, 139, 420, 457, 80, 977, 417, 741, 672, 935, 993, 72, 865, 21,
770, 997, 919, 388, 664, 410, 113, 916, 880, 945, 472, 849, 893, 231,
335, 557, 531, 460, 205, 511, 218, 228, 310, 387, 909, 178, 55, 847,
989, 837, 371, 504, 308, 276, 706, 940, 972, 104, 747, 957, 679, 928,
379, 659, 953, 998, 638, 297, 667, 580, 606, 877, 32, 983, 807, 510,
362, 828, 299, 241, 100, 125, 436, 806, 974, 439, 793, 678, 854, 612,
243, 311, 27, 162, 775, 288, 714, 413, 905, 110, 768, 790, 279, 334,
159, 196, 829, 368, 516, 343, 252, 133, 859, 941, 803, 530, 728, 720,
789, 62, 42, 962, 630, 108, 181, 518, 479, 819, 575, 306, 199, 443,
200, 418, 463, 999, 712, 852, 186, 635, 453, 99, 682, 991, 449, 17,
167, 954, 917, 325, 660, 309, 478, 563, 415, 459, 97, 802, 192, 856,
41, 180, 762, 298, 44, 245, 645, 757, 82, 975, 497, 704, 160, 329,
29, 574, 48, 146], dtype=torch.int32), tensor([5, 2, 1, 2, 4, 1, 9, 9, 9, 4, 8, 6, 1, 2, 4, 3, 5, 0, 3, 8, 2, 1, 3, 1,
9, 7, 0, 6, 1, 5, 5, 4, 3, 5, 2, 3, 3, 3, 1, 2, 1, 7, 8, 4, 1, 2, 3, 8,
9, 6, 8, 9, 1, 9, 1, 4, 9, 3, 3, 7, 8, 7, 8, 9, 2, 1, 1, 2, 2, 8, 0, 4,
2, 8, 0, 8, 5, 8, 1, 9, 5, 7, 6, 3, 4, 6, 3, 3, 5, 6, 2, 7, 5, 7, 9, 9,
3, 6, 7, 5, 0, 2, 7, 2, 6, 2, 6, 1, 1, 1, 7, 9, 1, 9, 3, 7, 7, 4, 2, 2,
4, 5, 6, 3, 3, 6, 4, 1, 3, 5, 6, 9, 6, 7, 4, 8, 7, 7, 3, 7, 3, 4, 0, 9,
5, 6, 1, 1, 7, 0, 6, 5, 1, 9, 0, 5, 8, 1, 0, 1, 4, 6, 8, 8, 5, 8, 9, 0,
5, 3, 1, 8, 9, 6, 1, 9, 2, 3, 0, 6, 8, 5, 7, 0, 3, 0, 7, 3, 4, 6, 5, 0,
2, 1, 5, 3, 0, 1, 4, 2])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
metadata={'name': 'node_classification', 'num_classes': 10},)
Loaded link prediction task: OnDiskTask(validation_set=HeteroItemSet(
itemsets={'user:like:item': ItemSet(
items=(tensor([[315, 755],
[251, 979],
[282, 863],
...,
[886, 514],
[886, 178],
[886, 736]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
), 'user:follow:user': ItemSet(
items=(tensor([[144, 382],
[926, 365],
[581, 712],
...,
[622, 746],
[622, 249],
[622, 482]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
)},
names=('seeds', 'labels', 'indexes'),
),
train_set=HeteroItemSet(
itemsets={'user:like:item': ItemSet(
items=(tensor([[380, 336],
[487, 223],
[977, 689],
...,
[466, 873],
[337, 551],
[775, 881]], dtype=torch.int32),),
names=('seeds',),
), 'user:follow:user': ItemSet(
items=(tensor([[153, 111],
[ 33, 32],
[611, 91],
...,
[475, 53],
[114, 508],
[468, 876]], dtype=torch.int32),),
names=('seeds',),
)},
names=('seeds',),
),
test_set=HeteroItemSet(
itemsets={'user:like:item': ItemSet(
items=(tensor([[599, 385],
[463, 702],
[697, 260],
...,
[770, 820],
[770, 58],
[770, 704]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
), 'user:follow:user': ItemSet(
items=(tensor([[710, 533],
[577, 575],
[ 81, 104],
...,
[373, 992],
[373, 272],
[373, 627]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
)},
names=('seeds', 'labels', 'indexes'),
),
metadata={'name': 'link_prediction', 'num_classes': 10},)
/dgl/python/dgl/graphbolt/impl/ondisk_dataset.py:463: GBWarning: Edge feature is stored, but edge IDs are not saved.
gb_warning("Edge feature is stored, but edge IDs are not saved.")