当前位置:网站首页>Tianjin University: a new architecture of heterogeneous graph neural network based on attribute completion

Tianjin University: a new architecture of heterogeneous graph neural network based on attribute completion

2021-10-14 04:58:18 Doctor of artificial intelligence

above Artificial intelligence algorithms and Python big data Get more dry goods

On the top right  ···  Set to star  *, Get resources the first time

Just for academic sharing , If there is any infringement , Contact deletion

Reproduced in : Almost Human

4  month 23 Japan , World wide The net top will WWW-2021(T he Web Conference 2021: International World Wi d e Web Conference )  Male The agenda of this session was announced The best paper award goes to the winner army (Winner and Ru nner-Up), Come on Zitian tianjin university Deputy brother Kim professor Team The paper 《Heterogeneous Graph Neural Network via Attribute Completion》 Capture The best Paper prize (Runner-Up) .

44761f8a35740ade1d405fc8be89d629.png

WWW( It is now renamed TheWebConf) Conference is the top meeting in the world wide web , By the Turing prize winner Tim founded , It is certified by China Computer Association as CCF-A Class meeting , Once a year . Current WWW Received in total 1736 Contributions , Employment 357 Papers , The employment rate is 20.6%, Among them, the winner and runner up of the best paper award are one each .

WWW2021 Best Paper Award ( runner-up ) Won by the team of associate professor Jindi from Tianjin University . This research creatively puts forward the attribute completion problem of heterogeneous information network and its efficient solution . The scheme is orthogonal to the existing heterogeneous graph neural network framework , Excellent results have been obtained on multiple real-world heterogeneous data sets .

4896eca2d5439a305c3b4e75f373701c.png

  • Thesis link :https://dl.acm.org/doi/10.1145/3442381.3449914

  • Code link :https://github.com/jindi-tju/HGNN-AC

1. Content abstract

Heterogeneous information networks (HINs) Also known as heterogeneous graphs , It is a complex network composed of many types of nodes and edges , It contains comprehensive information and rich semantics . Figure neural network (GNNs) As a powerful tool for processing graph structure data , It shows excellent performance in network analysis tasks . Recently, many heterogeneous graph models based on graph neural networks have been proposed , And it was a great success . Graph neural network aims to complete the graph representation learning task through the propagation and aggregation of node attributes , Therefore, complete node attributes are the necessary premise for the operation of the algorithm . However , Most real-world scenes usually have the problem of incomplete information , Performance in heterogeneous information networks is : There is often the phenomenon that the attributes of some types of nodes are completely missing , For example, in a citation network containing three types of nodes ACM in , Only paper The node contains the original attributes ,author and subject Node has no attributes . It is different from the lack of attributes of some nodes or the lack of node attributes in some dimensions in isomorphic networks , The degree of attribute missing in heterogeneous networks is greater 、 More complex .

Some existing heterogeneous network representation learning methods mainly aim at improving the model to improve the performance of the algorithm , For the missing attributes, some simple manual interpolation methods are used ( For example, average interpolation 、one-hot Vector interpolation ) To complete . These methods separate attribute completion from graph representation learning process , Ignoring the importance of accurate attributes for downstream tasks , Therefore, it is difficult to use simple interpolated attributes to ensure the performance of the model . actually , Accurate input is the basis for performance improvement of any model , In the absence of more complex attributes of heterogeneous networks , Accurate attributes become more important . therefore , Compared with designing a new model , Scientific and accurate completion of missing attributes should become another important research direction of heterogeneous network analysis task , And attribute completion and model design can enhance each other . Based on this , This paper proposes a learnable way to complete the missing attributes , A general framework of heterogeneous graph neural network for attribute missing heterogeneous networks is constructed by using the pattern of mutual enhancement between attribute completion and graph neural network model (HGNN-AC).

HGNN-AC There are four key designs : A priori knowledge pre learning based on topology 、 Attribute completion based on attention mechanism 、 Design of weakly supervised reconfiguration loss and construction of end-to-end model . In this paper, a large number of experiments are carried out on three real-world heterogeneous networks , The results show that the proposed framework is better than the latest benchmark .

2. Method

The framework proposed in this paper is mainly composed of four parts ( As shown in the figure below ). First , The classical heterogeneous network representation learning method , The topology structure is used to obtain the topology representation of nodes , This method captures the high-order topological relationship between nodes as a priori knowledge of attribute completion . secondly , Node based topology representation calculates the relationship between non attribute nodes and directly connected existing attribute nodes , The attributes of existing attribute nodes are weighted and aggregated to complete the attributes of non attribute nodes . then , Delete the attributes of some existing attribute nodes randomly , The proposed attribute completion method is used to reconstruct the attributes for these nodes to construct weak supervision loss . Last , Design attribute completion is combined with heterogeneous model based on graph neural network , Make the whole system end-to-end , Complete task oriented attribute completion .

40194b697ed9a063b347b1ce53821160.png

1) Pre learning of node topology representation

Because the semantic information carried by topology and attributes in the network often has strong similarity , This paper holds that the high-order heterogeneous relationship in network topology is helpful to attribute completion , Therefore, the classical heterogeneous network representation learning method is used in this paper ( for example metapath2vec) The topology is used to capture the relationship between nodes to learn the representation of nodes H, And take it as a priori knowledge to guide the completion of attributes .

2) Attribute completion based on attention mechanism

964f52cba0e67eaec2686caf4f406c4f.png Is a collection of nodes with attributes ,b978327236802c526b4d4ad0ed8443a2.png A collection of missing nodes for the attribute . This paper uses the prior knowledge obtained above H, The attention mechanism is used to calculate the importance of the first-order neighbor nodes of the target node with missing attributes , The first-order neighbor nodes of existing attributes are aggregated according to the importance coefficient (0db7ffc65accba5546eb32fb206b2578.png Node in ) Properties of , For the target node (ada5f0252c0f49f81164e9fafeecfb35.png The nodes in the ) Complete the attribute .

say concretely , Given node pair (v,u) And its corresponding node represents h_v and h_u ( Where nodes v For target nodes without attributes , node u Belonging node v A set of nodes with attributes in the first-order neighbors of 8d801f7babeea39f928cd75720b26244.png), Calculation u Nodes for v Importance coefficient of nodes :

51ad7dc4a4e71d9004031016be8959de.png

Normalize :

718cf8e32f4e23a4b6f61138b494517f.png

Aggregate according to the normalized coefficient 6c12f54622ac1e481f6688e948b5bea4.png The original attribute of the node in is the target node v Complete the attribute :

bb75ca74decf3b44a315f6789ce851fc.png

In order to stabilize the learning process and reduce high variance , Finally, this paper uses the multi head attention mechanism to complete the attributes :

b1a9acc81f1baf35c3ded3718e2e8468.png

3) Delete the original attribute to build a weak supervision loss

To ensure that the attribute completion process is learnable , At the same time, the attribute of complement is accurate , In this paper, nodes with original attributes are randomly divided into 2ca47ab41c6bb86d421534e00bf542b8.png and 3ce889d6b22425e41ae2b88aefb8cbbc.png, take 607c72906f72974279f762c9fd497d9b.png Delete the attribute of the node in , The attribute completion mechanism in the previous step is used to reconstruct the deleted attributes :

a702a63542d47c1317d6d5556b72e74d.png

The weak supervision loss of attribute completion is obtained by calculating the Euclidean distance between the original attribute and the reconstructed attribute :

9608406bedf098d233669aae19648139.png

4) Combined with heterogeneous graph neural network model to construct end-to-end system

Through the proposed attribute completion mechanism , This paper combines the existing attributes and the completed attributes , Get the complete attribute matrix :

1367c4bee75fb80863d369e7db2a11f9.png

The complete attribute matrix and topology are input into the graph neural network model , Get the label of the model to predict the loss :

dc3b30488b79f32e0778736075c9a517.png

In order to achieve task oriented attribute completion , This paper combines tag prediction loss and attribute completion loss , Build an end-to-end system to jointly optimize the two :

dedef0aef5b439af31e1644aef490216.png

3. experiment

Experiments are carried out on three real heterogeneous network data sets . The statistics of the data set are as follows :

cf8261fdc47863725aa7641e1be6e89f.png

1) Node classification results - The framework proposed in this paper is compared with two heterogeneous graph neural networks SOTA Model (MAGNN,GTN) Combine to HGNN-AC The framework is evaluated :

4f5cf5aace8862d16353f2d701369436.png

766531c2342fb1541e23730af3bb4f58.png

12be2707e6b0bbad8845b31edf643c18.png

2) case analysis - Different attribute completion methods are used for experimental comparison ,ACM Data set paper Node has attributes ,author and subject The node has no original attributes , The attribute completion methods from left to right in the following table are :paper and subject The attribute vector of a node comes from its directly connected paper The average value of the attribute vector of the node ;author The attribute vector of the node is one-hot vector ,subject The attribute vector of a node is directly connected to it paper The average value of the attribute vector of the node ;author and subject The attributes of all nodes are one-hot vector ;author The attributes of nodes are completed by the method proposed in this paper ,subject The attribute of the node is one-hot vector ;author and subject The attributes of nodes are completed by the method proposed in this paper .

9609b51a1ad8bad1514999c5b9fd2046.png

4. summary

This paper finds that , Facing the complex lack of attributes in heterogeneous networks , Compared with the traditional research direction focusing on designing new models , Attribute completion becomes particularly important , And can be a new 、 More effective ways to improve performance . In this paper, the missing attributes in heterogeneous networks are scientifically completed for the first time , A general framework is proposed to solve the problem of attribute deletion in heterogeneous graph neural network model .

In particular , In this framework , Firstly, the relationship between nodes is mined based on the high-order topology information oriented to meta path , It is used as a priori knowledge of the semantic relationship between nodes . Then, a method with a priori information guidance is provided for the completion of node attributes 、 Effective attention mechanism , The definition of weak supervision loss is realized by randomly deleting attributes , Thus, attribute completion becomes a reasonable learnable process under the guidance of a priori knowledge . Finally, the attribute completion process and target task are defined under the framework of the same graph neural network , To build a task oriented end-to-end framework , Realize the mutual enhancement of the two . The framework can be orthogonal to most heterogeneous graph neural network models , Bring stable performance improvements to these models . This paper also hopes that this new view can provide a new perspective for the existing research of heterogeneous networks based on graph neural networks 、 Effective direction .

------------------

Statement : This content comes from the Internet , The copyright belongs to the original author

Picture source network , It does not represent the position of the official account . If there is any infringement , Contact deletion

AI Doctor's personal wechat , There are still a few vacancies

dcc356fb29461a2e9a3f763ea72cdf95.png

25613c6c9b9a9e6b92bd69d0e94bf3a3.gif

How to draw a beautiful deep learning model ?

How to draw a beautiful neural network diagram ?

Read all kinds of convolutions in deep learning

Let's have a look and support 64cb7939e557de70cd0a675f0a752367.pngad614dd0db02eb586af5db50111c6d41.png

版权声明
本文为[Doctor of artificial intelligence]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/10/20211002145441569a.html

随机推荐