当前位置：网站首页>Graph node classification and message passing
Graph node classification and message passing
20201109 07:30:29 【osc_gnljjuu】
Message passing and node classification
This article mainly solves the problem ：
Given a network , Some of these nodes have label, How to assign other nodes to corresponding nodes label Well ？ （ There are many examples of this in life , For example, through interactive behavior to determine whether the user is a fleece party ） Intuitively, we will consider the connection between network nodes , The model to be learned in this article is collective classification：
There are three specific methods involved ：
 Relational classification;
 Iterative classification;
 Belief propagation;
Relational classification
Correlations Exists In Network
The behavior of nodes in the network is corrlated, There are three kinds of ：
 Homophily： Nodes with the same properties may have more network connections , If you have the same music hobby, it's easier to be in the same forum ;
 Influence： The characteristics of one node may be influenced by other nodes , If I recommend music albums that I am interested in to my friends , Some of my friends are likely to become as interested in these albums as I am ;
 Confounding： The environment can affect the characteristics of nodes and the connections between nodes ;
Classification with network data
Motivation
Similar nodes are usually connected to each other in the network , Or the connection is tight ; and ”Guiltbyassociation“： Association inference is to consider this part of information to infer the type of nodes in the network ; Therefore, the network node type judgment mainly through the following three sources of characteristics ：
 The characteristics of the node itself ;
 Nodes are connected to neighbors label;
 The characteristics of nodes' neighbors ;
Guiltbyassociation
"Guiltbyassociation" It's better to define , Here's the picture , Given Graph And a small amount of label The node of , How to label the remaining nodes as corresponding label, requirement ： The network has homogeneity ：
Mathematical modeling of the problem is carried out ： Given the adjacency matrix W, The size is n*n, as well as Y Corresponding to node label,1 by positive node ,1 position negative node , 0 position unlabeled node , forecast unlabeled In the node of postive Probability . Here's the Markov hypothesis ： Node label type , Depending only on the connected nodes label type ：
Collective classification It usually involves three steps ：
 local classifier： Only the node feature information is considered for classification ;
 Relational Classifier： Consider the characteristics and category information of node neighbors ;
 collective inference： take Relational Classifier Apply to every node , Until the whole network ”inconsistency“ To minimize the ;
Probabilistic Relational Classifier
Probabilistic Relational Classifier The basic idea is simple , The probability of each node class is the weighted average of its neighboring nodes , The specific process is as follows ：
 For tagged data , Label the class table of nodes as label;
 For data without labels , Initialize its class randomly ;
 Then the adjacent nodes are weighted in random order , Until the whole network converges or reaches the maximum number of iterations ：
among It's a node i And nodes j The connection weight of .
There are two problems with this model ：
 There is no guarantee that the model will converge ;
 The model does not use the feature information of nodes ;
Pictured above , For the initialized node , Press label Adjacent nodes are used to calculate the weight , Get the probability of node type , Until it converges ： after 5 After the iteration ：
Iterative Classifier
Iterative Classifier That is, the iterative process after adding node features , There are two main processes ：
 Bootstrap Phase： The nodes i Express by its characteristics , Use local classifier To calculate the... Of each node label probability ;
 Iteration Phase： Repeat for each node in the network , Update node characteristics and press local classifier Classify the node , until label Stable ;
In the video tutorial , One example is the use of Iterative Classifier By considering network structure information fake review detection Of , I suggest that you can read the paper in detail ： REV2_Fraudulent_User_Prediction_in_Rating_Platforms
Collective Classification
Belief Propagation It's a dynamic planning process , Through nodes passing message The means of , It is mainly used to solve the problem of conditional probability in graphs .
Message Passing
Take an example to explain message passing The process of ：
Given the task ： Count the number of nodes in the graph , Each node can only interact with its neighbors
Pictured above , Each node listens for information from its neighbors , And then update the information , And pass it forward , Here's the picture , Nodes marked with a red circle can only receive incoming messages：
 2 nodes before you;
 3 nodes behind you：
In a slightly more complex tree Type graph To show ：
Loopy Belief Propagation
To make it clear , Here's a basic definition ：
 LabelLabel potential matrix ： Representation node i It's a category Under the condition of , Its adjacent nodes j For the category Probability ;
 prior belief ： Representation node i For the category The prior probability of ;
 : node i Predict its adjacent nodes j For state ：
It's easy to understand ： The formula considers nodes i by The prior probability of ψ, According to the state transition matrix, the nodes are obtained j by , At the same time, all the information transmitted by neighbor nodes is considered , Random order iterations , Until the final state is stable , Get the node i For the category Y_{i}Yi Probability ：
Consider nodes , Doing it message passing when , In fact, there is no way to judge whether there is cycle, therefore LBP In case of cycles It's just a matter of some questions , Here's the picture , There's a little bit of confusion here , If you have any understanding, please discuss in the comments section ：
LBP The advantages and disadvantages are as follows ：
 advantage ：
 Implement a simple , It's easy to parallelize ;
 It's universal , Apply to all graph models ;
 Challenge ：
 There is no guarantee of convergence , Especially with a lot of closed connections ;
 Potential function ：
 It needs magic training ;
 Training learning is based on gradient optimization , There may be a bracelet problem during training ;
Summary
This article is based on cs224w Learning to organize , In this paper, we mainly introduce in node classification There are three classic ways to deal with it ：Relational classification、Iterative Classifier、Collective Classification, Introduces the classic solution , Such as Probabilistic Relational Classifier、Message Passing, There are several cases in the article , Because the time is not sorted out , Learn the tutorial in this chapter , And related paper, node classification It's classic graph The problem of , It's used in many anti fraud cases .
版权声明
本文为[osc_gnljjuu]所创，转载请带上原文链接，感谢
边栏推荐
 C++ 数字、string和char*的转换
 C++学习——centos7上部署C++开发环境
 C++学习——一步步学会写Makefile
 C++学习——临时对象的产生与优化
 C++学习——对象的引用的用法
 C++编程经验（6）：使用C++风格的类型转换
 Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days!
 C + + number, string and char * conversion
 C + + Learning  capacity() and resize() in C + +
 C + + Learning  about code performance optimization
猜你喜欢

C + + programming experience (6): using C + + style type conversion

Latest party and government work report ppt  Park ppt

在线身份证号码提取生日工具

Online ID number extraction birthday tool

️野指针？悬空指针？️ 一文带你搞懂！

Field pointer? Dangling pointer? This article will help you understand!

HCNA Routing＆Switching之GVRP

GVRP of hcna Routing & Switching

Seq2Seq实现闲聊机器人

【闲聊机器人】seq2seq模型的原理
随机推荐
 LeetCode 91. 解码方法
 Seq2seq implements chat robot
 [chat robot] principle of seq2seq model
 Leetcode 91. Decoding method
 HCNA Routing＆Switching之GVRP
 GVRP of hcna Routing & Switching
 HDU7016 Random Walk 2
 [Code+＃1]Yazid 的新生舞会
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 HDU7016 Random Walk 2
 [code + 1] Yazid's freshman ball
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 Qt Creator 自动补齐变慢的解决
 HALCON 20.11：如何处理标定助手品质问题
 HALCON 20.11：标定助手使用注意事项
 Solution of QT creator's automatic replenishment slowing down
 Halcon 20.11: how to deal with the quality problem of calibration assistant
 Halcon 20.11: precautions for use of calibration assistant
 “十大科学技术问题”揭晓！青年科学家50²论坛
 "Top ten scientific and technological issues" announced Young scientists 50 ² forum
 求反转链表
 Reverse linked list
 js的数据类型
 JS data type
 记一次文件读写遇到的bug
 Remember the bug encountered in reading and writing a file
 单例模式
 Singleton mode
 在这个 N 多编程语言争霸的世界，C++ 究竟还有没有未来？
 In this world of N programming languages, is there a future for C + +?
 es6模板字符
 js Promise
 js 数组方法 回顾
 ES6 template characters
 js Promise
 JS array method review
 【Golang】️走进 Go 语言️ 第一课 Hello World
 [golang] go into go language lesson 1 Hello World