Graph Neural Networks (GNNs) – Comprehensive Guide

Build, deploy, operate computer vision at scale

One platform for all use cases
Connect all your cameras
Flexible for your needs

Graph Neural Networks (GNNs) are a type of neural network designed to directly operate on graphs, a data structure consisting of nodes (vertices) and edges connecting them. GNNs have revolutionized how we analyze and utilize data structured in the form of a graph. Whenever you hear about groundbreaking discoveries in fields like drug development, social media analysis, or fraud detection, there’s a good chance that GNNs are playing a part behind the scenes.

In this article, we’ll start with a gentle introduction to Graph Neural Networks and follow with a comprehensive technical deep dive.

About us: Viso Suite is the end-to-end computer vision platform. With Viso Suite, it becomes possible for enterprises to start using machine learning without a single line of code. Book a demo with us to learn more.

Satellite image analysis with Viso Suite — End-to-end computer vision pipeline with Viso Suite

Prediction Tasks Performed by GNNs

The primary goal of GNNs is to learn a representation (embedding) of the graph structure. The GNN captures both the properties of the nodes (what the node contains) and the topology of the graph (how these nodes connect).

These representations are useful for various tasks such as node classification (determining the label of a node), link prediction (predicting the existence of an edge between two nodes), and graph classification (classifying entire graphs).

For example, social network apps (Facebook) extensively use GNNs. Here the GNNs can predict user behavior not just based on their profile, but also their friends’ activities and social circles. Here is what the GNN does:

Node-level prediction: Predicting the category of a node (e.g. Does this person eat a burger (predicting based on the kind of friends the person has, or activities the person does).
Edge-level prediction: Predicting the likelihood of a connection between two nodes (e.g., suggesting new friends in a social network or what next Netflix video to play).
Graph level prediction: Classifying the entire graph based on its structure and node properties (e.g., deciding whether a new molecule is a suitable drug, or predicting what would be the aroma of the new molecule).

Community detection with GNNs — GNNs are useful in social networks for identifying communities of users based on their interactions, interests, or affiliations.

Understanding the Basics of Graphs

To understand the Graph Neural Network (GNN), we need to learn its most basic element, which is a Graph. A graph data structure represents and organizes not just the data points but also emphasizes the relationships between data points.

Vertices and Edges

A graph is a collection of points (nodes) connected by lines (edges). Vertices represent entities, objects, or concepts, while edges represent relationships or connections between them.

In the context of the social network, the nodes can be a person, and the edges can be the kind of relationship (the person follows, etc.)

Directed vs Undirected Graphs

In a directed graph, edges have a direction, indicating the flow of the relationship. The direction can be who follows whom (you might be following person A, but person A doesn’t follow you back), in the context of a social network.

In an undirected graph, edges do not have a direction, simply indicating a connection between two vertices. E.g. on Facebook, if you accept a friend request, then you can see their posts, and they can see back yours. The relationship is mutual here.

Weighted Graph

The edges have a weight associated with them.

What is Graph Representation?

Graph representation is a way to encode graph structure and features for processing by neural networks. Graphs embed the node’s data and also the relationship between the data points. To represent these connections between the nodes, graph representation is required.

Here are some of the most used graph representations for Deep Learning.

Adjacency Matrix

This is a matrix listing all the vertices connected to that vertex (all the nodes connected to a node). E.g. here, node A has nodes B and C connected to it, there in the array, the corresponding value is 1. When a connection doesn’t exist, it has 0 in the array.

Incidence Matrix

A matrix of size NxM where N is the number of nodes and M are edges of the graph. In simple terms, it is used to represent the graph in matrix form. The value is 1 if the node has a particular edge, and 0 if it does not. E.g., A has edge E1 connected to it, therefore in the incidence matrix, it is a 1. Whereas node A doesn’t have edge E4 connected to it, therefore it is denoted by 0 in the matrix.

Degree Matrix

A diagonal matrix which contains the number of edges attached to each node.

Unique Characteristics of Graph Data

Graph data structure models real-world data efficiently due to its special characteristics. This is what makes them different from matrices used in Convolutional Neural Networks (CNNs).

To be able to work on graph data structure, GNNs were developed. Before diving into GNNs let’s see what those unique characteristics of graphs are:

Connections: Unlike tabular data where entities exist in isolation, graph data revolves around connections between entities (nodes). These connections joining the nodes, represented by edges, hold crucial information that makes the data valuable and is central to understanding the system.
No Structure: Traditional data resides in ordered grids or tables. Graphs, however, have no inherent order, and their structure is dynamic (the nodes are spread out randomly and change their positions). Which is what makes it useful for modeling dynamic real-world data.
Heterogeneity: In a graph, points (nodes) and connections (edges) can stand for different things like people, proteins, or transactions, each having its own characteristics.
Scalability: Graphs can become very large, including complex networks with millions of connections and points, which makes them difficult to store and analyze.

Graph Neural Networks vs Neural Networks

To better understand Graph Neural Networks (GNNs), it’s essential to first know how they differ from traditional Neural Networks (NNs). Starting with NNs as a foundation, we can then explore how GNNs build on this to handle graph-structured data, making the concept clearer and more approachable.

Here are some of the differences between NNs and GNNs.

Data Structure

Neural Networks (NNs): Traditional neural networks, including their variants like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), are primarily designed for grid-like data structures. This includes images (2D grids of pixels) for CNNs and sequential data (time series or text) for RNNs.
Graph Neural Networks (GNNs): GNNs are specifically designed to handle graph data. Graphs not only store data points but also the complex relationships and interconnections between data points.

Data Representation

NNs: Data input into traditional neural networks needs to be structured, such as vectors for fully connected layers, multi-dimensional arrays for CNNs (e.g., images), or sequences for RNNs.
GNNs: Input data for GNNs is in the form of graphs, where each node can have its features (attribute vectors), and edges can also have features representing the relationship between nodes.

Operation

NNs: Operations in neural networks involve matrix multiplications, convolution operations, and element-wise operations, applied in a structured manner across layers.
GNNs: GNNs operate by aggregating features from a node’s neighbors through a process called message passing. This involves aggregating the current node’s features with the neighbor node’s features (read more on this operation below).

Modeling transportation network with GNNs — City transportation systems, including roads, railways, and flight routes, can be modeled using graphs – source.

Learning Task

NNs: Traditional neural networks are well-suited for tasks like image classification, object detection (CNNs), and sequence prediction or language modeling (RNNs).
GNNs: GNNs excel at tasks that require understanding the relationships and interdependencies between data points, such as node classification, link prediction, and graph classification.

Interpretability

The ability to understand the inner workings and how models make their decisions or predictions.

Neural Networks (NNs): While NNs can learn complex patterns in data, interpreting these patterns and how they relate to the structure of the data can be challenging.
Graph Neural Networks (GNNs): GNNs, by operating directly on graphs, offer insights into how relationships and structure in the data contribute to the learning task. This is extremely valuable in domains such as drug discovery or social network analysis.

Types of GNNS

Graph Neural Networks (GNNs) have evolved into various architectures to address different challenges and applications. Here are a few of them.

Graph Convolutional Network (GCN)

The most popular and also a foundational type of GNN. The key idea behind GCNs is to update the representation of a node by aggregating and transforming the features of its neighboring nodes and itself (inspired by CNNs). This aggregation mechanism enables GCNs to capture the local graph structure around each node. GCNs are widely used for node classification, graph classification, and other tasks where understanding the local structure is crucial.

Deep Graph Convolutional Neural Network II (DGCNNII)

This architecture uses a deep graph convolutional neural network architecture for graph classification. It is improved upon GCN. This new architecture is based on a non-local message-passing framework and a spatial graph convolution layer. It is shown to outperform many other graph neural network methods on graph classification tasks.

Deep Graph Convolutional Network (DGCNNII) – source.

Graph Attention Networks (GATs)

Introduce attention mechanisms to the aggregation step in GNNs (adding weight to the aggregation step).

Graph Attention Network GAT — Graph Attention Network (GAT) – source.

In GATs, the importance of each neighbor’s features is dynamically weighted (the different colors of arrows in the image) during the aggregation process, allowing the model to focus more on relevant neighbors for each node. This approach is beneficial in graphs where not all connections are equally important or some nodes overpower the rest of the nodes (e.g. an influence with millions of followers), and it can lead to more expressive node representations. GATs are particularly useful for tasks that benefit from distinguishing the significance of different edges, such as in recommendation systems or social network analysis.

Graph Recurrent Networks (GRNs)

Combine the principles of recurrent neural networks (RNNs) with graph neural networks. In GRNs, node features are updated through recurrent mechanisms, allowing the model to capture dynamic changes in graph-structured data over time. Examples of applications include traffic prediction in road networks and analyzing time-evolving social networks.

Graph Recurrent Network GRN — GRNs are well-suited for dynamic graphs and temporal graph data (graphs that evolve) – source.

Graph Autoencoders (GAEs)

Designed for unsupervised learning tasks on graphs. A GAE learns to encode the graph (or subgraphs/nodes) into a lower-dimensional space (embedding) and then reconstructs the graph structure from these embeddings. The objective is to learn representations that capture the essential structural and feature information of the graph. GAEs are particularly useful for tasks such as link prediction, clustering, and anomaly detection in graphs, where explicit labels are not available.

Graph Autoencoder structure GAE — Graph Autoencoder (GAE) structure – source.

Graph Generative Networks

Aim to generate new graph structures or augment existing ones. These models learn the distribution of observed graph data and can generate new graphs that resemble the training data. This capability is valuable for drug discovery, where generating novel molecular structures is of interest. Also in social network analysis, where simulating realistic network structures can help understand network dynamics.

How Do GNNs Work

GNNs work by leveraging the structure of graph data, which consists of nodes (vertices) and edges (connections between nodes), to learn representations for nodes, edges, or entire graphs. They combine these features into a new feature matrix (the current nodes and the neighboring nodes).

This new feature matrix is updated in the current node. When this process is done multiple times, with all the nodes in the graph, as a result, each node learns something about every other node.

GNN Learning Steps

Message passing with GNNs — GNNs: 2D convolution vs. graph convolution – source.

Message Passing: In each layer, information is passed between nodes in the graph. A node’s current state is updated based on the information from its neighbors. This is done by aggregating the information from the neighbors and then combining it with the node’s current state. This is also called graph convolution (inspired by CNNs). In CNNs, the information from the neighbor pixels is integrated into the central pixel, similarly in GNNs, the central node is updated with aggregated information about its neighbors.
Aggregate: There are different ways to aggregate the information from a node’s neighbors. Some common methods include averaging, taking the maximum, or using a more complex function like a recurrent neural network.
Update: After the information from the neighbors has been aggregated, it is combined with the node’s current state, forming a new state. This new node state contains information about the neighboring nodes. Aggregate and update are the most important steps in GNNs. Here is the formal definition:
Output: The final node, edge, or graph representations is useful for tasks like classification, regression, or link prediction.
Multiple Layers: The message-passing process can be repeated for multiple layers. This allows nodes to learn information about their neighbors’ neighbors, and so on. The number of layers formally called hops is a hyperparameter.

What is a Hop?

In GNNs, a hop refers to the number of steps a message can travel from one node to another. For example, here the GNN has 2 layers, and each node incorporates information from its direct neighbors and its neighbors’ neighbors.

Encoding GNNs — Neighborhood aggregation methods for encoding – source.

However, one needs to be careful at choosing the number of layers or hops, as too many layers can lead to all the nodes having roughly the same values (over-smoothing).

What is Oversmoothing?

Oversmoothing is a challenge that arises when too many layers of message passing and aggregation are used in a GNN.

As the depth of the network increases, the representations of nodes across different parts of the graph may become indistinguishable. In extreme cases, all node features converge to a similar state, losing their distinctive information.

This happens because, after many iterations of smoothing, the node features have aggregated so much information from their neighborhoods that the unique characteristics of individual nodes are washed out.

Smoothing in GNNs

Smoothing in GNNs refers to the process through which node features become more similar to each other. This results from the repeated application of message passing and aggregation steps across the layers of the network.

As this process is applied through multiple layers, information from a node’s wider and wider neighborhood is integrated into its representation.

Addressing Oversmoothing

Skip Connections (Residual Connections): Similar to their use in CNNs, skip connections in GNNs can help preserve initial node features by bypassing some layers, effectively allowing the model to learn from both local and global graph structures without over-smoothing.
Attention Mechanisms: By weighting the importance of neighbors’ features dynamically, attention mechanisms can prevent the uniform blending of features that leads to over-smoothing.
Depth Control: Carefully design the network depth according to the graph’s structural properties.
Normalization Techniques: Techniques like Batch Normalization or Layer Normalization can help maintain the distribution of features across GNN layers.

Applications of GNNs

GNNs have lived to their theoretical potential and are now actively impacting various real-world domains. So, how powerful are graph neural networks? Here’s a glimpse of its applications:

Social Networks

Community Detection: Identifying communities of users based on their interactions, interests, or affiliations. This is useful for targeted advertising, content recommendation, or anomaly detection.
Link Prediction: Predicting new connections between users based on existing relationships and network structure. This can be valuable for friend recommendations or identifying potential influencers.
Sentiment Analysis: Analyzing the sentiment of user posts or conversations by considering the context and relationships within the network. This can help understand public opinion or brand perception.

Link prediction with GNNs — Social networks employ GNNs for suggestions of who to follow or friend recommendations – source.

Recommendation Systems

Collaborative Filtering with GNNs: Going beyond traditional user-item interactions, GNNs leverage user relationships to personalize recommendations by incorporating social influence and content similarity among users.
Modeling User-Item Interactions: Capturing the dynamic nature of user preferences by considering the evolution of user relationships and item popularity within a network.
Explainable Recommendations: By analyzing how information propagates through the network, GNNs can offer insights into why certain items are recommended, increasing transparency and trust in the system.

Bioinformatics

Protein-Protein Interaction Networks: Predicting protein-protein interactions based on their structural and functional similarities, aiding in drug discovery and understanding disease mechanisms.
Gene Regulatory Networks: Analyzing gene expression data and network structure to identify genes involved in specific biological processes or disease development.
Drug Discovery: Leveraging GNNs to predict the efficacy of drug candidates by considering their interactions with target proteins and pathways within a network.

Protein interaction networks with GNNs — Protein-protein interaction networks with GNNs – source.

Fraud Detection

Graphs can represent financial networks, where nodes might be entities (such as individuals, companies, or banks), and edges could represent financial transactions or lending relationships. This is useful for fraud detection, risk management, and market analysis.

Financial networks with GNNs — Identification of fraudulent transactions or activities by analyzing financial transaction networks and user behavior patterns.

Traffic Flow Prediction

Predicting traffic congestion and optimizing traffic light timings for smart cities by modeling the road network and vehicle flow dynamics.

Cybersecurity

Detecting malicious activities and identifying vulnerabilities in computer networks by analyzing network traffic and attack patterns.

Challenges of GNNs

Despite the impressive progress and potential of GNNs, several research and implementation challenges and limitations remain. Here are some key areas.

Scalability

Many GNN models struggle to efficiently process large graphs due to the computational and memory requirements of aggregating features from a node’s neighbors. This becomes particularly challenging for graphs with millions of nodes and edges, common in real-world applications like social networks or large knowledge graphs.

Oversmoothing

A note on over-smoothing for graph neural networks, as previously mentioned, the depth of a GNN increases, and the features of nodes in different parts of the graph can become indistinguishable. This over-smoothing problem makes it difficult for the model to preserve local graph structures, leading to a loss of performance in node classification and other tasks.

Dynamic graphs

Many real-world graphs are dynamic, with nodes and edges being added or removed over time. Most GNN architectures are designed for static graphs and struggle to model these temporal dynamics effectively.

Heterogeneous graphs

Graphs often contain different types of nodes and edges (heterogeneous graphs), each with its features and patterns of interaction. Designing GNN architectures that can effectively learn from heterogeneous graph data is a challenging task. This requires capturing the complex relationships between different types of entities.

Generalization Across Graphs

Many GNN models are trained and tested on the same graph or graphs with similar structures. However, models often struggle to generalize to entirely new graphs with different structures, limiting their applicability.

Crystal structure illustration with GNNs — The structure of a crystal mapped with GNNs – source.

Future Work in Graph Neural Networks (GNNs)

Interpretability and Explainability

Making GNN models more understandable is a key area because it is crucial for sectors like healthcare, and finance. Only by understanding how the models make predictions can create trust and facilitate their adoption.

Dynamic and Temporal Graph Modeling

Enhancing the ability of GNNs to model and predict changes in dynamic graphs over time is a key area of research. This includes not only capturing the evolution of graph structures but also predicting future states of the graph.

Heterogeneous Graph Learning

Advancing techniques for learning from heterogeneous graphs, which contain multiple types of nodes and relationships is a key area of ongoing research. This involves creating models that can effectively leverage the rich semantic information in these complex networks.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
ZCAMPAIGN_CSRF_TOKEN	session	This cookie is used to distinguish between humans and bots.
zfccn	session	Zoho sets this cookie for website security when a request is sent to campaigns.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_177371481_2	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
zabUserId	1 year	This cookie is set by Zoho and identifies whether users are returning or visiting the website for the first time
zabVisitId	one year	Used for identifying returning visits of users to the webpage.
zft-sdc	24hours	It records data about the user's navigation and behavior on the website. This is used to compile statistical reports and heat maps to improve the website experience.
zps-tgr-dts	1 year	These cookies are used to measure and analyze the traffic of this website and expire in 1 year.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
2d719b1dd3	session	This cookie has not yet been given a description. Our team is working to provide more information.
4662279173	session	This cookie is used by Zoho Page Sense to improve the user experience.
ad2d102645	session	This cookie has not yet been given a description. Our team is working to provide more information.
zc_consent	1 year	No description available.
zc_show	1 year	No description available.
zsc2feeae1d12f14395b6d5128904ae3746	1 minute	This cookie has not yet been given a description. Our team is working to provide more information.