a survey of adversarial learning on graph · of graphs 2.2 taxonomies of graphs di‡erent...

111

A Survey of Adversarial Learning on Graph

LIANG CHEN, Sun Yat-sen University of ChinaJINTANG LI, Sun Yat-sen University of ChinaJIAYING PENG, Sun Yat-sen University of ChinaTAO XIE, Sun Yat-sen University of ChinaZENGXU CAO, Hangzhou Dianzi University of ChinaKUN XU, Sun Yat-sen University of ChinaXIANGNAN HE, University of Science and Technology of ChinaZIBIN ZHENG∗, Sun Yat-sen University of China

Deep learning models on graphs have achieved remarkable performance in various graph analysis tasks, e.g.,node classi�cation, link prediction and graph clustering. However, they expose uncertainty and unreliabilityagainst the well-designed inputs, i.e., adversarial examples. Accordingly, a line of studies have emerged for botha�ack and defense addressed in di�erent graph analysis tasks, leading to the arms race in graph adversariallearning.

Despite the booming works, there still lacks a uni�ed problem de�nition and a comprehensive review.To bridge this gap, we investigate and summarize the existing works on graph adversarial learning taskssystemically. Speci�cally, we survey and unify the existing works w.r.t. a�ack and defense in graph analysistasks, and give appropriate de�nitions and taxonomies at the same time. Besides, we emphasize the importanceof related evaluation metrics, investigate and summarize them comprehensively. Hopefully, our works canprovide a comprehensive overview and o�er insights for the relevant researchers. More details of our worksare available at h�ps://github.com/gitgiter/Graph-Adversarial-Learning.

CCS Concepts: •Computing methodologies→ Semi-supervised learning settings; Neural networks;•Security and privacy→ So�ware and application security; •Networks→ Network reliability;

Additional Key Words and Phrases: adversarial learning, graph neural networks, adversarial a�ack and defense,adversarial examples, network robustness

ACM Reference format:Liang Chen, Jintang Li, Jiaying Peng, Tao Xie, Zengxu Cao, Kun Xu, Xiangnan He, and Zibin Zheng. 2018. ASurvey of Adversarial Learning on Graph. J. ACM 37, 4, Article 111 (August 2018), 28 pages.DOI: 10.1145/1122445.1122456

1 INTRODUCTIONOver the past decade, deep learning has enjoyed the status of crown jewels in arti�cial intelligence,which shows an impressive performance in various applications, including speech and languageprocessing [19, 74], face recognition [47] and object detection [35]. However, the frequently useddeep learning models recently have been proved unstable and unreliable due to the vulnerability∗Contact Author

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for pro�t or commercial advantage and that copies bear this notice andthe full citation on the �rst page. Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is permi�ed. To copy otherwise, or republish, to post on servers or to redistribute to lists, requiresprior speci�c permission and/or a fee. Request permissions from [email protected].© 2018 ACM. 0004-5411/2018/8-ART111 $15.00DOI: 10.1145/1122445.1122456

Journal of the ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.

arX

iv:2

003.

0573

0v2

[cs

.LG

] 1

9 M

ay 2

020

https://github.com/gitgiter/Graph-Adversarial-Learning

111:2 Liang Chen and Jintang Li, et al.

I

IT

I

IT

+ =

[:] [:] [:]

[:] [:]

[:]

Perturbations

I

IT

[:] [:] [:]

[:] [:]

[:]

I

IT

[:] [:] [:]

[:]

[:]

“Class 2”80��

Prediction Prediction

Class1Class2Class3

I Influencer

T Target

I

IT

[:] [:] [:]

[:] [:]

[:]

“Class 3”92��

[:] Node features

Fig. 1. A misclassification of the target caused by a small perturbations of the graph structure and nodefeatures.

against perturbations. For example, slight changes on several pixels of a picture, which appearsimperceptible for human eyes but strongly a�ect the outputs of deep learning models [46]. As statedby Szegedy et al. [61], deep learning models that are well-de�ned and learned by backpropagationhave intrinsic blind spots and non-intuitive characteristics, which should have been generalized tothe data distribution in an obvious way.

On the other hand, deep learning on graphs has received signi�cant research interest recently.As a powerful representation, graph plays an important role and has been widely applied in realworld [27]. Naturally, deep learning’s research on graph is also a hot topic and brings lots ofrefreshing implementations in di�erent �elds, such as social networks [48], e-commence networks[66] and recommendation systems [15, 73]. Unfortunately, graph analysis domain, a crucial �eld ofmachine learning, has also exposed the vulnerability of deep learning models against well-designeda�acks [85, 87]. For example, consider the task of node classi�cation, a�ackers usually have controlover several fake nodes, aim to fool the target classi�ers, leading to misclassi�cation by addingor removing edges between fake nodes and other benign ones. As shown in Figure 1, performingsmall perturbations (two added links and several changed features of nodes) on a clean graph canlead to the misclassi�cation of deep graph learning models.

With rising concerns being paid on the security of graph models, there exists a surge of researcheson graph adversarial learning, i.e., a �eld on studying the security and vulnerability of graph models.On one hand, from the perspective of a�acking a graph learning model, Zugner et al. [85] �rststudy adversarial a�acks on graph data, with li�le perturbations on node features and graphstructure, the target classi�ers are easily fooled and misclassify speci�ed nodes. On the otherhand, Wang et al. [67] propose a modi�ed Graph Convolution Networks (GCNs) model withan adversarial defense framework to improve its robustness. Moreover, Sun et al. [57] studythe existing works of adversarial a�ack and defense strategies on graph data and discuss theircorresponding contributions and limitations. However, they mainly focus on the aspect of theadversarial a�ack, leaving works on defense unexplored.

Challenges. Despite a surge of works on the graph adversarial learning, there still exists severalproblems to solve. i) Uni�ed and speci�ed formulation. Current studies consider the problemde�nition and assumptions in graph adversarial learning with their own mathematical formulationsand mostly lack of detailed explanations, which e�ectively hinders the progress of follow-upstudies. ii) Related evaluation metrics. While for various tasks, evaluation metrics on corresponding


A Survey of Adversarial Learning on Graph 111:3

performance are rather di�erent, and even have diverse standardization on it. Besides, specialmetrics for the graph adversarial learning scenario are necessary and timely to explore, e.g.,evaluation on the a�ack impacts.

For the problem of inconsistent formulations and de�nitions, we survey the existing a�ack anddefense works, give uni�ed de�nitions and categorize them from diverse perspectives. Althoughthere have been some e�orts [20, 39, 85] to generalize the de�nitions, most formulations still makecustomization for their own models. So far, only one article [57] outlines these concepts froma review perspective, which is not su�cient to summarize the existing works comprehensively.Based on the previous literature, we summarize the di�erent types of graphs and introduce thethree main tasks according to the level, and subsequently give the uni�ed formulations of a�ackand defense in Section 3.1 and 4.1, respectively.

Di�erent models have various metrics due to di�erent emphasis. To provide guidance forresearchers and be�er evaluate their adversarial models, we have a more detailed summarize anddiscussion on metrics in Section 5. In particular, we �rst introduce some common metrics for botha�ack and defense, and then present some special metrics provided in their respective works fromthree categories: e�ectiveness, e�ciency, and imperceptibility. For instance, the A�ack SuccessRate (ASR) [10] and the Average Defense Rate (ADR) [11] are proposed to measure the e�ectivenessof a�ack and defense, respectively.

In summary, our contributions can be listed as bellow:

• We thoroughly investigate related works in this area, and subsequently give the uni�edproblem formulations and the clear de�nitions for current inconsistent concepts of botha�ack and defense.• We give a clear overview on existing works and classify them from di�erent perspectives

based on reasonable criteria systematically.• We emphasize the importance of evaluation metrics and investigate them to make a com-

prehensively summary.• For such an emerging research area, we point out the limitations of current researches and

provide some open questions to solve.

�e survey is organized as follow. In Section 2, we will give some basic notations of typical graphs.In Section 3 and Section 4, we will separately introduce the de�nitions, taxonomies of adversariala�ack and defense on graph data, and further give a clear overview. We then summarize the relatedmetrics in Section 5 and try to discuss some open research questions in Section 6. Finally, we drawour conclusion in Section 7.

2 PRELIMINARYFocusing on the graph structure data, we �rst give notations of typical graphs for simplicity andfurther introduce the mainstream tasks in graph analysis �elds. �e most frequently used symbolsare summarized in Table 1.

2.1 NotationsGenerally, a graph is represented as G = (V, E), where V = {v1,v2, . . . ,vN } denotes the setof N nodes and E = {e1, e2, . . . , eM } is the set of M existing edges in the graph, and naturallyE ⊆ V ×V . �e connections of the graph could be represented as an adjacency matrix A ∈ RN×N ,where Ai, j , 0 if there is an edge from node vi to node vj and Ai, j = 0 otherwise.



Table 1. Summary of notations.

Symbol Description Symbol Description Symbol Description

N number of nodes M number of edges n number of graphs

F(·)number of features w.r.t.

nodes F(node) or edges F(edдe)G

a graph instance, denotinga clean graph G or a modi�ed graph G

C set of pre-de�ned class labelsfor nodes, edges or graphs

V set of nodesV = {v1,v2, . . . ,vN } E set of edges E = {e1, e2, . . . , eM } G set of graphs G = {G1,G2, . . . ,Gn}

D the whole dataset,D = (G,X ,C) S set of instances, could be G,V , or E SL

set of labeled instances, could beGL ,VL , or EL , SL ⊂ S

T set of unlabeled instanceswhere T ⊂ S − SL or T = S − SL

K a�ackers’ knowledge of the dataset O operation setw.r.t. a�ackers’ manipulation

A adjacency matrix A ∈ RN×N A′ modi�ed adjacency matrix Xfeature matrix

X ∈ {0, 1}N×F(·) or X ∈ RN×F(·)

X ′ modi�ed feature matrix fdeep learning model w.r.t. inductive

learning f (ind ) or transductive learning f (tra)f surrogate model

f well-designed model for defense θset of parameters

w.r.t. a speci�c model Z output of the model

ttarget instance,

can be a node, an edge or a graph ∆ a�ack budgets yground-truth label

w.r.t. an instance, y ∈ C

Ψperturbation space

of a�acks L loss function ofdeep learning model Q similarity function

of graphs

2.2 Taxonomies of GraphsDi�erent scenarios correspond to various types of graphs, hence we will introduce them further inthe following parts based on the basic graph de�nition in Section 2.1.

Directed and Undirected Graph. Directed graph, also called a digraph or a directed network,is a graph where all the edges are directed from one node to another �� but not backwards [7].On the contrary, a graph where the edges are bidirectional is called an undirected graph. Forundirected graphs, the convention for denoting the adjacency matrix doesn’t ma�er, as all edgesare bidirectional. Generally, Ai, j , Aj,i for directed graph and Ai, j = Aj,i for undirected graph.

Weighted and Unweighted Graph. Typically a weighted graph refers to an edge-weightedgraph where each edge is associated with a real value [7]. An unweighted graph may be used if arelationship in terms of magnitude doesn�t exist, i.e., the connections between edges are treated asthe same.

A�ributed Graph. An a�ributed graph refers to a graph where both node and edge are availableto have its own a�ributes/features [23]. Speci�cally, the a�ributes of nodes and edges could bedenoted as Xnode ∈ RN×Fnode and Xedдe ∈ RM×Fedдe , respectively. In most cases a graph usuallyhave a�ributes associated with nodes only, we use X to denote the node a�ributes/features forbrevity.

Homogeneous and Heterogeneous Information Graph. As well as a�ributes, the type ofnodes or edges is another important property. A graph G is called heterogeneous informationgraph if there are two or more types of objects/nodes or relations/edges in it [30, 37]; otherwise, itis called a homogeneous information graph.



Dynamic and Static Graph. Intuitively, nodes, edges and a�ributes are possibly changing overtime in a dynamic graph [41], which could be represented at G(t ) at time t . For a static graph, whichis simply de�ned as G, in which nodes, edges and a�ributes remain the same as the time changes.

Each type of graph has di�erent strengths and weaknesses. It’s be�er to pick the appropriatekind of graph to model the problem. As existing works are mainly focus on a simple graph, i.e.,undirected and unweighted. Besides, they assume that the graph is static and homogeneous forsimplicity. By default, the mentioned “graph” refers to “simple graph” in this paper.

2.3 Graph Analysis TasksIn this section, we will introduce the major tasks of graph analysis, in which deep learning modelsare commonly applied to. From the perspective of nodes, edges and graphs, we divide these tasksinto three categories: node-, link- and graph-level task.

Node-level Task. Node classi�cation is one of the most common node-level tasks, for instance,identifying a person in a social network. Given a graph G, with partial labeled nodesVL ⊆ V andother unlabeled ones, the goal of classi�ers is to learn a mapping function ϕ : V → C, where C isa set of pre-de�ned class labels. �e learned mapping function is applied to e�ectively identify theclass labels for the unlabeled nodes [34]. To this end, inductive learning and transductive learningse�ings are speci�ed based on the characteristic of training and testing procedures.

• Inductive Se�ing. For the inductive learning se�ing, a classi�er is usually trained on a setof nodes and tested on others that never seen during training.• Transductive Se�ing. Di�erent from inductive se�ing, test samples (i.e., the unlabeled nodes)

can be seen (but not their labels!) during the training procedure in transductive learningse�ing.

To conclude, the objective function that optimized by the classi�er could be formulated as follows:

L =1|VL |

∑vi ∈VL

L(fθ (G,X ),yi ) (1)

where yi is the class label of node vi , and L could be the loss of either inductive learning ortransductive learning, and the classi�er f with parameters θ is similarly de�ned with the twose�ings.

Link-level Task. Link-level task relates to the edge classi�cation and link prediction. Amongthem, link prediction is a more challenging task and widely used in real-world, which aims topredict the connection strength of an edge, e.g., predicting the potential relationship between twospeci�ed persons in a social network, and even the new or dissolution relationship in the future.Link prediction tasks take the same input as node classi�cation tasks, while the output is binarywhich indicates an edge will exist or not. �erefore the objective function of link prediction taskscould be similarly de�ned as Eq.(1) by changing yi ∈ {0, 1}, and replacingVL with a set of labelededges EL .

Graph-level Task. While treating the graph as a special form of node, the graph-level tasks aresimilar to node-level tasks. Take the most frequent application, graph classi�cation, as an example,the graph G could be represented as G = {G1,G2, . . . ,Gn}, whereGi = (Vi , Ei ) is a subgraph of theentire graph. �e core idea to solve the graph classi�cation problem is to learn a mapping functionG → C, here C represents the set of graph categories, so as to predict the class for an unseen graphmore accurately. �e objective function of graph classi�cation tasks is similar with Eq.(1) as well,in which the labeled training set is GL instead ofVL , and yi is the category of the graph Gi . In real



world, graph classi�cation task plays an important role in many crucial applications, such as socialand biological graph classi�cation.

3 ADVERSARIAL ATTACKIn this section, we will �rst introduce the de�nition of adversarial a�ack against deep learningmethods on graphs. �en, we categorize these a�ack models from di�erent perspectives. Finally,we will give a clear overview on existing works.

3.1 DefinitionAccording to existing works on graph adversarial a�acks, we summarize and give a uni�ed formu-lation for them.

A�ack on Graph Data. Considering f a deep learning function designed to tackle related down-stream tasks. Given a set of target instances T ⊆ S −SL , where S could beV , E or G respectivelyfor di�erent levels of tasks, and SL denotes the instances with labels,1 the a�acker aims to maximizethe loss of the target node on f as much as possible, resulting in the degradation of predictionperformance. Generally, we can de�ne the a�ack against deep learning models on graphs as:

maximizeG ∈Ψ(G)

∑ti ∈TL(fθ ∗ (Gti ,X , ti ),yi )

s .t . θ ∗ = arg minθ

∑vj ∈SL

L(fθ (Gvj ,X , tj )),yj )(2)

We denote a graph G ∈ G with node ti in it as Gti . G denotes the perturbed graph and Ψ(G)indicates the perturbation space on G, and we use G to represent original graph G or a modi�edone G, respectively.

To make the a�ack as imperceptible as possible, we set a metric for comparison between thegraphs before and a�er the a�ack, such that:

Q(Gti ,Gti ) < ϵs .t . Gti ∈ Ψ(G)

(3)

where Q denotes a similarity function, and ϵ is an allowed threshold of changes. Speci�cally, moremetrics will be detailed in Section 5.

3.2 Taxonomies of A�acksAs we focus on a simple graph, di�erent types of a�acks could be conducted on the target systems.In this section, we provide taxonomies for the a�ack types mainly proposed by Sun et al. [57] andsubsequently extended in our works according to various criteria. For di�erent scenarios, we giverelevant instructions and summarize them in the following:

3.2.1 A�acker�s Knowledge. To conduct a�acks on the target system, usually an a�acker willpossess certain knowledge about the target models and the dataset, which helps them achievethe adversarial goal. Based on the knowledge of the target models [57], we characterize di�erentthreatening levels of a�acks.

1Note that a�ackers only focus on a�acking the target instances in the test set.



White-box A�ack. �is is the simplest a�acks while a�ackers possess the entire informationof the target models, including the model architecture, parameters and gradient information, i.e.,the target models are fully exposed to the a�ackers. By utilizing such rich information, a�ackerscan easily a�ect the target models and bring destructive e�ects. However, it is impracticable in realworld since it is costly to possess such complete knowledge of target models. �erefore white-boxa�ack is less dangerous but o�en used to approximate the worst performance of a system undera�ack.

Gray-box A�ack. In this case, a�ackers are strict to possess excessive knowledge about thetarget models, this re�ects real world scenarios much be�er since it is more likely that a�ackershave limited access to get the information, e.g., only familiar with the architecture of the targetmodels. �erefore, it’s harder than conducting the white-box a�ack but more dangerous for thetarget models.

Black-box A�ack. In contrast to white-box a�ack, black-box a�ack assumes that the a�ackerdoes not know anything about the targeted systems. Under this se�ing, a�ackers are only allowedto do black-box queries on limited samples at most. However, it will be the most dangerous a�ackonce it works, since a�ackers can a�ack any models with limited (or no) information.

Here also comes a term “no-box attack” [55] which refers to an a�ack on the surrogate modelbased on their (limited) understanding of the target model.2 As a�ackers have complete knowledgeof the surrogate model, the adversarial examples are generated under white-box se�ing. No-boxa�ack will become a white-box a�ack once the a�ackers build a surrogate model successfullyand the adversarial examples are transferable to fool other target models. �eoretically, this kindof a�ack can hardly a�ect the systems in most cases with these constraints. However, as if thesurrogate model has strong transferability, the no-box a�ack is also destructive.

As a�ackers are strict with the knowledge of the target models, they also have less access to theinformation of the dataset D. Based on the di�erent levels of knowledge of the targeted system[17], we have:

Perfect Knowledge. In this case, the dataset D, including the entire graph structure, nodefeatures, and even the ground-truth labels of objects, are completely exposed to a�ackers, i.e., thea�acker is assumed to know everything about the dataset. �is is impractical but the most commonse�ing in previous studies. However, considering the damage that could be done by an a�ackerwith perfect knowledge is critical, since it may expose the potential weaknesses of the target systemin a graph.

Moderate Knowledge. �e moderate knowledge case represents an a�acker with less informa-tion about the dataset. �is makes a�ackers focus more on the performance of their a�acks, oreven search for more information through legitimate or illegitimate methods.

Minimal Knowledge. �is is the hardest one as a�ackers can only conduct a�acks with theminimal knowledge of datasets, such as partial connections of the graph or the feature informationof certain nodes. �is is essential information for an a�acker, otherwise it will fail even under thewhite-box a�ack se�ing. �e minimal knowledge case represents the least sophisticated a�ackerand naturally with less threat level.

To conclude, we use K to denote the a�ackers’ knowledge in an abstract knowledge space.Here K is consisting of three main parts, the dataset knowledge, the training algorithm and thelearned parameters of target models. For instance, K = (D, f ,θ ) denotes white-box a�ack with

2We must realize that no-box is a kind of gray- or black-box a�ack so we don’t don’t have a separate category for it.



T

[:] [:] [:]

[:]

[:][:]

Model

T

[:] [:] [:]

[:]

[:] [:]I

IT

[:] [:] [:]

[:]

[:] [:]

I

IT

[:] [:] [:]

[:]

[:] [:]

T

[:] [:] [:]

[:]

[:] [:]

Original graph

Perturbed Graph

Updatedmodel

Evasion

Train

Re-train

Reuse

Copy

attack

Poisoningattack

Adversarialattack

Fig. 2. An example of the evasion a�ack and the poisoning a�ack. (Image Credit: Zunger et al. [85])

perfect knowledge, and K = (Dtrain , f , θ ) denotes gray- or black-box a�ack and with moderateknowledge, where f and θ come from the surrogate model and Dtrain ⊂ D denotes the trainingdataset.

3.2.2 A�acker’s Goal. �e goal generally distinguishes three di�erent types of a�acks, i.e.,security violation, a�ack speci�city and error speci�city [1]. �ey are not mutually exclusive in fact,and more details are discussed below.

Security Violation. Security violation can be categorized into availability a�ack, integritya�ack and others. For availability a�ack, the the a�acker a�empts to destroy the function of thesystem, thereby impairing its normal operation. �e damage is global, that is, it a�acks the overallperformance of the whole system. For integrity a�ack, the a�ackers’ purpose is to bypass or foolthe detection of the system, which is di�erent from availability a�ack in that it does not destroythe normal operation of the system. �ere are other goals, such as reverse-engineering modelinformation to gain privacy knowledge.

Error Speci�city. Take node classi�cation task as an example, the error speci�c a�ack aimsto misclassify the predictions as speci�c labels, while unspeci�c a�ack does not care what theprediction is, the a�ack is considered successful as long as the prediction is wrong.

A�ack Speci�city. �is perspective focuses on the range of the a�ack, which can divide a�acksinto targeted a�ack and non-targeted a�ack (general a�ack). �e targeted a�ack focuses on a speci�csubset of nodes (usually a target node), while the non-targeted a�ack is undi�erentiated and global.With reference to Eq.(2), the di�erence is the domain of T ⊂ S − SL or T = S − SL .

It is worth noting that in some other �elds (e.g., computer vision), targeted a�ack refers to thespeci�c error a�ack, and non-targeted a�ack the unspeci�c error a�ack. In the �eld of graphadversarial learning, we suggest that distinguishing the targeted a�ack based on the a�ack range,and whether it is error speci�c based on the consequence.

3.2.3 A�acker’s Capability. A�acks can be divided into poisoning a�ack and evasion a�ackaccording to adversaries’ capabilities, which are occurred at di�erent stages of the a�acks.

Poisoning A�ack. Poisoning a�acks (a.k.a training-time a�acks) try to a�ect the performanceof the target models by modifying the dataset in the training stage, i.e., the target models are trainedon the poisoned datasets. Since transductive learning is widely used in most graph analysis tasks,



the test samples (but not their labels) are participated in the training stage, which leads to thepopularity of poisoning a�acks. Under this scenario, the parameters of target models are retraineda�er the training dataset being modi�ed, thus we can de�ne poisoning a�acks according to Eq.(2):

maximizeG ∈Ψ(G)



∑vj ∈SL

L(fθ (Gvj ,X ,vj ),yj )(4)

Evasion A�ack. While poisoning a�acks focus on the training phase, evasion a�acks (a.k.atest-time a�acks) tend to add adversarial examples in test time. Evasion a�acks occur a�er thetarget model is well trained on a clean graph, i.e., the learned parameters are �xed during evasiona�acks. �erefore, we can de�ne the formulation of evasion a�acks by changing part of Eq.(4)slightly:

θ ∗ = arg minθ

∑vj ∈SL

L(fθ (Gvj ,X ,vj ),yj ) (5)

In Figure 2, we show a toy example of the poisoning a�ack and the evasion a�ack. In most casesthe poisoning a�ack does not work well because the model is retrained to alleviate the adversarialimpacts.

3.2.4 A�ack Strategy. For a�acking a target model on graph data, a�ackers may have a lineof strategies to achieve their adversarial goals. In most instances, they will focus on the graphstructure or node/edge features. Based on the strategy applied on a graph, we have:

Topology A�ack. A�ackers are mainly focus on the the topology of the graph, a.k.a, structurea�ack. For example, they are allowed to add or remove some edges legally between nodes inthe graph to fool the target system. To specify this, we de�ne an a�ack budget ∆ ∈ N, thus thede�nition of the topology a�ack is:

maximizeG ∈Ψ(G)


s .t .∑u<v

|Au,v −A′u,v | ≤ ∆(6)

where A′ is the adjacency matrix of the perturbed graph, and θ ∗ is discussed in Section 3.2.3.

Feature A�ack. Although the topology a�ack is more common, a�ackers can conduct featurea�acks as well. Under this se�ing, the features of speci�ed objects will be changed. However, unlikegraph structure, the node/edge features could be either binary or continuous, i.e., X ∈ {0, 1}N×F(·)orX ∈ RN×F(·) . For binary features, a�ackers can �ip them like edges, while for continuous features,a�ackers can add a small value on them, respectively. �us the de�nition of feature a�acks is:

maximizeG ∈Ψ(G)


s .t .∑u

∑j

|Xu, j − X ′u, j | ≤ ∆(7)

whereX ′ is similarly de�ned, and here ∆ ∈ N for binary features and ∆ ∈ R for continuous features.



Hybrid. Usually, a�ackers will conduct both a�ack strategies at the same time to exert morepowerful impact. Besides, they could even add several fake nodes (with fake labels), which willhave their own features and relationship with other benign instances. For example, some fakeusers will be added into the recommendation system and a�ect the results of the system [16, 29].�erefore, we conclude a uni�ed formulation as follows:

maximizeG ∈Ψ(G)


s .t .∑u<v

|Au,v −A′u,v | +∑u

∑j

|Xu, j − X ′u, j | ≤ ∆(8)

A′ and X ′ may not have the same dimension with A and X respectively if some fake nodes areadded into the graph.

3.2.5 A�acker�s Manipulation. Although a�ackers may have enough knowledge about thetarget system, they may not always have access to manipulation all of the dataset. Besides, di�erentmanipulations may have di�erent budget cost. For example, in an e-commerce system [14], a�ackerscould only purchase more items (add edges) but unable to delete the purchase records (removeedges). Based on di�erent manipulation, we have:

Add. In a graph, a�ackers could add edges between di�erent nodes, or add features on speci�ednodes/edges. �is is the most simplest manipulation and naturally has lowest budget in mostscenarios.

Remove. In a graph, a�ackers could remove edges between di�erent nodes, or remove featureson speci�ed nodes/edges. �is kind of manipulation will be harder than “add”, since a�ackers maynot have enough access to execute the “remove” manipulation.

Rewiring. �e manipulations mentioned above may become more noticeable for the targetsystem if ∆ is larger. To address this problem, the rewiring operation is proposed in a less noticeableway than simple adding/removing manipulation [39, 70]. For instance, a�ackers can add an edgeconnected with a target node, while remove an edge connected with it at the same time. Each timeof rewiring manipulation related to two single operations, which can preserve some basic graphproperties (e.g., degree) and naturally more unnoticeable.

To conclude, we de�ne an operation set O = {OAdd ,ORem ,ORew }, representing the above threemanipulations of a�ackers, respectively.

3.2.6 A�ack Algorithm. Generally speaking, current methods of adversarial a�acks on gener-ating adversarial examples are mainly based on the gradient information, either from the targetmodel (white-box a�ack) or the surrogate model (black- or gray-box a�ack). Beyond that, there areseveral methods generating adversarial examples based on other algorithms. From the perspectiveof a�ack algorithm, we have:

Gradient-based Algorithm. Intuitively, gradient-based algorithm is simple but e�ective. �ecore idea is that: �x the parameters of a trained model, and regard the input as a hyperparameter tooptimize. Similar to the training process, a�ackers could use the partial derivative of loss L withrespect to edges (topology a�ack) or features (feature a�ack), to decide how to manipulate thedataset. However, gradients could not applied directly into the input data due to the discretenessof the graph data, instead, a�ackers o�en choose the one with greatest absolute gradients, andmanipulate it with a proper value. While most deep learning models are optimized by gradients,on the contrary, a�ackers could destroy them by gradients as well.



Non-gradient-based Algorithm. In addition to gradient information, a�ackers could generateadversarial examples in other ways. For example, from the perspective of the genetic algorithm,a�ackers can choose the population (adversarial examples) with highest �tness score (e.g., erroneousoutputs of the target/surrogate model) generations by generations. Besides, reinforcement learningalgorithms are also commonly used to solve this issue. Reinforcement learning based a�ackmethods [20, 39] will learn the generalizable adversarial examples within an action space. Moreover,adversarial examples can even generated by a well-designed generative model.

3.2.7 Target Task. As discussed in Section 2.3, there exists three major tasks in graph domains.According to di�erent levels of tasks, we can divide existing a�ack methods into the followingthree categories, respectively.

Node-relevant Task. Currently, there are several a�ack models against node-relevant tasks[2, 4, 10, 12, 20, 20, 58, 65, 68, 70, 72, 75, 77, 82, 85, 86]. Typically, Bojchevski et al. and Hou et al.[2, 29] utilize random walk [49] as an surrogate model to a�ack node embedding, Dai et al. [20]uses a reinforcement learning based framework to disturb node classi�cation tasks. In general,most of the existing works address in node classi�cation task due to its ubiquity in real world.

Link-relevant Task. Many relationships in real world can be represented by graph. For someof these graphs, such as social graph, are always dynamic in reality. �erefore, the link predictionon the graph comes into being, in order to predict the change of the edge. Link prediction is alsothe most common application of link-relevant tasks and there are many a�ack related studies havebeen proposed [10, 13, 16, 18, 58, 70, 82, 82].

Graph-relevant Task. Graph-relevant tasks treat graph as a basic unit. Compared to node- orlink-relevant tasks, it is macro and large-scale. �e application of graph-relevant methods are moreinclined to the research community of biology, chemistry, environment, materials, etc. In this task,the model tries to extract features of nodes and spatial structures to represent a graph, so as toachieve downstream operations such as graph classi�cation or clustering. Similar with Eq.(2), thetargets of a�ack should be an graphs, while y is determined by the speci�c task. For graph-relevanttasks, some a�ack researches have also appeared [8, 17, 39].

3.3 Summary: A�ack on GraphIn this part, we will discuss the main contributions of current works, and point out the limitations tobe overcome, and propose several open questions in this area. Speci�cally, we cover these releasedpapers and its characteristic in Table 2.

3.3.1 Major Contributions. So far, most of the existing works in the a�ack scenario are mainlybased on the gradients, either the adjacency matrix or feature matrix, leading to topology a�ackand feature a�ack, respectively. However, the gradients’ information of the target model is hardto acquire, instead, a�ackers will train a surrogate model to extract the gradients. In addition togradient-based algorithms, several heuristic methods are proposed to achieve the goal of a�ack,such as genetic algorithm [71] and reinforcement learning [60] based algorithms. According todi�erent tasks in graph analysis, we summarize the major contributions of current works in thefollowing.

Node-relevant Task. Most of the researches focus on the node classi�cation task. �e work [85]is the �rst to study the adversarial a�ack on graph data, using an e�cient greedy search method toperform perturbations on node features and graph structure and a�acking traditional graph learningmodel. From the perspective of gradients, there are several works [12, 20, 68, 72, 75, 86] focus on



Table 2. Related works on a�ack in details.

Reference Model Algorithm Target Task Target Model Baseline Metric Dataset

[6] GF-A�ack Graph signal processing Node classi�cation GCN, SGC,DeepWalk, LINE

Random,Degree,RL-S2V,Aclass

AccuracyCora,

Citeseer,Pubmed

[72] IG-FGSM,IG-JSMA Gradient-based GCN Node classi�cation GCN

FGSM,JSMA,

Ne�ack

Classi�cation margin,Accuracy

Cora,Citeseer,PolBlogs

[9] EPA Genetic algorithm Community detection GRE, INF,LOU

AQ , AS , AB ,AD , DS , DW

NMI,ARI

Synthetic networks,Football, Email,

Polblogs

[86] Meta-Self,Meta-Train Gradient-Based GCN Node classi�cation

GCN,CLN,

Deepwalk

DICE,Ne�ack,

First-order

Misclassi�cation rate,Accuracy

Cora-ML,Citeseer,PolBlogs,PubMed,

[2] ADW 2,ADW 3

Gradient-based random walk Node classi�cation,Link prediction Deepwalk

BrndBeiдBdeд

F1 score,Classi�cation margin

Cora,Citeseer,PolBlogs

[13] TGA-Tra,TGA-Gre Gradient-based DDNE Link prediction

DDNE,ctRBM,

GTRBM,dynAERNN

Random,DGA,CNA

ASR, AMLRADOSLAW,

LKML,FB-WOSN

[39] ReWa� Reinforcement learningbased on GCN

Graphclassi�cation GCN RL-S2V,

Random ASR

REDDIT-MULTI-12K,

REDDIT-MULTI-5K,IMDB-MULTI

[75] PGD,Min-Max Gradient-based GCN Node classi�cation GCN

DICE,Meta-Self,

Greedy

Misclassi�cationrate

Cora,Citeseer

[77] EDA Genetic algorithmbased on Deepwalk

Node classi�cation,Community detection

HOPE,LPA,EM,

Deepwalk

Random,DICE,RLS,DBA

NMI,Micro-F1,Macro-F1

Karate,Game,

Dolphin

[4] DAGAER Generative model basedon VGAE Node classi�cation GCN Ne�ack ASR Cora,

Citeseer

[79] - Knowledge embedding Fact plausibilityprediction

TransE,TransR,RESCAL

Random MRR, HR@K FB15k,WN18

[65] - Based on LinLBP Node classi�cation,Evasion

LinLBP, JW, GCNLBP, RW, LINE

DeepWalkNode2vec

Random,Ne�ack

FNR,FPR

Facebook, Enron,Epinions,Twi�er,Google+

[8] Q-A�ack Genetic algorithm Community detectionFN, Lou, SOA,

LPA, INF,Node2vec+KM

Random,CDA,DBA

NMI,Modularity Q

Karate,Dolphins,Football,Polbooks,

[29] HG-a�ack Label propagation algorithm,Nodes injection Malware detection Orig-HGC AN-A�ack TP, TN, FP, FN, F1,

Precision, Recall, AccuracyTencent Security

Lab Dataset

[16] UNA�ack Gradient-based similarity method,Nodes injection Recommendation Memory-based CF,

BPRMF, NCF

Random,Average,Popular,

Co-visitation

Hit@KFilmtrust,Movielens,

Amazon

[18] - Gradient-based GAN and MF,Nodes injection Recommendation MF -

A�ack di�erence,TVD, JS, Est.,

Rank loss @K,Adversarial loss

MovieLens 100k,MovieLens 1M

[68] Greedy GAN Gradient-based GCN and GAN Nodeclassi�cation GCN Random Accuracy,

F1 score, ASRCora,

Citeseer

[70] CTR, OTC Neighbor score basedon graph structure Link prediction

Traditionallink prediction

algorithms- AUC, AP

WTC 9/11,ScaleFree,Facebook,

Random network

[10] IGA Gradient-based GAE Link prediction

GAE,LRWDeepWalk,Node2vec,

CN, Random, Katz

RAN,DICE,

GA

ASR,AML

NS,Yeast,

FaceBook

[20] RL-S2V Reinforcement learning Node/GraphClassi�cation

GCN,GNN

Randomsampling Accuracy

Citeseer, Cora,Finance,Pubmed

[85] Ne�ack Greedy search & gradientbased on GCN Node classi�cation

GCN,CLN,

Deepwalk

Rnd,FGSM

Classi�cation margin,Accuracy,

Cora-ML,Citeseer,PolBlogs

[12] FGA Gradient-based GCN Node classi�cation,Community detection

GCN, GraRep,DeepWalk,Node2vec,

LINE, GraphGAN

Random,DICE,

Ne�ackASR, AML

Cora,Citeseer,PolBlogs,

[58] Opt-a�ack Gradient-based Deepwalk and Link predictionDeepWalk,LINE, SC

Node2vec, GAE

Random,PageRank,

Degree sum,Shortest path

AP,Similarity Score

Facebook,Cora,

Citeseer

[82] Approx-Local Similarity methods Link predictionLocal& global

similaritymetrics

RandomDel,GreedyBase

Katz similarity,ACT distance,

Similarity score

Random network,Facebook

[17] Targeted noise injection,Small community a�ack Noise injection Graph clustering,

Community detection

SVD, Node2vec,Community

detection algorithms- ASR, FPR

Reverse,Engineered,

DGA,Domains,

NXDOMAIN



the topology a�ack, adding/removing edges between nodes based on the gradients informationform various surrogate models. Specially, Xu et al. [75] present a novel optimized-based a�ackmethod that uses the gradients of surrogate model and facilitates the di�culty of tackling discretegraph data; Zugner et al. [86] use meta-gradients to solve the bilevel problem of poisoning a graph;Wang et al. [68] propose a greedy method based on Generative Adversarial Network (GAN) [25]to generate adjacency and feature matrices of fake nodes, which will be injected to a graph tomisclassify the target models; Chen et al. and Dai et al. [12, 20] both use GCN as a surrogatemodel to extract gradients information and thus generating an adversarial graph; Wu et al. [72]argue that integrated gradients can be�er re�ect the e�ect of perturbing certain features or edges.Moreover, Dai et al. [20] consider evasion a�acks on the task of node classi�cation and graphclassi�cation, and proposes two e�ective a�ack methods based on the reinforcement learningand genetic algorithms, respectively. Taking Deepwalk [49] as a base method, Bojchevski et al.[2] and Xuan et al. [77] propose a eigen decomposition and genetic algorithm based strategy toa�ack the network embedding, respectively. Also, Bose et al. [4] design a uni�ed encoder-decoderframework from the generative perspective, which can be employed to a�ack diverse domains(images, text and graphs), Wang et al. [65] propose a threat model to manipulate the graph structureto evade detection by solving a graph-based optimization problem e�ciently. Considering real-world application scenarios, Hou et al. [29] propose an algorithm that allows malware to evadedetection by injecting nodes (apps).

Link-relevant Task. Link prediction is another fundamental research problems in networkanalysis. In this scenario, Waniek et al. [70] study the link connections, and propose heuristicalgorithms to evade the detection by rewiring operations. Furthermore, Chen et al. [10] put forwarda novel iterative gradient a�ack method based on a graph auto-encoder framework. Similarly, Chenet al. [13] also exploit the gradients information of surrogate model, and �rstly study the worksabout adversarial a�acks on dynamic network link prediction (DNLP). Besides, Sun et al. [58] focuson poisoning a�acks and propose a uni�ed optimization framework based on projected gradientdescent. In the recommendation scenario, considering the interactions between users and itemsas a graph and treat it as a link prediction task, Christakopoulou et al. [18] and Chen et al. [16]propose the method of injecting fake users to degrade the recommendation performance of thesystem.

Graph-relevant Task. Few works study the adversarial a�acks on this scenario, Ma et al. [39]propose a rewiring operation based algorithm, which uses reinforcement learning to learn thea�ack strategy on the task of graph classi�cation. Besides, Chen et al. [8] introduce the problem ofcommunity detection and proposes a genetic algorithm based method. Chen et al. [17] focus ongraph clustering and community detection scenarios, devise generic a�ack algorithms based onnoise injection and demonstrates their e�ectiveness against a real-world system.

3.3.2 Current Limitations. Despite the remarkable achievements on a�acking graph learningmodels, there are several limitations remain to be overcome:

• Unnoticeability. Most works are unaware of preserving the adversarial a�acks fromnoticeable impacts, they simply consider the lower a�ack budgets but far from enoughinstead.• Scalability. Existing works are mainly focus on a relatively small-scale graph, however,

million-scale or even larger graphs are commonly seen in real life, and e�cient a�acks onlarger graph are leaving unexplored.3

3�ere has been some e�orts on large-scale graph computation that would be useful for graph learning methods. [78, 84]



• Knowledge. It is common to assume that a�ackers have perfect knowledge about the dataset,but it is unreasonable and impractical due to the limited access of a�ackers. Nevertheless,very few works conduct a�acks with moderate or even minimal knowledge.• Physical Attack. Most of the existing works conduct a�acks on ideal datasets. However,

in real world, a�acks need to consider more factors. For instance, in a physical a�ack theadversarial examples as input will be distorted accidentally and it o�en fails to achieve thedesired results. Unlike conducting a�acks on ideal datasets, this brings more challenges toa�ackers.

4 DEFENSE�e proposed a�ack methods have made researchers realize the importance of the robustness ofdeep learning models. Relatively, some defense methods have also been proposed. In this section,we will give some general de�nitions of defense models against adversarial a�ack methods on graphdata and its related concepts. In addition, this section systematically classi�es existing defensemethods and details some typical defense algorithms.

4.1 DefinitionSimply put, the purpose of defense is to make the performance of the model still maintain a certainstability on the data that is maliciously disturbed. Although some defense models have beenproposed, there is no clear and uni�ed de�nition of the defense problem. To facilitate the discussionof the following, we propose a uni�ed formulation of the defense problem.

Defense on Graph Data. Most symbols are the same as mentioned in Section 3, and we de�ne f

as a deep learning function with the loss function L designed for defense, it receives a graph eitherperturbed or not. �en the defense problem can be de�ned as:

minimizeG ∈Ψ(G)∪G

∑ti ∈TL( fθ ∗ (Gti ,X , ti ),yi )


∑vj ∈SL

L( fθ (Gvj ,X ,vj ),yj )(9)

where G ∈ Ψ(G)∪G can be a well-designed graph G for the purpose of defense, or a clean graphG, which depends on whether the model have been a�acked.

4.2 Taxonomies of DefensesIn this section, we divide the existing defense methods into several categories according to theused algorithms and describe them by examples. To the best of our knowledge, it’s the �rst timefor those defense methods to be clearly classi�ed and the �rst time for those types to be clearlyde�ned. All taxonomies are listed below:

Preprocessing-based Defense. As the most intuitive way, directly manipulating the raw datahas a great impact on the model performance. Additionally, preprocessing raw data is independentto model structures and training methods, which gives considerable scalability and transferability.Some existing works[72] improve model robustness by conducting certain preprocessing stepsbefore training, and we de�ne this type of defense methods as Preprocessing-based Defense. Inaddition, Wu et al. [72] try to drop edges that connect nodes with low similarity score, which couldreduce the risk for those edges of being a�ack and nearly does no harm to the model performance.



Structure-based Defense. In addition to raw data, model structure is also crucial to the modelperformance. Some existing works modify the model structure, such as GCN, to gain more ro-bustness, and we de�ne this type of defense methods as Structure-based Defense. Instead of GCN’sgraph convolutional layers, Zhu et al. [83] use Gaussian-based graph convolutional layers, whichlearns node embeddings from Gaussian distributions and assign a�ention weight according totheir variances. Wang et al. [67] propose dual-stage aggregation as a Graph Neural Network(GNN) encoder layer to learn the information of neighborhoods, and adopt GAN [25] to conductcontrastive learning. Tang et al. [62] initialize the model by meta-optimization and penalize theaggregation process of GNN. And Ioannidis et al. [32] propose a novel GCN architecture, namedAdaptive Graph Convolutional Network (AGCN), to conduct robust semi-supervised learning.

Adversarial-based Defense. Adversarial training has been widely used in deep learning dueto its excellent performance. Some researchers successfully adopt adversarial training from other�elds into graph domain to improve model robustness, and we de�ne the defense methods whichuse adversarial training as Adversarial-based Defense. �ere are two types of adversarial training: i)Training with adversarial goals. Some adversarial training methods gradually optimize the model ina continuous min-max way, under the guide of two opposite (minimize and maximize) objectivefunctions [24, 56, 75]; ii) Training with adversarial examples. Other adversarial-based models arefed with adversarial samples during training, which helps the model learn to adjust to adversarialsamples and thus reduces the negative impacts of those potential a�ack samples [11, 22, 28, 69].

Objective-based Defense. As a simple and e�ective method, modifying objective function playsan important role in improving the model robustness. Many existing works a�empt to train arobust model against adversarial a�acks by optimizing the objective function, and we de�ne thistype of defense methods as Objective-based Defense. However, there is a li�le intersection betweenthe de�nition of Objective-based Defense and Adversarial-based Defense mentioned above, becausethe min-max adversarial training (the �rst type of Adversarial-based Defense) is also in the rangeof objective optimization. �erefore, to accurately distinguish the de�nition boundary betweenAdversarial-based Defense and Objective-based Defense, we only consider those methods which areobjective-based but not adversarial-based as the instances of Objective-based Defense. Zugner etal. [87] and Bojchevski et al. [3] combine hinge loss and cross entropy loss to perform a robustoptimization. Jin et al. [33] and Chen et al. [11] regularize the training process by studying thecharacteristics of graph powering and smoothing the cross-entropy loss function, respectively.

Detection-based Defense. Some existing works focus on the detection of adversarial a�acks, orthe certi�cation of model/node robustness, which we de�ne as Detection-based Defense. Althoughthose detection-based methods are unable to improve the model robustness directly, they canserve as supervisors who keep monitoring the model security and alarm for awareness when ana�ack is detected. Zugner et al. [87] and Bojchevski et al. [3] propose novel methods to certi�catewhether a node is robust, and a robust node means it won’t be a�ected even under the worse-case/strongest a�ack. Pezeshkpour et al. [50] study the impacts of adding/removing edges byperforming adversarial modi�cations on the graph. Xu et al. [76] adopt link prediction and itsvariants to detect the potential risks, and for example, link with low score could be a maliciouslyadded edge. Zhang et al. [80] utilize perturbations to explore the model structure and proposea method to detect adversarial a�ack. Hou et al. [29] detect the target node by uncovering thepoisoning nodes injected in the heterogeneous graph [59]. Ioannidis et al. [31] e�ectively detectanomalous nodes in large-scale graphs by applying a graph-based random sampling and consensusstrategies.



Hybrid Defense. As the name suggested, we de�ne Hybrid Defense to denote the defense methodwhich consists of two or more types of di�erent defense algorithms mentioned above. Manyresearches �exibly combine several types of defense algorithms to achieve be�er performance,thus alleviate the limitations of only using a single method. As mentioned above, Zugner et al.[87] and Bojchevski et al. [3] address the certi�cation problem of node robustness and conductrobust training with objective-based optimization. Wang et al. [67] focus on improving the modelstructure and adversarially train the model by GAN. Chen et al. [11] adopt adversarial trainingand other regularization mechanisms (e.g., gradient smoothing, loss smoothing). Xu et al. [76]propose a novel graph generation method together with other detection mechanisms (e.g., linkprediction, subgraph sampling, outlier detect) as preprocessing to detect potential malicious edges.Miller et al. [43] consider a combination of the features come from graph structure and origin nodea�ributes and use well-designed training data selection methods to do a classi�cation task. �eseare instances of the hybrid defense.

Others. Currently, the amount of researches for defense is far less than that of a�ack on thegraph domain. To the best of our knowledge, most existing works for defense only focus on nodeclassi�cation tasks, and there are a lot of opportunities to study defense methods with di�erenttasks on graph, which will enrich our defense types. For example, Pezeshkpour et al. [50] evaluatethe robustness of several link prediction models and Zhou et al. [81] adopt the heuristic approachto reduce the damage of a�acks.

4.3 Summary: Defense on GraphIn this section, we will introduce the contributions and limitations of current works about defense.�en, we will discuss the potential research opportunities in this area. �e details and comparisonsof various methods are placed in Table 3.

4.3.1 Major Contributions. Currently, most of the methods to improve the robustness of GNNsstart from two aspects: a robust training method or a robust model structure. Among them, thetraining methods are mostly adversarial training, and many of the model structure improvementsare made using the a�ention mechanism. In addition, there are some studies that do not directlyimprove the robustness of GNNs but try to verify the robustness or try to detect the data that isdisturbed. In this part, considering the di�erent ways in which existing methods contribute to theadversarial defense of graphs, we summarize the contributions of graph defense from the followingthree perspectives: adversarial learning, model improvement and others.

Adversarial Learning. As a successful method that has shown to be e�ective in defending theadversarial a�acks of image data and test data, adversarial learning [26] augments the trainingdata with adversarial examples during the training stage. Some researchers have also tried to applythis kind of idea to graph defense methods. Wang et al. [67] think the vulnerabilities of graphneural networks are related to the aggregation layer and the perceptron layer. To address thesetwo disadvantages, they propose an adversarial training framework with a modi�ed GNN modelto improve the robustness of GNNs. Chen et al. [11] propose di�erent defense strategies basedon adversarial training for target and global adversarial a�ack with smoothing distillation andsmoothing cross-entropy loss function. Feng et al. [24] propose a method of adversarial trainingfor the a�ack on node features with a graph adversarial regularizer which encourages the model togenerate similar predictions on the perturbed target node and its connected nodes. Sun et al. [56]successfully transfer the e�ciency of Virtual Adversarial Training (VAT) to the semi-supervisednode classi�cation task on graphs and applies some regularization mechanisms on original GCN tore�ne its generalization performance. Wang et al. [69] point out that the values of perturbation



Table 3. Related works on defense in details. The type is divided according to the algorithm.

Reference Model Algorithm Type Target Task Target Model Baseline Metric Dataset

[87] GNN (trained withRH-U)

Robustness certi�cation,Objective based Hybrid Node

classi�cationGNN,GCN

GNN(trained withCE, RCE, RH)

Accuracy,Averaged

Worst-caseMargin

Citeseer,Cora-ML,Pubmed

[75] - Adversarial training Adversarialtraining

Nodeclassi�cation GCN GCN Accuracy,

Misclassi�cation rateCiteseer,

Cora

[33] r-GCN,VPN

Graphpowering

Objectivebased

Nodeclassi�cation GCN

ManiReg,ICA,

Vanilla GCN,…

Accuracy, Robustness merit,A�ack deterioration

Citeseer,Cora,

Pubmed

[72] - Drop edges Preprocessing Nodeclassi�cation GCN GCN Accuracy,

Classi�cation margin

Citeseer,Cora-ML,Polblogs

[67] DefNetGAN,GER,ACL

Hybrid Nodeclassi�cation

GCN,GraphSAGE

GCN,GraphSAGE

Classi�cationmargin

Citeseer,Cora,

Polblogs

[50] CRIAGE Adversarialmodi�cation

Robustnessevaluation

Linkprediction

Knowledgegraph

embeddings- Hits@K,

MRR

Nations,Kinship,WN18,

YAGO3-10

[83] RGCN Gaussian-basedGCN

Structurebased

Nodeclassi�cation GCN GCN,

GAT AccuracyCiteseer,

Cora,Pubmed

[11]Global-AT,Target-AT,SD, SCEL

Adversarialtraining,

Smooth defenseHybrid Node

classi�cation GNN AT ADR,ACD

Citeseer,Cora,

PolBlogs

[56] SVAT,DVAT VAT Adversarial

trainingNode

classi�cation GCN GCN AccuracyCiteseer,

Cora,Pubmed

[80] - KLdivergence

Detectionbased

Nodeclassi�cation

GCN,GAT - Classi�cation margin,

Accuracy, ROC, AUC

Citeseer,Cora,

PolBlogs

[24] GCN-GATV GAT,VAT

Adversarialtraining

Nodeclassi�cation GCN

DeepWalk,GCN,

GraphSGAN,…

AccuracyCiteseer,

Cora,NELL

[22] S-BVAT,O-BVAT BVAT Adversarial

trainingNode

classi�cation GCN

ManiReg,GAT,

GPNN,GCN,VAT,

…

Accuracy

Citeseer,Cora,

Pubmed,NELL

[62] PA-GNN Penalized aggregation,Meta learning

Structurebased

Nodeclassi�cation GNN

GCN,GAT,

PreProcess,RGCN,VPN

Accuracy

Pubmed,Reddit,

Yelp-Small,Yelp-Large

[69] GraphDefense Adversarialtraining

Adversarialtraining

Node/Graphclassi�cation GCN Drop edges,

Discrete AT AccuracyCora,

Citeseer,Reddit

[29] HG-HGC HG-Defense Detectionbased Malware detection

Malwaredetectionsystem

Other malwaredetection systems Detection rate Tencent Security

Lab Dataset

[32] AGCN Adaptive GCN withedge dithering

Structurebased

Nodeclassi�cation GCN GCN Accuracy

Citeseer,PolBlogs,

Cora, Pubmed

[31] GraphSAC Random sampling,Consensus

Detectionbased

Anomalydetection

Anomalymodel

GAE,Degree,

Cut ratio,…

AUCCiteseer,PolBlogs,

Cora, Pubmed

[3] GNN (train withLRCE , LCEM )

Robustness certi�cation,Objective based Hybrid Node

classi�cation GNN GNNAccuracy,

Worst-caseMargin

Cora-ML,Citeseer,Pubmed

[81] IDOpt,IDRank

Integer Program,Edge Ranking

Heuristicalgorithm

Linkprediction

Similarity-basedlink prediction

modelsPPN DPR

PA,PLD,

TVShow,Gov

[43]SVM with

a radial basisfunction kernel

Augmented feature,Edge selecting Hybrid Node

classi�cation SVM GCN Classi�cationmarigin

Cora,Citeseer

[28] APR,AMF

MF-BPRbased AT

Adversarialtraining Recommendation MF-BPR

ItemPop,MF-BPR,CDAE,NeuMF,IRGAN

HR,NDCG

Yelp,Pinterest,Gowalla

[76]SL, OD,P+GGD,

ENS,GGD

Link prediction,Subsampling,

Neighbour analysisHybrid Node

classi�cation GNN, GCN LP AUC Citeseer,Cora



in adversarial training could be continuous or even negative. And they propose an adversarialtraining method to improve the robustness of GCN. He et al. [28] adopt adversarial training toBayesian Personalized Ranking (BPR) [52] on recommendation by adding adversarial perturbationson embedding vectors of the user and item.

Model Improvement. In addition to improvements in training methods, many studies havefocused on improving the model itself. In recent years, the a�ention mechanism [63] has shownextraordinary performance in the �eld of natural language processing. Some studies of graphdefense have borrowed the way that they can automatically give weight to di�erent features toreduce the in�uence of perturbed edges on features. Zhu et al. [83] use Gaussian distributions toabsorb the e�ects of adversarial a�acks and introduce a variance-based a�ention mechanism toprevent the propagation of adversarial a�acks. Tang et al. [62] propose a novel framework base ona penalized aggregation mechanism which restrict the negative impact of adversarial edges. Hou etal. [29] enhance the robustness of malware detection system in Android by their well-designeddefense mechanism to uncover the possible injected poisoning nodes. Jin et al. [33] point out thebasic �aws of the Laplacian operator in origin GCN and propose a variable power operator toalleviate the issues. Miller et al. [43] propose to use unsupervised methods to extract the featuresof the graph structure and use support vector machines to complete the task of node classi�cation.In addition, they also propose two new training data partitioning strategies.

Others. In addition to improving the robustness of the model, there are also some studies thathave made other contributions around robustness, such as detecting disturbed edges, certi�catingnode robustness, analyses of current a�ack methods and so on. Zugner et al. [87] propose the�rst work on certifying the robustness of GNNs. �e method can give robustness certi�cationwhich states whether a node is robust under a certain space of perturbations. Zhang et al. [80]study the defense methods’ performance under random a�acks and Ne�ack concluded that graphdefense models which use structure exploration are more robust in general. Otherwise, this paperproposes a method to detect the perturbed edges by calculating the mean of the KL divergences [36]between the so�max probabilities of the node and it’s neighbors. Pezeshkpour et al. [50] introducea novel method to automatically detect the error for knowledge graphs by conducting adversarialmodi�cations on knowledge graphs. In addition, the method can study the interpretability ofknowledge graph representations by summarizing the most in�uential facts for each relation. Wuet al. [72] argue that the robustness issue is rooted in the local aggregation in GCN by analyzinga�acks on GCN and propose an e�ective defense method based on preprocessing. Bojchevski et al.[3] propose a robustness certi�cate method that can certi�cate the robustness of GNNs regardingperturbations of the graph structure and the label propagation. In addition, they also propose a newrobust training method guided by their own certi�cate method. Zhou et al. [81] model the problemof robust link prediction as a Bayesian Stackelberg game [64] and propose two defense strategiesto choose which link should be protected. Ioannidis et al. [32] propose a novel edge-dithering (ED)approach reconstructs the original neighborhood structure with high probability as the numberof sampled graphs increases. Ioannidis et al. [31] introduce a graph-based random sampling andconsensus approach to e�ectively detect anomalous nodes in large-scale graphs.

4.3.2 Current Limitations. As a new-rising branch that has not been su�ciently studied, currentdefense methods have several major limitations as follows:

• Diversity. At present, most works of defense mainly focus on node classi�cation tasksonly. From the perspective of defender, they need to improve their model robustness ondi�erent tasks.



• Scalability. Whether a defense model can be widely used in practice largely depends onthe cost of model training. Most existing works lack the consideration of training costs.• �eoretical Proof. Most of current methods only illustrate their e�ectiveness by showing

experimental results and textual descriptions. It will be great if a new robust method’se�ectiveness can be proved theoretically.

5 METRICSIn this section, we �rst introduce the metrics that are common in graph analysis tasks, which areused in a�ack and defense scenarios as well. Next, we introduce some new metrics proposed ina�ack and defense works from three perspectives: e�ectiveness, e�ciency, and imperceptibility.

5.1 Common Metrics• FNR and FPR. In the classi�cation or clustering tasks, here is a category of metrics based

on False Positive (FP), False Negative (FN ), True Positive (TP) and True Negative (TN ). Ina�ack and defense scenarios, commonly used to quantify are False Negative Rate (FNR)and False Positive Rate (FPR). Speci�cally, FNR is equal to FN over (TP + FN ), and FPR isequal to FP over (FP + TN ). For all negative samples, the former describes the proportion offalse positive samples detected, while the la�er describes the proportion of false negativesamples detected. In other words, for negative samples, the former is the error rate, andthe la�er is the miss rate.• Accuracy (Acc). �e Accuracy is one of the most commonly used evaluation metrics,

which measures the quality of results based on the percentage of correct predictions overtotal instances.• F1-score. �e F1 score [2] can be regarded as a harmonic average of model Precision and

Recall score [51], with a maximum value of 1 and a minimum value of 0.• Area Under Curve (AUC). �e AUC is the area under the Receiver Operating Character-

istic (ROC) curve [80]. Di�erent models corresponding to di�erent ROC curves, and it isdi�cult to compare which one is be�er if there is a crossover between the curves, thuscomparison based on the AUC score is more reasonable. In general, AUC indicates theprobability that the predicted positive examples rank in front of the negative examples.• Average Precision (AP). As the area under the Precision-Recall curve, AP is described

as one of the most robust metrics [5]. Precision-Recall curve depicts the relationshipsbetween Precision and Recall. A good model should improve the Recall while preservingthe Precision a relatively high score. In contrast, weaker models may lose more Precisionin order to improve Recall. Comparing with the Precision-Recall curve, AP can show theperformance of the model more intuitively.• Mean Reciprocal Rank (MRR). �e MRR is a commonly used metric to measure a rank

model. For a target query, if the �rst correct item is ranked nth , then the MRR score is 1/n,and once there is no match, the score is 0. �e MRR of the model is the sum of the scoresof all queries.• Hits@K. By calculating the rank (e.g., MRR) of all the ground-truth triples, Hits@K is the

proportion of correct entities ranked in top K.• Modularity. �e Modularity is an important measure based on the assortative mixing

[44], which usually used to assess the quality of di�erent division for a particular network,especially the community structure is unknown [45].• NormalizedMutual Information (NMI). �e NMI is another commonly used evaluation

to measure the quality of clustering results, so as to analyze the network community



structure [21]. NMI further indicates the similarity between two partitions with mutualinformation and information entropy, while a larger value means a higher similarity.• Adjusted Rand Index (ARI). �e ARI is a measure of the similarity between two data

clusterings: one given by the clustering process and the other de�ned by external criteria[53]. Such a correction establishes a baseline by using the expected similarity of all pair-wisecomparisons between clusterings speci�ed by a random model. ARI measures the relationbetween pairs of dataset elements without labels’ information, which can cooperate withconventional performance measures to detect classi�cation algorithm. In addition, ARI canalso use labels information for feature selection [53].

5.2 Specific Metrics for A�ack and DefenseIn order to measure the performance, a number of speci�c metrics for a�ack and defense appear inliterature. Next, we will organize these metrics from three perspectives: e�ectiveness, e�ciency,and imperceptibility.

5.2.1 E�ectiveness-relevant Metrics. Both a�ack and defense require metrics to measure theperformance of the target model before and a�er the a�ack. �erefore, we have summarized somemetrics for measuring e�ectiveness and list them below:

• Attack Success Rate (ASR). ASR is the ratio of targets which will be successfully a�ackedwithin a given �xed budget [10]. Correspondingly, we can conclude the formulation ofASR:

ASR = Number of successful a�acksNumber of a�acks (10)

• Classi�cation Margin (CM). CM is only designed for the classi�cation task. Under thisscenario, a�ackers aim to perturbe a graph that misclassi�es the target node and hasmaximal “distance” (in terms of log-probabilities/logits) to the correct class [85]. Based onthis, CM is well formulated as:

CM = maxy′t,yt

Zt,y′t − Zt,yt (11)

where Zt is the output of the target model with respect to node t , and yt ∈ C is the correctclass label of vt while y ′t is the wrong one.• Averaged Worst-case Margin (AWM). Worst-case Margin (WM) is the minimum of the

Classi�cation Margin (CM) [87], and the average of WM is dynamically calculated over amini-batch of nodes during training. �e Bs in the following equation denotes batch size.

AWM = 1Bs

i=Bs∑i=1

CMworsti (12)

• Robustness Merit (RM). It evaluates the robustness of the model by calculating thedi�erence between the post-a�ack accuracy and pre-a�ack accuracy on GNNs [33]. AndVa�acked denotes the set of nodes theoretically a�ected by the a�ack.

RM = Accpost-a�ackedV a�acked − Accpre-a�acked

V a�acked (13)

• Attack Deterioration (AD). It evaluates the a�ack e�ect of the model prediction. Notethat any added/dropped edges can only a�ect the nodes within the spatial scope in the



origin network [33], due to the spatial scope limitaions of the GNNs.

AD = 1 −Accpost-a�acked

Va�acked

Accpre-a�ackedVa�acked

(14)

• Average Defense Rate (ADR). ADR is the ratio of the di�erence between the ASR ofa�ack on GNNs with and without defense, versus the ASR of a�ack on GNNs withoutdefense [11]. �e higher ADR the be�er defense performance.

ADR =ASRwith-defense

Va�acked

ASRwithout-defenseVa�acked

− 1 (15)

• Average Con�dence Di�erent (ACD). ACD is the average con�dence di�erent of nodesin Ns before and a�er a�ack [11], where Ns is the set of nodes which are classi�ed correctlybefore a�ack in test set. Note that Con�dence Di�erent (CD) is equivalent to Classi�cationMargin (CM).

ACD = 1Ns

∑vi ∈Ns

CDi (Avi ) − CDi (A) (16)

• Damage Prevention Ratio (DPR). Damage prevention is de�ned to measure the amountof damage that can be prevented by defense [81]. Let L0 be the defender’s loss withouta�ack, LA be the defender’s loss under a certain a�ack strategy, A and LD be the defender’sloss with a certain defense strategy D. A be�er defense strategy leads to a larger DPR.

DPRDA =

LA − LDLA − L0

(17)

5.2.2 E�iciency-relevant Metrics. Here we introduce some e�ciency metrics which measure thecost of the a�ack and defense. For example, the metric of quantifying how much perturbations arerequired for the same e�ect.

• AverageModi�ed Links (AML). AML is designed for the topology a�ack, which indicatesthe average perturbation size leading to a successful a�ack [10]. Assume that a�ackershave limited budgets to a�ack a target model, the modi�ed links (added or removed) areaccumulated until a�ackers achieve their goal or run out of the budgets. Based on this, wecan conclude the formulation of AML:

AML = Number of modi�ed linksNumber of a�acks (18)

5.2.3 Imperceptibility-relevant Metrics. Several metrics are proposed to measure the scale of themanipulations caused by a�ack and defense, which are summarized as follows:

• Similarity Score. Generally speaking, Similarity Score is a measure to infer the likelihoodsof the existence of links, which usually applied in the link-level task [38, 82]. Speci�cally,suppose we want to �gure out whether a particular link (u,v) exists in a network, SimilarityScore can be used to quantify the topology properties of node u and v (e.g., common neigh-bors, degrees), and a higher Similarity Score indicates a greater likelihood of connectionbetween this pair. Usually, Similarity Score could be measured by the cosine similaritymatrix [58].• Test Statistic Λ. Λ takes advantage of the distribution of node degrees to measure the

similarity between graphs. Speci�cally, Zugner et al. [85] state that multiset of node degreesdG in the graph G follow the power-law like distribution, and they provide a method toapproximate the main parameter αG of the distribution. �rough αG and dG , we can



calculate G’s log-likelihood score l(dG ,αG ). For the pre-a�ack graph Gpr and post-a�ackgraph Gpo , we get (dGpr ,αGpr ) and (dGpo ,αGpo ), respectively. Similarly, we can then de�nedGq = dGpr ∩ dGpo and estimate αGq . �e �nal test statistic of graphs can be formulated as:

Λ(Gpr ,Gpo) = 2 · (−l(dGq ,αGq )+ l(dGpo ,αGpo ) + l(dGpr ,αGpr ))

(19)

Finally, when the statistic Λ satis�es a speci�ed constraint, the model considers the pertur-bations are unnoticeable.• Attack Budget ∆. To ensure unnoticeable perturbations in the a�ack, previous work

[85] measures the perturbations in terms of the budget ∆ and the test statistics Λ w.r.t.log-likelihood score. More precisely, they accumulate the changes in node feature and theadjacency matrix, and limit it to a budge ∆ to constrain perturbations. However, it is notsuitable to deal with complicated situations [85].• Attack E�ect. Generally, the a�ack e�ect evaluates the impacts in the community detec-

tion task. Given a confusion matrix J , where each element Ji, j represents the number ofshared nodes between original community and a perturbed one, the a�ack e�ect is simplyde�ned as the accumulation of normalized entropy [9]. Suppose all the communities keepexactly the same a�er the a�ack, the a�ack e�ect will be equal to 0.

6 OPEN PROBLEMSGraph adversarial learning has many problems worth to be studied by observing existing researches.In this section, we try to introduce several major research issues and discuss the potential solutions.Generally, we will discuss the open problems in three parts: a�ack, defense and evaluation metric.

6.1 A�ackWe mainly approach the problems from three perspectives, a�acks’ side e�ects, performance,and prerequisites. First, we hope the inevitable disturbances in the a�ack can be stealthy andunnoticeable. Second, now that data is huge, a�acks need to be e�cient and scalable. Finally, thepremise of the a�ack should not be too ideal, that is, it should be practical and feasible. Based onthe limitations of previous works detailed in Section 3.3.2 and the discussion above, we rise severalopen problems in the following:

• Unnoticeable Attack. Adversaries want to keep the a�ack unnoticeable to avoid censor-ship. To this end, Zugner et al. [86] argue that the main property of the graph should bemarginally changed a�er a�ack, e.g., a�ackers are restricted to maintain the degree distri-bution while a�acking a speci�ed graph, which is evaluated via a test static. Nevertheless,most of existing works ignore such a constraint despite achieving outstanding performance.To be more concealed, we believe that a�ackers should focus more on their perturbationimpacts on a target graph, with more constraints placed on the a�acks.• E�cient and E�ective Algorithms. Seeing from Table 2, the most frequently used

benchmark datasets are rather small-scale graphs, e.g., Cora [42] and Citeseer [54]. Cur-rently, the majority of proposed methods are failed to a�ack a large-scale graph, for thereason that they need to store the entire information of the graph to compute the surrogategradients or loss even with small changes, leading to the expensive computation and mem-ory consumption. To address this problem, Zugner et al. [86] make an e�ort that derivesan incremental update for all candidate set (potential perturbation edges) with a linearvariant of GCN, which avoid the redundant computation while remain e�cient a�ackperformance. Unfortunately, the proposed method isn’t suitable for a larger-scale graph



yet. With the shortcoming that unable to conduct a�acks on a larger-scale graph, a moree�cient and e�ective a�ack method should be studied to address this practical problem.For instance, given a target node, the message propagation are only exits within its k-hopsneighborhood.4 In other words, we believe that adversaries can explore a heuristic methodto a�ack an entire graph (large) via its represented subgraph (small), and naturally thecomplexity of time and space are reduced.• Constraints on Attackers’ Knowledge. To conduct a real and practical black-box a�ack,

a�ackers should not only be blind to the model knowledge, but also be strict to minimaldata knowledge. �is is a challenging se�ing from the perspective of a�ackers. Chen et al.[17] explore this se�ing by randomly sampling di�erent size of subgraphs, showing thatthe a�acks are failed with a small network size, and the performance increases as it getslarger. However, few works are aware of the constraints on a�ackers’ knowledge, instead,they assume that perfect knowledge of the input data are available, and naturally performwell in a�acking a fully “exposed” graph. �erefore, several works could be explored withsuch a strict constraint, e.g., a�ackers can train a substitute model on a certain part of thegraph, learn the general pa�erns of conducting a�acks on graph data, thus transfer to othermodels and entire graph with less prior knowledge.• Real-world Attack. A�acks can’t always be on an ideal assumption, i.e., studying a�acks

on a simple and static graphs. Such ideal, undistorted input is unrealistic in real cases.In other words, how can we improve existing models so that they can work in complexreal-world environments?

6.2 DefenseBesides the open problems of a�ack mentioned above, there are still many interesting problems indefense on the graph, which deserve more a�entions. Here we list some open problems that worthdiscussing:

• Defense on Various Tasks. From table 3 we can see that, most existing works of defense[33, 67, 72, 75, 87] are focusing on the node classi�cation task on graph, achieving promisingperformance. Except for node classi�cation, there are various tasks on graph domain areimportant and should be pay more a�ention to. However, only a few works [50, 81] tryto investigate and improve model robustness on link prediction task on graph while afew works [28, 29, 31] make an e�ort to defense on some tasks (e.g., malware detection,anomaly detection, recommendation) on other graph domains. �erefore, it is valuableto think about how to improve model robustness on various tasks or transfer the currentdefense methods to other tasks.• E�cient Defense. Training cost is critical important factor to be taken into consideration

under the industrial scenes. �erefore, it’s worth to be studied how to guarantee acceptablecost while improving robustness. However, the currently proposed defense methods haverarely discussed the space-time e�ciency of their algorithms. Wang et al. [67] designa bo�leneck mapping to reduce the dimension of input for a be�er e�ciency, but theirexperiments absence the consideration of large-scale graph. Zugner et al. [87] test thenumber of coverage iterations of their certi�cation methods under the di�erent numberof nodes, but the datasets they used are still small. Wang et al. [69] do not restrict theregularized adjacency matrix to be discrete during the training process, which improvesthe training e�ciency.

4Here k depends on the speci�ed method of message propagation and is usually of a smaller value, e.g., k = 2 with atwo-layer GCN.



• Certi�cation on Robustness. Robustness of the graph model is always the main issueamong all the existing works including a�ack and defense mentioned above. However, mostresearchers only focus on improving model (e.g., GCN) robustness, and try to prove themodel robustness via the performance results of their model. �e experimental results canindeed prove the robustness of the model to a certain extent. However, considering that themodel performance is easily in�uenced by the hyperparameters, implementation method,random initialization, and even some hardware constraints (e.g., cpu, gpu, memory), itis di�cult to guarantee that the promising and stable performance can be reobtained ondi�erent scenarios (e.g., datasets, tasks, parameter se�ings). �ere is a novel and cheaperway to prove robustness via certi�cation that should be taken more seriously into consider-ation, which certi�cate the node’s absolute robustness under arbitrary perturbations, butcurrently only a few works have paid a�ention to the certi�cation of GNNs’ robustness[3, 87] which are also a valuable research direction.

6.3 Evaluation MetricAs seen in Section 5, we got so many evaluation metrics in graph adversarial learning �eld, however,the number of e�ectiveness-relevant metrics are far more than the other two, which re�ects theresearch emphasis on model performance is unbalanced. �erefore, we propose several potentialworks can be further studied:

• Measurement of Cost. �ere are not many metrics to explore the model’s e�ciencywhich re�ects the lack of research a�ention on it. �e known methods roughly measurethe cost via the number of modi�ed edges which begs the question that is there anotherperspective to quantify the cost of a�ack and defense more precisely? In real world, the costof adding an edge is rather di�erent from removing one,5 thus the cost between modi�ededges are unbalanced, which needs more reasonable evaluation metrics.• Measurement of Imperceptibility. Di�erent from the image data, where the perturba-

tions are bounded in `p -norm [26, 40, 57], and could be used as an evaluation metric ona�ack impacts, but it is ill-de�ned on graph data. �erefore, how to be�er measure thee�ect of perturbations and make the a�ack more unnoticeable?• Appropriate Metric Selection. With so many metrics proposed to evaluate the perfor-

mance of the a�ack and defense algorithms, here comes a problem that how to determinethe evaluation metrics in di�erent scenarios?

7 CONCLUSIONIn this survey, we conduct a comprehensive review on graph adversarial learning, including a�acks,defenses and corresponding evaluation metrics. To the best of our knowledge, this is the �rstwork systemically summarize the extensive works in this �eld. Speci�cally, we present the recentdevelopments of this area and introduce the arms race between a�ackers and defenders. Besides,we provide a reasonable taxonomy for both of them, and further give a uni�ed problem formulationwhich makes it clear and understandable. Moreover, under the scenario of graph adversariallearning, we summarize and discuss the major contributions and limitations of current a�ack anddefense methods respectively, along with the open problems of this area that worth exploring. Ourworks cover most of the relevant evaluation metrics in the graph adversarial learning �eld as well,aiming to provide a be�er understanding on these methods. In the future, we will measure theperformance of proposed models with relevant metrics based on extensive empirical studies.

5Usually, removing an edge is more expensive than adding one.



Hopefully, our works will serve as a reference and give researchers a comprehensive and system-atical understanding of the fundamental issues, thus become a well starting point to study in this�eld.

ACKNOWLEDGMENTS�e paper was supported by the National Natural Science Foundation of China (61702568, U1711267),Key-Area Research and Development Program of Guangdong Province (2020B010165003), theProgram for Guangdong Introducing Innovative and Entrepreneurial Teams (2017ZT07X355) andthe Fundamental Research Funds for the Central Universities under Grant (17lgpy117).

REFERENCES[1] Ba�ista Biggio and Fabio Roli. 2018. Wild pa�erns: Ten years a�er the rise of adversarial machine learning. Pa�ern

Recognition 84 (2018), 317–331.[2] Aleksandar Bojchevski and Stephan Gunnemann. 2019. Adversarial A�acks on Node Embeddings via Graph Poisoning.

In International Conference on Machine Learning. 695–704.[3] Aleksandar Bojchevski and Stephan Gunnemann. 2019. Certi�able Robustness to Graph Perturbations. In Advances in

Neural Information Processing Systems. 8317–8328.[4] Avishek Joey Bose, Andre Cian�one, and William Hamiltion. 2019. Generalizable Adversarial A�acks Using Generative

Models. arXiv preprint arXiv:1905.10864 (2019).[5] Kendrick Boyd, Kevin H Eng, and C David Page. 2013. Area under the precision-recall curve: point estimates and

con�dence intervals. In Joint European conference on machine learning and knowledge discovery in databases. Springer,451–466.

[6] Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu, and JunzhouHuang. 2019. A Restricted Black-box Adversarial Framework Towards A�acking Graph Embedding Models.arXiv:cs.SI/1908.01297

[7] Gary Chartrand. 1977. Introductory graph theory. Courier Corporation.[8] Jinyin Chen, Lihong Chen, Yixian Chen, Minghao Zhao, Shanqing Yu, Qi Xuan, and Xiaoniu Yang. 2019. GA-Based

Q-A�ack on Community Detection. IEEE Transactions on Computational Social Systems 6, 3 (2019), 491–503.[9] Jinyin Chen, Yixian Chen, Lihong Chen, Minghao Zhao, and Qi Xuan. 2019. Multiscale Evolutionary Perturbation

A�ack on Community Detection. arXiv preprint arXiv:1910.09741 (2019).[10] Jinyin Chen, Ziqiang Shi, Yangyang Wu, Xuanheng Xu, and Haibin Zheng. 2018. Link prediction adversarial a�ack.

arXiv preprint arXiv:1810.01110 (2018).[11] Jinyin Chen, Yangyang Wu, Xiang Lin, and Qi Xuan. 2019. Can Adversarial Network A�ack be Defended? arXiv

preprint arXiv:1903.05994 (2019).[12] Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. 2018. Fast gradient a�ack on

network embedding. arXiv preprint arXiv:1809.02797 (2018).[13] Jinyin Chen, Jian Zhang, Zhi Chen, Min Du, Feifei Li, and Qi Xuan. 2019. Time-aware Gradient A�ack on Dynamic

Network Link Prediction. arXiv preprint arXiv:1911.10561 (2019).[14] Liang Chen, Yang Liu, Xiangnan He, Lianli Gao, and Zibin Zheng. 2019. Matching User with Item Set: Collaborative

Bundle Recommendation with Deep A�ention Network. In IJCAI. 2095–2101.[15] Liang Chen, Yang Liu, Zibin Zheng, and Philip Yu. 2018. Heterogeneous Neural A�entive Factorization Machine for

Rating Prediction. In CIKM. ACM, 833–842.[16] Liang Chen, Yangjun Xu, Fenfang Xie, Min Huang, and Zibin Zheng. 2019. Data Poisoning A�acks on Neighborhood-

based Recommender Systems. arXiv:cs.IR/1912.04109[17] Yizheng Chen, Yacin Nadji, Athanasios Kountouras, Fabian Monrose, Roberto Perdisci, Manos Antonakakis, and

Nikolaos Vasiloglou. 2017. Practical a�acks against graph-based clustering. In Proceedings of the 2017 ACM SIGSACConference on Computer and Communications Security. ACM, 1125–1142.

[18] Konstantina Christakopoulou and Arindam Banerjee. 2018. Adversarial recommendation: A�ack of the learned fakeusers. arXiv preprint arXiv:1809.08336 (2018).

[19] Ronan Collobert, Jason Weston, Leon Bo�ou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Naturallanguage processing (almost) from scratch. Journal of machine learning research 12, Aug (2011), 2493–2537.

[20] Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial a�ack on graphstructured data. arXiv preprint arXiv:1806.02371 (2018).

[21] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. 2005. Comparing community structure identi�cation.Journal of Statistical Mechanics: �eory and Experiment 2005, 09 (2005), P09008.


http://arxiv.org/abs/cs.SI/1908.01297

http://arxiv.org/abs/cs.IR/1912.04109


[22] Zhijie Deng, Yinpeng Dong, and Jun Zhu. 2019. Batch Virtual Adversarial Training for Graph Convolutional Networks.arXiv preprint arXiv:1902.09192 (2019).

[23] Kaize Ding, Jundong Li, Rohit Bhanushali, and Huan Liu. 2019. Deep Anomaly Detection on A�ributed Networks. InProceedings of the 2019 SIAM International Conference on Data Mining. SIAM, 594–602.

[24] Fuli Feng, Xiangnan He, Jie Tang, and Tat-Seng Chua. 2019. Graph Adversarial Training: Dynamically RegularizingBased on Graph Structure. arXiv preprint arXiv:1902.08226 (2019).

[25] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, andYoshua Bengio. 2014. Generative Adversarial Networks. arXiv:stat.ML/1406.2661

[26] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples.arXiv preprint arXiv:1412.6572 (2014).

[27] Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey.Knowledge-Based Systems 151 (2018), 78–94.

[28] Xiangnan He, Zhankui He, Xiaoyu Du, and Tat-Seng Chua. 2018. Adversarial personalized ranking for recommendation.In �e 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 355–364.

[29] Shifu Hou, Yujie Fan, Yiming Zhang, Yanfang Ye, Jingwei Lei, Wenqiang Wan, Jiabin Wang, Qi Xiong, and FudongShao. 2019. αCyber: Enhancing Robustness of Android Malware Detection System against Adversarial A�ackson Heterogeneous Graph based Model. In Proceedings of the 28th ACM International Conference on Information andKnowledge Management. ACM, 609–618.

[30] Rana Hussein, Dingqi Yang, and Philippe Cudre-Mauroux. 2018. Are Meta-Paths Necessary?: Revisiting HeterogeneousGraph Embeddings. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management.ACM, 437–446.

[31] Vassilis N Ioannidis, Dimitris Berberidis, and Georgios B Giannakis. 2019. GraphSAC: Detecting anomalies in large-scalegraphs. arXiv preprint arXiv:1910.09589 (2019).

[32] Vassilis N Ioannidis and Georgios B Giannakis. 2019. Edge Dithering for Robust Adaptive Graph ConvolutionalNetworks. arXiv preprint arXiv:1910.09590 (2019).

[33] Ming Jin, Heng Chang, Wenwu Zhu, and Somayeh Sojoudi. 2019. Power up! Robust Graph Convolutional Networkagainst Evasion A�acks based on Graph Powering. arXiv preprint arXiv:1905.10029 (2019).

[34] �omas N. Kipf and Max Welling. 2017. Semi-Supervised Classi�cation with Graph Convolutional Networks. InInternational Conference on Learning Representations (ICLR).

[35] Balaji Krishnamurthy and Mausoom Sarkar. 2018. Deep-learning network architecture for object detection. US Patent10,152,655.

[36] Solomon Kullback and Richard A Leibler. 1951. On information and su�ciency. �e annals of mathematical statistics22, 1 (1951), 79–86.

[37] Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, and Le Song. 2018. Heterogeneous graph neuralnetworks for malicious account detection. In Proceedings of the 27th ACM International Conference on Information andKnowledge Management. ACM, 2077–2085.

[38] Linyuan Lu and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: statistical mechanics andits applications 390, 6 (2011), 1150–1170.

[39] Yao Ma, Suhang Wang, Lingfei Wu, and Jiliang Tang. 2019. A�acking Graph Convolutional Networks via Rewiring.arXiv preprint arXiv:1906.03750 (2019).

[40] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deeplearning models resistant to adversarial a�acks. arXiv preprint arXiv:1706.06083 (2017).

[41] Franco Manessi, Alessandro Rozza, and Mario Manzo. 2020. Dynamic graph convolutional networks. Pa�ernRecognition 97 (2020), 107000.

[42] Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. 2000. Automating the construction ofinternet portals with machine learning. Information Retrieval 3, 2 (2000), 127–163.

[43] Benjamin A Miller, Mustafa Camurcu, Alexander J Gomez, Kevin Chan, and Tina Eliassi-Rad. 2019. ImprovingRobustness to A�acks Against Vertex Classi�cation. (2019).

[44] Mark EJ Newman. 2003. Mixing pa�erns in networks. Physical Review E 67, 2 (2003), 026126.[45] Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical

review E 69, 2 (2004), 026113.[46] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017.

Practical black-box a�acks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computerand communications security. ACM, 506–519.

[47] Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, et al. 2015. Deep face recognition.. In bmvc, Vol. 1. 6.[48] Yulong Pei, Nilanjan Chakraborty, and Katia Sycara. 2015. Nonnegative matrix tri-factorization with graph regu-

larization for community detection in social networks. In Twenty-Fourth International Joint Conference on Arti�cial


http://arxiv.org/abs/stat.ML/1406.2661


Intelligence.[49] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD.

ACM, 701–710.[50] Pouya Pezeshkpour, CA Irvine, Yifan Tian, and Sameer Singh. 2019. Investigating Robustness and Interpretability of

Link Prediction via Adversarial Modi�cations. In Proceedings of NAACL-HLT. 3336–3347.[51] David Martin Powers. 2011. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and

correlation. (2011).[52] Ste�en Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-�ieme. 2009. BPR: Bayesian personalized

ranking from implicit feedback. In Proceedings of the twenty-��h conference on uncertainty in arti�cial intelligence.AUAI Press, 452–461.

[53] Jorge M Santos and Mark Embrechts. 2009. On the use of the adjusted rand index as a metric for evaluating supervisedclassi�cation. In International conference on arti�cial neural networks. Springer, 175–184.

[54] Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collectiveclassi�cation in network data. AI magazine 29, 3 (2008), 93–93.

[55] Yash Sharma. 2018. Gradient-based Adversarial A�acks to Deep Neural Networks in Limited Access Se�ings. Ph.D.Dissertation. COOPER UNION.

[56] Ke Sun, Zhouchen Lin, Hantao Guo, and Zhanxing Zhu. 2019. Virtual adversarial training on graph convolutionalnetworks in node classi�cation. In Chinese Conference on Pa�ern Recognition and Computer Vision (PRCV). Springer,431–443.

[57] Lichao Sun, Ji Wang, Philip S Yu, and Bo Li. 2018. Adversarial a�ack and defense on graph data: A survey. arXivpreprint arXiv:1812.10528 (2018).

[58] Mingjie Sun, Jian Tang, Huichen Li, Bo Li, Chaowei Xiao, Yao Chen, and Dawn Song. 2018. Data poisoning a�ackagainst unsupervised node embedding methods. arXiv preprint arXiv:1810.12881 (2018).

[59] Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similaritysearch in heterogeneous information networks. Proceedings of the VLDB Endowment 4, 11 (2011), 992–1003.

[60] Richard S Su�on and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.[61] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus.

2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).[62] Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, and Suhang Wang. 2019. Robust graph neural

network against poisoning a�acks via transfer learning. arXiv preprint arXiv:1908.07558 (2019).[63] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia

Polosukhin. 2017. A�ention is all you need. In Advances in neural information processing systems. 5998–6008.[64] Heinrich Von Stackelberg. 2010. Market structure and equilibrium. Springer Science & Business Media.[65] Binghui Wang and Neil Zhenqiang Gong. 2019. A�acking Graph-based Classi�cation via Manipulating the Graph

Structure. arXiv preprint arXiv:1903.00553 (2019).[66] Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale commodity

embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conferenceon Knowledge Discovery & Data Mining. ACM, 839–848.

[67] Shen Wang, Zhengzhang Chen, Jingchao Ni, Xiao Yu, Zhichun Li, Haifeng Chen, and Philip S Yu. 2019. AdversarialDefense Framework for Graph Neural Network. arXiv preprint arXiv:1905.03679 (2019).

[68] Xiaoyun Wang, Joe Eaton, Cho-Jui Hsieh, and Felix Wu. 2018. A�ack graph convolutional networks by adding fakenodes. arXiv preprint arXiv:1810.10751 (2018).

[69] Xiaoyun Wang, Xuanqing Liu, and Cho-Jui Hsieh. 2019. GraphDefense: Towards Robust Graph ConvolutionalNetworks. arXiv preprint arXiv:1911.04429 (2019).

[70] Marcin Waniek, Kai Zhou, Yevgeniy Vorobeychik, Esteban Moro, Tomasz P Michalak, and Talal Rahwan. 2018.A�ack Tolerance of Link Prediction Algorithms: How to Hide Your Relations in a Social Network. arXiv preprintarXiv:1809.00152 (2018).

[71] Darrell Whitley. 1994. A genetic algorithm tutorial. Statistics and computing 4, 2 (1994), 65–85.[72] Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. 2019. Adversarial examples for

graph data: deep insights into a�ack and defense. In Proceedings of the 28th International Joint Conference on Arti�cialIntelligence. AAAI Press, 4816–4823.

[73] Fenfang Xie, Liang Chen, Yongjian Ye, Zibin Zheng, and Xiaola Lin. 2018. Factorization Machine Based ServiceRecommendation on Heterogeneous Information Networks. In ICWS. IEEE, 115–122.

[74] Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Mike Seltzer, Andreas Stolcke, Dong Yu, and Geo�reyZweig. 2016. Achieving human parity in conversational speech recognition. arXiv preprint arXiv:1610.05256 (2016).

[75] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. 2019. Topology A�ackand Defense for Graph Neural Networks: An Optimization Perspective. arXiv preprint arXiv:1906.04214 (2019).



[76] Xiaojun Xu, Yue Yu, Bo Li, Le Song, Chengfeng Liu, and Carl Gunter. 2018. Characterizing Malicious Edges targetingon Graph Neural Networks. (2018).

[77] Qi Xuan, Jun Zheng, Lihong Chen, Shanqing Yu, Jinyin Chen, Dan Zhang, and Qingpeng Zhang Member. 2019.Unsupervised Euclidean Distance A�ack on Network Embedding. arXiv preprint arXiv:1905.11015 (2019).

[78] Ke Yang, MingXing Zhang, Kang Chen, Xiaosong Ma, Yang Bai, and Yong Jiang. 2019. KnightKing: a fast distributedgraph random walk engine. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 524–537.

[79] Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019. Data poisoninga�ack against knowledge graph embedding. In Proceedings of the 28th International Joint Conference on Arti�cialIntelligence. AAAI Press, 4853–4859.

[80] Yingxue Zhang, S Khan, and Mark Coates. 2019. Comparing and detecting adversarial a�acks for graph deep learning.In Proc. Representation Learning on Graphs and Manifolds Workshop, Int. Conf. Learning Representations, New Orleans,LA, USA.

[81] Kai Zhou, Tomasz P Michalak, and Yevgeniy Vorobeychik. 2019. Adversarial robustness of similarity-based linkprediction. arXiv preprint arXiv:1909.01432 (2019).

[82] Kai Zhou, Tomasz P Michalak, Marcin Waniek, Talal Rahwan, and Yevgeniy Vorobeychik. 2018. A�acking Similarity-Based Link Prediction in Social Networks. In Proceedings of the 18th International Conference on Autonomous Agentsand MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 305–313.

[83] Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2019. Robust Graph Convolutional Networks AgainstAdversarial A�acks. (2019).

[84] Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributedgraph processing system. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16).301–316.

[85] Daniel Zugner, Amir Akbarnejad, and Stephan Gunnemann. 2018. Adversarial a�acks on neural networks for graphdata. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM,2847–2856.

[86] Daniel Zugner and Stephan Gunnemann. 2019. Adversarial a�acks on graph neural networks via meta learning. arXivpreprint arXiv:1902.08412 (2019).

[87] Daniel Zugner and Stephan Gunnemann. 2019. Certi�able robustness and robust training for graph convolutionalnetworks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM, 246–256.


a survey of adversarial learning on graph · of graphs 2.2 taxonomies of graphs di‡erent...

Documents