data integrity attack detection in smart grid: a deep...

1

Data Integrity Attack Detection in Smart Grid: ADeep Learning Approach

Sunitha Basodi*, Song Tan and Yi Pan

Department of Computer Science,Georgia state University,Atlanta, GA, USAE-mail: [email protected]: [email protected]: [email protected]*Corresponding author

WenZhan Song

School of Electrical and Computer Engineering ,University of Georgia,Atlanta, GA, USAE-mail: [email protected]

Abstract:Cybersecurity in smart grids plays a crucial role in determining its reliable functioning

and availability. Data integrity attacks at the physical layer of smart grids are mainlyaddressed in this paper. State Vector Estimation(SVE) methods are widely used todetect such attacks, but such methods fail to identify attacks that comply with physicalproperties of the grid, known as unobservable attacks. In this paper, we formulate adistance measure to be employed as the cost function in deep-learning models using feed-forward neural network architectures to classify malicious and secured measurements.Efficiency and performance of these models are compared with existing state-of-the-artdetection algorithms and supervised machine learning models. Our analysis shows betterperformance for deep learning models in detecting centralized data attacks.

Keywords: smart grids; bad data detection; state vector estimation; deep learning; IEEEtest bus systems; matpower; keras with tensorflow.

Reference to this paper should be made as follows: Sunitha Basodi Song Tan, WenZhanSong and Yi Pan (2019) ‘’Data Integrity Attack Detection in Smart Grid: A Deep LearningApproach, Vol. x, No. x, pp.xxx–xxx.

Biographical notes: Sunitha Basodi is currently a PhD student in the Department ofComputer Science, Georgia State University. Her current research is focused on machinelearning and deep learning methods, and their applications in Bioinformatics andSmartGrid domains. She has worked in Amazon Inc., India as a developer and receivedher Bachelors degree from Jawaharlal Nehru Technological University, Hyderabad, India.

Dr. Song Tan pursued his PhD at Department of Computer Science, Georgia StateUniversity. His research is focused on the cyber-physical security in Smart Grid system,which includes bad data detection, electrical market security and design of cyber-physicalsecurity testbed for Smart Grid. He has a MS from Georgia State University, and a BSfrom Northeast Normal University, China.

Dr. WenZhan Song is a Chair Professor of Electrical and Computer Engineering at theUniversity of Georgia. Dr. Songs research focuses on cyber-physical systems informaticsand security and their applications in energy, health and environment. Dr. Song has anoutstanding record of leading large multidisciplinary research projects on those issueswith multi-million grant support from NSF, NASA, USGS, and industry.

Dr. Yi Pan is currently a Regents Professor and Chair of Computer Science at GeorgiaState University, USA. He has served as an Associate Dean and Chair of BiologyDepartment during 2013-2017 and Chair of Computer Science during 2006-2013. Dr. Panjoined Georgia State University in 2000, was promoted to full professor in 2004, named aDistinguished University Professor in 2013 and designated a Regents’ Professor in 2015.

Int. J. Security and Networks, , Vol. x, No. x, 201X

2

1 Introduction

Cybersecurity plays a vital role in determining thereliability and availability of the entire smart gridinfrastructure. Potential intrusions can make the systemvulnerable to various types of attacks, each of which maylead to serious outcomes, ranging from the leakage ofcustomer private information to cascade of systemfailures as described in Aloul et al. [2012], Metke and Ekl[2010], Wang and Lu [2013]. Liu et al.[2012] provide anoverview of the potential attacks and suggest theimportance of cyber security in smart grids. In thecurrent study, we primarily focus on data integrityattacks in which, measurements of a set of compromisedmeters are altered by the attacker. Such attacks canmislead the operational metrics at the control centercausing the operators to make fatal decisions anddisrupting the reliability of measured data. State VectorEstimation (SVE) methods, detailed in Abur andExposito [2004], are generally employed to identify suchmalicious data attacks. In these methods, the statevector, which corresponds to an optimal power systemstate, is maintained, and a new state vector isreconstructed from measurements. If the differencebetween the actual and the reconstructed state vector isabove a threshold value τ , then the data is considered tobe attacked data. The study Liu et al. [2011] suggests away of constructing data integrity attack vectors that fallin the column space of the Jacobian matrix(H) of themeasurements. In those cases, there exist cooperativeattacks on meters, known as unobservable attacks, thatcannot be identified using state vector estimation or anyother known bad data detection techniques. There havebeen many algorithms proposed to detect such attacks inthe literature , as in Kosut et al. [2010], Giani et al. [2011],Xiao et al. [2013]. Esmalifalak et al.[2014] and Ozay et al.[2016] use supervised machine learning classificationmodels have also been proposed to identify such attacks[Esmalifalak et al. (2014); Ozay et al. (2016)]. The workin Esmalifalak et al. [2014] uses a series of measurementsas time series data, preprocesses the data using principalcomponent analysis (PCA), and then uses support vectormachines (SVM) with Gaussian kernel to identifymalicious data attacks. The authors in Ozay et al. [2016]employ a wide array of supervised and semi-supervisedmachine learning methods on the measurements, andanalyze the performance of these methods. Kundur et al.[2011] analyze the impact of such physical and cyberattacks in smart grids using graph-based model.

Identifying attacked measurements from securemeasurements using machine learning techniques isan active research area. In this paper, we use deeplearning Feed-Forward Neural Network model to learnand classify malicious data from secured data. Thisis done by geometrically analyzing and visualizing themeasurement space which helps in identifying a distancemetric for measurements, that can be used in machinelearning algorithms as the cost function, to optimally

discriminate secured data from attacked(malicious)data. We employ deep learning models, in this case feed-forward neural networks, to classify secured vs. maliciousdata. We perform exhaustive performance analysis of theclassification models for a range of attacked meters formultiple IEEE bus test systems and also compare ourresults with other physical and machine learning models.

The remainder of this paper is organized asfollows. State estimation and centralized attack datageneration methods used in this paper are brieflydescribed in Section 2. The problem formulationis presented in Section 3. Deep learning modelsemployed, and their characteristics are described inSection 4. Implementation and experimental results andcomparisons are covered in Sections 5 and 6, followed byconclusions in Section 7.

2 Preliminaries

2.1 State Estimation

State Estimation is the process of identifying thecurrent state of the system from measurements gatheredacross various meters. Monitoring such information isnecessary for the reliability of the system, as theoperators at the control center use this informationto re-dispatch power, call on operating/contingencyreserves, re-balance control areas, and for auditing andsettlement(Giani et al. [2011]). In this process, thecontrol center collects real time measurements z acrossvarious meters of the smart grid system and combinesthe network topology and parameter information, tocalculate the real time estimates of the unknownsystem state variables x. Let z = (z1, z2, ..., zm)T

denote the meter measurements genertated using buspower and branch power flow measurements andx = (x1, x2, ..., xn)

T denote the system state variablescomputed using voltage phases at all buses, where m isthe number of meters and n is the number of unknownstate variables, and m ≥ n. Let e = (e1, e2, ..., em)T

denote the measurement errors with Guassian noisewith zero mean and a covariance matrix σ2I. Themeasurement model for DC power flow is given asfollows:

z = Hx + e (1)

where H is an m× n full rank jacobian matrix ofthe measurement model. The matrix Hm×n signifies therelationship between the vector zm×1 of measurementsconsisting of m meters readings, and the state vectorxn×1 consisting of n state variables x1, ..., xn. From thevector of measurements, the estimated state vector x iscomputed using

x = (HTWH)−1HTWz (2)

where W is a diagonal matrix whose elements arereciprocals of the variances of meter errors.

Data Integrity Attack Detection in Smart Grid: A Deep Learning Approach

Data Integrity Attack Detection in Smart Grid: A Deep Learning Approach 3

2.2 Bad Data Detection

Bad measurements may be introduced due to somefaulty transmission lines, meter failures, or due tomalicious attacks. Estimated state vector of normalmeasurements is close to the actual state vector,while deviates from it for malicious measurements. Thedifference between the observed measurements z andthe computed measurements using x i.e., z = Hx isknown as the measurement residual, z − z . L2-Normof this residual �z −Hx�2 is generally used to detectbad measurements using a threshold value τ . If themeasurement residue exceeds this threshold value i.e.,�z −Hx�2 > τ , the measurements are considered to becompromised; otherwise they are considered as securedmeasurements.

2.3 Unobservable Attacks

Injecting malicious data involves modifying the meterreadings so that the actual measurements are augmentedwith an attack vector a = (a1, a2, ..., am)T , that havenon-zero elements for compromised meters. Thecompromised vector of measurements can then beconstructed as:

z = z + a (3)

The attacker can be assumed to have access tothe matrix Hm×n, in addition to some or all of themeters. Let A = {t1, t2, ..., tk}, where ti ∈ {1, 2, ..,m},be the set of meters that are compromised and Abe the uncompromised meters. When the compromisedmeasurements are coherent with the physical powerflow constraints of the system, such attacks are calledas unobservable attacks. These attacks cannot bedetected by State Estimation methods or any other baddata detection algorithms. Such an unobservable attackvector a = (a1, a2, ..., am)T can be generated using thefollowing condition:

a = Hc (4)

where c is a sparse vector using guassian distribution,and ai �= 0 for i ∈ A and ai = 0 for i /∈ A. Theproperties and constraints for generating unobservabledata integrity attacks are briefly described in Liu et al.[2011].

The compromised measurements generated afteradding the above attack vector is as follows:

z = H(x+ c) + e (5)

where the injected vector c is the residueerror introduced into the measurements, which goesundetected.

There are two types of attacks possible for centralizeddata integrity attacks, namely, random false datainjection attacks and targeted false injected attacks. Inthe former type of attacks, the attacker can corruptmeasurements from any of the meters to make the attack

undetectable in state estimation, whereas in the latter,the attacker has access to only a specific set of meters.In this paper, we use LASSO optimization methodssuggested in Ozay et al. [2013] to generate attack vectors.

2.4 Deep learning

Deep learning is a branch of machine learning thatuses artificial neural networks, a complex layerednetwork of non-linear processing units called neurons,to capture multiple levels of hidden representationsof the data. These extracted representations can beused for feature extraction, classification or, any otherrelated tasks. Artificial Neural Networks work very wellwith high-dimensional data, both for classification andregression problems. Deep learning has been successfulin a wide variety of areas such as automatic speechrecognition, natural language processing, etc. Severalarchitectures exist for deep learning such as feed-forwardneural networks, recurrent neural networks and LSTMs.Detailed survey of deep learning methods and models,along with their applications, can be found in LeCunet al. [2015], Schmidhuber [2015].

Deep learning models use artificial neural networkswhich capture multiple levels of hidden representationsof the data to perform the classification task. It can alsoperform other tasks such as object recognition, speechrecognition, series prediction and so on. An artificialneural network consists of an input layer, an outputlayer, and a set of hidden layers that connect themusing computation units called neurons. Neurons of eachlayer are connected to its preceding and succeedinglayers via weighted links. Each neuron takes inputs fromthe neurons of its preceding layer, computes weightedsum of the corresponding inputs, and then applies anactivation function to generate the transformed output.Activation functions are non-linear functions that helpdetermine the output states of neurons. This output ispassed on as input to the neurons in the next layer.This process is repeated until the values are propagatedto the output layer which predicts the final result. Theweights of the connected links are adjusted to minimizethe error between expected and predicted outputs. Theconnections between layers can be uni-directional or bi-directional.

3 Problem Formulation

Attacked measurements cannot be identified using stateestimation methods (SVE) when the attack vectors arein the column space of H as in equation (4). Suchvectors can always be constructed when the number ofattacked meters is more than m+ n− 1 atleast m−n+ 1 , as per Theorem 2 of Liu et al. [2011]. Thesegenerated attack vectors are closely located aroundthe normal values as shown in Fig.1, for IEEE 57-bus test system, which generally holds good for anysystem. Clearly, the compromised measurements are

4 Sunitha Basodi et al.

spaced very closely to secure measurements, separatedby short distances. These inter-measurement distancescan be used in machine learning models to separateattacked and unattacked measurements in space, andto identify the discriminating boundaries that helpclassify new measurements efficiently. Since the distancesbetween the measurements are usually small, firstly, themeasurements are transformed to logarithmic space, andthen the distances are computed.

Formally, let z = (z1, z2, ..., zm)T denote the actualmeter measurements, and ap = (ap1, a

p2, ..., a

pm)T and

aq = (aq1, aq2, ..., a

qm)T be the potential attack vectors

generated for z at timestamps tp and tq respectively.Let zp = z + ap and zq = z + aq. Clearly, zp =(zp1 , z

p2 , ..., z

pm)T , where zpi = zi + api . Let Mp

i ,Mqi

represent the meter i of measurement vector zp , zq

respectively. As the meter measurements can be attackedindependently at tp and tq, either none or one or both ofthe Mp

i and/or Mqi values can be attacked. The distance

measure computation problem can be formulated asbelow:

d(zpi , zqi ) =

�log(|api |)− log(|aqi |)�2, ifMpi ,M

qi ∈ A

�log(|api |)�2, ifMpi ∈ A,Mq

i /∈ A0, if Mp

i ,Mqi /∈ A

where d(., .) is the distance measure between the twopoints. Since the meters of the measurement vectorscan be attacked independently, Mp

i and/or Mqi can

be attacked. The above formulation handles all thesecases . Whenever there is attack on the meter readings,the distance between the estimated and the actualmeasurements is not zero, irrespective of the type ofthe attack, which helps in identifying that the attacks., and this This distance measure can be used in anymachine learning algorithms as the loss or optimizationfunction to train the models so as to effectively separatesecure vs. attacked measurements in vector space andto detect attacks on new measurements. We use theabove formulation in our deep learning model as the lossfunction and optimize it to build a robust model forclassifying normal vs. malicious measurements.

Given S = {si}mi=1, the set of measurements from allthe meters, and the corresponding label set Y = {yi}mi=1,where si ∈ R, yi ∈ {0, 1}. The value of yi correspondingto the measurement si is defined as follows:

yi =

�1, if the measurement si is compromised

0, otherwise

The input data (si, yi) ∈ S × Y is assumed to beindependent and identically distributed to a jointdistribution P (Bousquet et al. [2004]). The samplingfrom P is done independently and identically, as perthe assumptions in Ozay et al. [2016]. The aim isto determine a learning model defined as a functionmapping f : S → Y, which determines the relationshipsbetween S and Y using the distance measure d(., .) andseparates the attacked and secure measurements.

4 Deep Learning For Attack Detection

In this section, we briefly describe the design of thesoftware, the architecture of the deep learning modelemployed to build a classification model to identifysecure data and attacked data.

4.1 Workflow Design

The design of our tool is shown in the Fig. 3. DataGenerator module is responsible for generating themetered data of the smart grids. Any software thatsimulates power grid models to generate meter readings(or measurements) at different time intervals or theactual meter readings available from power grid websiteslike PJM [2017] can be used in the Data Generator.Measurements of all the meters for each timestamp arestored into a matrix. This matrix data is passed tothe Attack Simulator which has two modules AttackIdentifier and Attack Generator. Attack Identifierdetermines timestamps for the attack, this determinationis done by randomly selecting the timestamps. Oncea timestamp is considered for the attack, then all themeter readings corresponding to that timestamp arepassed to the Attack Generator. An attack vector isgenerated for the given measurement vector and theattacked measurements along with the attacked metersare passed to the Data Persistor. Here, the module looksthe attacked meters passed from the Attack Generator.If any of the meters are passed as attacked meters,then it labels the corresponding measurements with label1 otherwise, the measurements are labeled as 0. Thismeasurement data along with the labels are stored andalso passed to the Attack Predictor. This module is readsthe labeled measurements and builds a classificationmodel to predict if an unseen measurements are securedor attacked. In our work, we use deep learning models toperform this classification.Attack Predictor module canalso be independently executed using the data stored bythe Data Persistor.

4.2 Deep Learning Model Architecture

There are many neural network architectures such asConvolution deep neural networks (CNNs), Recurrentneural networks (RNNs), and LSTM networks whichhave outperformed in image recognition, objectidentification, natural language processing, nextsequence prediction. However, our data consists ofmeasurements and therefore a simple feed forwardnetwork architecture is employed. Feed-forward networksare uni-directional networks, in which, the connectionsfrom the input layer to the output layer are all in theforward direction, without any cycles. The informationflows from input layer through the hidden layers, andfinally to the output layer without any looping backof the information (see Fig.2). In our classificationmodel, the neurons on the input layer accept the actualmeter measurements, transform them using activation


functions, and pass it on to its successive hidden layers,which in turn transform the data, and finally pass it tothe output layer which transforms and outputs a binaryvalue, 0 or 1, classifying the data as secure or attackedrespectively.

The feed-forward architecture of our neural networkis shown in the Fig. 4. The input layer which is fullyconnected to the subsequent layer and accepts the inputdata. The output layer is fully connected to its precedinglayer and gives us the classification result. Each ofthe hidden layers is fully connected to its subsequentlayer, however dropout method is used to drop a fewneurons and network connections to prevent the networkfrom overfitting. Batch normalization is also performedduring training the model and overfitting of the modelis prevented using dropout (Srivastava et al. [2014]) foreach of the hidden layers. The number of neurons in eachlayer is controlled by a parameter K and the dropoutfactor for each of the layer is determined using theparameter p.

5 Implementation

Data Generator module is implemented in Matlab andthe simulated measurements for different power grid busarchitectures are generated using MATPOWER toolbox(Zimmerman et al. [2011]). We use MATPOWER tosimulate IEEE 9-bus, 14-bus, 30-bus and 57-bus testsystems. Attack Identifier randomly determines ifthe measurements obtained from the Data Generatorshould be attacked or not. If it determines themeasurements should not be attacked, then themeasurements are sent to the Data Persistor moduleand the label for these measurements are marked as 0.Otherwise, the attack vector for the given measurementsis generated using false data injection attacks. TheAttack Generator module is implemented using byAlgorithm 1 of Ozay et al. [2013]. This module generatesthe attack vector of different sparseness based on theratio κ/m ∈ [0, 1]. In our implementation, we generate22 ratio values equally spaced between (0,1). The ratioparameter value multiplied by the number of meters forthe test system determines that number of meters tobe attacked. This value is added as one of the stoppingconstraints of Algorithm 1 of Ozay et al. [2013] so thatit corresponds to the sparseness of the attack vector.These generated unobservable attack vectors are addedto the original measurements to get the corrupted orattacked measurements. These attacked measurementsare passed to the Data Persistor module and are storedwith the label 1. Data Persistor module stores thesemeasurements along with the corresponding labels in thefile system as CSV files. These CSV files along withtheir labels are used by the Attack Predictor moduleto perform classification. Our deep learning modelsare implemented using Keras (Chollet [2015]), a high-level API for neural networks that works on top onTensorFlow (Abadi et al. [2015]) and Theano. This API

Table 1 Calculation of performance measures

Attacked Secure

Predicted as Attacked tp fpPredicted as Secure fn tn

provides a simple interface to implement complex neuralnetwork models. Our code implements the architectureshown in Fig.4 to train a feed forward neural networkwith one input layer, one output layer, and multiplehidden layers. Weight initialization methods, activationfunctions, loss functions, optimizer methods, and metricswhich need to be improved, can be passed as methodparameters. In our code, we train the model to achievebest accuracy metric using our distance formulationas the loss function. The weights in the network areregulated so as to achieve the best possible accuracy.

5.1 Sampling and Metrics

Stratified 10-fold cross validation is used for training andtesting the model. Accuracy, precision, and recall arethe metrics used to study the performance of the model.Accuracy (Acc) is defined as the proportion of test casescorrectly classified as either attacked or secure. Precision(Prec) measures the fraction of the test cases whichare classified as attacked, have truly been attacked.Recall (Rec) measures the fraction of the test cases thatare correctly predicted as attacked. To calculate thesemetrics, we measure true positives (tp), true negatives(tn), false positives (fp) and false negatives (fn), asshown in Table 1.

Acc= tp+tntp+tn+fp+fn , Prec= tp

tp+fp , Rec= tptp+fn

In our models, the input data (si, yi) ∈ S × Y issampled and used for training feed-forward neuralnetworks. Each measurement si is fed as input to themodel and the output layer with a single neuron resultsin either 0 or 1, to represent if the data is secure orattacked respectively. Problem formulation defined inSection 3 is employed into the cost function and themodel is trained so as to optimize the cost function andreduce the prediction errors.

6 Experimental Results

The classification models are analyzed for IEEE 9-bus, 14-bus, 30-bus, and 57-bus test systems inour experiments. MATPOWER toolbox (Zimmermanet al. [2011]) is used to compute the measurementJacobian matrices H. The operating state of thesystem x is obtained from MATPOWER, and thenthe measurements for z are computed. For each of themeasurement vectors, we independently determine if themeasurements need to be attacked. Following this, thecorresponding attacked measurements using random andtargeted false injection attacks are generated (Liu et al.


[2011], Ozay et al. [2013]). Data is labeled ’1’ if themeasurements are attacked, or ’0’ otherwise. In theexperiments, we assume that the attacker has accessto κ meters, and these meters are randomly chosen togenerate a κ-sparse attack vector. These attack vectorshave non-zero values with the same mean and varianceas z, and follow a Gaussian distribution. We evaluatethe behavior of the models to classify both observableand unobservable attacks for different values of κ/m ∈[0, 1]. Specifically, we analyze the performance of themodel for κ ≥ m− n+ 1, where unobservable attacksare generated, and the attack vectors are not observableby SVE (Liu et al. [2011]). Otherwise, the generatedattacks are observable.

Feed-forward neural networks are implemented usingKeras API (Chollet [2015]) in python with Tensorflowas the backend library. There are multiple parameters tolearn in the deep learning models, such as the numberof hidden layers, the number of neurons in each of theselayers, the type of activation functions for neurons, thedropout factor and, the loss functions. In our models, weintegrate the distance metric in the problem formulationas the loss function, and use Exponential Linear Units(elu) (Clevert et al. [2015]) as the activation functionfor all the neuronal units of the hidden layers. Sigmoidis used as the activation function for the output layerso that the output result can be directly related to theclass label. The number of neurons in the input andoutput layers are fixed as it depends on the dimensionof the input data and the type of output. To determinethe appropriate number of hidden layers, the modelis parameterized to have upto 100 hidden layers, andthe number of neurons (f) in each layer is taken fromthe set {0, 50, 100, .., 450, 500}. Dropout factors (p) arechosen from the set {0.25, 0.50}. Scikit (Pedregosa et al.[2011]) library from python is used to perform gridsearch through the parameter space, and determine thebest possible parameters for the model. We also trainLinear SVM models using scikit library in python forcomparison studies.

Fig.5 shows the accuracies of SVE, SVM (Linear),and deep learning models for IEEE 57-bus test system.Accuracies are measured using a stratified 10-fold crossvalidation method. It can be seen that SVM performscomparable to the feed-forward deep neural networkmodel till the ratio κ/m reaches 0.5, after which, thefeed-forward network (deep learning) model performsbetter. SVE (State Vector Estimation) performanceincreases linearly with the number of meters, but italso increases the true negatives (number of outputspredicted as attacked, but are actually unattacked). Thiscan be inferred from the precision and recall comparisonfigures shown in Fig. 6, 7. Linear SVM models havebetter recall values when less than half of the metersare attacked, but have very low precision values dueto the presence of false positives and vice versa, whenmore than half of meters are attacked. SVE modelshave good precision performance, but have less recallperformance which shows that not all the attacked

meters are identified correctly. Deep learning modelshave less precision performance when compared to SVE,but have better recall performance than SVE. They alsohave better accuracy performance when compared toSVE and SVM models.

The performance of neural network model slightlyincreases with an increase in the number of layers,after which it stabilizes or sometimes decreases after amaximum value. One of the reasons for this reductionin accuracy with the increase in number of layers is dueto overfitting of the data with the increase in numberof layers. Fig. 11 shows the accuracy of models withthe increase in number of layers. Clearly, accuracy ofsome IEEE bus test systems reach highest accuracywith almost two layers which some models require morelayers. In Fig. 11, we observe that IEEE 9-bus, 14-bustest systems achieve their maximum accuracy around 50hidden layers, IEEE 30-bus test system achieves similarhighest accuracies around 2 or 90 hidden layers andIEEE 57-bus test system achieves it around 2 or 40hidden layers.

The accuracies of the models vary based onthe number of attack meters and also the otherhyperparameters. We show metrics for neural networkmodels with the best performance values of all trainedmodels for IEEE 9-bus, 14-bus, 30-bus, and 57-bus testsystems. The performance charts can be seen in Fig. 8,9, 10. It is observed that all the models start to convergeto high accuracy when more than 55% of the metersare attacked. IEEE 9-bus test system achieves around90% accuracies with just 2 hidden layers. The numberof hidden layers and neurons increase with the numberof buses in the IEEE test systems. Clearly, feed-forwardneural networks achieve more than 90% accuracies whenthe ratio of the attacked meters is greater than 0.3 forIEEE 9-bus, 11-bus test systems, and is greater than 0.7for IEEE 30-bus, 57-bus test systems.

We also analyze the computational time for each ofthe models with the increase in the number of layers.In Fig. 12, we observe that the average training timeincreases with the increase in the number of layers. Thehorizontal axis in the Fig. 12 represents the numberof hidden layers in the model and the vertical axisrepresents the average time taken for the model totrain the across the parameter space. The interestingobservation is that as the complexity of the IEEE bustest system increases, the average training time is lesscompared to the simplest bus system. That is, as thenumber of layers increases, IEEE 9-bus system has moreaverage training times compared to IEEE 14-bus testsystem which in turn has higher average training timescompare to IEEE 30-bus test system. IEEE 30-bus and57-bus has comparable (close) training times which isless than IEEE-9, IEEE 14-bus systems.

Since we experiment with multiple IEEE bus testsystems, namely 9-bus, 14-bus, 30-bus, and 57-bussystems, to achieve fast computation results, we performthe data generation for all the bus systems in parallelusing multiple processes. For each of the test systems,


we analyze the scenario when different number of metersattacked and analyze their classification performance.This means for each of the datasets corresponding todifferent ratio of attacked meters, a model needs totrained and tested. To accelerate the training process forall the models, we train these models in parallel on aserver and persist the optimal network models.

7 Conclusion & Future Work

The data integrity attack detection problem in smartgrids is redefined as a machine learning problem,and a feed-forward neural network model is employedfor the classification of attacked data and secureddata. The performance of this model is analyzed andcompared with State Vector Estimation (SVE) andsupervised machine learning algorithms for a rangeof attacked meters. This analysis shows that feed-forward neural networks can detect attacks with betterperformance than attack detection algorithms thatemploy state vector estimation methods for centralizeddata integrity attacks. Also, the performance of feed-forward deep neural networks against various IEEE-bus test systems is compared. It is clearly seen thatfeed-forward neural networks perform better than StateVector Estimation methods and have comparable (oftenbetter) performance with other supervised machinelearning algorithms. In future work, we would like towork on the real-time measurement data from PJM andemploy other deep learning models such as recurrentnetworks and LSTMs which are known to work betterfor sensor data, time series and high-dimensional data toimprove the performance of the classification of securedand attacked data.

Acknowledgement

This work is supported by the National ScienceFoundation under Award Number DUE-1303359. Wewould like to thank Mete Ozay for the productivediscussions and comments, and Krishna Pusuluri forvaluable feedback.

References

Mohammad Esmalifalak, Lanchao Liu, Nam Nguyen,Rong Zheng, and Zhu Han. ‘Detecting stealthy falsedata injection using machine learning in smart grid.’IEEE Systems Journal, 2014.

Annarita Giani, Eilyan Bitar, Manuel Garcia, MilesMcQueen, Pramod Khargonekar, and KameshwarPoolla. ‘Smart grid data integrity attacks:characterizations and countermeasures π.’ In2011 IEEE International Conference on Smart GridCommunications (SmartGridComm), pages 232–237.IEEE, 2011.

Oliver Kosut, Liyan Jia, Robert J Thomas, and LangTong. ‘Malicious data attacks on smart grid stateestimation: Attack strategies and countermeasures.’ In2010 First IEEE International Conference on SmartGrid Communications (SmartGridComm), pages 220–225. IEEE, 2010.

Deepa Kundur, Xianyong Feng, Salman Mashayekh,Shan Liu, Takis Zourntos, and Karen L Butler-Purry.‘Towards modelling the impact of cyber attacks ona smart grid.’ International Journal of Security andNetworks, 6(1):2–13, 2011.

Ali Abur and Antonio Gomez Exposito. Power systemstate estimation: theory and implementation. 1st ed.,CRC press, 2004.

Fadi Aloul, AR Al-Ali, Rami Al-Dalky, Mamoun Al-Mardini, and Wassim El-Hajj. ‘Smart grid security:Threats, vulnerabilities and solutions.’ InternationalJournal of Smart Grid and Clean Energy, 1(1):1–6,2012.

Olivier Bousquet, Stephane Boucheron, and GaborLugosi. Introduction to statistical learning theory.In Advanced lectures on machine learning, pages 169–207. Springer, 2004.

Francois Chollet. Keras. [online] https://github.com/fchollet/keras, 2015. (Accessed 15 February 2017).

Martın Abadi, Ashish Agarwal, Paul Barham, EugeneBrevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado,Andy Davis, Jeffrey Dean, Matthieu Devin, SanjayGhemawat, Ian Goodfellow, Andrew Harp, GeoffreyIrving, Michael Isard, Yangqing Jia, Rafal Jozefowicz,Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg,Dan Mane, Rajat Monga, Sherry Moore, DerekMurray, Chris Olah, Mike Schuster, JonathonShlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar,Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan,Fernanda Viegas, Oriol Vinyals, Pete Warden, MartinWattenberg, Martin Wicke, Yuan Yu, and XiaoqiangZheng. TensorFlow: Large-scale machine learningon heterogeneous systems, 2015. [online] URLhttp://tensorflow.org/. Software available fromtensorflow.org (Accessed 11 March 2017 ).

Djork-Arne Clevert, Thomas Unterthiner, and SeppHochreiter. Fast and accurate deep network learningby exponential linear units (elus). arXiv preprintarXiv:1511.07289, 2015.

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton.‘Deep learning.’ Nature, 521(7553):436–444, 2015.

Jing Liu, Yang Xiao, Shuhui Li, Wei Liang, andCL Philip Chen. ‘Cyber security and privacy issuesin smart grids.’ IEEE Communications Surveys &Tutorials, 14(4):981–997, 2012.


Yao Liu, Peng Ning, and Michael K Reiter. ‘False datainjection attacks against state estimation in electricpower grids.’ ACM Transactions on Information andSystem Security (TISSEC), 14(1):13, 2011.

Anthony R Metke and Randy L Ekl. ‘Smart grid securitytechnology.’ In Innovative Smart Grid Technologies(ISGT), 2010, pages 1–7. IEEE, 2010.

Mete Ozay, Inaki Esnaola, Fatos TYarman Vural,Sanjeev R Kulkarni, and H Vincent Poor. ‘Sparseattack construction and state estimation in the smartgrid: Centralized and distributed models.’ IEEEJournal on Selected Areas in Communications, 31(7):1306–1318, 2013.

Mete Ozay, Inaki Esnaola, Fatos Tunay Yarman Vural,Sanjeev R Kulkarni, and H Vincent Poor. ‘Machinelearning methods for attack detection in the smartgrid.’ IEEE transactions on neural networks andlearning systems, 27(8):1773–1786, 2016.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,D. Cournapeau, M. Brucher, M. Perrot, andE. Duchesnay. ‘Scikit-learn: Machine learning inPython.’ Journal of Machine Learning Research, 12:2825–2830, 2011.

PJM : Markets and Operations , 2017. [online] URLhttp://www.pjm.com/markets-and-operations.

aspx. (Accessed 23 June 2017 ).

Jurgen Schmidhuber. ‘Deep learning in neural networks:An overview.’ Neural networks, 61:85–117, 2015.

Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky,Ilya Sutskever, and Ruslan Salakhutdinov. ‘Dropout:a simple way to prevent neural networks fromoverfitting.’ Journal of machine learning research, 15(1):1929–1958, 2014.

Wenye Wang and Zhuo Lu. ‘Cyber security in the smartgrid: Survey and challenges.’ Computer Networks, 57(5):1344–1371, 2013.

Zhifeng Xiao, Yang Xiao, and David Hung-ChangDu. ‘Exploring malicious meter inspection inneighborhood area smart grids.’ IEEE Transactionson Smart Grid, 4(1):214–226, 2013.

Ray Daniel Zimmerman, Carlos Edmundo Murillo-Sanchez, and Robert John Thomas. ‘Matpower:Steady-state operations, planning, and analysis toolsfor power systems research and education.’ IEEETransactions on power systems, 26(1):12–19, 2011.

Figure 1 Secure and malicious measurements of IEEE57-bus test system. All the meter measurements ofvector z = (z1, z2, ..., zm)T are plotted using (i, zi)for i ∈ 1, 2, ..,m. Y-axis shows the measurementvalues for the meters on X-axis. The attacked andsecure measurement values are close to each other,meaning the difference between them is verysmall. We demonstrate the difference by plottingthose values on the logarithmic scale, as shown inthe small rectangular box. For case-57 bus system,m=137. Here, we only show a part of the plot forvalues of i ∈ {13, .., 62} to clearly demonstratethat the malicious and the secure measurementsare very closely placed in vector space, separatedby very short distances, as shown in therectangular box. The points on the histogramconnected to the rectangle correspond to thezoomed regions showing how these points areseparated by some distance.

Figure 2 Example of a feed-forward neural network withone input on the input layer(blue) and one outputon the output layer(green) and 2 hidden layerswith three neurons each(grey). This is a fullyconnected feed-forward network in which a neuronreceives inputs from all the neurons in theprevious layer and then uses weights on itsconnections and its activation function todetermine the value to be passed to the nextlayer.


Figure 3 Workflow design of the smart grid datageneration and prediction model.

Figure 4 Architecture of deep learning model

Figure 5 Comparison of accuracies among state vectorestimation (SVE), SVM (Linear) and deeplearning models for IEEE 57-bus test system.

Figure 6 Comparison of precision values among StateVector Estimation (SVE), SVM (Linear) and deeplearning models for IEEE 57-bus test system.

Figure 7 Comparison of recall values among State VectorEstimation (SVE), SVM (Linear) and deeplearning models for IEEE 57-bus test system.

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 8 Accuracies of deep learning models for IEEE9-bus, 14-bus, 30-bus, and 57-bus test systems fora range of attacked meters.


Figure 9 Precision values of deep learning models forIEEE 9-bus, 14-bus, 30-bus, and 57-bus testsystems for a range of attacked meters.

Figure 10 Recall performance values of deep learningmodels for IEEE 9-bus, 14-bus, 30-bus, and 57-bustest systems for a range of attacked meters.

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 11 Accuracy of deep learning models with theincrease in number of layers for IEEE 9-bus,14-bus, 30-bus, and 57-bus test systems whenthere are 10 attacked meters.

��

��

�

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 12 Average training times of deep learning modelsfor IEEE 9-bus, 14-bus, 30-bus, and 57-bus testsystems, where there are 10 attacked meters. Thehorizontal axis represents the number of hiddenlayers in the model and the vertical axisrepresents the average time taken for the model totrain the across the parameter space.

data integrity attack detection in smart grid: a deep...

Documents