household energy disaggregation based on …cs229.stanford.edu/proj2016spr/report/067.pdf1 household...

1

Household Energy Disaggregation based onDifference Hidden Markov Model

Ali Hemmatifar ([email protected]) Manohar Mogadali([email protected])

Abstract—We aim to address the problem of energydisaggregation into individual appliance contributionfor a given household. We do this based on a MachineLearning approach that involves modeling each ap-pliance’s load contribution to be a Markov chain.Once the Hidden Markov Model (HMM) is trained,we can get the parameters such as the number ofstates and their prior probabilities, transition matrixand the emission characteristics for each appliance(each state is modeled as a Gaussian distribution).The disaggregation of the total load is modeled as adifference Hidden Markov Model (difference HMM),based on HMMs for each of the appliance learnedpreviously. Finally, the most probable set of states forall the appliances is obtained using a custom Viterbialgorithm. This approach to energy disaggregationworks well for fixed state appliances, as can be seenin the results.

Key Words: Energy disaggregation, MarkovChain, HMM, Difference HMM, Viterbi algorithm

I. INTRODUCTION

Exponential increase in energy demandsmakes the energy conservation one of thebiggest challenges of our time. Any effort inminimizing the energy wastage will then havedirect impact on our life, both economicallyand environmentally. It is estimated that, onaverage, residential and commercial buildingsconsume as high as a third of total energygeneration[7]. Recent studies[1], [7] suggestthat household-specific energy monitoring andproviding direct feedback (as opposed to in-direct feedbacks such as monthly bills, etc.)can help reducing the energy consumption ofresidential buildings by about 10%. This couldalso be used to provide important informa-tion to utilities and lawmakers who shape en-

ergy policy. Energy disaggregation has beenan active area of research in the recent past.[3]use sparse coding for even disaggregatedload from monthly demands. Non-Intrusive Ap-pliance Load Monitoring (NIALM) techniquesare among the promising monitor-feedback sys-tems. NIALM is a method of evaluating totalcurrent/voltage (or power) readings in orderto deduce appliance-level energy consumptionpatterns. To this end, NIALM monitors the totalload and identifies the certain appliance “signa-tures”. These signatures correspond to certain(hidden) states of the appliances and can beused to disaggregate the total load to its individ-ual components. Energy disaggregation, pow-ered by various machine learning algorithms,has been an active area of research[2], [3], [4],[7]. Among all, supervised algorithms showbetter performance compared to unsupervised[3] algorithms, due to a prior knowledge on in-dividual appliance energy states and signatures.In this project, we use a mixture of supervisedlearning algorithms for energy disaggregation.We first extract individual appliance energystates by approximating the states as a mixtureof Gaussians. We then use Hidden MarkovModel (HMM) with Gaussian outputs (or emis-sions) to learn energy signatures and transitionsbetween states for each individual appliance.We then use Difference HMM (DHMM) al-gorithm developed by [6]along with a customViterbi algorithm to approximate the most prob-able sequence of hidden states for each appli-ance from total power consumption data. Usingour model, we are equipped with a methodto provide energy saving recommendations to

2

consumers. Energy consumption segmentationinto ‘classes’ of appliances has been studied.These methods though being both stochasticallyadvanced and accurate, do not help in providingvaluable recommendations to a customer. Webelieve that by comparing with other house-holds having similar energy patterns, one canprovide more relevant recommendations andhave a greater chance of behavioral change.This is a novel feature that has not been re-searched much on, according to our knowledgeand is central to our recommendation scheme.

II. APPLIANCE MODELING

In this section, we describe the procedureof extracting features of energy consumptionof appliances. We use publicly-available REDDdataset which contains both aggregate and in-dividual appliance power data of 6 households.We used the data from first home (measuredat every 2 to 5 sec) for our appliance-levelsupervised learning. We identified the majorenergy consuming appliances (total of 7) andextracted their power curves. To this end, foreach appliance, we “chopped” the power usagefor period of times in which the appliancewas on. This will provide a multi-observationsequence which can potentially result in moreaccurate training. We then used a histogramanalysis to identify number of states of eachappliance and provide reasonable initial guessesto the training algorithm. Figure 1(a) showsthe actual power reading of dishwasher as wellas its corresponding energy histogram. Thisanalysis and all the following algorithms areimplemented in MATLAB. As shown in Figure1(a), we created an energy histogram of eachappliance (normalized frequency of energy vs.energy level) to identify the most frequentenergy levels and the standard deviation cor-responding to those states. This is a crude yetessential step for us in appliance modeling,as it provides reasonable initial values for thenext step in appliance modeling. These resultsare then fed into a HMM as initial values.Since the power readings are continuous, we

approximate the output of HMM as multivariateGaussian (with each state corresponding to asingle Gaussian). For this version of HMM,we used the algorithm formulated by [6]. Webriefly introduce this model here. For emissionmatrix φ, we have φi ∼ N (µi, σ

2i ). The es-

timation formulae for the parameters of thismodel can be shown to be of the form [6]µi =

∑Tt=1 γt(i)xt∑Tt=1 γt(i)

and σ2i =

∑Tt=1 γt(i)(xt−µi)2∑T

t=1 γt(i)

where x is the observation and µi and σ2i are

respectively mean and variance of i-th state.γt(i) is the probability of being in state i attime t and is defined as

γt(i) =αt(i)βt(i)∑ni=1 αt(i)βt(i)

where αt(i) and βt(i) are backward and for-ward variables. The other parameters, such asprior probabilities and transmission matrix, aresimilar to the case of standard HMM. Theresult of Gaussian emission HMM in the caseof dishwasher is shown in Figure 1(b). Thisappliance has four states and so the graph rep-resentation has four nodes. In Figure 1(c), weshow the most probable state sequence (outputof Viterbi algorithm) for the data shown inFig. 1(a). Result show a reasonable agreementbetween predicted and measured power curve.We similarly train the rest of appliances use adifference Markov chain to disaggregation.

III. DISAGGREGATION

Once all the appliances are modeled, thetask of disaggregation is addressed. We usea difference Hidden Markov Model for this.For the disaggregation of total load, we havetwo input vectors- x (time series of the totalload curve) and y(time series of difference inx for consecutive time states), i.e., such thatyt = xt−xt−1 (hence this model is referred to asa difference HMM). This is done because y is amore important feature than the total load itselfsince any change of state is an appliance resultsin non-zero yt values, and zero yt values forsteady operation. Figure 2 shows the schematicof a difference HMM where x and y are defined

3

Figure 1. (a) Measured power draw and histogram of states for dishwasher, (b)State transition model, (c) Predicted state sequencevia Viterbi algorithm

Figure 2. Schematic of difference HMM

above and z is the vector of appliance states ateach time step.

For each appliance n, the model parame-ters can be defined as θ(n) =

{Π(n),A(n),φ(n)

},

respectively corresponding to the probabilityof an appliance’s initial state, the transitionprobabilities between states and the probabilitythat an observation was generated by an appli-ance state.[5] The probability of an appliance’sstarting state at t = 1 is represented by the priordistribution, hence:

p(z1 = k) = Πk (1)

The transition probabilities from state i at t−1 to state j at t are represented by the transitionmatrix A such that:

p(zt = j|zt−1 = i) = Ai,j (2)

We assume that each appliance has a Gaus-sian distributed power demand:

p(wzt |zt, φ) = N (µzt , σ2zt) (3)

where wzt is the actual power draw forthe appliance. Then, yt can be modeled as aGaussian distribution which is a difference of2 Gaussian distributions:

yt|zt, zt−1, φ ∼ N (µzt−µzt−1 , σ2zt +σ2

zt−1) (4)

where φk = {µk, σ2k} , and µk and σ2

k are themean and variance of the Gaussian distributiondescribing this appliance’s power draw in statek. So far, change in state of an appliancedoes not ensure that the appliance was capableof being in the initial state (or even if theappliance was ON). We can do this by imposinga constraint on the total power demand xt, asa higher limit for the appliance power draw zt:

P (wzt ≤ xt|zt, φ) =

xtˆ

−∞

N (µzt , σ2zt)dw

=1

2

[1 + erf

(xt − µztσzt√

2

)](5)

We further need another constraint whichensures a change of state is not predicted based

4

on noise in the observed total load x. In order toachieve this, we filter out the change of stateswhen the joint probability is below a certainthreshold(C) specific to each appliance: t ∈ Sif

maxzt−1,zt(p(xt, zt−1, zt|θ)) ≥ C (6)

Combining equations 1 through 6 mentionedabove, we end with a joint probability:

p(x,y, z|Θ) = p(z1|π)T∏t=2

p(zt|zt−1,A)

T∏t=1

P (wzt ≤ xt|zt, φ)∏t∈S

p(yt|zt, zt−1, φ) (7)

Using equation 7, we can figure out the mostprobable state for any appliance at a given timet, if we know its state at t − 1. We still donot have a unique state change (a non zero ytcould predict change of state for more than oneappliance). In order to overcome this, we em-ploy the Viterbi algorithm, which is a dynamicprogramming algorithm that returns the mostprobable change of state across all appliances.For running the custom Viterbi algorithm, wemake the assumption that there can be onlyone state change across all appliances. This isa fair assumption owing to high sample ratesfor the data used (2 - 5 seconds). Only themost likely change of state (as predicted by theViterbi algorithm) is accepted and all others arediscarded at each time step. The results of thisalgorithm are discussed in the next section.

4 RESULTS AND DISCUSSION

In Figure 3, we show the total load curveof a household disaggregated into 4 applianceloads - an oven, a dryer, a microwve and arefrigerator. The total power demand curve istaken from one of the households in the REDDdataset. Table 1 shows the energy usage (whichis the area under the Power-time curve) foreach appliance. We observe that the L1 norm

Figure 3. Energy disaggregation into four different appliances(oven, dryer, microwave and refrigerator)

Actual Energy (%) Estimated Energy(%)Dryer 60.84 61.47Oven 26.97 27.09

Refrigerator 11.54 10.58Microwave 0.65 0.86

Table IACTUAL AND ESTIMATED TOTAL ENERGY CONSUMPTION

FOR 4 APPLIANCES (OVEN, DRYER, MICROWAVE ANDREFRIGERATOR)

of the error for this example is 2.58% of thetotal load curve. The results show that we havesuccessfully disaggregated the total load for thebuilding into its appliance components.

5 APPLICATIONS AND SCOPE

One application of the disaggregated datais to provide energy saving recommendationsto individual households. This could be fromidentifying inefficient appliances, better usagerecommendations such as using the washer-dryer during afternoon, when the electricityprice is low, etc. One major feature requiredin a recommendation system is that it shouldbe relevant to the consumer. In order to ad-dress this, we cluster the users into 5 differentclusters based on their energy habits using k-means++ algorithm. We observe that k = 5gives the best results. Individual household datais obtained from PecanStreet data, which pro-vides 600 homes’ load curves. The results arepresented in Figure 4. The clustering of usersmakes comparison within the same energy clus-ter far more meaningful than when done in a

5

Figure 4. Clustered households via k-means ++ clustering(k =5), Least Square error for various k values

general setting. Any inaccuracies that occur inthe disaggregation algorithm can be treated asan online learning problem within each clusterto increase the accuracy.

6 CONCLUSION AND FUTURE WORK

We notice that the difference HMM modelingapproach to energy disaggregation based onHMM appliance models has a good predictionrate in the case of fixed state appliances. In ourexample, we have an L1 error norm of 2.58%of the total load. It should be noted the powersurge that typically occurs when appliancesstart from the OFF state is not explained usingthe current model. Further, infinite state devices(such as dimmer lamps) cannot be modeledaccurately using a finite state HMM model andhence, their contribution is not disaggregatedaccurately. Further, we state a feasible applica-tion for the work carried out, in the form of anovel energy saving recommendation providerby comparing households with similar energypatterns.

REFERENCES

[1] K Carrie Armel, Abhay Gupta, Gireesh Shrimali, andAdrian Albert. Is disaggregation the holy grail of energyefficiency? the case of electricity. Energy Policy, 52:213–234, 2013.

[2] D Kelly and W Knottenbelt. Disaggregating smart meterreadings using device signatures. Imperial ComputingScience MSc Individual Project, 2011.

[3] J Zico Kolter, Siddharth Batra, and Andrew Y Ng. Energydisaggregation via discriminative sparse coding. In Ad-vances in Neural Information Processing Systems, pages1153–1161, 2010.

[4] J Zico Kolter and Tommi Jaakkola. Approximate inferencein additive factorial hmms with application to energydisaggregation. In International conference on artificialintelligence and statistics, pages 1472–1482, 2012.

[5] Oliver Parson, Siddhartha Ghosh, Mark Weal, and AlexRogers. Non-intrusive load monitoring using prior modelsof general appliance types. 2012.

[6] Lawrence R Rabiner. A tutorial on hidden markov modelsand selected applications in speech recognition. Proceed-ings of the IEEE, 77(2):257–286, 1989.

[7] Ahmed Zoha, Alexander Gluhak, Muhammad Ali Imran,and Sutharshan Rajasegarar. Non-intrusive load monitoringapproaches for disaggregated energy sensing: A survey.Sensors, 12(12):16838–16866, 2012.

household energy disaggregation based on …cs229.stanford.edu/proj2016spr/report/067.pdf1 household...

Documents