Evaluating Electricity Theft Detectors in Smart Grid Networks. Group 5:

Similar documents
Evaluating Electricity Theft Detectors in Smart Grid Networks

Detection of Unauthorized Electricity Consumption using Machine Learning

Journal of Chemical and Pharmaceutical Research, 2014, 6(5): Research Article

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.

Anomaly Detection via Over-sampling Principal Component Analysis

SHORT TERM LOAD FORECASTING

Anomaly Detection. Jing Gao. SUNY Buffalo

Data Mining Based Anomaly Detection In PMU Measurements And Event Detection

False Data Injection Attacks Against Nonlinear State Estimation in Smart Power Grids

An Overview of Outlier Detection Techniques and Applications

Università di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7

Unsupervised Learning Methods

Chart types and when to use them

Justin Appleby CS 229 Machine Learning Project Report 12/15/17 Kevin Chalhoub Building Electricity Load Forecasting

Multi-Plant Photovoltaic Energy Forecasting Challenge: Second place solution

Department of Computer Science, University of Pittsburgh. Brigham and Women's Hospital and Harvard Medical School

GLAD: Group Anomaly Detection in Social Media Analysis

A Framework for Adaptive Anomaly Detection Based on Support Vector Data Description

Anomaly (outlier) detection. Huiping Cao, Anomaly 1

Introduction to Time Series (I)

Real-time Sentiment-Based Anomaly Detection in Twitter Data Streams

Multi-criteria Anomaly Detection using Pareto Depth Analysis

About Nnergix +2, More than 2,5 GW forecasted. Forecasting in 5 countries. 4 predictive technologies. More than power facilities

Anomaly Detection via Online Oversampling Principal Component Analysis

Electrical Energy Modeling In Y2E2 Building Based On Distributed Sensors Information

Available online at ScienceDirect. Procedia Engineering 119 (2015 ) 13 18

Unsupervised Anomaly Detection for High Dimensional Data

Quantifying Cyber Security for Networked Control Systems

Detection and Estimation Final Project Report: Modeling and the K-Sigma Algorithm for Radiation Detection

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014

Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University

Cyber Attacks, Detection and Protection in Smart Grid State Estimation

Identification of False Data Injection Attacks with Considering the Impact of Wind Generation and Topology Reconfigurations

Data and prognosis for renewable energy

Description of the case study

Predicting Solar Irradiance and Inverter power for PV sites

ASSIGNMENT - 1 M.Sc. DEGREE EXAMINATION, MAY 2019 Second Year STATISTICS. Statistical Quality Control MAXIMUM : 30 MARKS ANSWER ALL QUESTIONS

Forecasting demand in the National Electricity Market. October 2017

Anomaly Detection in Logged Sensor Data. Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK

Machine Learning Linear Classification. Prof. Matteo Matteucci

Support Measure Data Description for Group Anomaly Detection

The Changing Landscape of Land Administration

Learning an Optimal Policy for Police Resource Allocation on Freeways

Stat 500 Midterm 2 8 November 2007 page 0 of 4

Improving Biosurveillance System Performance. Ronald D. Fricker, Jr. Virginia Tech July 7, 2015

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014

AWERProcedia Information Technology & Computer Science

b) Explain about charts for attributes and explain their uses. 4) a) Distinguish the difference between CUSUM charts and Shewartz control charts.

Weather Normalization: Model Selection and Validation EFG Workshop, Baltimore Prasenjit Shil

Predictive Modeling: Classification. KSE 521 Topic 6 Mun Yi

Detecting Non-Technical Energy Losses through Structural Periodic Patterns in AMI data

Structured Denoising Autoencoder for Fault Detection and Analysis

Doing Right By Massive Data: How To Bring Probability Modeling To The Analysis Of Huge Datasets Without Taking Over The Datacenter

Hierarchical Anomaly Detection in Load Testing with StormRunner Load

Aalborg Universitet. CLIMA proceedings of the 12th REHVA World Congress Heiselberg, Per Kvols. Publication date: 2016

Predicting New Search-Query Cluster Volume

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract

Prediction for night-time ventilation in Stanford s Y2E2 building

EXAD: A System for Explainable Anomaly Detection on Big Data Traces

Anomaly detection and. in time series

Nonparametric forecasting of the French load curve

Jeffrey D. Ullman Stanford University

Artificial Neural Networks Examination, June 2005

Roberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA

Application-Aware Anomaly Detection of Sensor Measurements in Cyber-Physical Systems

Concept Drift. Albert Bifet. March 2012

Analysis. Components of a Time Series

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Quality Control & Statistical Process Control (SPC)

Graph-Based Anomaly Detection with Soft Harmonic Functions

Improving Gas Demand Forecast During Extreme Cold Events

Performance of Conventional X-bar Chart for Autocorrelated Data Using Smaller Sample Sizes

Wireless Sensor Network Wormhole Detection using an Artificial Neural Network

Real-Time Detection of Hybrid and Stealthy Cyber-Attacks in Smart Grid

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Performance Evaluation

Predicting Solar Flares by Converting GOES X-ray Data to Gramian Angular Fields (GAF) Images

ECE 5424: Introduction to Machine Learning

ECML PKDD Discovery Challenges 2017

MS&E 226: Small Data

Demand and Supply Integration:

David John Gagne II, NCAR

Figure 1. Time Series Plot of arrivals from Western Europe

Gov 2002: 3. Randomization Inference

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Seasonal ARMA-based SPC charts for anomaly detection: Application to emergency department systems

8. Classifier Ensembles for Changing Environments

Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc.

Energy Consumption Anomaly Monitoring Model

Robust and Rapid Adaption for Concept Drift in Software System Anomaly Detection

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Development of System for Supporting Lock Position Adjustment Work for Electric Point Machine

The Dayton Power and Light Company Load Profiling Methodology Revised 7/1/2017

MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton

Introduction to Statistical Inference

Evaluation. Andrea Passerini Machine Learning. Evaluation

Lessons learned from the theory and practice of. change detection. Introduction. Content. Simulated data - One change (Signal and spectral densities)

CHAPTER 6 CONCLUSION AND FUTURE SCOPE

Correlative Monitoring for Detection of False Data Injection Attacks in Smart Grids

Chapter-1 Introduction

Transcription:

Evaluating Electricity Theft Detectors in Smart Grid Networks Group 5: Ji, Xinyu Rivera, Asier

Agenda Introduction Background Electricity Theft Detectors and Attacks Results Discussion Conclusion 2

Introduction Bypass the Meter, so that, it doesn t count the consumed electricity. Detected under human inspection. 3

Background Observation (Electricity Consumption) X Sensor (SmartMeter) Y Classifier (Algorithm) Average attack: Set Y to low value => Really cheap electricity => Easy to detect. Sophisticated attack: Set Y to reasonable value => Cheaper electricity => Difficult to detect. Adversarial Classification: evaluate the effectiveness of the classifier against undetected attacks. Adversarial Learning: avoid the attacker to provide false data to the learning algorithm. 4

Adversarial classification Assumptions: A random process generates observations x X with probability distribution P 0 Use a maximum level of false negatives, α, that can be affordable Each classifier requires a threshold (Ƭ) to decide when raise the alert Attack vectors: 1. For each classifier, find the maximum value for the Ƭ among the P 0 distribution that fulfils α 2. Run all the classifiers set with the Ƭ selected in step 1, select the results that would produce the worst undetected attacks (more cost for the company) 3. Select the less costly (for the attacker) among the classifiers selected in step 2 5

Adversarial learning Contamination attacks: The classifiers uses a dataset to detect anomalies The attacker can inject false values to poison the dataset After some time, the data set is modified to fulfil the attackers requirements Contamination attack Y = Classifier would find purple as an anomaly Classifier would find purple as normal 6

Electricity-Theft Detectors and Attacks 1. Average Detector 2. ARMA-GLR 3. Nonparametric Statics (EWMA and CUSUM) 4. Unsupervised Learning (LOF) 7

Average Detector തY = 1 σ N i=1 N Y i In the paper they select the threshold in the following way: 1. 1. Given a training dataset, say T days in the most recent past, we can compute T daily averages, D i (i = 1,...,T). 2. 2. τ = min i (D i ) If തY <τ, where τ is a variable threshold. sending τ as Y i all the day Formulas from ECI Telecom. Fighting Electricity Theft with Advanced Metering Infrastructure (March 2011) 8

ARMA-GLR (Auto-Regressive Moving Average) ARMA probability distribution p0: If Y k > Y The attacker creates the probability distribution (P γ ) based on: Formulas from Forecast package for R, http://robjhyndman.com/software/forecast/ 9

EWMA (Exponentially-weighted Moving Average) EWMA i = λy i + (1 λ) EWMA i 1 If EWMA i < τ, where τ is a configurable parameter. λ is a weighting factor and 0 < λ 1 Yi is one of the time series measurements When EWMA i 1 When EWMA i 1 = τ, send Y i = τ. > τ, send Y i = MAX(0, τ (1 λ)ewma i 1 λ ) Formulas from EWMA Control Charts, http://itl.nist.gov/div898/handbook/pmc/section3/pmc324.html 10

CUSUM (Cumulative Sum Control Chart) S i = MAX(0, S i 1 + (μ Y i b)) μ is the expected value of the time-series, (i = 1,...,N) b is a slack constant defined so that E[ μ Y i b] < 0 under normal operation if S i > τ Calculate M = τ+nb N and send Y i = μ M Formulas from Non-Parametric Methods in Change-Point Problems. Kluwer Academic Publishers by Brodsky, B., Darkhovsky, B 11

EXTRA: XMR chart (individuals and moving range chart) EWMA and CUSUM identify minor changes in small regions XMR chart identifies large changes over the time, it determines whether the process is stable and predictable or not It calculates various thresholds and classifies the anomalies depending on how the process overpasses those thresholds Images adapted from Fraud detection in registered electricity time series by Josif V. Spiric, Miroslav B. Docic and Slobodan S. Stankovic 12

LOF (Local Outlier Factor) 1. Create a vector containing all measurements of a day to be tested in order, V test = {Y 1,..., Y n } where N is the number of measurements per day. 2. For all days in a training dataset, create vectors in the same way, V i = {x i1, x in } (i = 1,...,T). 3. Create a set containing V test and all V is, and apply LOF to this set. 4. If LOF test < τ where LOF test is a score corresponding to V test, conclude V test is normal and exit. Select the value with the minimum consumption among the ones that their LOF score is less than τ. Reduce that value without raising the alarm and send it. Fomulas from Lof: Identifying density-based local outliers by Breunig, M., Kriegel, H.-P., Ng, R.T., Sander, J. 13

EXTRA: COF (Connectivity-based Outlier Factor) It is a variation of LOF, created because of the excessive complexity (N 2 ) LOF creates a neighbourhood based on the main given instance COF creates the neighbourhood in an aggregative way from the main given instance Image adapted from Anomaly Detection: A Survey by Varun Chandola, Arindam Banerjee and Vipin Kumar 14

Results In the paper, they use real meter-reading data measured by an electric utility company during 6 months. The meter reading consisted of 108 customers with a mix of residential and commercial customers and recorded every 15 minutes. 15

Adversarial Evaluation: Cost of Undetected Attacks Image adapted from Evaluating Electricity Theft Detectors in Smart Grid Networks by Daisuke Mashima and Alvaro A. Cárdenas 16

Monetary Loss Caused by Different Customers Average customers High consumption Customers Image adapted from Evaluating Electricity Theft Detectors in Smart Grid Networks by Daisuke Mashima and Alvaro A. Cárdenas 17

Adversarial Learning: Detecting Contaminated Datasets Image adapted from Evaluating Electricity Theft Detectors in Smart Grid Networks by Daisuke Mashima and Alvaro A. Cárdenas 18

Adversarial learning: Types Of Contamination Image adapted from Evaluating Electricity Theft Detectors in Smart Grid Networks by Daisuke Mashima and Alvaro A. Cárdenas 19

Discussion Cross-Correlation among customers Attackers should exhibit different trends compared to similar honest customers LOF to identify outliers Auto-Correlation in ARMA-GLR The residuals of generated attacks have high auto-correlation Durbin-Watson statistics then can detect attacks against ARMA-GLR Energy Efficiency Customers can add green-energy technology to the system, such as, solar panels Company should know about this information to set the classifiers properly Not general classifiers Depends on consumers lifestyle (changes from countries, areas, etc) 20

Conclusion These algorithms will perform much better under average cases where the attacker does not know the algorithm or time intervals we use for anomaly detection For companies, they need to combine all the information to consider their network and accurate the electricity-theft reports The proposed anomaly detector will only output indicators of an attack 21