HPC and High-end Data Science

Similar documents
Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Role of Synchronized Measurements In Operation of Smart Grids

c 2017 Society for Industrial and Applied Mathematics

Operational modal analysis using forced excitation and input-output autoregressive coefficients

State Estimation of Linear and Nonlinear Dynamic Systems

Estimation of electromechanical modes in power systems using system identification techniques

Model Order Selection for Probing-based Power System Mode Estimation

Parallel Singular Value Decomposition. Jiaxing Tan

Energy Consumption Evaluation for Krylov Methods on a Cluster of GPU Accelerators

Distributed Real-Time Electric Power Grid Event Detection and Dynamic Characterization

Chapter 4 Systems of Linear Equations; Matrices

State Estimation and Power Flow Analysis of Power Systems

Principal Component Analysis

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Mode Identifiability of a Multi-Span Cable-Stayed Bridge Utilizing Stochastic Subspace Identification

PCA, Kernel PCA, ICA

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

Deriving Principal Component Analysis (PCA)

A new truncation strategy for the higher-order singular value decomposition

1 Singular Value Decomposition and Principal Component

ECEN 667 Power System Stability Lecture 20: Oscillations, Small Signal Stability Analysis

VARIANCE COMPUTATION OF MODAL PARAMETER ES- TIMATES FROM UPC SUBSPACE IDENTIFICATION

Robust extraction of specific signals with temporal structure

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Smart Grid State Estimation by Weighted Least Square Estimation

Performance Evaluation and Review of System Protection Scheme Design with the help of Synchrophasor Measurements

Processing Big Data Matrix Sketching

DESIGN AND IMPLEMENTATION OF SENSORLESS SPEED CONTROL FOR INDUCTION MOTOR DRIVE USING AN OPTIMIZED EXTENDED KALMAN FILTER

ECEN 667 Power System Stability Lecture 23:Measurement Based Modal Analysis, FFT

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems

Applications of GIS in Electrical Power System. Dr. Baqer AL-Ramadan Abdulrahman Al-Sakkaf

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python

Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

MULTIPLE EXPOSURES IN LARGE SURVEYS

Reducing The Data Transmission in Wireless Sensor Networks Using The Principal Component Analysis

DEVELOPMENT AND USE OF OPERATIONAL MODAL ANALYSIS

MODAL IDENTIFICATION AND DAMAGE DETECTION ON A CONCRETE HIGHWAY BRIDGE BY FREQUENCY DOMAIN DECOMPOSITION

WEATHER DEPENENT ELECTRICITY MARKET FORECASTING WITH NEURAL NETWORKS, WAVELET AND DATA MINING TECHNIQUES. Z.Y. Dong X. Li Z. Xu K. L.

Optimal PMU Placement

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

Efficient and Accurate Rectangular Window Subspace Tracking

Lecture: Face Recognition and Feature Reduction

DESIGNING A KALMAN FILTER WHEN NO NOISE COVARIANCE INFORMATION IS AVAILABLE. Robert Bos,1 Xavier Bombois Paul M. J. Van den Hof

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Dynamic Decomposition for Monitoring and Decision Making in Electric Power Systems

Real-Time Wide-Area Dynamic Monitoring Using Characteristic Ellipsoid Method

STA141C: Big Data & High Performance Statistical Computing

A New Subspace Identification Method for Open and Closed Loop Data

Dynamic state based on autoregressive model for outgoing line of high voltage stations

A Novel Approach for Solving the Power Flow Equations

SINGLE DEGREE OF FREEDOM SYSTEM IDENTIFICATION USING LEAST SQUARES, SUBSPACE AND ERA-OKID IDENTIFICATION ALGORITHMS

Background Mathematics (2/2) 1. David Barber

Lecture: Face Recognition and Feature Reduction

Modal Testing and System identification of a three story steel frame ArdalanSabamehr 1, Ashutosh Bagchi 2, Lucia Trica 3

PMU-Based Power System Real-Time Stability Monitoring. Chen-Ching Liu Boeing Distinguished Professor Director, ESI Center

WHEN studying distributed simulations of power systems,

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING

Updating the Centroid Decomposition with Applications in LSI

Numerical Linear Algebra

On Moving Average Parameter Estimation

STATE AND OUTPUT FEEDBACK CONTROL IN MODEL-BASED NETWORKED CONTROL SYSTEMS

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Stabilizing Hybrid Data Traffics in Cyber Physical Systems with Case Study on Smart Grid

Identification of modal parameters from ambient vibration data using eigensystem realization algorithm with correlation technique

Parallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers

Extracting inter-area oscillation modes using local measurements and data-driven stochastic subspace technique

A Modular NMF Matching Algorithm for Radiation Spectra

Postgraduate Course Signal Processing for Big Data (MSc)

Sample questions for Fundamentals of Machine Learning 2018

Normalized power iterations for the computation of SVD

Big Data Analytics: Optimization and Randomization

Least costly probing signal design for power system mode estimation

arxiv:cs/ v1 [cs.ms] 13 Aug 2004

STATE ESTIMATION IN DISTRIBUTION SYSTEMS

The optimal filtering of a class of dynamic multiscale systems

PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou

Quantum Recommendation Systems

WECC Dynamic Probing Tests: Purpose and Results

Eigenvalue Problems Computation and Applications

Bearing fault diagnosis based on EMD-KPCA and ELM

Dimensionality Reduction

UNIT 6: The singular value decomposition.

Out-of-Core SVD and QR Decompositions

BATCH-FORM SOLUTIONS TO OPTIMAL INPUT SIGNAL RECOVERY IN THE PRESENCE OF NOISES

Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling. Tam Chantem Electrical & Computer Engineering

Loadability Enhancement by Optimal Load Dispatch in Subtransmission Substations: A Genetic Algorithm

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11

Seminar Course 392N Spring2011. ee392n - Spring 2011 Stanford University. Intelligent Energy Systems 1

Data Mining and Matrices

Determination of stability limits - Angle stability -

STA141C: Big Data & High Performance Statistical Computing

System Identification by Nuclear Norm Minimization

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Collaborative Filtering Matrix Completion Alternating Least Squares

Linear Algebra & Geometry why is linear algebra useful in computer vision?

BLIND SOURCE SEPARATION TECHNIQUES ANOTHER WAY OF DOING OPERATIONAL MODAL ANALYSIS

Transcription:

HPC and High-end Data Science for the Power Grid Alex Pothen August 3, 2018

Outline High-end Data Science 1 3 Data Anonymization Contingency Analysis 2 4 Parallel Oscillation Monitoring 2 / 22

PMUs in the U.S. Phasor Measurement Units in the North American Grid as of March 2015 [1] 3 / 22 [1] NASPI website: https://www.naspi.org/documents.

Emerging Data Logistics 4 / 22

Challenges in High-end Data Analysis Computing is a major consumer of electricity, recent estimates about 10%. Data movement within a processor consumes more energy than arithmetic. Data movement from the instrument to the curation and archival sites, visualizations, researcher s work spaces, etc. represents a significant use of energy. Hence energy is an overarching challenge for sustainable High-end Data Analysis (HDA). Solutions include: Reduce computational costs by matching platforms to each stage of the scientific method Reduce data movement costs by collocation, compression and caching Reuse data through sharing, metadata, and catalogs 5 / 22

Challenges in High-end Data Analysis Data reduction is a fundamental pattern in the convergence of HPC and HDA. need new algorithms for loss-less and lossy compression of scientific data, and need algorithms that can work with the compressed data. New algorithms and software for HPC, data science, analytics, Machine learning, etc. are needed here. A major challenge is the convergence in the software ecosystem for HPC and HDA. 6 / 22

Challenges in High-end Data Analysis High volume data needs high performance computing Streaming data needs on-line algorithms Integration of heterogeneous data e.g.., consumption forecasting in smart grids: smart meters, utility databases, weather data, microgrid sensor networks, etc. Anonymization of consumer data 7 / 22

Outline High-end Data Science 1 3 Data Anonymization Contingency Analysis 2 4 Parallel Oscillation Monitoring 8 / 22

Contingency Analysis N x contingency analysis on power flow evaluates the stability of a power system by simulating the failures of x out of N transmission lines or generators (components). 2 7 8 9 3 gen-2 5 load 6 gen-3 load gen-1 load 4 1 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 9 / 22

Contingency Analysis N x contingency analysis on power flow evaluates the stability of a power system by simulating the failures of x out of N transmission lines or generators (components). 2 7 8 9 3 gen-2 5 load 6 gen-3 load gen-1 load 4 1 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 9 / 22

Contingency Analysis N x contingency analysis on power flow evaluates the stability of a power system by simulating the failures of x out of N transmission lines or generators (components). 2 7 8 9 3 gen-2 5 load 6 gen-3 load gen-1 load 4 1 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 9 / 22

Contingency Analysis Currently each N x contingency analysis is performed independently, but since the number of cases increases exponentially with x, this is computationally impractical. (For example, if N = 3120 and x = 5, there are in total 2.46 10 15 cases.) Heuristics are needed for choosing the cases to be analyzed. To be able to analyze more cases, we have to speed up the solution time for each case. We propose new algorithms for the problem by observing that only a principal submatrix of the system is changed when a component is removed upon failure. 10 / 22

Augmented Formulation The augmented matrix system can be expressed as: n{ A AH H m{ H A H ÂH x 1 b 0 x 2 = H b. m{ H 0 0 x 3 0 A is the original matrix; H is a submatrix of the identity matrix; Â = A HEH T updated matrix; m n; b is the new right-hand-side of the updated rows. 11 / 22

Runtimes Power Grid Contingency Analysis of 777,646 Nodes 10 1 time (second) 10 0 10 1 PARDISO LUSOL CHOLMOD augmented (Direct) augmented (GMRES) 10 2 1 2 3 4 5 6 7 8 9 1011121314151617181920 x (number of components removed) 12 / 22

Outline High-end Data Science 1 3 Data Anonymization Contingency Analysis 2 4 Parallel Oscillation Monitoring 13 / 22

Adaptive Anonymity Users Features F 1 F 2 F 3 F 4 F 5 F 6 U1 1 0 1 0 1 0 U2 1 1 1 1 1 0 U3 0 1 0 1 0 1 U4 0 0 0 0 0 1 U5 1 1 0 0 0 0 U6 1 1 0 0 0 1 14 / 22

Adaptive Anonymity Users Features F 1 F 2 F 3 F 4 F 5 F 6 U1 1 0 1 0 1 0 U2 1 1 1 1 1 0 U3 0 1 0 1 0 1 U4 0 0 0 0 0 1 U5 1 1 0 0 0 0 U6 1 1 0 0 0 1 Users Features F 1 F 2 F 3 F 4 F 5 F 6 U1 1 1 1 0 U2 1 1 1 0 U3 0 0 0 1 U4 0 0 0 1 U5 1 1 0 0 0 U6 1 1 0 0 0 15 / 22

Adaptive Anonymity with Approximation Algorithms Prob. Inst. Feat. b EdgeCover Time Util. Census1990 158K 68 9m 90% PokerHands 500K 95 2h 17m 84% CMS 1M 512 10h 33m 81% Earlier Algorithms Belief propagation: used exact b-matching algorithm, could solve problems with few thousand instances only. The approximate edge cover approach is two to three orders of magnitude faster on those smaller problems. 16 / 22

Adaptive Anonymity Computation on Cori 17 / 22

Outline High-end Data Science 1 3 Data Anonymization Contingency Analysis 2 4 Parallel Oscillation Monitoring 18 / 22

PMUs in the U.S. Stochastic Subspace Identification Covariance State Space Model xk 1 Axk wk, yk Cxk vk Past & Future Output Matrices y y1 y 0 J 1 y1 y2 y J Yp, y y y I I I J 1 2 y y y I I 1 I J 1 yi 1 yi 2 y I J Yf, y y y I I I J 1 2 2 2 2 Covariance Matrix T H Y Y O G, I f P I 2 I 1 T T, G Exkyk, O C CA CA CA Singular Value Decomposition (SVD) T H O G USV, O U S I I 1/2 1 1, Fast SVD Approaches State & Output Matrices Mode Identification A OI OI, C OI 1: l,:, A c = 1 log A, C T c = C, S Mode Freq. and Damping: eigenvalues of Aˆ c, s ModeShape: ˆ m C. i c i 19 / 22

PMUs in the U.S. Case 1 102 Voltage Phase Angles Computational Time Comparison Mac. No. Full SVD (sec) Strat. 1 sec (speedup) Randomized SVD h = 20, q = 1 Strat. 2 sec (speedup) Strat. 3 sec (speedup) Strat. 4 sec (speedup) Strat. 1 sec (speedup) Lanczos SVD k = 16 Strat. 2 sec (speedup) Strat. 3 sec (speedup) Strat. 4 sec (speedup) 1 112,075 105,826 (1.06x) 206.8 (542x) 135.4 (828x) 105.7 (1060x) 106,234 (1.05x) 81.2 (1380x) 100.2 (1119x) 96.0 (1167x) 2 125,807 121,007 (1.04x) 237.9 (529x) 174.0 (723x) 141.8 (887x) 115,458 (1.09x) 87.4 (1439x) 148.0 (850x) 122.0 (1031x) 3 138,236 135,304 (1.02x) 298.4 (463x) 230.3 (600x) 134.2 (1030x) 133,961 (1.03x) 101.9 (1357x) 177.0 (781x) 115.0 (1202x) 4 170,518 Theoretical speedup 168,617 (1.01x) 58x 304.4 (560x) 226.2 (754x) 140.0 (1218x) 163,028 (1.05x) 140x 104.8 (1627x) 158.0 (1079x) 138.3 (1233x) 20 / 22

References HDA and HPC M. Asch, I. Moore, R. Badia, J. Dongarra et al, Big Data and Extreme-Scale Computing: Pathways to Convergence, Internat. J. High Performance Computing Applications, 2018, 32(4), pp. 435-479. Data Anonymization A. Khan, K. Choromanski, A. Pothen, SM Ferdous, M. Halappanavar and A. Tumeo, Adaptive Anonymization of Data using b-edge cover, ACM/IEEE Proc. Supercomputing, Nov. 2018. Data Reduction for Oscillation Monitoring T. Wu, V. Venkatasubramanian, and A. Pothen, Fast parallel stochastic subspace algorithms for large-scale ambient oscillation monitoring, IEEE Trans. Smart Grid, 8(3), pp. 1494-1503, 2017. 21 / 22

References Contingency Analysis Y. Yeung, A. Pothen, M. Halappanavar and Z. Huang, AMPS: An augmented matrix formulation for principal submatrix updates with application to power grids, SIAM J. Sci. Comput., 39 (3), S809-S827, 2017. 22 / 22