Introduction to Side Channel Analysis. Elisabeth Oswald University of Bristol

Introduction to Side Channel Analysis Elisabeth Oswald University of Bristol

Outline Part 1: SCA overview & leakage Part 2: SCA attacks & exploiting leakage and very briefly Part 3: Countermeasures Part 4: Current research Please do ask questions as & when they arise

Types of attacks Measure some physical characteristics like power consumption, electromagnetic emanation, timing Device (Smart Card, phone,..) Crypto (AES, RSA-OAEP, ECC, ) Implementation of Algorithm (Software + Hardware) Dismantle Induce Error (Laser, Voltage, Electromagnetic Pulse, )

Types of attacks in a nutshell Non-Invasive Attacks Device attacked as is, only accessible interfaces exploited, relatively inexpensive Semi-Invasive Attacks Device is depackaged but no direct electrical contact is made to the chip surface, more expensive Invasive Attacks No limits what is done with the device Passive Attacks The device is operated largely or even entirely within its specification Side-channel attacks: timing attacks, power + EM attacks, cache trace Read out memory of device without probing or using the normal read-out circuits Probing depackaged devices but only observe data signals Active Attacks The device, its inputs, and/or its environment are manipulated in order to make the device behave abnormally Insert fault in device without depackaging: clock glitches, power glitches, or by changing the temperature Induce faults in depackaged devices with e.g. X-rays, electromagnetic fields, or light Depackaged devices are manipulated by probing, laser beams, focused ion beams

Security is hard in practice! Crypto devices ought to protect keys from being revealed/extracted Cryptographers have been very good in proving algorithms secure in theory Engineers have learned how to deal with the practical mess Real life often gets in the way Limited computing power Limited memory Limited time

Outline Part 1: SCA overview & leakage Focus on power analysis Part 2: SCA attacks & exploiting leakage Part 3: Countermeasures Part 4: Current research Please do ask questions as & when they arise

Power Analysis Attacks Power consumption of cryptographic device depends on instructions and data. V dd q Data q a q CMOS Inverter A GND Power consumption

(Simple) Power analysis Use snapshot of power consumption Single or few power traces Analyse patterns within one trace Patterns correspond to secret key

Differential power analysis Attacker requires many power traces Fixed key, varying data Analyse patterns/differences across different traces (but at same point in time) t t K=0 D=1 K=0 D=2 K=0 D=3 t

SPA and DPA exploit leakage Global leakage per time index Power analysis, EM analysis Local leakage per time index EM analysis, timing derived via EM Global leakage Timing, cache trace, timing derived via power consumption Amount of leakage does depend on side channel!

Does leakage behaviour change? Of course: it depends on e.g. power consumption which depends on parameters such as supply voltage, clock frequency, but also which parts of a device are accessed, scheduling of processes, etc.

How can we measure leakage? Theoretical issues Average entropy vs. min entropy Univariate vs. multivariate Practical issues Which configuration of device? (parallel processing, pipelining, interrupts, etc.) Multivariate: points of interest?

Outline Part 1: SCA overview & leakage Part 2: SCA attacks & exploiting leakage Part 3: Countermeasures Part 4: Current research Please do ask questions as & when they arise

SCA attack types Disclaimer: there are no universally accepted definitions/descriptions for any of the concepts I will mention SPA, template based SPA, collision attacks DPA, template based DPA, univariate, multivariate

SPA-type attacks SPA attacks exploit key-dependent differences that occur within a trace. Use only very few power traces. k = 31 Some SPA attacks can be extremely successful Unprotected multiplications or scalar multiplications (ECC) can be trivially broken Simple timing analysis can be shockingly successful Just think of all the implementations of PIN/password comparisons implemented efficiently (i.e. check item by item with stop as soon as mismatch is detected) Obstacles in practice Often need to know how implementation work Profiling required for template-based SPA attacks k = 21 difference

DPA-type attacks Data Device under Attack (Key) Data Model of the Device under Attack Key Hypothesis Real power consumption Statistical Analysis Hypothetical power consumption Decision about Key Hypothesis

DPA: Measuring real power consumption (1/5) Cryptographic device (device under attack) Measurement circuit, probe Oscilloscope/PC Challenge is not to induce too much noise Maybe more art than science

DPA: key hypothesis (2/5) Key guess in model is typically small Example: AES State Mixes key with message bit-wise Uses key byte-wise AES round State' s 0,0 s 0,1 s 0,2 s 0,3 s 1,0 s 1,1 s 1,2 s 1,3 SubBytes ( ) s' 0,0 s' 0,1 s' 0,2 s' 0,3 s' 1,0 s' 1,1 s' 1,2 s' 1,3 s 2,0 s 2,1 Ss 2,2 i,j s 2,3 SBox S' i,j s 3,0 s 3,1 s 3,2 s 3,3 s 3,0 s 3,1 s 3,2 s 3,3 3 s 2,0 s 2,1 s 2,2 s 2,3 2 S 2,2 s' 2,0 s' 2,1 s' 2,2 s' 2,3 s' 3,0 s' 3,1 s' 3,2 s' 3,3 s 0,0 s 0,1 s 0,2 s 0,3 0 s' 0,0 s' 0,1 s' 0,2 s' ShiftRows ( ) 0,3 s 1,0 s 1,1 s 1,2 s 1,3 1 s' 1,0 s' 1,1 s' 1,2 s' 1,3 S 2,0 S 2,1 S 2,2 S 2,3 s' 2,0 S 2,3 s' 2,1 Ss' 2,2 2,0 s' 2,3 S 2,1 S 0,2 s0,0 s0,1 s0,2 s0,3 S 1,2 s1,0 s1,1 s1,2 s1,3 S 2,2 s2,0 s2,1 s2,2 s2,3 s3,0 s3,1 s3,2 s3,3 S 3,2 Rotate left MixColumns ( ) (`03 x 3 + `01 x 2 + `01 x + `02 ) mod (x 4 + 1) s' 3,0 s' 3,1 s' 3,2 s' 3,3 S' 0,2 s'0,0 s'0,1 s'0,2 s'0,3 S' 1,2 s'1,0 s'1,1 s'1,2 s'1,3 S' 2,2 s'2,0 s'2,1 s'2,2 s'2,3 s'3,0 s'3,1 s'3,2 s'3,3 S' 3,2 s0,0 s0,1 s0,2 s0,3 s 1,0 s 1,1 s 1,2 s 1,3 s2,0 s2,1 s2,2 s2,3 s 3,0 s 3,1 s 3,2 s 3,3 SubBytes ( ) ShiftRows ( ) MixColumns ( ) AddRoundKey( ) s'0,0 s'0,1 s'0,2 s'0,3 s' r,c s' 1,0 s' 1,1 10010110 s' 1,2 s' 1,3 Byte s'2,0 s'2,1 s'2,2 s'2,3 s' 3,0 s' 3,1 s' 3,2 s' 3,3 s0,0 s0,1 s0,2 s0,3 s 1,0 s 1,1 s 1,2 s 1,3 s 2,0 s 2,1 s 2,2 s 2,3 AddRoundKey ( ) k0,0 k0,1 k0,2 k0,3 AddRoundKey ( ) k 1,0 k 1,1 k 1,2 k 1,3 k 2,0 k 2,1 k 2,2 k 2,3 s'0,0 s'0,1 s'0,2 s'0,3 s' 1,0 s' 1,1 s' 1,2 s' 1,3 s' 2,0 s' 2,1 s' 2,2 s' 2,3 Si,j ki,j S'i,j s 3,0 s 3,1 s 3,2 s 3,3 k 3,0 k 3,1 k 3,2 k 3,3 s' 3,0 s' 3,1 s' 3,2 s' 3,3

DPA: model (3/5) Model of device Implement cryptographic algorithm (similar architecture if possible) Calculate intermediate value using key guess Map intermediate value to hypothetical power consumption value

DPA: Three popular statistical tests (4/5) arg max s* ( *), M s ρ L Correlation analysis using Pearson s correlation coefficient. arg max s* p( M s* L) Bayesian analysis using normal distribution to determine probability. arg max E( L s* M s* = 0 ) E( L M s* = 1 M hypothetical leakage, determined from model L physical leakage, measured from device ) Distance of means test.

DPA: evaluation/key ranking (5/5) 1 d 1 d 2 d q k 1 k 2 k K Algorithm 3 V 1,1 V 1,2 V 1,K V 2,1 V 2,2 V 2,K V q,1 V q,2 2 Power model Traces l 1,1 l 1,2 l 1,T l 2,1 l 2,2 l 2,T V q,k 4 m 1,1 m 1,2 m 1,K m 2,1 m 2,2 m 2,K l q,1 l Dq,2 l q,t m q,1 m q,2 m q,k Statistics 5 r 1,1 r 1,2 r 1,T r 2,1 r 2,2 r 2,T r K,1 r K,2 r K,T

Effectiveness of DPA attacks: using ρ 2 8z 1 α / 2 We know the relationship n = 3 + 2 1+ ρ between the ρ and n ln 1 ρ A simple device (power model) allows attacker to determine ρ= ρ ck,ct via simulation or computation Previous slides: 8-bit microcontroller showing HW leakage, bit-model ρ ck,ct = ρ(m ck, l ct )=ρ(lsb(v ck ),HW(v ck ))=0.35 n 220 traces Disadvantage: works in specific scenarios only

Effectiveness of DPA attacks: success rate (SR) Ranking of key hypothesis Succ A (q)=sr correct key is ranked top in sr attack runs using q queries Generic measure

Does it matter which statistics to use? All 3 previously mentioned statistics are equally effective (in standard DPA attacks using mean-free data) Correlation: arg Bayes: E( L M s* s* max ρ( L, M ) = arg max s* 2 s* 2 s* s* E(( M ) ) E( M ) s* s* E( L M ) arg max p( M L) = arg max s* 2 s* s* E(( M ) ) )

Relationship between information and correlation Assuming we have Gaussian leakages and models which are close enough to Gaussian, this implies a direct relationship between correlation (i.e. how well does an attack work) and leakage In all other cases this relationship is NOT that simple

Experiments: Entropy (3/3) Practical evidence for Thm.4: Holds even for HW and Binary power models!

Other attacks: e. g. multivariate stuff, aka template attacks Characterisation Phase Determine interesting points, build templates: A template consists of the pair (m,c) that defines a multivariate normal distribution. Analysis Phase Match the templates to the given trace(s). The template that fits best, indicates the correct key.

Other attacks: e.g. second-order DPA (on masked implementations) Pre-processing prepares traces for DPA step. Targeted intermediate values occur in (P K) M Different clock cycles The same clock cycle The same clock cycle but power consumption characteristic allows exploiting leakage directly (L(S(P K) M)-L(P K M) S -box S(P K) M Then DPA attack on pre-processed traces using suitable hypothesis HW(S(P K) P K)

Outline Part 1: SCA overview & leakage Part 2: SCA attacks & exploiting leakage Part 3: Countermeasures Part 4: Current research Please do ask questions as & when they arise

Countermeasures Intermediate values correspond to values processed in the device Power consumption of device can be related to processed values Intermediate values as predicted by attacker Masking Intermediate values as processed by device Hiding Power consumption of device Goal of any countermeasure: Make power consumption independent of intermediate values!

Masking: concealing intermediate values by random values Each intermediate value v is concealed by a random value m which is called the mask m generated at random and independent from v m is not known to the attacker m is generated anew for each new encryption run v m = v m Power consumption characteristics of device are not changed! Can be used to protect existing devices, logic styles

Hiding: modifying relationship between interm. values and power consumption Power consumption of device is independent of processed data if Device consumes random amounts of power in each clock cycle Device consumes equal amounts of power in each clock cycle Power consumption characteristics of a device are changed Same intermediate values are processed

Countermeasures: Protocol level First patents by Kocher: key update mechanism First academic contribution: PhD thesis of Borst Today: plenty of works that define leakage resilient schemes based on the key update idea

Outline Part 1: SCA overview & leakage Part 2: SCA attacks & exploiting leakage Part 3: Countermeasures Part 4: Current research Please do ask questions as & when they arise

Current research Practitioner community: seems to be caught up with details -> find variation of attack X on implementation Y Do not look at system holistically Theory community: seem to only focus on high-level protocols -> can prove scheme X secure in model Y Do not pay any attention to practice (i.e. does model Y make sense, can scheme X be implemented)

Current research Fundamental questions still unanswered How do we measure leakage (univariate, multivariate, configuration of device, statistical method, etc.) How does leakage translate into SR of attacks How can high-level ideas be mapped on secure implementations (SCA aware compilers, design flow?)

Want more? Check out on IACR eprint One for all Leakage resilient cryptography The DPA book Visit: www.dpabook.org OpenSCA toolbox Follow links from http://www.cs.bris.ac.uk/home/eoswald/opensca.html