Vulnerability Extrapolation Give me more bugs like that Blackhat Briefings Fabian fabs Yamaguchi Recurity Labs GmbH, Germany

Similar documents
Class Diagrams. CSC 440/540: Software Engineering Slide #1

LU N C H IN C LU D E D

c. What is the average rate of change of f on the interval [, ]? Answer: d. What is a local minimum value of f? Answer: 5 e. On what interval(s) is f

Information System Desig

Grain Reserves, Volatility and the WTO

600 Billy Smith Road, Athens, VT

Pr[X = s Y = t] = Pr[X = s] Pr[Y = t]

Capacitor Discharge called CD welding

Lan Performance LAB Ethernet : CSMA/CD TOKEN RING: TOKEN

Växjö University. Software Security Testing. A Flexible Architecture for Security Testing. School of Mathematics and System Engineering

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Math 110, Spring 2015: Midterm Solutions

Machine Learning (Spring 2012) Principal Component Analysis

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

A L A BA M A L A W R E V IE W

ANNUAL MONITORING REPORT 2000

Isomorphisms and Well-definedness

Quadratic Equations Part I

Lab 4. Current, Voltage, and the Circuit Construction Kit

STEEL PIPE NIPPLE BLACK AND GALVANIZED

Robust Programs with Filtered Iterators

1 General Tips and Guidelines for Proofs

Operation Manual for Automatic Leveling Systems

MySQL 5.1. Past, Present and Future. Jan Kneschke MySQL AB

MOLINA HEALTHCARE, INC. (Exact name of registrant as specified in its charter)

AGRICULTURE SYLLABUS

TECHNICAL MANUAL OPTIMA PT/ST/VS

MITOCW ocw-18_02-f07-lec02_220k

Principal Component Analysis

UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C FORM 8-K

TTM TECHNOLOGIES, INC. (Exact Name of Registrant as Specified in Charter)

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Lecture 4: Constructing the Integers, Rationals and Reals

7.2 P rodu c t L oad/u nload Sy stem s

Lecture 27: Theory of Computation. Marvin Zhang 08/08/2016

Software Architecture. CSC 440: Software Engineering Slide #1

UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C Form 8-K/A (Amendment No. 2)

DIFFERENTIAL EQUATIONS

, p 1 < p 2 < < p l primes.

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

Functional pottery [slide]

gender mains treaming in Polis h practice

Automata Theory CS S-12 Turing Machine Modifications

Effective Entropy for Memory Randomization Defenses

IV. Matrix Approximation using Least-Squares

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

Exercises * on Principal Component Analysis

CHAPTER 8: MATRICES and DETERMINANTS

CprE 281: Digital Logic

Machine Learning, Fall 2009: Midterm

B ooks Expans ion on S ciencedirect: 2007:

IE418 Integer Programming

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

S U E K E AY S S H A R O N T IM B E R W IN D M A R T Z -PA U L L IN. Carlisle Franklin Springboro. Clearcreek TWP. Middletown. Turtlecreek TWP.

Breakup of weakly bound nuclei and its influence on fusion. Paulo R. S. Gomes Univ. Fed. Fluminense (UFF), Niteroi, Brazil

Exploratory Factor Analysis and Principal Component Analysis

CSCI3390-Assignment 2 Solutions

Beechwood Music Department Staff

Let s now begin to formalize our analysis of sequential machines Powerful methods for designing machines for System control Pattern recognition Etc.

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Data Mining Techniques

Abstractions and Decision Procedures for Effective Software Model Checking

A primer on matrices

Exploratory Factor Analysis and Principal Component Analysis

A REVIEW OF RESIDUES AND INTEGRATION A PROCEDURAL APPROACH

Comp 11 Lectures. Mike Shah. July 12, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures July 12, / 33

CS281 Section 4: Factor Analysis and PCA

Lecture 4: Training a Classifier

Form and content. Iowa Research Online. University of Iowa. Ann A Rahim Khan University of Iowa. Theses and Dissertations

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

M a n a g e m e n t o f H y d ra u lic F ra c tu rin g D a ta

One-to-one functions and onto functions

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

A Primer on Statistical Inference using Maximum Likelihood

EXAM 2 REVIEW DAVID SEAL

A new ThermicSol product

CS 124 Math Review Section January 29, 2018

High Capacity Double Pillar Fully Automatic Bandsaw. p h a r o s 2 8 0

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors

Introduction to Error Analysis

CSE373: Data Structures and Algorithms Lecture 2: Math Review; Algorithm Analysis. Hunter Zahn Summer 2016

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Chapter 15. Probability Rules! Copyright 2012, 2008, 2005 Pearson Education, Inc.

Ordinary Least Squares Linear Regression

Tutorial on Principal Component Analysis

Physics 225 Relativity and Math Applications. Fall Unit 7 The 4-vectors of Dynamics

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Predicting AGI: What can we say when we know so little?

PRINCIPAL COMPONENTS ANALYSIS

Introduction to Algebra: The First Week

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

MITOCW ocw f99-lec30_300k

UNITED STATES SECURITIES AND EXCHANGE COMMISSION FORM 8-K. Farmer Bros. Co.

Transcription:

Vulnerability Extrapolation Give me more bugs like that Blackhat Briefings 2011 Fabian fabs Yamaguchi Recurity Labs GmbH, Germany

Agenda Patterns you find when auditing code Exploiting these patterns: Vulnerability Extrapolation Using machine learning to get there A method to assist in manual code audits based on this idea The method in practice A detailed showcase

Exploring a new code base Like an area of mathematics you don t yet know. It s not completely different from the mathematics you already know. But there are secrets specific to this area: Vocabulary Reoccurring patterns in argumentation Weird tricks used in proofs Understanding the specifics of the area makes it a lot easier to reason about it.

It s also a lot like DOOM Dropped into some code-base, no idea where you are Only a handgun to begin with Secrets of this particular.wad matter! Where do I get weapons around here? Effective ways of killing these monsters?

It s all the same -attitude Let s not be too romantic about what we do: Things tend to break in the same way over and over again. Think of how you automatically stop scrolling whenever you see sprintf. Or those lists of dangerous functions used by old-school tools like RATS, ITS4 and flawfinder. The method employs heuristics.

Or think of the success of fuzzers Fuzzers use patterns in input that will get many targets very upset. Often very effective. Why? Because things tend to break in the same way over and over again. Try giving it a lemon.

or Taint Propagation Example: Monitor flow of integer from read to malloc, detect unsafe operations and monitor if integer is ever checked. API usage patterns, we ve seen blow up again and again. Taint Sources Operations Taint Sinks

Overfished These methods are too generic. They find what people screw up in most applications. And that s also what most people have looked at in the application.

Specifics matter! For software that people actually care about, you can be sure all the lowhanging fruit is gone. grep ing for memcpy : not much of a strategy anymore. Why? Because that s what the last 100 people did before you came along. It is no longer optional to learn about the specifics of the code base!

Which is why manual audits are so successful Because auditors find the weak programming patterns used in this application. Find the interfaces that are causing trouble in this application! Find the secrets in this.wad! (Seriously, there was a ghostbusters.wad)

An example Poppler (CVE-2009-3607) s ta tic c a iro _ s u rfa c e _ t * c re a te _ s u rfa c e _ fro m _ th u m b n a il_ d a ta ( g u c h a r * d a ta, g in t w id th, g in t h e ig h t, g in t ro w s trid e ) { g u c h a r * c a iro _ p ix e ls ; c a iro _ s u rfa c e _ t * s u rfa c e ; s ta tic c a iro _ u s e r_ d a ta _ k e y _ t key; in t j; c a iro _ p ix e ls = ( g u c h a r *)g_ m a llo c ( 4 * w id th * h e ig h t); s u rfa c e = c a iro _ im a g e _ s u rfa c e _ c re a te _ fo r_ d a ta ((unsigned c h a r *) c a iro _ p ix e ls, C A IR O _ F O R M A T _ R G B 2 4, w id th, h e ig h t, 4 * w id th ); c a iro _ s u rfa c e _ s e t_ u s e r_ d a ta ( s u rfa c e, & key, c a iro _ p ix e ls, ( c a iro _ d e s tro y _ fu n c _ t) g _ fre e ); [..] r e tu r n s u rfa c e ;

Do you see the bugs? The integer overflow is obvious. The missing check, not so much. You need to know the API to see this! /* This function always returns a valid pointer, but it will return a * pointer to a "nil" surface in the case of an error such as out of * memory or an invalid stride value. In case of invalid stride value * the error status of the returned surface will be * %CAIRO_STATUS_INVALID_STRIDE. You can use * cairo_surface_status() to check for this. */

Evince Bug, silently fixed. s ta tic c a iro _ s u rfa c e _ t * d jv u _ d o c u m e n t_ re n d e r ( E v D o c u m e n t * d o c u m e n t, E v R e n d e rc o n te x t * rc) { [..] # ifd e f H A V E _ C A IR O _ F O R M A T _ S T R ID E _ F O R _ W ID T H row stride = cairo_form at_stride_for_w idth (CAIRO_FORM AT_RGB24, p a g e _ w id th ); # e ls e ro w s trid e = p a g e _ w id th * 4 ; # e n d if p ix e ls = ( g c h a r *) g _ m a llo c ( p a g e _ h e ig h t * ro w s trid e ); s u rfa c e = c a iro _ im a g e _ s u rfa c e _ c re a te _ fo r_ d a ta ((guchar *)pixels, C A IR O _ F O R M A T _ R G B 2 4, p a g e _ w id th, p a g e _ h e ig h t, ro w s trid e ); c a iro _ s u rfa c e _ s e t_ u s e r_ d a ta ( s u rfa c e, & key, p ix e ls, ( c a iro _ d e s tro y _ fu n c _ t) g _ fre e ); [..] r e tu r n s u rfa c e ; version 2.28.1

A simple case This case is simple, because the APIsymbols, which lead to these two bugs were exactly the same. But does that have to be the case?

Another Example: libtiff CVE-2006-3459 CVE-2010-2067 s ta tic in t T IF F F e tc h S h o rtp a ir ( T IF F * tif, T IF F D ire n try * d ir) { s w itc h ( d ir->tdir_ ty p e ) { c a s e T IF F _ B Y T E : c a s e T IF F _ S B Y T E : { u in t8 v [ 4 ]; r e tu r n T IF F F e tc h B y te A rra y ( tif, d ir, v ) && T IF F S e tf ie ld ( tif, d ir->tdir_ ta g, v [ 0 ], v [ 1 ]); c a s e T IF F _ S H O R T : c a s e T IF F _ S S H O R T : { u in t1 6 v [ 2 ]; r e tu r n T IF F F e tc h S h o rta rra y ( tif, d ir, v ) && T IF F S e tf ie ld ( tif, d ir->tdir_ ta g, v [ 0 ], v [ 1 ]); d e fa u lt : r e tu r n 0 ;

Another Example: libtiff CVE-2006-3459 CVE-2010-2067 s ta tic in t T IF F F e tc h S u b je c td is ta n c e ( T IF F * tif, T IF F D ire n try * d ir) s ta tic in t { T IF F F e tc h S h o rtp a ir ( T IF F * tif, TuIF in F Dt3 ire2 n try l[ 2 *]; d ir) { flo a t v ; in t ok = 0 ; s w itc h ( d ir->tdir_ ty p e ) { c a s e T IF F _ B Y T E : c a s e T IF F _ S B Y T E : if ( T IF F F e tc h D a ta ( tif, d ir, ( c h a r *)l) { u in t8 v [ 4 ]; && c v tr a tio n a l( tif, d ir, l[ 0 ], l[ 1 ], & v )) { r e tu r n T IF F F e tc h B y te/* A rra y ( tif, d ir, v ) && T IF F S e tf ie * ld XXX: ( tif, dn ir->td um erator ir_ ta g 0xFFFFFFFF, v [ 0 ], v [ 1 ]); m eans that we have infinite * distance. Indicate that with a negative floating point c a s e T IF F _ S H O R T : * SubjectDistance value. c a s e T IF F _ S S H O R T : */ { ok = T IF F S e tf ie ld ( tif, d ir->tdir_ ta g, u in t1 6 v [ 2 ]; ( l[ 0 ]!= 0 x F F F F F F F F )? v : - v ); r e tu r n T IF F F e tc h S h o rta rra y ( tif, d ir, v ) && T IF F S e tf ie ld ( tif, d ir->tdir_ ta g, v [ 0 ], v [ 1 ]); d e fa u lt : r e tu r n ok; r e tu r n 0 ;

LibTIFF: Bug Analysis TIFFFetchShortArray is actually a wrapper around TIFFFetchData. The two are pretty much synonyms. These functions are part of an API local to libtiff. Badly designed API: the amount of data to be copied into the buffer is passed in one of the fields of the dir-structure and not explicitly! Developers missed this in both cases and it s hard to blame them.

The times of grep memcpy./*.c may be over. But that does not mean patterns of API use that lead to vulnerabilities no longer exist!

Vulnerability Extrapolation Given a function known to be vulnerable, determine functions similar to this one in terms of application-specific API usage patterns. Why? Because these are most likely to contain another incarnation of this bug. Why? Because developers tend to make the same mistakes over and over and over again. Especially if motivated by a bad API. Vulnerability Extrapolation exploits the information leak you get every time a vulnerability is disclosed!

What needs to be done We need to be able to determine how similar functions are in terms of these programming patterns. We need to find a way to extract these programming patterns from a code-base in the first place. How do we do that?

Similarity A decomposition Decomposition into shape and rotation: If rotation is just a detail, these are pretty similar. Signal Processing: Decomposition into components of different frequencies: Noise is suspected to be of high frequency while the signal is of lower frequency. In Face-Recognition, faces are decomposed into weighted sums of commonly found patterns + a noise-term.

Signal and Noise Checking if two things are similar always requires a decomposition into The big picture and the details or signal and noise. In general, if the big picture is the same and only the details differ, things are pretty similar. What s signal and what s noise depends on the problem you re dealing with.

Decomposing Code s ta tic in t T IF F F e tc h S u b je c td is ta n c e ( T IF F * tif, T IF F D ire n try * d ir) { detail u in t3 2 l[ 2 ]; flo a t v ; in t ok = 0 ; if ( T IF F F e tc h D a ta ( tif, d ir, ( c h a r *)l) && c v tr a tio n a l( tif, d ir, l[ 0 ], l[ 1 ], & v )) { /* * XXX: N um erator 0xFFFFFFFF m eans that we have infinite * distance. Indicate that with a negative floating point * SubjectDistance value. */ r e tu r n ok; Big picture ok = T IF F S e tf ie ld ( tif, d ir->td ir_ ta g, ( l[ 0 ]!= 0 x F F F F F F F F )? v : - v ); Function = Dominant Pattern + Noise Once you know a code-base, you start to decompose automatically: Dominant patterns of API use: Some FetchData- Function followed by TIFFSetField Symbols occurring in this function but not necessarily in any of the other functions employing the pattern

Think of it as zooming out Increasing level of detail/frequency s ta tic in t T IF F F e tc h S u b je c td is ta n c e ( T IF F * tif, T IF F D ire n try * d ir) { u in t3 2 l[ 2 ]; flo a t v ; in t ok = 0 ; Decreasing dominance of pattern if ( T IF F F e tc h D a ta ( tif, d ir, ( c h a r *)l) && c v tr a tio n a l( tif, d ir, l[ 0 ], l[ 1 ], & v )) { /* * XXX: N um erator 0xFFFFFFFF m eans that we have infinite * distance. Indicate that with a negative floating point * SubjectDistance value. */ ok = T IF F S e tf ie ld ( tif, d ir->tdir_ ta g, ( l[ 0 ]!= 0 x F F F F F F F F )? v : - v ); Usage Pattern Usage Pattern Usage Pattern r e tu r n ok; Linear approximation of each function by the most dominant API usage patterns of the code-base it is contained in!

Extracting dominant patterns How do we identify the most dominant API usage patterns of a code-base? How do other fields identify dominant patterns in their data?

Principal Component Analysis in Face Recognition Images have a natural vectorial representation. Each image can be interpreted as a $numberofpixelsdimensional vector. Directions in this space correspond to dependencies among pixels. Brightness of first pixel of second pixel of last pixel A set of images can be represented by a set of vectors.

Images with two pixels Brightness of Pixel 2 Brightness of Pixel 1 The most dominant pattern: Either both pixels light up, or both don t. As opposed to, for example, either pixel 1 lights up and pixel 2 doesn t and vice versa. Geometrically, this corresponds to the direction where the data varies most. In other words, if you were to project onto the red vector, you d best describe the data with a single dimension.

We can make direct use of this! Directions of highest variance correspond to dominant patterns in the data. These correspond to the eigenvectors of the data covariance matrix. A singular value decomposition can be used to obtain these. Let s make use of this to determine dominant API usage patterns!

Mapping code to the vector space Describe functions by the API-symbols they contain. API-symbols are extracted using a fuzzy parser. Each API-symbol is associated with a dimension. func1(){ int *ptr = malloc(64); fetcharray(pb, ptr);

Approximation of functions by most dominant API usage patterns! PCA implemented by truncated Singular Value Decomposition. Directions of highest variance Strength of pattern on diagonal Representation of functions in terms of these dominant patterns

A closer look at the decomposition Data Matrix (Contains all function-vectors) Strength of pattern Each column of U is a dominant pattern. Each row is a representation of an API-symbol in terms of the most dominant patterns Representation of functions in terms of the most dominant patterns

In summary

A toy problem to gain an intuition Group 1 v o id g u if u n c 1 ( G tk W id g e t * w id g e t) { in t j; g u i_ m a k e _ w in d o w ( w id g e t); G tk B u tto n * b u tto n ; b u tto n = g u i_ n e w _ b u tto n (); g u i_ s h o w _ w in d o w (); v o id g u if u n c 2 ( G tk W id g e t * w id g e t) { g u i_ m a k e _ w in d o w ( w id g e t); G tk B u tto n * m y B u tto n ; b u tto n 1 = g u i_ n e w _ b u tto n (); b u tto n 2 = g u i_ n e w _ b u tto n (); b u tto n 3 = g u i_ n e w _ b u tto n (); fo r ( in t i = 10; i!= i; i++) d o _ g u i_ s tu ff();

Group2 v o id n e tf u n c 1 () { in t fd; in t i = 0 ; s tru c t s o c k a d d r_ in in; fd = s o c k e t( a rg u m e n ts ); re c v ( fd, m o re A rg u m e n ts ); v o id n e tf u n c 2 () { in t fd; s t r u c t s o c k a d d r_ in in; h o s te n t h o s t; fd = s o c k e t( a rg u m e n ts ); re c v ( fd, m o re A rg u m e n ts ); g e th o s tb y n a m e ( h o s t) if( c o n d itio n ){ i++; s e n d ( fd, i, a rg ); s e n d ( fd, i, a rg ); c lo s e ( fd); if( c o n d itio n ){ in t i = 0 ; i++; s e n d ( fd, i, a rg ); c lo s e ( fd);

Group 3 v o id lis tf u n c 1 ( in t e le m ) { G L is t m y L is t; if(! lis t_ c h e c k ( m y L is t)){ d o _ lis t_ e rro r_ s tu ff(); r e tu r n ; lis t_ a d d ( m y L is t, e le m ); v o id lis tf u n c 2 ( in t e le m ) { G L is t m y L is t; if(! lis t_ c h e c k ( m y L is t)){ d o _ lis t_ e rro r_ s tu ff(); r e t u r n ; lis t_ re m o v e ( m y L is t, e le m ); lis t_ d e le te ( m y L is t);

Projection onto the first two principal components Core API Functions Occurs in this context but does not constitute the pattern

We get a lot more than just a method to extrapolate! We get a projection of all functions and API symbols into a space where API symbols constituting a pattern are close to one another. functions using the same pattern are close to one another. functions are close to the dominant API symbols of their most dominant patterns. You can browse code in this space.

But visualization isn t the way to go for real code-bases ;)

It turns out tables don t look as fancy but are a lot more useful. At least if the visualization you re doing is as bare-bones as the one just shown. Let s browse FFmpeg.

DecodingContexts: Synonyms Decoders in FFmpeg form a group of functions related by API use. Each decoder has its own decoding-context. They are thus noise - terms to be found at a similar angle. foo

String-API in FFMpeg foo foo

Functions similar to function - Vulnerability Extrapolation Take a function that used to be vulnerable as an input. Measure distances to other functions to determine those functions, which are most similar. Let s try that for FFmpeg.

Original bug: CVE-2010-3429 s ta tic in t flic _ d e c o d e _ fra m e _ 8 B P P ( A V C o d e c C o n te x t * a v c tx, v o id * d a ta, in t * d a ta _ s iz e, c o n s t u in t8 _ t * b u f, in t b u f_ s iz e ) { [..] p ix e ls = s - > fr a m e.d a ta [ 0 ] ; [..] c a s e F L I_ D E L T A : y _ p tr = 0 ; c o m p re s s e d _ lin e s = A V _ R L 1 6 (&buf[ s tre a m _ p tr]); s tre a m _ p tr += 2 ; w h ile ( c o m p re s s e d _ lin e s > 0 ) { lin e _ p a c k e ts = A V _ R L 1 6 ( & b u f[ s tre a m _ p tr] ) ; s tre a m _ p tr += 2 ; if ((line_ p a c k e ts & 0 x C 0 0 0 ) == 0 x C 0 0 0 ) { / / lin e s k ip o p c o d e lin e _ p a c k e ts = - lin e _ p a c k e ts ; y _ p tr + = lin e _ p a c k e ts * s - > fr a m e.lin e s iz e [ 0 ] ; e ls e if ((line_ p a c k e ts & 0 x C 0 0 0 ) == 0 x 4 0 0 0 ) { [..] e ls e if ((line_ p a c k e ts & 0 x C 0 0 0 ) == 0 x 8 0 0 0 ) { / / "la s t b y t e " o p c o d e p ix e ls [ y _ p tr + s - > fr a m e.lin e s iz e [ 0 ] - 1 ] = lin e _ p a c k e ts & 0 x ff; e ls e { [..] y _ p tr + = s - > fr a m e.lin e s iz e [ 0 ] ; b r e a k ; [..] Decoder-Pattern: Usually a variable of type AvCodecContext AV_RL*-Functions used as sources. Lot s of primitive types with specified width used. Use of memcpy, memset, etc. unchecked index, Write to arbitrary location in memory.

Extrapolation The closest match contained the same vulnerability but it was fixed when the initial function was fixed. [You cannot expect this decoder to be the optimal prototype for the decoder class, so yes, it will find nondecoders.] 0-Bug

0-Bug s ta tic v o id v m d _ d e c o d e ( V m d V id e o C o n te x t * s ) { [...] in t fra m e _ x, fra m e _ y ; in t fra m e _ w id th, fra m e _ h e ig h t; in t d p _ s iz e ; fr a m e _ x = A V _ R L 1 6 ( & s - > b u f[ 6 ] ) ; fr a m e _ y = A V _ R L 1 6 ( & s - > b u f[ 8 ] ) ; fr a m e _ w id th = A V _ R L 1 6 ( & s - > b u f[ 1 0 ] ) - fr a m e _ x + 1 ; fr a m e _ h e ig h t = A V _ R L 1 6 ( & s - > b u f[ 1 2 ] ) - fr a m e _ y + 1 ; [...] if ( s ->size >= 0 ) { / * o r ig in a lly U n p a c k F r a m e in V A G 's c o d e * / pb = p ; m e th = * pb++; [...] d p = & s - > fr a m e.d a ta [ 0 ] [ fr a m e _ y * s - > fr a m e.lin e s iz e [ 0 ] + fr a m e _ x ] ; d p _ s iz e = s ->fram e.lin e s iz e [ 0 ] * s ->avctx->height; pp = & s ->prev_ fra m e.d a ta [ 0 ][fram e _ y * s ->prev_ fra m e.lin e s iz e [ 0 ] + fra m e _ x ]; s w itc h ( m e th ) { [...] c a s e 2 : fo r ( i = 0 ; i < fra m e _ h e ig h t; i++) { m e m c p y ( d p, p b, fr a m e _ w id th ) ; pb += fra m e _ w id th ; dp += s ->fram e.lin e s iz e [ 0 ]; pp += s ->prev_ fra m e.lin e s iz e [ 0 ]; b r e a k ; [...] Decoder-Pattern: Usually a variable of type AvCodecContext AV_RL*-Functions used as sources. Lot s of primitive types with specified width used. Use of memcpy, memset, etc. Again an unchecked index into the pixelbuffer!

From 0-Bug to 0-day Demonstrate that this can be turned into this.

Writing a binary exploit in 2011 Writing binary exploits in 2011 is an adventure you do not want to miss. ASLR and DEP have arrived to stay. The heap is hardened. Every bug is kind of different right now and new generic techniques are just emerging. But when you look deep into the code and use it creatively, you can get it done

The memory corruption aspect s ta tic v o id v m d _ d e c o d e ( V m d V id e o C o n te x t * s ) { s ta tic v o id v m d _ d e c o d e (V m d V id e o C o n te x t * s ) { [...] in t fra m e _ x, fra m e _ y ; in t fram e _ w idth, fram e _ height; in t d p _ s iz e ; fram e _ x = AV_ RL16( & s -> buf[6 ] ) ; fram e _ y = AV_ RL16( & s -> buf[8 ] ) ; fram e _ w idth = AV_ RL16( & s -> buf[10] ) - fram e _ x + 1 ; fram e _ height = AV_ RL16( & s -> buf[12] ) - fram e _ y + 1 ; if ((fra m e _ w id th == s ->a v c tx ->w id th && fra m e _ h e ig h t == s ->a v c tx ->h e ig h t) && ( fra m e _ x fra m e _ y )) { s ->x_ o ff = fra m e _ x ; s ->y_ o ff = fra m e _ y ; fra m e _ x -= s ->x_ o ff; fra m e _ y -= s ->y_ o ff; if ( fra m e _ x fra m e _ y ( fra m e _ w id th!= s ->a v c tx ->w id th ) ( fra m e _ h e ig h t!= s ->a v c tx ->h e ig h t)) { m e m c p y ( s ->fra m e.d a ta [ 0 ], s ->p re v _ fra m e.d a ta [ 0 ], s ->a v c tx ->h e ig h t * s ->fra m e.lin e s iz e [ 0 ]); [...] if ( s ->size >= 0 ) { / * o r ig in a lly U n p a c k F r a m e in V A G 's c o d e * / pb = p ; m e th = * pb++; [...] dp = & s -> fram e.data [0 ] [fram e _ y * s -> fram e.linesize [0 ] + fram e _ x ] ; d p _ s iz e = s ->fra m e.lin e s iz e [ 0 ] * s ->a v c tx ->h e ig h t; pp = & s ->p re v _ fra m e.d a ta [ 0 ][fra m e _ y * s ->p re v _ fra m e.lin e s iz e [ 0 ] + fra m e _ x ]; s w itc h ( m e th ) { [...] c a s e 2 : b r e a k ; [...] [...] in t fra m e _ x, fra m e _ y ; in t fra m e _ w id th, fra m e _ h e ig h t; in t d p _ s iz e ; fr a m e _ x = A V _ R L 1 6 ( & s - > b u f[ 6 ] ) ; fr a m e _ y = A V _ R L 1 6 ( & s - > b u f[ 8 ] ) ; fo r ( i = 0 ; i < fra m e _ h e ig h t; i++) { m e m c p y (d p, p b, fra m e _ w id th ) ; pb += fra m e _ w id th ; fr a m e _ w id th = A V _ R L 1 6 ( & s - > b u f[ 1 0 ] ) - fr a m e _ x + 1 ; fr a m e _ h e ig h t = A V _ R L 1 6 ( & s - > b u f[ 1 2 ] ) - fr a m e _ y + 1 ; [...] if ( s ->s iz e >= 0 ) { dp += s ->fra m e.lin e s iz e [ 0 ]; / * o r ig in a lly U n p a c k F r a m e in V A G 's c o d e * / pb = p ; m e th = * pb++; [...] d p = & s - > fr a m e.d a ta [ 0 ] [ fr a m e _ y * s - > fr a m e.lin e s iz e [ 0 ] + fr a m e _ x ] ; d p _ s iz e = s ->fram e.lin e s iz e [ 0 ] * s ->avctx->height; pp = & s ->p re v _ fra m e.d a ta [ 0 ][fra m e _ y * s ->p re v _ fra m e.lin e s iz e [ 0 ] + fra m e _ x ]; s w itc h ( m e th ) { [...] c a s e 2 : pp += s ->p re v _ fra m e.lin e s iz e [ 0 ]; fo r ( i = 0 ; i < fra m e _ h e ig h t; i++) { m e m c p y ( d p, p b, fr a m e _ w id th ) ; pb += fra m e _ w id th ; dp += s ->fra m e.lin e s iz e [ 0 ]; pp += s ->p re v _ fra m e.lin e s iz e [ 0 ]; b r e a k ; [...]

Our Bug v.s. ASLR Due to ASLR, the startaddress of the heap will change from run to run. The relative address of the pixel-buffer within the heap will not be affected by ASLR. As long as we choose an offset that remains within the boundaries of the heap, ASLR does not hurt us yet. run n run n+1

What this bug gives us The ability to write up to 65535 bytes of data specified by us to a location relative to the pixel-buffer. Constraint: Offsets in the interval [-65535; -1] cannot be specified.

What do we overwrite? s ta tic in t v m d v id e o _ d e c o d e _ fra m e ( A V C o d e c C o n te x t * a v c tx, v o id * d a ta, in t * d a ta _ s iz e, A V P a c k e t * a v p k t) { [..] v m d _ d e c o d e ( s ) ; / * m a k e th e p a le tte a v a ila b le o n th e w a y o u t * / m e m c p y ( s ->fra m e.d a ta [ 1 ], s ->p a le tte, P A L E T T E _ C O U N T * 4 ); / * s h u f f le f r a m e s * / F F S W A P ( A V F ra m e, s ->fram e, s ->prev_ fra m e ); if ( s ->fram e.d a ta [ 0 ]) a v c tx - > r e le a s e _ b u ffe r ( a v c tx, & s - > fr a m e ) ; [..] r e tu r n b u f_ s iz e ; Overwrite the release_buffer pointer! Of course, that s in the interval of offsets we can t use

The state changing aspect s ta tic v o id v m d _ d e c o d e ( V m d V id e o C o n te x t * s ) { s ta tic v o id v m d _ d e c o d e ( V m d V id e o C o n te x t * s ) { && [...] [...] in t fra t mfra e _ x m, frae m_ e _ x y,; fra m e _ y ; in t fra m e _ w id th, fra m e _ h e ig h t; t d p _ s iz e ; in t fra m e _ w id th, fra m e _ h e ig h t; in t d p _ s iz e ; fra m e _ x = A V _ R L 1 6 (&s ->b u f[ 6 ]); fra m e _ y = A V _ R L 1 6 (&s ->b u f[ 8 ]); fra m e _ w id th = A V _ R L 1 6 (&s->buf[ 10]) - fra m e _ x + 1 ; fra m e _ h e ig h t = A V _ R L 1 6 (&s->buf[ 12]) - fra m e _ y + 1 ; fra m e _ x = A V _ R L 1 6 (&s ->b u f[ 6 ]); if ( ( fr a m e _ w id th = = s - > a v c tx - > w id th & & fr a m e _ h e ig h t = = s - > a v c tx - > h e ig h t) & & (fram e_x fram e_y)) { fra m e _ y = A V _ R L 1 6 (&s ->b u f[ 8 ]); s-> x_off = fram e_x; s-> y_off = fram e_y; fra m e _ w id th = A V _ R L 1 6 (&s->buf[ 10]) - fra m e _ x + 1 ; a m e _ x - = s - > x _ o ff; fra a mme _ y e -_ = hs - e> y ig_ oh ff; t = A V _ R L 1 6 (&s->buf[ 12]) - fra m e _ y + 1 ; if ( fra m e _ x fra m e _ y ( fra m e _ w id th!= s ->a v c tx ->w id th ) [...] ( fra m e _ h e ig h t!= s ->a v c tx ->h e ig h t)) { if ( ( fr a m e _ w id th = = s - > a v c tx - > w id th & & fr a m e _ h e ig h t = = s - > a v c tx - > h e ig h t) m e m c p y ( s ->fra m e.d a ta [ 0 ], s ->p re v _ fra m e.d a ta [ 0 ], s ->avctx ->h eigh t * s ->fram e.lin esize [ 0 ]); if ( s ->s iz e >= 0 ) { (fram e_x fram e_y)) { / * originally U npackfram e in VA G 's code * / pb = p ; m e th = * pb++; s-> x_off = fram e_x; [...] dp = & s ->fra m e.d a ta [ 0 ][fra m e _ y * s ->fra m e.lin e s iz e [ 0 ] + fra m e _ x ]; s-> y_off = fram e_y; dp_ size = s ->fram e.lin esize [ 0 ] * s ->avctx ->h eigh t; pp = & s ->p re v _ fra m e.d a ta [ 0 ][fra m e _ y * s ->p re v _ fra m e.lin e s iz e [ 0 ] + fra m e _ x ]; s w itc h ( m e th ) { [...] c a s e 2 : fr a mfo r e ( i _ = x 0 ; i - < = fra ms e -_ > h e igx h_ t; i++) o ff; { m e m c p y ( dp, pb, fra m e _ w id th ); pb += fra m e _ w id th ; fr a m e _ y - = s - > y _ o ff; dp += s ->fram e.lin e s iz e [ 0 ]; [...] b r e a k ; [...] pp += s ->p re v _ fra m e.lin e s iz e [ 0 ];

First frame We exploit the bug using two video frames. First frame: Put the decoder in a state where s->x_off contains our desired signinverted offset. First frame State 1: s->x_off = $offset Second frame State 2: frame_x = -s->x_off OVERWRITE

Second frame Second frame: Makes the decoder set frame_x to s>x_off and thus we add a negative offset to the pixelbuffer. First frame State 1: s->x_off = $offset Second frame State 2: frame_x = -s->x_off OVERWRITE

Now we can overwrite the pointer! Specify the sign-inverted desired offset in the first frame s frame_x and do not trigger memory corruption. The offset will be saved in s- >x_off. In the second frame, set frame_x to 0 so that frame_x will be set to s->x_off. Trigger memory corruption with this second frame.

Now, where do we jump? We re in luck: the base image of mplayer is not compiled as position-independent code. A small portion of code will be at constant addresses despite ASLR! What about this? 0x080cc5c2: mov %eax, (%esp) 0x080cc5c5: call <system@plt>

When shellcode contains shell-commands When the call is made, %eax has just been used as an auxiliary register to hold a pointer to avctx. avctx will be interpreted as a string of commands for /bin/sh! Lightly spray the heap with avctx-structures accordingly. Of course, this is not 100% stable but works remarkably well

DEMO

Evaluation: How good does it work in general? Of course, if no further similar bug exists in the code-base, this will not work. Second: We did not look for the same bug but for the same usage-pattern. That does not mean the pattern has to be used wrong in the cases we find as well. We just know the things to look for with this pattern.

What we can show If we do API-discovery by hand, we can check whether the method would have found these groups as well. Manually extracted code from the Linux kernel and FFmpeg was used to evaluate the method.

Evaluation Before PCA After PCA

Summary There are lots of patterns in your bugs. There s an info-leak when you disclose a bug: You re providing a sample of what went wrong in this specific code-base. Fixing a bug without performing proper extrapolation may be contra-productive You may be disclosing related 0-day!

Where to go from here for this method in particular. This will probably work well on binaries. While PCA was the vanilla algorithm to try out, there are better representations: NMF looks a lot more promising. We re currently investigating whether structural features can improve the method.

Some ideas and for security in general. Learn context-free grammars from input to generate fuzzers. Cluster fuzz-traces to group input that hits the same bug. More robust OS- and rate-limiter detection in port-scanning. Heap-chunk usage patterns to identify how APIs interact at runtime?

Final words Whenever you encounter patterns in security research or need to make a fuzzy decision, you may want to give machine learning a shot. It can be beneficial and refreshing to find ways of applying research from other fields to your problems.

Questions? Fabian Yamaguchi Vulnerability Researcher fabs@recurity-labs.com Recurity Labs GmbH, Berlin, Germany http://www.recurity-labs.com