Efficient Techniques for Fast Packet Classification

Similar documents
CS 5114: Theory of Algorithms. Tractable Problems. Tractable Problems (cont) Decision Problems. Clifford A. Shaffer. Spring 2014

CS 5114: Theory of Algorithms

Computational Complexity and Intractability: An Introduction to the Theory of NP. Chapter 9

Chapter 3 Deterministic planning

Classes of Boolean Functions

More on NP and Reductions

Introduction to Kleene Algebras

Introduction to Complexity Theory

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving III & Binary Decision Diagrams. Sanjit A. Seshia EECS, UC Berkeley

NP-Completeness and Boolean Satisfiability

Timo Latvala. March 7, 2004

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Chapter 2. Reductions and NP. 2.1 Reductions Continued The Satisfiability Problem (SAT) SAT 3SAT. CS 573: Algorithms, Fall 2013 August 29, 2013

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving

NP Completeness and Approximation Algorithms

A An Overview of Complexity Theory for the Algorithm Designer

Description Logics: an Introductory Course on a Nice Family of Logics. Day 2: Tableau Algorithms. Uli Sattler

Automata Theory for Presburger Arithmetic Logic

NP-Completeness. Algorithmique Fall semester 2011/12

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Lecture 7: The Satisfiability Problem

Temporal logics and explicit-state model checking. Pierre Wolper Université de Liège

Lecture 25: Cook s Theorem (1997) Steven Skiena. skiena

Turing Machines and Time Complexity

Data Mining and Machine Learning

Comp487/587 - Boolean Formulas

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Automata Theory CS Complexity Theory I: Polynomial Time

CS256/Spring 2008 Lecture #11 Zohar Manna. Beyond Temporal Logics

Summer School on Introduction to Algorithms and Optimization Techniques July 4-12, 2017 Organized by ACMU, ISI and IEEE CEDA.

Nondeterminism. September 7, Nondeterminism

Nondeterministic Finite Automata

QuIDD-Optimised Quantum Algorithms

Algebraic Dynamic Programming. Solving Satisfiability with ADP

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

SAT Solvers: Theory and Practice

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

The efficiency of identifying timed automata and the power of clocks

Reduced Ordered Binary Decision Diagrams

Lecture 1 : Data Compression and Entropy

Tutorial 6. By:Aashmeet Kalra

Finite Automata. Seungjin Choi

Lecture Notes on Emptiness Checking, LTL Büchi Automata

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

An Alternative Construction in Symbolic Reachability Analysis of Second Order Pushdown Systems

SAT-Solving: From Davis- Putnam to Zchaff and Beyond Day 3: Recent Developments. Lintao Zhang

INAPPROX APPROX PTAS. FPTAS Knapsack P

Theory of computation: initial remarks (Chapter 11)

Design of Distributed Systems Melinda Tóth, Zoltán Horváth

A Symbolic Decision Procedure for Symbolic Alternating Finite Automata

SAT, NP, NP-Completeness

KB Agents and Propositional Logic

An Introduction to SAT Solving

DETECTION AND REMOVAL OF FIREWALL MISCONFIGURATION

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

A brief introduction to Logic. (slides from

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine)

Lecture 14 - P v.s. NP 1

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Computability and Complexity Theory: An Introduction

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Intractable Problems [HMU06,Chp.10a]

CS4026 Formal Models of Computation

BBM402-Lecture 11: The Class NP

Clause/Term Resolution and Learning in the Evaluation of Quantified Boolean Formulas

(Refer Slide Time: 0:21)

Finite Automata Part Two

arxiv: v1 [cs.ds] 9 Apr 2018

Lecture 4 : Quest for Structure in Counting Problems

Exercises 1 - Solutions

Correctness of Dijkstra s algorithm

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine.

Finite Automata. Mahesh Viswanathan

NP-Completeness. Andreas Klappenecker. [based on slides by Prof. Welch]

Exam Computability and Complexity

Fuzzy Answer Set semantics for Residuated Logic programs

Critical Reading of Optimization Methods for Logical Inference [1]

Pattern Recognition and Machine Learning. Learning and Evaluation of Pattern Recognition Processes

Graduate Algorithms CS F-21 NP & Approximation Algorithms

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

The algorithmic analysis of hybrid system

Unranked Tree Automata with Sibling Equalities and Disequalities

Essential facts about NP-completeness:

Intractable Problems. Time-Bounded Turing Machines Classes P and NP Polynomial-Time Reductions

22c:145 Artificial Intelligence

Decision Tree Learning Lecture 2

LOGIC PROPOSITIONAL REASONING

Nondeterministic finite automata

Part V. Intractable Problems

Reduced Ordered Binary Decision Diagram with Implied Literals: A New knowledge Compilation Approach

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

The Complexity of Lattice-Based Fuzzy Description Logics

In complexity theory, algorithms and problems are classified by the growth order of computation time as a function of instance size.

CS 154, Lecture 3: DFA NFA, Regular Expressions

Part 1: Propositional Logic

Overview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for?

[read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] General-to-specific ordering over hypotheses

Transcription:

Efficient Techniques for Fast Packet Classification Network Reading Group Alok Tongaonkar, R Sekar Stony Brook University Sept 16, 2008

What is Packet Classification? Packet Classification A mechanism that inspects network packets determines how to process a packet based on the values of header fields and/or the payload. Fundamental Operation Identify the rules R i that match a packet p from rules {R 1,..., R n } where R i : condition action e.g., R 1 : dhost = PLUTO && dport = HTTP && content: Bad command DENY

Applications Firewalls Identify highest priority matching rule Intrusion Detection Systems Use unordered rules Identify all matching rules Network Monitoring Packet-filtering whether a packet satisfies any of the conditions

Previous Techniques Naive technique: Berkeley Packet Filter(BPF) Match one rule at a time A test that occurs in multiple rules is tested once on behalf of each of the rules

Previous Techniques Trie-based techniques: PathFinder, Dynamic Packet Filter(DPF) Identify common prefixes and share them {F1, F2} type = IP {F1, F2} proto = T CP {F1, F2} dport = A dport = B {F1} {F2}

Previous Techniques DAG automaton: Berkeley Packet Filter(BPF+) Recognize some equivalent states Use data flow analysis to eliminate tests that are implied by other tests performed previously on the path Shost X? Shost X? T T F Dhost Y? F Dhost Y? Shost Y? F Shost Y? T T T T F Dhost X? F Dhost X? F T F T {F1} {F2} {F1} {F2}

Previous Techniques Adaptive Traversal Change order of testing to promote sharing {p1, p2, p3} {p1, p2, p3} x = a x a {p1, p2, p3} {p1, p3} y = b y = a y a && y b y = b y = a y a && y b y = b y a && y b y = a {p1} x = a {p2, p3} x a φ {p1} {p2, p3} φ {p1} {p3} φ {p2, p3} {p3}

Objective Promote Sharing of Tests Adaptive automata traversal developed in the context of term-matching Restricted to equality tests we need to support inequalities, disequalities, and bit-masking operations Several new techniques in the context of the application domain Flexibility to Support Diverse Applications Ordered (firewalls) and unordered (intrusion detection) rulesets Packet-filtering (network monitoring)

Organization of Talk

Organization of Talk Part I - Packet Field Matching Algorithm Techniques Intrusion Detection Systems Firewalls Evaluation

Organization of Talk Part I - Packet Field Matching Algorithm Techniques Intrusion Detection Systems Firewalls Evaluation Part II - Content Matching Integrating String Matching

Organization of Talk Part I - Packet Field Matching Algorithm Techniques Intrusion Detection Systems Firewalls Evaluation Part II - Content Matching Part III Integrating String Matching Related Work Summary

Techniques for Packet Classification Naive technique A test that occurs in multiple rules is tested once on behalf of each of the rules Automata-based techniques Automaton states used to remember tests Avoids repetition of tests

Deterministic Packet Classification Automaton F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) icmp type = ECHO {F1, F3} {F1, F3} ttl = 1 ttl 1 ttl 1 {F1} {F2, F3} {F1, F2, F3} icmp type = ECHO REP LY ttl = 1 icmp type ECHO REP LY icmp type ECHO φ ttl 1 {F3} ttl = 1 {F3} All but one transitions labeled with test Remaining transition labeled other conjunction of negations of all tests on the rest of the transitions {F2, F3} φ

Deterministic Packet Classification Automaton F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) icmp type = ECHO {F1, F3} {F1, F3} ttl = 1 ttl 1 ttl 1 {F1} {F2, F3} {F2, F3} {F1, F2, F3} icmp type = ECHO REP LY ttl = 1 icmp type ECHO REP LY icmp type ECHO φ ttl 1 φ {F3} ttl = 1 {F3} Transitions are simultaneously distinguishable All tests except other are mutually exclusive Applicable transition can be determined using a single operation O(1) expected time complexity

Deterministic Packet Classification Automaton F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) icmp type = ECHO {F1, F3} ttl = 1 ttl 1 ttl 1 {F2, F3} {F1, F2, F3} icmp type = ECHO REP LY icmp type ECHO REP LY icmp type ECHO {F3} ttl = 1 Each final state S correctly identifies the match set corresponding to any packet satisfying all the tests along a path from the start state to S. {F1, F3} {F1} ttl = 1 φ ttl 1 {F3} {F2, F3} φ

Non-deterministic Packet Classification Automaton F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) {F1, F2, F3} icmp type ECHO REP LY icmp type = ECHO icmp type = ECHO REP LY {F1} {F3} ttl = 1 ttl 1 {F2, F3} φ {F3} ttl = 1 ttl 1 other conjunction of negations of a subset of tests on the rest of the transitions Nondeterminism is simulated using backtracking at runtime {F2, F3} φ

Principal Design Criteria for PCA Operate in real-time on high-speed networks without dropping packets Scale to support thousands of rules typical in intrusion detection systems and firewalls Computational Issues Matching time closely related to path lengths Memory size of automata

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c).

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c). Equality tests x = c tcp sport = 80

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c). Equality tests x = c tcp sport = 80 Equality tests with bitmasks x&c 1 = c tcp flags & 0x03 = 0x03

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c). Equality tests x = c tcp sport = 80 Equality tests with bitmasks x&c 1 = c tcp flags & 0x03 = 0x03 Disequality tests x c tcp sport 80

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c). Equality tests x = c tcp sport = 80 Equality tests with bitmasks x&c 1 = c tcp flags & 0x03 = 0x03 Disequality tests x c tcp sport 80 Disequality tests with bitmasks x&c 1 c tcp flags & 0x03 0x03

Problem Formulation Tests Involve a variable x and one or two constants (denoted by c). Equality tests x = c tcp sport = 80 Equality tests with bitmasks x&c 1 = c tcp flags & 0x03 = 0x03 Disequality tests x c tcp sport 80 Disequality tests with bitmasks x&c 1 c tcp flags & 0x03 0x03 Inequality tests x c or x c tcp dport 1024

Filters and Priorities A filter F is a conjunction of tests. (dport = 22) (sport 1024) (flags&0xb = 0x3) A set F of filters may be partially ordered by a priority relation. The priority of F is denoted as Pri(F).

Filters and Priorities A filter F is a conjunction of tests. (dport = 22) (sport 1024) (flags&0xb = 0x3) A set F of filters may be partially ordered by a priority relation. The priority of F is denoted as Pri(F). A filter F matches a packet p, if: the packet satisfies F, i.e., F(p) is true the packet does not satisfy any rule that has higher priority than F

Filters and Priorities A filter F is a conjunction of tests. (dport = 22) (sport 1024) (flags&0xb = 0x3) A set F of filters may be partially ordered by a priority relation. The priority of F is denoted as Pri(F). Match Set of p consists of all filters that match p, with the exception that among equal priority filters, at most one is retained.

Example of Prioritized Matching F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) p 1 : icmp echo packet with ttl of 1 p 2 : icmp reply packet with ttl of 1

Example of Prioritized Matching F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) p 1 : icmp echo packet with ttl of 1 p 2 : icmp reply packet with ttl of 1 Multi-matching (intrusion detection systems) set incomparable priorities M(p 1 ) = {F 1, F 3 } M(p 2 ) = {F 2, F 3 }

Example of Prioritized Matching F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) p 1 : icmp echo packet with ttl of 1 p 2 : icmp reply packet with ttl of 1 Ordered matching (firewalls) assign monotonically decreasing priorities Pri(F 1 ) > Pri(F 2 ) > Pri(F 3 ) M(p 1 ) = {F 1 } M(p 2 ) = {F 2 }

Example of Prioritized Matching F 1 : (icmp type = ECHO) F 2 : (icmp type = ECHO REPLY ) (ttl = 1) F 3 : (ttl = 1) p 1 : icmp echo packet with ttl of 1 p 2 : icmp reply packet with ttl of 1 Packet-filtering (network monitoring) set equal priorities Pri(F 1 ) = Pri(F 3 ) = Pri(F 2 ) p 1 can match either F 1 or F 3 p 2 can match either F 2 or F 3

Matching Automata Construction Key New Idea Decompose and reorder tests to increase sharing of tests among rules Example F 1 : (x = 5), F 2 : (x & 0x03 1)

Matching Automata Construction Key New Idea Decompose and reorder tests to increase sharing of tests among rules Example F 1 : (x = 5), F 2 : (x & 0x03 1) {F1, F2} x = 5 x 5 {F1, F2} x & 0x03 = 1 x & 0x03 1 x & 0x03 = 1 {F2} x & 0x03 1 {F1} {F1, F2} φ {F2}

Matching Automata Construction Key New Idea Decompose and reorder tests to increase sharing of tests among rules Example F 1 : (x = 5), F 2 : (x & 0x03 1) {F1, F2} x&0x03 = 1 x & 0x03 1 {F2} x & 0xfc = 4 x & 0xfc 4 {F2} φ

Condition Factorization Decomposing filters into combination of more primitive tests Similar to factorization of integers Based on the residue operation analogous to integer division Residue We want to determine if there is a match for a filter C 1 We have so far tested a condition C 2 A residue captures the additional tests that need to be performed at this point to verify C 1

Residue Operation Definition (Residue) The residue C 1 /C 2 is another condition C 3 such that: 1 C 2 C 3 C 1 2 C 1 C 2 C 3 Examples C 1 : x [1, 20], C 2 : x [15, 25] C 1 : x [1, 20], C 2 : x = 15 C 1 : x [1, 20], C 2 : x = 35 C 1 : x [1, 20], C 2 : y = 15 C 3 : x 20 C 3 : true C 3 : false C 3 : x [1, 20]

Residue Operation Definition (Residue) The residue C 1 /C 2 is another condition C 3 such that: 1 C 2 C 3 C 1 2 C 1 C 2 C 3 Ideally C 3 would be the weakest condition such that (1) holds

Residue Operation Definition (Residue) The residue C 1 /C 2 is another condition C 3 such that: 1 C 2 C 3 C 1 2 C 1 C 2 C 3 In Practice We might not want minimal condition since Expensive to compute Inefficient to use contains many disjunctions

Residue Operation Definition (Residue) The residue C 1 /C 2 is another condition C 3 such that: 1 C 2 C 3 C 1 2 C 1 C 2 C 3 Example of Approximation C 1 : x [1, 20], C 2 : x 15 C 3 : x [1, 14] x [16, 20] C 3 : x [1, 20]

Residue Operation Definition (Residue) The residue C 1 /C 2 is another condition C 3 such that: 1 C 2 C 3 C 1 2 C 1 C 2 C 3 Need for (2) C 3 shouldn t be too strong, or else we may miss matches for C 1 C 1 : x [1, 20], C 2 : x [10, 30] C 3 : x [10, 15] C 3 satisfies (1) but not (2) Will miss match for x [1, 9] or x [16, 20]

Computing Residue on Tests T1 T2 T1/T2 Conditions T T true T T false T x = c T [x c] x = c x & c1 = c2 x & c1 = c & c1 c & c1 = c2 false c & c1 c2 x = c x & c1 c2 false c & c1 = c2 x = c x [c1, c2] false c [c1, c2] x c x & c1 = c2 x & c1 c & c1 c & c1 = c2 true c & c1 c2 x c x & c1 c2 true c & c1 = c2 x c x [c1, c2] true (c < c1) (c > c2) x [c1, c2] x [c3, c4] true c1 c3 c4 c2 x [, c2] c1 c3 c2 c4 x [c1, ] c3 c1 c4 c2 x [c1, c2] c3 c1 c2 c4 false (c2 < c3) (c4 < c1) x [c1, c2] x & c3 = c4 false c4 > c2 x & c1 = c2 x & c3 = c4 x & (c1 & c3) = (c2 & c3) c2 & c3 = c1 & c4 false otherwise x & c1 = c2 x [c3, c4] false c2 > c4 x & c1 c2 x & c3 = c4 x & (c1 & c3) (c2 & c3) c2 & c3 = c1 & c4 true otherwise x & c1 c2 x [c3, c4] true c2 > c4 T T T

Build Algorithm Recursive procedure Takes an automaton state s as its first parameter Builds the subautomaton that is rooted at s It takes two other parameters C s, the candidate set of the state s M s, the match set of s Candidate Set C s filters that haven t completed a match, but future matches can t be ruled out either. Match Set M s all filters for which a match can be announced at s.

Build Algorithm 1. procedure Build(s, C s, M s ) 2. if C s is empty 3. then match[s] = M s 4. else 5. (D, T ) = select(c s ) 6. T o = { d i D d i =true T i} 7. for each T i (T {T o }) do 8. C i = C s /T i 9. if ((T i T o ) d i ) then C i = C i C/T o endif 10. compute M si and C si from C i and M s 11. if a state s i corresponding to (C si, M si ) isn t present 12. create a new state s i 13. Build(s i, C si, M si ) 14. endif 15. create a transition from s to s i on T i 16. end 17. endif

Improving Automata Size Key Idea Pick tests which avoid duplication of filters in next states T = {x = 5, x = 6, (x 5) (x 6)} C = {x = 5, x = 6, x > 7} C = {x = 6, x > 4} {C1, C2, C3} X = 5 X = 6 X 5 X 6 X = 5 X = 6 X 5 X 6 {C1} {C2} {C3} {C2} {C1, C2} {C2}

Improving Automata Size Key Idea Pick tests which avoid duplication of filters in next states T = {x = 5, x = 6, (x 5) (x 6)} C = {x = 5, x = 6, x > 7} C = {x = 6, x > 4} Definition (Discriminating Set) A set T of conditions is said to be a discriminating set for a filter set F iff for every F F there exists at most one T T such that F belongs to the candidate set of F/T. Concept of discriminating tests is similar to the concept of index in the context of term matching.

Ensuring Polynomial-Size Automata Breadth of subautomaton rooted at s k B( C s ) = B( C si ), i=1 P(n) the desired polynomial on n that bounds the automaton size. k P( C s ) P( C si ) (1) i=1 Pick tests that satisfy the bounds Pick a test that comes closest to satisfying this constraint and make some outgoing transitions nondeterministic

Benign Nondeterminism Two filters F 1 and F 2 are said to be independent of each if they do not have a common test Build separate automaton for each independent set Match packets against each automaton non-determinism without incurring any performance penalties

Effect of Benign Nondeterminism on Automata Size Leads to dramatic reduction in automata size especially for intrusion detection systems. If F 1 and F 2 are independent, packet may match F 1, F 2, both, or neither. Number of states of automaton for F 1 is k 1, for F 2 is k 2. Number of states of automaton for F 1 F 2 is k 1 k 2. Combined number of states of independent automata for F 1 and F 2 is k 1 + k 2.

Improving Matching Time Utility How much a test goes towards checking a filter Based on notion of assigning costs to tests and filters Compare cost of a filter with combined cost of a test and the residue of a filter w.r.t the test select strategy Size reduction more important than matching time 1 Pick discriminating test when available Pick test with higher utility 2 Examine opportunities for benign-nondeterminism 3 Pick tests that satisfy polynomial bound

Measuring Matching Time Implementation-independent metric for matching time Suppose we could guess the set of rules that match a packet The match verification cost is lower bound for any algorithm that tries to identify the matching rules We use the ratio of actual matching cost to the lower bound for match verification as a metric for matching time

Experiments Setup for IDS Snort open source Comprehensive default signatures Signatures consist of packet field tests and content-matching operation Snort Next Generation (Snort-NG) matches packet fields in parallel Snort version 2 (Snort v2) tries to parallelize matching for some fields Used 1635 default rules that come with Snort combined rules with same packet field tests to get 305 rules System: 1.70Ghz pentium 4 processor, 520MB, CentOS-4.2 (Linux kernel 2.6)

Automaton Size 20000 15000 Condition Factorization Snort-NG No. of states 10000 5000 0 0 50 100 150 200 250 300 Number of Filtering Rules

Effect of Optimizations on Size No. of states 40000 35000 30000 25000 20000 15000 10000 5000 LR Tree LR DAG Adaptive Tree Adaptive DAG Adaptive DAG w/ benign non-det 0 0 50 100 150 200 250 300 Number of Filtering Rules

Matching Time Lower Bound Avg. Path Length (in terms of tests) 25 20 15 10 5 0 Adaptive Traversal Lower Bound 0 50 100 150 200 250 300 Number of Filtering Rules

Matching Time Matching Time (in s) 90 80 70 60 50 40 30 20 10 Snort 2 Snort-NG Condition Factorization 0 0 50 100 150 200 250 300 Number of Filtering Rules

Matching Time Matching Time (in s) 90 80 70 60 50 40 30 20 10 Snort 2 Snort-NG Condition Factorization 0 0 50 100 150 200 250 300 Number of Filtering Rules

Experiments Setup for Firewall Department firewall rules Firewall rules in the form of iptable rules for a Linux machine Network divided into different subnets 140 filtering rules System: 1.70Ghz pentium 4 processor, 520MB, CentOS-4.2 (Linux kernel 2.6)

Automaton Size No. of states 4000 Adaptive Traversal DAG 3500 3000 2500 2000 1500 1000 500 0 0 20 40 60 80 100 120 140 Number of Filtering Rules

Matching Time Lower Bound Avg. Path Length (in terms of tests) 14 12 10 8 6 4 2 0 Lower Bound Actual Path Length 0 20 40 60 80 100 120 140 Number of Filtering Rules

Extending Our Techniques for Content Matching Key Idea Use boolean variables corresponding to strings being matched Test boolean variables to check presence of corresponding string in payload Treat tests on these boolean variables just like tests on other packet fields F 1 : (tcp sport = 80) (content = Command complete ) F 2 : (tcp sport = 80) (content = Bad command ) (content = Bad filename ) F 3 : (tcp sport = 25) (content = Command complete )

Extending Our Techniques for Content Matching Key Idea Use boolean variables corresponding to strings being matched Test boolean variables to check presence of corresponding string in payload Treat tests on these boolean variables just like tests on other packet fields F 1 : (tcp sport = 80) (content = Command complete ) F 2 : (tcp sport = 80) (content = Bad command ) (content = Bad filename ) F 3 : (tcp sport = 25) (content = Command complete ) F 1 : C 1 (X 1 = 1) F 2 : C 2 (X 2 = 1) (X 3 = 1) F 3 : C 3 (X 1 = 1)

Interesting Questions When to Perform String Matching? Perform string matching before passing packet to packet classification automata Lazy evaluation perform string matching only when packet classification automata can not proceed How to Handle Regular Expressions? Use combined packet-field and string matching as prefilter for RE matching How can we include parts of RE in string matching to get maximize gains from prefiltering?

Related Work Left-to-right traversal based techniques PathFinder, DPF share common prefix BPF+ uses global data flow techniques to eliminate redundant tests Can not reason about semantic redundancies in presence of complex test Adaptive traversal Sekar et al, Adaptive Binary Matching [Gustafsson] Do not handle inequalities, disequalities, bit-fields Automata has exponential worst case space complexity Linears size guarded sequential automata require runtime manipulation of match sets Dynamic reordering techniques DPF, Al-Shaer et al maintain statistics regarding traffic Techniques for routers work on fixed number of fields Srinivasan et al, Lakshman et al multidimensional searching problem Woo et al, Gupta et al decision tree based techniques

Conclusion and Future Work Summary Developed a new technique for fast packet classification Flexible support diverse applications in a uniform framework Promotes sharing of tests Developed novel techniques for generating packet classification automata that Have polynomial size Virtually constant matching time Demonstrated the gains from our technique for intrusion detection systems and firewalls Future Work Complete the integration and then evaluate the combined content matching operation and packet field matching

Thank You Acknowledgement: Sreenath Vasudevan Questions?

Computing Match and Candidate Sets P s denotes the conjunction of tests on the path from the start state to s Maintain only the residuals of the original filters in C s and M s with respect to P s

Computing Match and Candidate Sets P s denotes the conjunction of tests on the path from the start state to s Maintain only the residuals of the original filters in C s and M s with respect to P s Match Set M 1 = {M F/P s (M = true)} M 2 = {M M 1 M F/P s Pri(M ) > Pri(M)} M s is obtained by considering filters with equal priorities in M 2, and deleting all but one of them.

Computing Match and Candidate Sets P s denotes the conjunction of tests on the path from the start state to s Maintain only the residuals of the original filters in C s and M s with respect to P s Candidate Set C(F, M) = {C F M M with Pri(M ) Pri(C)} C s = C(F/P s, M s )