Chapter 5. Presentation. Entropy STATISTICAL CODING

Similar documents
CHAPTER VI Statistical Analysis of Experimental Data

(b) By independence, the probability that the string 1011 is received correctly is

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

3. Basic Concepts: Consequences and Properties

D. VQ WITH 1ST-ORDER LOSSLESS CODING

Summary of the lecture in Biostatistics

Lecture 3. Sampling, sampling distributions, and parameter estimation

18.413: Error Correcting Codes Lab March 2, Lecture 8

The Mathematical Appendix

Introduction to local (nonparametric) density estimation. methods

PTAS for Bin-Packing

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Laboratory I.10 It All Adds Up

CHAPTER 4 RADICAL EXPRESSIONS

VARIABLE-RATE VQ (AKA VQ WITH ENTROPY CODING)

Chapter 14 Logistic Regression Models

On generalized fuzzy mean code word lengths. Department of Mathematics, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India

Lecture 3 Probability review (cont d)

NP!= P. By Liu Ran. Table of Contents. The P vs. NP problem is a major unsolved problem in computer

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Lecture Notes Types of economic variables

Chapter 4 Multiple Random Variables

Investigating Cellular Automata

8.1 Hashing Algorithms

Lecture 9: Tolerant Testing

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Chapter 5 Properties of a Random Sample

CODING & MODULATION Prof. Ing. Anton Čižmár, PhD.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

Chapter 3 Sampling For Proportions and Percentages

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

ρ < 1 be five real numbers. The

NP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer

Some Notes on the Probability Space of Statistical Surveys

Chapter 9 Jordan Block Matrices

Introduction to Probability

Functions of Random Variables

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

X ε ) = 0, or equivalently, lim

A tighter lower bound on the circuit size of the hardest Boolean functions

Chain Rules for Entropy

Chapter 11 Systematic Sampling

Entropy ISSN by MDPI

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

MA/CSSE 473 Day 27. Dynamic programming

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

Econometric Methods. Review of Estimation

Algorithms Design & Analysis. Hash Tables

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

A New Measure of Probabilistic Entropy. and its Properties

CHAPTER 3 POSTERIOR DISTRIBUTIONS

Source-Channel Prediction in Error Resilient Video Coding

Channel Polarization and Polar Codes; Capacity Achieving

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Wireless Link Properties

The internal structure of natural numbers, one method for the definition of large prime numbers, and a factorization test

1 Onto functions and bijections Applications to Counting

L5 Polynomial / Spline Curves

Bayes (Naïve or not) Classifiers: Generative Approach

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Lecture 8: Linear Regression

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Pseudo-random Functions. PRG vs PRF

MEASURES OF DISPERSION

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

ECE 729 Introduction to Channel Coding

GOALS The Samples Why Sample the Population? What is a Probability Sample? Four Most Commonly Used Probability Sampling Methods

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Pseudo-random Functions

Simple Linear Regression

Entropies & Information Theory

STK4011 and STK9011 Autumn 2016

Basics of Information Theory: Markku Juntti. Basic concepts and tools 1 Introduction 2 Entropy, relative entropy and mutual information

arxiv: v1 [math.st] 24 Oct 2016

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames

Lecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,

Dimensionality Reduction and Learning

Special Instructions / Useful Data

Lebesgue Measure of Generalized Cantor Set

Chapter 8. Inferences about More Than Two Population Central Values

On Fuzzy Arithmetic, Possibility Theory and Theory of Evidence

Class 13,14 June 17, 19, 2015

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Analysis of System Performance IN2072 Chapter 5 Analysis of Non Markov Systems

A Primer on Summation Notation George H Olson, Ph. D. Doctoral Program in Educational Leadership Appalachian State University Spring 2010

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

22 Nonparametric Methods.

Non-uniform Turán-type problems

Lecture 02: Bounding tail distributions of a random variable

Module 7: Probability and Statistics

Unsupervised Learning and Other Neural Networks

Descriptive Statistics

Point Estimation: definition of estimators

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Transcription:

Chapter 5 STATISTICAL CODING Presetato Etropy

Iformato data codg Iformato data codg coded represetato of formato Ijectve correspodece Message {b } Multples roles of codg Preparg the trasformato message => trasmtted sgal Adaptg the source bt rate - chael capacty (compresso ) Protectve ecodg agast trasmsso errors (error detecto / correcto) Ecryptg ( secretve commucatos ) Tattoog ( owershp markers ) Trascodg (alphabet chages, trasmsso costrats ) The goal of a commucato system s to trasport messages from a seder (the formato source) towards a recpet (formato user). The sgal supportg the formato beg trasmtted has to be compatble wth the characterstcs of the trasmsso chael. Iformato codg must establsh a jectve correspodece betwee the message produced by the source ad the sequece of formato {b } set to the trasmtter. Ths formato codg plays umerous roles: - It prepares the trasformato of a message to a sgal (carred out the trasmtter part: sgal formato ecodg); - It adapts the source of formato to the capacty of the trasmttg chael; - It ca be used to protect formato agast trasmsso errors (wth certa lmts), so as to detect ad/or correct them (due to chael dsturbaces); - It ca also be used certa cases for secret commucatos (ecryptg) ad watermarkg to protect owershp. A gve trasmsso chael must be able to trasport messages of varous types that s why trascodg s ecessary: ths trasforms the message represetato from a codebook M k usg a alphabet A k to a represetato of the same message from a codebook M 0 usg alphabet A 0. Ths partcular alphabet s ofte the bary set {0, 1}, but ths s ot the oly oe.

Iformato data codg Deftos Message sources S: producto of a sequece of messages, each of them beg selected a set M of messages ( M : codebook of possble messages M = { m 1, m 2,.}, the m are also called "words") Message: fte sequece of symbols (characters take from A : alphabet ) Alphabet: fte set of symbols A = { a 1, a 2,, a k } Deftos: A message s ay fte set of characters take from a alphabet A: a fte set of symbols (for example: letters, dgts etc.). A message source S s the system that produces a temporal seres of messages m, each of them take from a set of possble messages M. M s called a message (or word) codebook. The trasmtted message s fact a text formed by the sytactc rules of elemetary messages called words: M = {m 1, m 2, }, each word s wrtte by a fxed, fte set of symbols take from the alphabet A. Depedg o the applcatos, the message sources S ca use dctoares of very dfferet types: from messages wrtte usg characters of the alphabet, umbers, puctuato ad tab marks, to vsual messages where the messages are dgtal mages where for example, each word s a pxel represeted as a sequece of 8 bary symbols take from the alphabet {0, 1} (bt).

Etropy of a source (SHANNON 1948) Defto of ucertaty ad of etropy Ucertaty I of a evet E: I(E) = -log 2 Pr{E} Uts: bt ( BIary ut f log 2 ) f source smple s = > I(s ) = =1; I(m α ) Etropy H of a dscrete radom varable X: at (NAtural ut f Log e ): 1 at=1.443 bts H(X) = E X [ I(X) ] = =1; p I(X ) = - =1; p log 2 (p ) Propertes of etropy H 0 ; H s cotuous, symmetrcal; H(p 1,, p N ) log 2 f (p 1,, p ) ad (q 1, q ) are 2 dstrbutos of probabltes ==> =1; p log 2 ( q / p ) 0 car Log x < x - 1 The ucertaty I of a evet E of probablty Pr( E ) s defed by: I ( E ) = log 2 Pr( 1 = - log E ) 2 Pr( E ) Notes: - f Pr( E ) = 1/2 the I ( E ) = 1 (utary ucertaty) - f Pr( E ) = 1 the I ( E ) = 0: ucertaty s ull for a certa evet. - The ucertaty ut s bt (Bary Ut). It s ot the same as the bt: Bary dgt. - We ca use the atural logarthm stead of the base 2 logarthm, therefore the ut s the at (Natural Ut = 1.443 bt).

We ow cosder that the evets E are fact realzatos of a radom dscrete varable X. We defe the etropy H as beg the average ucertaty of the radom varable X. If we cosder fact each evet x, [1, ], as a realzato of a radom varable X (.e. X s a radom varable wth values { x 1, x 2,, x }) : H(X) = E X { I(X) } = =1.. Pr{X = x }. I(x ) = =1.. p.i(x ), wth p = Pr{X = x } The etropy depeds o the probablty law of X but t s ot a fucto of the values take by X. It s expressed bts (or ats) ad represets the average umber of bts ecessary to bary ecode the dfferet realzatos of X. Now let s cosder a formato source S defed by a set of possble m (codebook): S{m 1, m 2,, m N }, ad by a mechasm such as for emttg messages: s = {m α1, m α2,, m α } wth m α1 : 1 st emtted message,, m α : th emtted message. Warg: the dex α defes the temporal dex the sequece of messages emtted by the source. α defes the dex of the th message emtted the codebook M of possble messages, geerally: N. The choce of m α occurs accordg to a gve probablty law. The emsso of a dscrete source of formato thus correspods to a sequece of radom varables X, [1, ]: The probablty of s ca be expressed as a product of codtoal probabltes: Pr(s ) = Pr{X 1 = m α1 } Pr{X 2 = m α2 / X 1 = m α1 } Pr{ X = m α / X 1 = m α1,, X -1 = m α-1 } I the case of smple sources, the radom varables X are depedet ad of the same law, whch gves: (, j) [1, ] x [1, N], Pr{ X = m j } = p j, et Pr{s } = p α1.p α2 p α I ( s ) = - log 2 Pr{s } = - log 2 ( p α1.p α2 p α ) = =1.. - log 2 p α = =1.. I ( m α1 ) I ( s ) = =1.. I ( m α1 ) I the case of a dscrete source of messages m, where each message m s assocated wth a probablty p, the etropy H of the source S s gve by: H (S) = = 1 p. log 2 p

Propertes of etropy: - As 0 p 1 ad p 1, the H(X) > 0: the etropy s postve. = 1 = q - Gve (p 1, p 2,, p ) ad (q 1, q 2,, q ) two probablty laws, the p log2 0. p x > 0, we have L x x 1 so l q p q p - 1, beg log2 q p = 1 1 q ( - 1) l2 p thus q 1 p = q p = ( 1 1) = 0 = l2 1 = p l2 1 q p log p = = l2 1 2 1 1 1 1 - The etropy of a radom varable X wth possble values s maxmal ad s worth log 2 whe X follows a uform probablty law. By takg q 1 = q 2 = = q = 1 (uform law), the prevous property: q p log2 0 p p log2 p p log = 1 = 1 = 1 = 1 H(X) p log 1 2 H(X) log 1 2 p = log2 = 1 - The etropy s cotuous ad symmetrcal. For the rest of ths course, we wll systematcally use the logarthm base 2. 2 q Smple example: Let s cosder a source S, of uform law, that seds messages from the 26-character Frech (a,b,c,, z). To ths alphabet we add the "space" character as a word separator. 27 1 The alphabet s made up of 27 characters: H(S) = - log 27 2 27 1 = log 2 (27) = 4.75 bts of formato per character. Actually, the etropy s close to 4 bts of formato per character o a very large amout of Frech text. = 1

Chapter 5 STATISTICAL CODING Huffma Codg

Optmal statstcal codg Deftos: - S: dscrete ad smple source of messages m wth probablty law p = (p 1,., p N ) (homogeeous source) - Codg of a alphabet A = { a 1, a 2,, a q } - Etropy of the source H(S) ad average legth of code-words E() MacMlla s theorem: - There exsts at least oe rreducble vertg code that matches: H / log2 q E() < (H / log2 q) +1 Equalty f p of the form: p = q - ( f q = 2 => = - log2 p ) Shao s theorem (1 st theorem o oseless codg) H / log2 q E() < (H / log2 q) + ε 0 I the prevous resource, "Defg codg ad propertes" we saw that the objectves of codg are maly to trascrbe formato ad to reduce the quatty of symbols ecessary to represet formato. By optmzg the codg, we attempt to reduce the quatty of symbols as much as possble. Let s cosder a smple source of homogeeous formato S = {m 1, m 2,, m N } armed wth a probablty law p = { p 1, p 2,, p N } où p = Pr{m = m }. If M s the code-word that correspods to the message m, we call = (M ) the umber of characters that belog to the alphabet A (Card(A) = q) eeded for the codg of m, s thus the legth of the code-word M. The average legth of the code-words s the: E() = N = 1 p. The average ucertaty of a source s the etropy H. The average ucertaty per character of the alphabet A s equal to E H, so we get: () E H log () 2 q because alphabet A cotas «q» characters. From ths equalty, we deduce that E() H. log q 2

To optmze codg, we wat to reduce E(), the average legth of the code-words. Ths average legth caot be lower tha H. Nevertheless, two theores show that t s possble log q to obta equalty E() = H ad a optmal code: log q Mac Mlla s theorem: 2 2 A formato source S wth etropy H coded by a vertg way wth a alphabet coutg q characters s such that: E() H ad there exsts at least oe rreducble code of a gve log q 2 law such as H + 1. Equalty s reached f p = q - (.e. = - log q p ) ad we the have log2q a optmal codg. Note: I the partcular case of a bary alphabet {0, 1}, we have q = 2. If the relatoshp = - log 2 p s true, we the have E() = H: the etropy s the bottom lmt of the set of code-word average legths ad ths lower lmt s reached. Shao s theorem of oseless codg: Ay homogeeous source of formato s such as there exsts a rreducble codg for whch the average legth of code-words s as close as we wat to the lower lmt H. The log2q demostrato of ths theorem uses rreducble codg blocks (block codg assgs a codeword to each block of «k» messages of S, cosecutve or ot).

Optmal statstcal codg Fao - Shao codg Arthmetc codg (block ecodg, terval type ecodg) possbltes of o le adaptato Huffma codg 3 basc prcples: - f p < p j => j - the 2 ulkelest codes have the same legth - the 2 ulkelest codes (of max legth) have the same prefx of legth max -1 The presetato above shows three types of codes that are close to optmal code: Shao-Fao codg: It tres to approach as much as possble the most compact rreducble code, but the probablty p s ot usually equal to 2 -, so the codg ca oly be close to optmal codg. The probabltes p assocated wth the messages m are arraged by decreasg order the we fx so that: log 1 1. Fally, we choose each code-word M of legth so that 2 p oe of the prevously chose code-words forms a prefx (t avods decodg ambguty cf.: "classfcato of the codes"). Arthmetc codg: The code s assocated wth a sequece of messages from the source ad ot wth each message. Ulke Huffma codg, whch must have a teger legth of bts per message ad whch does ot always allow a optmal compresso, arthmetc codg lets you code a message o a o-teger umber of bts: ths s the most effectve method, but t s also the slowest. The dfferet aspects of ths codg are more fully developed the resource: "Statstcal codg: arthmetc codg".

Huffma codg: Huffma codg s the optmal rreducble code. It s based o three prcples: - f p j > p the j, - the two most ulkely words have equal legths, - the latters are wrtte wth the same max -1 frst characters. By usg ths procedure teratvely, we buld the code-words M of the messages m. - Example: Let be a source S = {m 1, m 2,, m 8 } wth a probablty law: p 1 = 0,4 ; p 2 = 0,18 ; p 3 = p 4 = 0,1 ; p 5 = 0,07 ; p 6 = 0,06 ; p 7 = 0,05 ; p 8 = 0,04. We place these probabltes decreasg order the colum p (0) of the table below. We ca see that the colum p (0) the probabltes of the messages m 7 ad m 8 are the smallest, we add them ad reorder the probabltes, stll decreasg order, to create the colum p (1) : Geerally, we add the two smallest probabltes the colum p (k), the we reorder the probabltes decreasg order to obta the colum p (k+1). Fally we get the followg table:

We assg the bts 0 ad 1 to the last two elemets of each colum: For each message m, we go through the table from left to rght ad each colum we ca see the assocated probablty p (k) (blue path o the llustrato below). The code-word M s the obtaed by startg from the last colum o the rght ad movg back to the frst colum o the left, by selectg the bts assocated wth the probabltes p (k) of the message m (gree rectagles o the llustrato below). For example, we wat to determe the code-word M 6 of the message m 6. We detect all the probabltes p 6 (k) :

The code-word M 6 s thus obtaed by smply readg from rght to left the bts cotaed the gree rectagles: 0 1 0 1. By followg the same procedure for each message, we obta: The average legth of the code-words s equal to: 8 E() = = 1 E() = 2,61 p = 0,4 1 + 0,18 3 + 0,1 3 + 0,1 4 + 0,07 4 + 0,06 4 + 0,05 5 + 0,04 5 We ca compare ths sze wth the etropy H of the source: H = 8 = 1 p. log p = 2,552 2 2,552 The effcecy η of the Huffma codg for ths example s thus = 97.8 %. For 2,61 comparso purposes, 3 bts are eeded to code 8 dfferet messages wth a atural bary (2 3 = 8). For ths example, the effcecy of the atural bary codg s oly 2,552 = 85 %. 3

Chapter 5 STATISTICAL CODING Arthmetc Codg

Arthmetc Codes Basc prcples: - the code s assocated to the sequece of symbols m (messages), ot to every symbol the sequece. - codg of tervals of type [c, d [ for each symbol. - terato o the selected terval for the ext symbol of the sequece - oe codes the sequece of symbols wth a real value o [0, 1 [. Arthmetc codes allow you to ecode a sequece of evets by usg estmates of the probabltes of the evets. The arthmetc code assgs oe codeword to each possble data set. Ths techque dffers from Huffma codes, whch assg oe codeword to each possble evet. The codeword assged to oe sequece s a real umber whch belogs to the half-ope ut terval [0, 1 [. Arthmetc codes are calculated by successve subdvsos of ths orgal ut terval. For each ew evet the sequece, the subterval s refed usg the probabltes of all the dvdual evets. Fally we create a half-ope subterval of [0, 1[, so that ay value ths subterval ecodes the orgal sequece.

Deftos: Arthmetc Codes - Let S = {s 1, s 2,, s N } be a source ad p k = Pr(s k ) - [Ls k, Hs k [ s the terval correspodg to the symbol s k wth: Hs k - Ls k = p k Ecodg algorthm: 1) Italzato: L c = 0 ; H c = 1 2) Calculate code sub-tervals 3) Get ext put symbol s k 4) Update the code sub-terval 5) Repeat from step 2 utl all the sequece has bee ecoded Let S = {s 1,, s N } be a source whch ca produce N source symbols. The probabltes of the source symbols are deoted by: [1, N], P{s k } = p k. Here s the basc algorthm for the arthmetc codg of a sequece s M = {s α1, s α2,, s αm } of M source symbols (s αk stads for the k-th source symbol that occurs the sequece we wat to ecode): Step1: Let us beg wth a curret half-ope terval [L c, H c [ talzed to [0, 1[ (ths terval correspods to the probablty of choosg the frst source symbol s α1 ). The legth of the curret terval s thus defed by: legth = H c - L c. Step 2: For each source symbol the sequece, we subdvde the curret terval to half-ope subtervals [Ls k, Hs k [, oe for each possble source symbol s k. The sze of a symbol s subterval [Ls k, Hs k [ depeds o the probablty p k of the symbol s k. The subterval legth ( Ls k - Hs k ) s defed so that: H k - L k = p k, the: k Ls k = L c + legth 1 k p ad Hs = 1 k = L c + legth p. = 1

Step 3: We select the subterval correspodg to the source symbol s k that occurs ext ad make t the ew curret terval [L c, H c [: L c = L c + legth Ls k H = L + legth Hs c c k Step 4 : Ths ew curret terval s subdvded aga as descrbed Step 2. Step 5 : Repeat the steps 2, 3, ad 4 utl the whole sequece of source symbols has bee ecoded.

Example Arthmetc Codg The source S= {-2, -1, 0, 1, 2} s a set of 5 possble moto vector values (used for vdeo codg) Null moto vector 2 1 0-1 -2 Y s the radom varable assocated to the moto vector values Wth the followg probabltes: Pr{Y = -2} = p 1 = 0,1 Pr{Y = -1} = p 2 = 0,2 Pr{Y = 0} = p 3 = 0,4 Pr{Y = 1} = p 4 = 0,2 Pr{Y = 2} = p 5 = 0,1 We wat to ecode the moto vector sequece (0, -1, 0, 2) Here s a example for ecodg moto vectors of a vdeo sgal wth a arthmetc code.

Arthmetc Codg Subdvsos of the curret sub-tervals s α1 = 0 s α2 = -1 s α3 = 0 s α1 = 2 To ecode the sequece {s α1, s α2, s α3, s α4 } = {s 3, s 2, s 3, s 5 } = {0, -1, 0, 2}, we subdvde the ut curret terval [0, 1[ to 5 half-ope tervals, the we select the subterval correspodg to the frst moto vector that occurs the sequece (value 0). Ths subterval s the ew curret terval. We subdvde t ad we select the subterval correspodg to the ext evet (the moto vector -1). We repeat these steps for each source symbol of the sequece (here these source symbols are the moto vectors). Cosequetly, we ca ecode the sequece (0, -1, 0, 2) of vertcal moto vectors by ay value the half-ope rage [0.3928, 0.396[. The value 0.3945 ecodes ths sequece; therefore we eed 8 bts to ecode ths sequece: So we eed 8/5 = 1.6 bts/symbol. 0.3945 = 0 2-1 + 2-2 + 2-3 + 0 2-4 + 0 2-5 + 2-6 + 0 2-7 + 2-8

Arthmetc Codg Decodg algorthm 1) Italzato: L c = 0 ; H c = 1 2) Calculate the code sub-terval legth: legth = H c - L c 3) Fd the symbol sub-terval [Ls k, Hs k [ wth 1 k N such that: Ls k (codeword L c ) / legth < Hs k 4) Output symbol: s k 5) Update the subterval: L c = L c + legth Ls k H c = L c + legth Hs k 6) Repeat from step 2 utl all the last symbol s decoded The fgure above descrbes the decodg algorthm of a codeword obtaed after havg ecoded a sequece of source symbols wth a arthmetc code. Ths decodg algorthm s performed for the prevous example. Let us cosder the codeword M c = 0.3945: [ Algorthm begg ] Step 1: We talze the curret terval [L c, H c [ : L c = 0 et H c = 1. Step 2: We calculate the legth L of the curret terval: L = H c - L c = 1. Step 3: We calculate the value ( M c - L c ) / L = 0.3945, ad we select the subterval [Ls k, Hs k [ so that M c [Ls k, Hs k [. Here, the selected subterval s [0.3, 0.7[. Ths subterval correspods to the half-ope terval [Ls 3, Hs 3 [. Step 4: The frst symbol s α1 of the sequece s thus s 3 (moto vector 0).

Step 5: We create the ew curret terval [L c, H c [ for ecodg the ext source symbol: Step 2: L c = L c + L Ls 3 = 0 + 1 0.3 = 0.3 H c = L c + L Hs 3 = 0 + 1 0.7 = 0.7 We calculate the legth L of the curret terval: L = H c - L c = 0.7 0.3 = 0.4. Step 3: ( M c - L c ) / L = (0.3945 0.3) / 0.4 = 0,2363. Ths value belogs to the subterval [Ls 2, Hs 2 [ = [0.1, 0.3[. Step 4: The secod symbol s α2 of the sequece s thus s 2 (moto vector -1). Step 5: L c = L c + L Ls 2 = 0.3 + 0.4 0.1 = 0.34 H c = L c + L Hs 2 = 0.3 + 0.4 0.3 = 0.42 Step 2: We calculate the legth L of the curret terval: L = H c - L c = 0.42 0.34 = 0.08. Step 3: ( M c - L c ) / L = (0.3945 0.34) / 0.08 = 0.6812. Ths value belogs to the subterval [Ls 3, Hs 3 [ = [0.3, 0.7[. Step 4: The thrd symbol s α3 of the sequece s thus s 3 (moto vector 0). Step 5: L c = L c + L Ls 3 = 0.34 + 0.08 0.3 = 0.364 H c = L c + L Hs 3 = 0.34 + 0.08 0.7 = 0.396 Step 2: We calculate the legth L of the curret terval: L = H c - L c = 0.396 0.364 = 0.032.

Step 3: ( M c - L c ) / L = (0.3945 0.364) / 0.032 = 0.9531. Ths value belogs to the subterval [Ls 5, Hs 5 [ = [0.9, 1[. Step 4: The fourth symbol s α4 of the sequece s thus s 5 (moto vector 2). [ Algorthm ed ] The decodg of the value 0.3945 allows us to rebuld the orgal sequece {s α1, s α2, s α3, s α4 } = {s 3, s 2, s 3, s 5 } = { 0, -1, 0, 2 }. Cotrary to Huffma codes, arthmetc codes allow you to allocate fractoal bts to symbols. The data compresso wth arthmetc codes s thus more effcet. However arthmetc codg s slower tha Huffma. It s ot possble to start decodg wthout the etre sequece of symbols, whch s possble Huffma codg. The compresso rate ca also be creased by usg probablty models whch are ot statc. The probabltes are adapted accordg to the curret ad the prevous sequeces: arthmetc codg ca thus hadle adaptve codg.

Chapter 5 STATISTICAL CODING Presetato Classfcato of the statstcal codes

Iformato data codg Objectves Trascrpto of formato to facltate codg code sgal (Trascodg) Iformato compresso reducg formato sze Protecto agast trasmsso errors agast loss ad decso errors Keepg trasmtted formato secret ecrypto Defto of a code applcato of S A = { a 1, a 2,, a q } message m S code-word M M fte sequeces of A Iformato codg cossts of trascrbg messages from a formato source the form of a sequece of characters take from a predefed alphabet. The objectves of codg fall to four ma categores: trascrbg formato a form that makes t easy to create a sgal that ca hadle the formato, or easy to hadle the formato automatcally. To do ths, dfferet codes for represetg the formato are used depedg o the evsaged applcato, wth trascodg operatos frequetly beg used; reducg the umber of formato symbols eeded to represet the formato ( terms of the total umber of symbols used): ths s a space-savg role; prevetg qualty loss (dstorto, ose) caused by the trasmsso chael ad whch lead to errors whe you recostruct the formato whe t leaves the trasmsso chael (upo recepto); protectg cofdetal formato by makg t utellgble except for ts teded recpet.

Defto of a code: Gve a set, called alphabet A made up of q characters a : A = { a 1, a 2,, a q }ad M the fte set of fte sequeces M of characters (for example: M = a 10 a 4 a 7 ). Gve a fte set of messages emtted by a message source S: S={ m 1,, m N }. A code refers to ay applcato of S A: codg of S through the use of the alphabet A. The elemet M of M whch correspods to the message m of S s called the codeword of m. Its legth, oted as, s the umber of characters belogg to A whch compose M. The decodg of a sequece of set messages m volves beg able to separate the codewords a receved sequece of codewords M. Ths s why we sometmes use a specal spacg character a alphabet.

Iformato data codg (4) Alphabet A = { a 1, a 2,, a q } Fte set of messages S = { m 1, m 2,., m,, m N } Codg C = { M 1, M 2,., M,..., M N } Legth of code-words: = (M ) Average legth of code-words: E ( ) = =1;N p Etropy of the source H: H(p 1,, p N ) log 2 N Average quatty of formato per character = H / E() or H / E() log 2 q => E() H / log 2 q Flow of a source of formato coded wth a average D characters per secod: R = D H/E() => R D log 2 q R bts/secod From here o, we shall call the messages produced by the formato source m ad M the codewords assocated wth them. We wll call = (M ) the umber of characters belogg to a alphabet A (Card(A) = q) ecessary for codg m, beg the legth of the codeword M. If the source uses N possble dfferet messages, the average legth of the codewords s gve by: N E() = = 1 p, where p = Pr{ m }. H s the average ucertaty (.e. the etropy) of the source S per message set, so the average ucertaty (.e. the etropy) per character (of the alphabet A) equals E H ad we () have: E H log () 2 q (because we have q characters the alphabet A), so: E() H. log2q Fally, f the coded formato source produces D characters per secod take from the alphabet A, E H beg the average formato trasported per character bt/character, the () character rate R of formato s: R = D. E H. () Ths character rate s the lmted by: R D.log 2 q.

Codg ad decodg formato (5) Effcecy η of a code: η = m / E() => η = H / ( E() log 2 q ) Redudacy ρ of a code : ρ = 1 - η Smple examples: codes C 1 ad C 2 Costrats: separato of code-words & uambguous readg of code-words => regular ad vertg codes Regular code: f m m j ==> M M j (jectve applcato) Ivertg codes : 2 sequeces of dstct messages ==> 2 sequeces of dstct codes f (m α1,, m α ) (m β1,, m βj ) => (M α1,, M α ) (M β1,, M βj ) examples: fxed legth codes; codes wth separator Irreducble code: vertg code that ca be decoded wthout ay devce M s ot a prefx of M j, j Some deftos ad propertes lked to formato ecodg ad decodg: Effcecy: For a gve alphabet A, the effcecy of a code s η gve by: H m m E() log2q η = = = = H, η [0, 1] E() E() E() E() log2q Redudacy: The mathematcal redudacy s defed by the factor ρ = 1 - η. Redudacy ca be used to crease the robustess of the codg whe faced wth trasmsso errors for the coded formato (error detecto ad correcto).

Here s a smple example: we cosder a source of 4 possble messages {m 1, m 2, m 3, m 4 } of probabltes: p 1 = 0.5 ; p 2 = 0.25 ; p 3 = p 4 = 0.125, respectvely. Gve the followg two codes C 1 (smple bary codage) ad C 2 (varable legth code): Messages Codes m 1 m 2 m 3 m 4 C 1 0 0 0 1 1 0 1 1 C 2 0 1 0 1 1 0 1 1 1 For C 1 : η = 1.75 = 0.875 ad ρ = 1 - η = 0.125. 2 For C 2 : η = 1.75 = 1 ad ρ = 1 - η = 0. 1.75 The code C 2 s of maxmum effcecy (utary) whle code C 1 s ot. Regular code: Ay gve code-word s assocated wth oly oe possble message (applcato S A s bjectve): f m m j the M M j. Ivertg code: The code s vertg f two dstct sets of messages (m α1,, m α ) ad (m β1,, m βj ) ecessary lead to dstct codgs (for example code of fxed legth such as C 1 ad codes wth separator). A vertg code s the a specal case of a regular code. Irreducble code: Ths s a decryptable code that ca be read drectly wthout ay specal devce (fxed legth code, separator). To do that, ay code-word M of a message m must have o prefx that s aother code-word M j. I ths way, we ca create a herarchcal classfcato to characterze a code s type:

Code examples Regular codes / Ivertg codes / Irreducble codes Messages Proba. m 1 0.5 m 2 0.25 m 3 0.125 m 4 0.125 C 1 1 1 0 00 C 2 0 1 11 01 C 3 1 01 001 000 C 4 1 10 100 1000 C 1 s a regular code C 2 s a o-vertg code C 3 s a vertg ad rreducble code C 4 s oly a vertg code Here are four codes C 1, C 2, C 3 ad C 4 gve as examples of the prevous deftos ad propertes. We suppose that the four messages m 1, m 2, m 3, ad m 4 are dstct. The code C 1 s ot regular: m 1 m 2 but C 1 (m 1 ) = C 1 (m 2 ), ad also C 1 (m 3 ) = C 1 (m 4 ). The code C 2 s a o-vertg code: the two texts {m 1, m 2 } ad {m 4 } are dfferet, but they lead to the same code «01». The code C 3 s a vertg ad rreducble code: two dstct texts made up of sequeces of messages, for example {m 1, m 3, m 4 } ad {m 1, m 2 } always lead to dfferet codes ad o code-word M = C 3 (m ) s prefxed by aother code-word M j = C 3 (m j ) The code C 4 s a vertg code but ot rreducble: two dstct texts always lead to dfferet codes but the code-words M = C 4 (m ) are the prefxes of all the code-words M j = C 4 (m j ) oce < j.