Statistical Mechanics of Multi Terminal Data Compression

Similar documents
Propagating beliefs in spin glass models. Abstract

Statistical mechanics of low-density parity check error-correcting codes over Galois fields

2 Information transmission is typically corrupted by noise during transmission. Various strategies have been adopted for reducing or eliminating the n

Low-Density Parity-Check Codes A Statistical Physics Perspective

Statistical mechanics and capacity-approaching error-correctingcodes

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes

Spin Glass Approach to Restricted Isometry Constant

On convergence of Approximate Message Passing

Information, Physics, and Computation

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

Linear Programming Decoding of Binary Linear Codes for Symbol-Pair Read Channels

CHAPTER 3 LOW DENSITY PARITY CHECK CODES

Analysis of a Randomized Local Search Algorithm for LDPCC Decoding Problem

Graph-based codes for flash memory

An Introduction to Low Density Parity Check (LDPC) Codes

Design of Optimal Quantizers for Distributed Source Coding

LDPC Codes. Intracom Telecom, Peania

Performance of Polar Codes for Channel and Source Coding

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Non-binary Distributed Arithmetic Coding

LDPC Codes. Slides originally from I. Land p.1

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels

Belief Propagation, Information Projections, and Dykstra s Algorithm

Low-Density Parity-Check Codes

ECEN 655: Advanced Channel Coding

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011

Spatially Coupled LDPC Codes

The spin-glass cornucopia

Sample Complexity of Bayesian Optimal Dictionary Learning

Phase transitions in discrete structures

Weaknesses of Margulis and Ramanujan Margulis Low-Density Parity-Check Codes

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

arxiv: v3 [cs.na] 21 Mar 2016

Sparse Superposition Codes for the Gaussian Channel

1.6: Solutions 17. Solution to exercise 1.6 (p.13).

Asynchronous Decoding of LDPC Codes over BEC

Lecture 4 : Introduction to Low-density Parity-check Codes

Low Density Parity Check (LDPC) Codes and the Need for Stronger ECC. August 2011 Ravi Motwani, Zion Kwok, Scott Nelson

Iterative Encoding of Low-Density Parity-Check Codes

Quantization for Distributed Estimation

Expectation propagation for symbol detection in large-scale MIMO communications

Low-Complexity Fixed-to-Fixed Joint Source-Channel Coding

Turbo Codes are Low Density Parity Check Codes

Common Information. Abbas El Gamal. Stanford University. Viterbi Lecture, USC, April 2014

Distributed Arithmetic Coding

T sg. α c (0)= T=1/β. α c (T ) α=p/n

arxiv:quant-ph/ v2 24 Dec 2003

Slepian-Wolf Code Design via Source-Channel Correspondence

Belief propagation decoding of quantum channels by passing quantum messages

ECE Information theory Final (Fall 2008)

Approximate Message Passing

1 Background on Information Theory

Optimal Rate and Maximum Erasure Probability LDPC Codes in Binary Erasure Channel

Expectation propagation for signal detection in flat-fading channels

Noisy channel communication

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

On Scalable Source Coding for Multiple Decoders with Side Information

THERE is a deep connection between the theory of linear

Single-letter Characterization of Signal Estimation from Linear Measurements

The sequential decoding metric for detection in sensor networks

arxiv:cs/ v1 [cs.it] 15 Aug 2005

THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová

ECC for NAND Flash. Osso Vahabzadeh. TexasLDPC Inc. Flash Memory Summit 2017 Santa Clara, CA 1

(Classical) Information Theory III: Noisy channel coding

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Variable-Rate Universal Slepian-Wolf Coding with Feedback

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

On Common Information and the Encoding of Sources that are Not Successively Refinable

arxiv: v2 [cs.it] 27 Apr 2007

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Lecture 8: Shannon s Noise Models

Information-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery

Distributed Lossless Compression. Distributed lossless compression system

A Practical and Optimal Symmetric Slepian-Wolf Compression Strategy Using Syndrome Formers and Inverse Syndrome Formers

Quasi-cyclic Low Density Parity Check codes with high girth

Multi-Hypothesis based Distributed Video Coding using LDPC Codes

Belief-Propagation Decoding of LDPC Codes

Practical Coding Scheme for Universal Source Coding with Side Information at the Decoder

Lecture 4 Noisy Channel Coding

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke

Codes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008

Codes for Partially Stuck-at Memory Cells

Performance of Low Density Parity Check Codes. as a Function of Actual and Assumed Noise Levels. David J.C. MacKay & Christopher P.

Structured Low-Density Parity-Check Codes: Algebraic Constructions

Distributed Source Coding Using LDPC Codes

Chain Independence and Common Information

Bounds on Achievable Rates of LDPC Codes Used Over the Binary Erasure Channel

On Scalable Coding in the Presence of Decoder Side Information

Entropies & Information Theory

Construction of low complexity Array based Quasi Cyclic Low density parity check (QC-LDPC) codes with low error floor

EE5585 Data Compression May 2, Lecture 27

ALARGE class of codes, including turbo codes [3] and

Side-information Scalable Source Coding

Lecture 4: Proof of Shannon s theorem and an explicit code

Message passing and approximate message passing

Joint Source-Channel Coding for the Multiple-Access Relay Channel

Chapter 9 Fundamental Limits in Information Theory

Data Fusion Algorithms for Collaborative Robotic Exploration

Information Projection Algorithms and Belief Propagation

Reliable Computation over Multiple-Access Channels

Transcription:

MEXT Grant-in-Aid for Scientific Research on Priority Areas Statistical Mechanical Approach to Probabilistic Information Processing Workshop on Mathematics of Statistical Inference (December 004, Tohoku University, Sendai, Japan) Statistical Mechanics of Multi Terminal Data Compression Theory and Practice Tatsuto Murayama 1 NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Kyoto 619-037, Japan This paper presents an efficient LDPC-based algorithm for multi-terminal data compression. In our scenario, a codeword sequence is calculated by multiplying a data sequence by a predetermined LDPC matrix. In contrast, the decoder generates a proper reproduction sequence for a codeword sequence using a message passing algorithm. The key result is the discovery that the LDPC coding technique can provide a suboptimal solution to problem of decoding, which often suffers from the dimension curse. Our analysis shows that the achievable rate region is described as first-order phase transitions among several phases. The typical performance of our practical decoder is also well evaluated by the replica method. 1 Introduction Data compression, or source coding, is a scheme to reduce the size of message (data) in information representation. In his seminal paper [1], Shannon showed that for an information source represented by a distribution P(S) ofn dimensional Boolean (binary) vector S, one can employ another representation in which the message length N is reduced to M( N) without any distortion, if the code rate R = M/N satisfies R H (S) in the limit N, M. Here, H (S) = (1/N )Tr S P(S)log P(S) representsthe binary entropy per bit in the original representation S indicating the optimal compression rate. Unfortunately, Shannon s theorem itself is non-constructive and does not provide explicit rules for devising the optimal codes. Therefore, it is surprising that a practical code proposed by Lempel and Ziv in 1973 [] asymptotically saturates Shannon s optimal compression limit in the case of point-to-point communication, when lossless compression is considered. However, it should be emphasized here that generalization of the Lempel-Ziv codes to advanced data compression suitable for a network is difficult, although the importance of networks is rapidly increasing with the recent development of the Internet. This is because all the practical codes that saturate Shannon s limit to date require a complete knowledge of all source vectors coming into the communication network, while the compression should be carried out 1 E-mail: murayama@cslab.kecl.ntt.co.jp 1

encoders decoder ξ u u R 1 ξ,η η v v R Figure 1: Slepian and Wolf system: A simple communication network introduced in the data compression theorem of Slepian and Wolf. Separate coding is assumed in the distributed system. independently on each terminal in usual situations. Therefore, the quest for more efficient compression codes that are suitable for a network still remains one of the most important topics in information theory [3]. The purpose of this chapter is to employ recent advances in the research on error-correcting codes for this purpose. More specifically, we will investigate the efficacy and the limitation of a linear compression scheme inspired by Gallager s error-correcting codes [4] which has been actively investigated in both of information theory and physics communities [5, 6], when it is applied to the data compression problem introduced by Slepian and Wolf in their research on the network based information theory [7]. Unlike the existing argument in information theory, our approach based on statistical mechanics makes it possible not only to assess the theoretical bounds of the achievable performance but also to provide practical encoding/decoding methods that can be performed in linear time scales with respect to the data length. General Scenario Let us start by setting up the framework of the Slepian-Wolf coding-decoding problem [7]. In a general scenario, two correlated N-dimensional Boolean vectors ξ and η are independently compressed to M- dimensional vectors u and v, respectively. These compressed data (or codewords) u and v are decoded to retrieve the original data simultaneously by a single decoder. A schematic representation of this system is shown in Figure 1. The codes used in this chapter are composed of randomly selected sparse matrices A and B of dimensionality M 1 N and M N, respectively. These are constructed similarly to those of Gallager s error-correcting codes [4] as characterized by K 1 and K nonzero unit elements per row and C 1 and C nonzero unit elements per column, respectively. The compression rates can be different between the two

H(η) PF FF (achievable rate region) R H(η ξ) PP FP H(ξ η) H(ξ) R 1 Figure : Achievable rate region: Code rates are classified into four categories according to whether the two compressed data are decodable or not. The parameter regime where the both data are decodable without any distortion is termed the achievable rate region. terminals. Corresponding to matrices A and B, the rates are defined as R 1 = M 1 /N = K 1 /C 1 and R = M /N = K /C, respectively. While both matrices are known to the decoder, encoders only need to know their own matrix, that is, encoding is carried out separately in this scheme as u = Aξ and v = Bη, where Boolean arithmetic is employed to the Boolean vectors. After receiving the codewords u and v, the couple of equations u = AS, v = Bτ (1) should be solved with respect to S and τ which become the estimates of the original data ξ and η, respectively. 3 Statistical Mechanical Analysis To facilitate the current investigation we first map the problem to that of an Ising model with finite connectivity [8]. We employ the binary representation (+1, 1) of the dynamical variables S and τ and of the vectors u and v rather than the Boolean (0, 1) one; the vector u is generated by taking products of the relevant binary data bits u i1,i,,i K1 = ξ i1 ξ i ξ ik1, where the indices i 1,i,,i K1 correspond to the nonzero elements of A, producing a binary version of u, and similarly for v. Assuming the thermodynamic limit N, M 1,M, while keeping the code rates R 1 = M 1 /N and R = M /N finite is quite natural as communication to date generally requires transmitting large data, where finite size corrections are likely to be negligible. To explore the system s capabilities we examine the partition 3

function Z =Tr S,τ P(S, τ ) i 1,i,,i K1 i 1,i,,i K { 1+ 1 ( ) } A i 1,i,,i K1 u i1,i,,i K1 S i1 S i S ik1 1 { 1+ 1 ( B i 1,i,,i K v i1,i,,i K τ i1 τ i τ ik 1) }. () The tensor product A i1,i,,i K1 u i1,i,,i K1,whereu i1,i,,i K1 = ξ i1 ξ i ξ ik1 is the binary equivalent of Aξ. Elements of the sparse connectivity tensor A i1,i,,i K1 take the value 1 if the corresponding indices of data are chosen (i.e., if all corresponding indices of the matrix A are 1) and 0 otherwise; it has C 1 unit elements per i index representing the system s degree of connectivity. Notice that if the product S i1 S i S ik1 is in disagreement with the corresponding element u i1,i,,i K1, which implies an error for the parity check, the value of the partition function Z vanishes. Similar arguments are valid for B i1,i,,i K and v i1,i,,i K. The probability P(S, τ) represents our prior knowledge of data including the correlation between the sources ξ and η. Note that the dynamical variables τ, introduced to estimate η, are irrelevant to the performance measure with respect to the other data ξ. Since the partition function Eq. () is invariant under the transformations S i S i ξ i, u i1,i,,i K1 u i1,i,,i K1 ξ i1 ξ i ξ ik1 =1 τ i τ i η i, v i1,i,,i K v i1,i,,i K τ i1 τ i τ ik =1, (3) it is useful to decouple the correlations between the vectors S, τ and ξ, η. Rewriting Eq. () using this gauge, one obtains a similar expression apart from the first factor which becomes P(S ξ, τ η), where S ξ =(S i ξ i )andτ η =(τ i η i )fori =1,,,N. The random selection of elements in A and B introduces disorder to the system; we average the logarithm of the partition function Z(A, B, u, v) over the disorder and the statistical properties of both data, using the replica method [9]. In the calculation, a set of order parameters q α,β,,γ = 1 N r α,β,,γ = 1 N N Z i Si α S β i Sγ i, (4) N Y i τi α τ β i τ γ i (5) arise, where α, β,,γ represent replica indices, and the variables Z i and Y i come from enforcing the restriction of C 1 and C connections per index, respectively, δ π P A i,i,,i K1 C 1 dz = i,,i K1 0 π Z i,,i K1 A i,i,,i K1 (C 1+1), (6) δ π B i,i,,i K C dy P = π Y i,,i K B i,i,,i K (C +1). i,,i K 0 To proceed further, we have to make an assumption about the order parameters symmetry. The assumption made here, and validated later on, is that of replica symmetry in the following representation 4

of the order parameters and the related conjugate variables [6], q α,β,,γ = a q dx π(x)x l, ˆq α,β,,γ = aˆq r α,β,,γ = a r dy ρ(y)y l, ˆr α,β,,γ = aˆr dˆx ˆπ(ˆx)ˆx l, dŷ ˆρ(ŷ)ŷ l, (7) where l is the number of replica indices and a are normalization factors to make π(x), ˆπ(ˆx), ρ(y) and ˆρ(ŷ) represent probability distributions. Unspecified integrals are carried out over the range [ 1, +1]. Extremizing the averaged expression with respect to the probability distributions, we obtain the following free energy per spin: F = 1 N ln Z A,B,P ( C 1 1+ ) K 1 = Extr ln x i + C ln π,ˆπ,ρ,ˆρ K 1 K π ( ) ( ) 1+xˆx 1+yŷ C 1 ln C ln π,ˆπ ρ,ˆρ [ + 1 N C 1 ( ) 1+ˆxµi S i N C ( 1+ŷµi τ i ln Tr N S,τ µ=1 µ=1 ( 1+ ) K y i ρ ) ] P(S ξ, τ η) ˆπ,ˆρ,P, where the brackets with the subscript π and ˆπ represent averages over the probability distributions π(x) and π(ˆx) with respect to variables denoted by x and ˆx with and without subscripts, respectively. Similar notations are also used for ρ and ˆρ. The bracket with the subscript P denotes the average with respect to ξ and η following the data distribution P(ξ, η). Taking the functional derivative with respect to the distributions π, ˆπ, ρ and ˆρ, we obtain the following saddle point equations: [ ( π(x) = 1 N δ x tanh F i (ˆx µj L(µ)/i, ŷ µi ; ξ, η)ξ i + N ( ) K 1 1 ˆπ(ˆx) = δ ˆx x i π C 1 1 µ=1 tanh 1 (ˆx µi ) where the effective fields denoted by F i with subscripts are implicitly defined as e Fi(ˆx µj L(µ)/i,ŷ µi;ξ,η)ξ is i coshf i (ˆx µj L(µ)/i, ŷ µi ; ξ, η) = Tr S/S i,τ j L(µ)/i Tr S,τ N i C1 µ=1 C1 µ=1 ( 1+ˆxµjS j ( 1+ˆxµiS i ) N ) N C µ=1 C µ=1 ( 1+ŷµiτ i ( ) 1+ŷµiτ i )] P(S ξ, τ η) ) P(S ξ, τ η) and similarly for ρ(y) andˆρ(ŷ). Notice that the notation S/S i represents the set of all dynamical variables S except S i. On the other hand, L 1 (µ) andl (µ) denote the set of all indices of nonzero components in the µth row of A and B, respectively. The notation L 1 (µ)/i represents the set of all indices belonging to L 1 (µ) except i, and the same is true for others. After solving these equations, the expectation of the overlap can be evaluated as m 1 = 1 N ξ i sign S i = dz φ(z)sign(z), (11) N A,P ˆπ,ˆρ,P, (8) (9) (10) 5

where we denote thermal averages and φ(z) = 1 N [ ( )] N C 1 δ z tanh F i (ˆx µj L(µ)/i, ŷ µi ; ξ, η)ξ i + tanh 1 ˆx i, (1) ˆπ,ˆρ,P and similarly for m of the overlap between η and its estimator. 4 Structure of Solutions The performance of the current compression method can be measured by the vector m =(m 1, m ). Hereafter, we use the term ferromagnetic to specify the perfect retrieval, that is, m 1 =1(orm =1), while the term paramagnetic implies the distortion, that is, m 1 < 1(orm < 1). For instance, a term such as ferromagnetic-paramagnetic phase denotes the phase characterized by the performance vector m {(m 1, m ) m 1 =1, m < 1}, andsoon. One can show that the ferromagnetic-ferromagnetic state (FF): π(x) = δ(x 1), ˆπ(ˆx) = δ(ˆx 1), ρ(y) =δ(y 1) and ˆρ(ŷ) =δ(ŷ 1) always satisfies Eq. (9). In addition, in the limit of C 1,C, four solutions describing the paramagnetic-paramagnetic state (PP): π(x) =δ(x), ˆπ(ˆx) =δ(ˆx), ρ(y) = δ(y) andˆρ(ŷ) =δ(ŷ), the paramagnetic-ferromagnetic phase (PF): π(x) =δ(x), ˆπ(ˆx) =δ(ˆx), ρ(y) = δ(y 1) and ˆρ(ŷ) =δ(ŷ 1) and the ferromagnetic-paramagnetic state (FP): π(x) =δ(x 1), ˆπ(ˆx) = δ(ˆx 1), ρ(y) =δ(y) andˆρ(ŷ) =δ(ŷ) are also analytically obtained for an arbitrary joint distribution P(ξ, η). Free energies corresponding to these solutions are provided from Eq. (8) as F FF = 1 N Tr ξ,ηp(ξ, η)lnp(ξ, η), F PP =(R 1 + R )ln, F FP = R ln 1 N Tr ξp(ξ)lnp(ξ), F PF = R 1 ln 1 N Tr ηp(η)lnp(η), (13) where subscripts stand for corresponding states and P(ξ) =Tr ηp(ξ,η) and P(η) =Tr ξ P(ξ, η) represent marginal distributions for the two source vectors ξ and η. 4.1 Case of Dense Matrix Perfect decoding is theoretically possible if F FF is the lowest among the above four. The corresponding parameter regime termed achievable rate region is shown in Fig. as an intersection of the inequalities R 1 + R H (ξ, η), R 1 H (ξ η), R H (η ξ), (14) where H (ξ, η) = 1 N Tr ξ,ηp(ξ, η)lnp(ξ, η), H (ξ η) =H (ξ, η) H (η), (15) H (η ξ) =H (ξ, η) H (ξ). 6

It is worth noticing that this coincides with the achievable rate region saturated by the optimal data compression in the current framework previously shown by Slepian-Wolf [7]. Namely, in the limit C 1,C, the current compression codes provide the optimal performance for arbitrary information sources P(ξ, η). 4. Case of Sparse Matrix For finite C 1 and C, the saddle point equations (9) can be solved numerically; but the properties of the system highly depend on the source distribution P(ξ, η), which makes it difficult to go further without any assumption on the distribution. As a simple but non-trivial example, we will focus here on a component-wise correlated joint distribution N ( ) 1+m1 S i + m τ i + qs i τ i P(S, τ )=, (16) 4 where a set of parameters m 1, m,andq characterize the data sources. To make Eq. (16) a distribution, these parameters must satisfy four inequalities: 1+m 1 +m +q 0, 1 m 1 +m q 0, 1+m 1 m q 0 and 1 m 1 m + q 0. 5 Decoding Solving Eq. (1) rigorously for decoding is computationally hard in general cases. However, one can construct a practical decoding algorithm based on the belief propagation (BP) [10, 5] or the Thouless- Anderson-Palmer (TAP) approach [11]. It has recently been shown that these two frameworks provide the same algorithm in the case of error-correcting codes [1], as mentioned in the previous chapter. This is also the case under the current context. For distribution (16), the algorithm derived from the BP-based frameworks becomes m 1 µi = a µi + m 1 + m a µi b i + qb i, m µi = b µi + m + m 1 a i b µi + qa i, 1+m 1 a µi + m b i + qa µi b i 1+m 1 a i + m b µi + qa i b µi ˆm 1 µi = u µ m 1 µj, ˆm µi = v µ m µj, (17) j L 1(µ)/i j L (µ)/i where we denote a µi tanh tanh 1 ˆm 1 νi, a i tanh ν M 1(i)/µ µ M 1(i) tanh 1 ˆm 1 µi, (18) and similarly for b s. Here, M 1 (i) andm (i) indicate the set of all indices of nonzero components in the ith column of the sparse matrices A and B, respectively. Equation (17) can be solved iteratively from the appropriate initial conditions. After obtaining a solution, approximated posterior means can be calculated for i =1,,,N as m 1 i = S i = a i + m 1 + m a i b i + qb i 1+m 1 a i + m b i + qa i b i, m i = τ i = b i + m + m 1 a i b i + qa i 1+m 1 a i + m b i + qa i b i, (19) 7

which provide an approximation to the Bayes-optimal estimators as ξ i =sign(m 1 i )andη i =sign(m i ), respectively. In order to investigate the efficacy of the current method for finite C 1 and C, we have numerically solved Eqs. (9) and (17) for K 1 = K =6andC 1 = C =3(R 1 = R =1/), results of which are summarized in Fig. 3. Numerical results for the saddle point equation (9) were obtained by an iterative method using 10 4 10 5 bin models for each probability distribution. 10 10 updates were sufficient for convergence in most cases. Similarly to the case of C 1, C, there can be four types of solutions corresponding to combinations of decoding success and failure on the two sources. The obtained phase diagram is quite similar to that for C 1, C. This implies that the current compression code theoretically has a good performance close to the optimal one that is saturated in the limit C 1,C, although the choice of C 1 = C = 3 is far from such limit. However, this does not directly mean that the suggested performance can be obtained in practice. Since the variables are updated locally in the BP-based decoding algorithm (17), it may become difficult to find the thermodynamically dominant state when there appear suboptimal states which have large basins of attraction. This suggests that the practical performance for the perfect decoding is determined by the spinodal points of the suboptimal states, similar to the case of channel coding [6]. To confirm this conjecture, we have numerically compared the practical limit of the perfect decoding obtained by the BP-based decoding algorithm (17) and the spinodal points of the non-ff solutions. These two results exhibit an excellent consistency supporting our conjecture. In the figure, the perfectly decodable region obtained by the BP-based algorithm for m 1 =0.7cases is indicated as the area surrounded by the spinodal points and the boundaries for the feasible region 1 + 0.7 m q =0and1+0.7+m q =0. This looks narrow compared to the theoretical limit, which might provide a negative impression on the practical utility of this code. Nevertheless, we still consider that the current method may be practically useful because the size of information that can be represented by parameters in the region is not as small as the area appears. Moreover, we can not achieve the retrieval using the time sharing scheme for the shaded regions. 6 Conclusion Today almost all of the digital communications are based on the network, even if only point-to-point communications are considered. Therefore we have investigated the problem of multi-terminal data compression, a typical topic in network-based communicating scheme. Furthermore, we have selected the simplest model of data compression, the Slepian-Wolf system to reveal its theoretical aspects. The system had been introduced in the data compression theorem of Slepain and Wolf in 1973, which corresponds to the source coding theorem given by Shannon. We have derived the achievable rate region given in the data compression theorem by making use of the linear compression codes when dense matrix limit is considered. Although the result shows only theoretical aspects, that is, infinite computational power is 8

Figure 3: Phase diagram for K 1 = K =6andC 1 = C = 3 code in the case of component-wise correlated information source (16). This figure shows that the feasible region in m q plane for m 1 =0.7isclassified into three states. Phase boundaries obtained by numerical methods are indicated by with errorbars (FF/PP and FF/PF) and (PF/PP). These are close to those for K 1 = K,C = C (curves and the vertical line). Practically decodable limits of the BP-based algorithm obtained for N =10 4 systems are indicated as. These are well evaluated by the spinodal points of non-ff solutions ( with errorbars). Inset: The practical limits are represented by the sizes of transmitted information. The horizontal and vertical axes show the entropy of the second source τ and the joint entropy. The shaded regions indicate that we can not achieve the retrieval using the time sharing scheme. assumed for decoding, the rediscovery of the data compression theorem appeared beautiful. Moreover, we have generalized the message passing decoding to that of multi-terminal cases to find that it really works well. The figure shows that our practical decoder outperforms the simple method using time sharing scheme. Acknowledgements The author thanks Y. Kabashima and T. Ohira for valuable discussions. This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Young Scientists (B), 1576088, 003. References [1] C. E. Shannon. A mathematical theory of communication. Bell Sys. Tech. J., 7:379 43 & 63 656, 1948. [] J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory, IT-3:337 343, 1977. 9

[3] T.M.CoverandJ.A.Thomas.Elements of Information Theory. Wiley, 1991. [4] R. G. Gallager. Low-density parity-check codes. IRE Trans. Inf. Theory, IT-8:1 8, 196. [5] D. J. C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE Trans. Inf. Theory, IT-45:399 431, 1999. [6] Tatsuto Murayama, Yoshiyuki Kabashima, David Saad, and Renato Vicente. Statistical physics of regular low-density parity-check error-correcting codes. Phys. Rev. E, 6:1577 1591, 000. [7] D. Slepian and J. K. Wolf. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory, IT-19:471 480, 1973. [8] N. Sourlas. Spin-glass models as error-correcting codes. Nature, 339:693 695, 1989. [9] K. Y. M. Wong and D. Sherrington. Graph bipartitioning and spin glasses on a random network of fixed finite valence. J. Phys. A, 0:L793 L799, 1987. [10] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988. [11] D. J. Thouless, P. W. Anderson, and R. G. Palmer. Solvable model of a spin glass. Philos. Mag., 35:593 601, 1977. [1] Y. Kabashima and D. Saad. Belief propagation vs. TAP for decoding corrupted messages. Europhys. Lett., 44:668 674, 1998. 10