Information Projection Algorithms and Belief Propagation
|
|
- Patience Phelps
- 5 years ago
- Views:
Transcription
1 χ 1 π(χ 1 ) Information Projection Algorithms and Belief Propagation Phil Regalia Department of Electrical Engineering and Computer Science Catholic University of America Washington, DC with J. M. Walsh, Drexel University P. Dewilde Workshop, Wassenaar, June 2008
2 Outline 1 Introduction 2 Bregman Distance 3 Iterative Projection Algorithms 4 Belief Propagation 5 Conclusion
3 Belief Propagation Iterative Convex Projection algorithms Belief Propagation (Pearl, 1986) has met with success in: Error correction decoding (low density parity-check codes, turbo codes,... ); Network diagnostics and link monitoring; Sensor self-localization; Distributed estimation in sensor networks; Lossy source quantization; Multi-user communications; Et cetera. Iterative Convex Projection algorithms have convergence proofs, unlike BP for which proofs assume either: Tree or forest dependency graph; Arbitrarily large factor graph girth.
4 Belief Propagation Iterative Convex Projection algorithms Belief Propagation (Pearl, 1986) has met with success in: Error correction decoding (low density parity-check codes, turbo codes,... ); Network diagnostics and link monitoring; Sensor self-localization; Distributed estimation in sensor networks; Lossy source quantization; Multi-user communications; Et cetera. Iterative Convex Projection algorithms have convergence proofs, unlike BP for which proofs assume either: Tree or forest dependency graph; Arbitrarily large factor graph girth.
5 Belief Propagation Iterative Convex Projection algorithms Belief Propagation (Pearl, 1986) has met with success in: Error correction decoding (low density parity-check codes, turbo codes,... ); Network diagnostics and link monitoring; Sensor self-localization; Distributed estimation in sensor networks; Lossy source quantization; Multi-user communications; Et cetera. Iterative Convex Projection algorithms have convergence proofs, unlike BP for which proofs assume either: Tree or forest dependency graph; Arbitrarily large factor graph girth.
6 Basic Query Can iterative convex projection algorithms lend insight into belief propagation? A priori yes, and the role of information projections and information geometry in BP is nothing new: Moher and Gulliver (Trans. IT-98) rephrased iterative decoding as information projections of Csiszár; Grant (ISIT-99) examined turbo decoding via information projections; Richardson (Trans. IT-03) viewed nonlinear dynamics of iterative decoding via (tacitly) information geometry; Ikeda, Tanaka and Amari (Trans. IT-04) rephrased all of the above in terms of information geometry. Key obstacle: Extrinsic information extraction goes against projections on invariant sets.
7 Basic Query Can iterative convex projection algorithms lend insight into belief propagation? A priori yes, and the role of information projections and information geometry in BP is nothing new: Moher and Gulliver (Trans. IT-98) rephrased iterative decoding as information projections of Csiszár; Grant (ISIT-99) examined turbo decoding via information projections; Richardson (Trans. IT-03) viewed nonlinear dynamics of iterative decoding via (tacitly) information geometry; Ikeda, Tanaka and Amari (Trans. IT-04) rephrased all of the above in terms of information geometry. Key obstacle: Extrinsic information extraction goes against projections on invariant sets.
8 Basic Query Can iterative convex projection algorithms lend insight into belief propagation? A priori yes, and the role of information projections and information geometry in BP is nothing new: Moher and Gulliver (Trans. IT-98) rephrased iterative decoding as information projections of Csiszár; Grant (ISIT-99) examined turbo decoding via information projections; Richardson (Trans. IT-03) viewed nonlinear dynamics of iterative decoding via (tacitly) information geometry; Ikeda, Tanaka and Amari (Trans. IT-04) rephrased all of the above in terms of information geometry. Key obstacle: Extrinsic information extraction goes against projections on invariant sets.
9 Bregman Distance Graph of a convex function f( ) is lower bounded by any tangent hyperplane: f(r) f(q) f(q) + f(q), r q This induces the gradient inequality: q r f(r) f(q) + f(q), r q whose discrepancy in turn induces the Bregman distance D f (r, q) = f(r) f(q) f(q), r q 0
10 Common examples Probability mass functions defined on M outcomes: q i = Pr(x = x i ), i = 0, 1, 2,..., M 1. Introduce convex domain q 1 0 D = q i :., q 1 + q q M 1 1 q M 1 0 Setting q 0 = 1 M 1 i = 1 q i, the negative Shannon entropy f(q) = is convex over D. ( M 1 1 i = 1 ) ( M 1 q i log 1 i = 1 ) M 1 q i + i = 1 q i log q i
11 The induced Bregman distance becomes D f (r, q) = M 1 i = 0 r i log r i q i and is recognized as the Kullback-Leibler divergence. Closely related is the Fenchel conjugate function f (θ) = sup q = log ( ) q, θ f(q) ( M 1 i = 0 ) exp(θ i ), with θ i = log q i q 0 This is convex over a domain D IR M of vectors having first component zero. ( Log probability ratios )
12 The induced Bregman distance becomes D f (r, q) = M 1 i = 0 r i log r i q i and is recognized as the Kullback-Leibler divergence. Closely related is the Fenchel conjugate function f (θ) = sup q = log ( ) q, θ f(q) ( M 1 i = 0 ) exp(θ i ), with θ i = log q i q 0 This is convex over a domain D IR M of vectors having first component zero. ( Log probability ratios )
13 Induced Bregman distance is now By associating M 1 i = 0 exp(ρ i ) D f (ρ, θ) = log M 1 exp(θ i ) i = 0 M 1 i = 0 exp(θ i ) (ρ i θ i ) M 1 j = 0 exp(θ j ) q i = exp(θ i) M 1 exp(θ j ) j = 0 r i = exp(ρ i) M 1 exp(ρ j ) j = 0 This assumes the more familiar form D f (ρ, θ) = M 1 i = 0 q i log q i r i
14 Observe that [ f(q)] i = df(q) dq i = log q i q 0 = θ i [ f (θ)] i = df (θ) dθ i = exp(θ i) i exp(θ j) = q i giving inverse maps, as f(q) and f (θ) are Legendre transforms of each other. In general, for convex f and its conjugate f in the Legendre class, their induced Bregman distances satisfy ( ) D f (r, q) = D f f(q), f(r).
15 Consider finally the energy function: f(q) = 1 2 M 1 i = 0 q i 2, q IR M the Bregman distance becomes D f (r, q) = 1 M 1 (r i q i ) 2. 2 i = 0 As f(q) = q, and f = f, this gives a symmetric Bregman distance: D f (r, q) = D f (q, r)
16 Bregman Projections If f(q) is a convex function over a domain D, and C is a convex subset of D, the Bregman projection of q onto C is π C (q) = arg min r C D f(r, q). and is characterized by the inequality ) ( ) D f (r, q) D f (r, π C (q) + D f π C (q), q, for all r C, or, equivalently, f(q) f ( πc (q) ), r π C (q) 0, for all r C.
17 Dykstra s Cyclic Projection Algorithm Seek minimization π C (q) = arg min r C D f(r, q) where C is the intersection of convex sets: C = First stab: Sometimes convergent algorithm, using C n+n = C n and initialization r n = π Cn (r n 1 ) = π Cn ( f ( f(r n 1 ) )) N n = 1 r 0 = q, s (N 1) = = s 1 = s 0 = 0. C n
18 Dykstra s Cyclic Projection Algorithm Seek minimization π C (q) = arg min r C D f(r, q) where C is the intersection of convex sets: C = N n = 1 Improved algorithm, r n n π C (q): r n = π Cn ( f ( f(r n 1 ) + s n N ) ) s n = f(r n 1 ) + s n N f(r n ) using C n+n = C n and initialization r 0 = q, s (N 1) = = s 1 = s 0 = 0. C n
19 Minimum Distance Algorithm Find closest members: D f (r, q ) = inf D f (r, q), where C 1 C 2 =. r C 1, q C 2 Cyclic projection algorithm becomes r n = π C1 ( f ( f(q n 1 ) + v n 1 ) ) v n = f(q n 1 ) + v n 1 f(r n ( q n = π C2 f ( f ) ) (r n ) + w n 1 w n = f (r n ) + w n 1 f (q n ) Convergent in the Euclidean case D f (r, q) = 1 2 k (r k q k ) 2.
20 Belief Propagation Iterative (sometimes fast ) algorithm to calculate marginal probability functions. Given binary vector x = [x 1,..., x M ] {0, 1} M, and probability function G(x), seek marginals f k (x k ) = 1 x 1 =0 1 1 x k 1 =0 x k+1 =0 1 G(x) x M =0 Calculation is hard since there are 2 M evaluations for x. BP is applicable when G(x) splits into simpler factors: G(x) = K g k (x) k = 1 Usually, each factor g k depends only on a subset of variables in x.
21 Message passing algorithm m xi g k (x i ) = variable x i to factor g k = β k m gl x i (x i ) l k m gk x i (x i ) = factor g k to variable x i = α i m xl g k (x l ) x n:n i g k (x) l i x 1 x 2 x 3 x 4 x 5 m x1 g 1 (x 1 ) m g6 x 5 (x 5 ) Outgoing message on an edge depends only on incoming messages from other edges at that node ( extrinsic information). If convergent, the beliefs b i (x i ) = β i m gl x i (x i ) l g 1 (x) g 2 (x) g 3 (x) g 4 (x) g 5 (x) g 6 (x) are thresholded: b i (0) > < b i(1). Ideally, beliefs would be marginals.
22 For projection interpretation, let x 1,..., x K be K copies of x, and consider extended likelihood function G(x 1,..., x K ) = K g k (x k ). k = 1 Original function is obtained with constraint x 1 = = x K = x. Two convex sets: Product distributions of 2 MK variables: P = { θ : f (θ) = K M k = 1 m = 1 } q k,m (x k m) Constraint (or sparse ) distributions { } Q = r : r(x 1,..., x K ) = 0 if x i x j for any i j
23 Information geometric view: At factor nodes: Given pmf q, if r P is a product distribution built from the marginals of q, then D f (q, s) = D f (q, r) + D f (r, s), for all s P }{{} 0 so that r is the Bregman projection of q onto P. At variable nodes: Given pmf q, if Q denotes a set of sparse distributions, then { β qi, i index set for Q; b i = 0, otherwise; satisfies D f (s, q) = D f (s, b) + D }{{} f (b, q), for all s Q. 0 so that b (containing beliefs) is Bregman projection onto Q.
24 Information geometric view: At factor nodes: Given pmf q, if r P is a product distribution built from the marginals of q, then D f (q, s) = D f (q, r) + D f (r, s), for all s P }{{} 0 so that r is the Bregman projection of q onto P. At variable nodes: Given pmf q, if Q denotes a set of sparse distributions, then { β qi, i index set for Q; b i = 0, otherwise; satisfies D f (s, q) = D f (s, b) + D }{{} f (b, q), for all s Q. 0 so that b (containing beliefs) is Bregman projection onto Q.
25 Calculation of marginal pmfs Ideal solution is p = ( π Q f (χ 1 ) ) (constrain to sparse distributions) χ 0 = ( ) π P f(p) (calculate marginal distributions) with initialization ( ) χ 1 = f g k (x k ) k
26 Belief propagation is the cyclic projection algorithm ξ n = π P (χ n 1 + σ n 1 ) xlskdjflskjldkfjls99lskdjf88 σ n = χ n 1 + σ n 1 ξ n p n = π Q ( f (ξ n + τ n 1 ) ) χ n = π P ( f(pn ) ) τ n = ξ n + τ n 1 χ n ξ n + τ n 1 χ n 1 + σ n 1 with initialization ( ) χ 1 = f g k (x k ) k σ 1 = τ 1 = 0 P ξ n χ n π P f ( ) χ n + σ n π Q f π P f ( f( ) ) ( f ( ) ) p n Q
27 Special Case: Replace negative Shannon entropy by energy function, to get Euclidean belief propagation : ξ n = π P (χ n 1 + σ n 1 ) σ n = χ n 1 + σ n 1 ξ n p n = π Q (ξ n + τ n 1 ) χ n = π P (p n ) τ n = ξ n + τ n 1 χ n Given arbitary convex sets P and Q, can show that {χ n } converges for any initial condition χ 1. Conventional belief propagation, by contrast, is convergent when either of the following apply: Factor graph is a tree/forest (or nearly so: large girth); Initialization χ 1 is a product distribution (or nearly so: π P (χ 1 ) χ 1 ).
28 Special Case: Replace negative Shannon entropy by energy function, to get Euclidean belief propagation : ξ n = π P (χ n 1 + σ n 1 ) σ n = χ n 1 + σ n 1 ξ n p n = π Q (ξ n + τ n 1 ) χ n = π P (p n ) τ n = ξ n + τ n 1 χ n Given arbitary convex sets P and Q, can show that {χ n } converges for any initial condition χ 1. Conventional belief propagation, by contrast, is convergent when either of the following apply: Factor graph is a tree/forest (or nearly so: large girth); Initialization χ 1 is a product distribution (or nearly so: π P (χ 1 ) χ 1 ).
29 Further perspectives: Rephrasing belief propagation in terms of cyclic convex projection algorithms suggests that convergence studies of the latter might extend to the former. Euclidean belief propagation is always convergent, although conventional (information projection-based) belief propagation can diverge is some settings. Continuity of projectors can be invoked to extend cases where belief propagation is proved to converge to good solutions. Can some modification to belief propagation give a more faithful transcription of Dykstra s algorithm?
30
Belief Propagation, Information Projections, and Dykstra s Algorithm
Belief Propagation, Information Projections, and Dykstra s Algorithm John MacLaren Walsh, PhD Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu
More informationBelief Propagation, Dykstra s Algorithm, and Iterated Information Projections
Belief Propagation, Dykstra s Algorithm, and Iterated Information Projections John MacLaren Walsh, Member, IEEE, Phillip A. Regalia Fellow, IEEE, Abstract Belief propagation is shown to be an instance
More informationA Survey of Information Geometry in Iterative Receiver Design, and Its Application to Cooperative Positioning
A Survey of Information Geometry in Iterative Receiver Design, and Its Application to Cooperative Positioning Master of Science Thesis in Communication Engineering DAPENG LIU Department of Signals and
More informationInformation Geometric view of Belief Propagation
Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural
More informationCharacterizing the Region of Entropic Vectors via Information Geometry
Characterizing the Region of Entropic Vectors via Information Geometry John MacLaren Walsh Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu Thanks
More informationTurbo Decoding as Constrained Optimization
urbo Decoding as Constrained Optimization John M. Walsh School of Elec. and Comp. Eng. Cornell niversity Ithaca, NY 14850 jmw56@cornell.edu Phillip A. Regalia Dept. Elec. Eng. & Comp. Sci. Catholic niv.
More informationBelief-Propagation Decoding of LDPC Codes
LDPC Codes: Motivation Belief-Propagation Decoding of LDPC Codes Amir Bennatan, Princeton University Revolution in coding theory Reliable transmission, rates approaching capacity. BIAWGN, Rate =.5, Threshold.45
More informationGraphical models and message-passing Part II: Marginals and likelihoods
Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available
More informationDistributed Source Coding Using LDPC Codes
Distributed Source Coding Using LDPC Codes Telecommunications Laboratory Alex Balatsoukas-Stimming Technical University of Crete May 29, 2010 Telecommunications Laboratory (TUC) Distributed Source Coding
More informationConnecting Belief Propagation with Maximum Likelihood Detection
Connecting Belief Propagation with Maximum Likelihood Detection John MacLaren Walsh 1, Phillip Allan Regalia 2 1 School of Electrical Computer Engineering, Cornell University, Ithaca, NY. jmw56@cornell.edu
More informationLower Bounds on the Graphical Complexity of Finite-Length LDPC Codes
Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel 2009 IEEE International
More information5. Density evolution. Density evolution 5-1
5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;
More informationVariational algorithms for marginal MAP
Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang
More informationJoint Iterative Decoding of LDPC Codes and Channels with Memory
Joint Iterative Decoding of LDPC Codes and Channels with Memory Henry D. Pfister and Paul H. Siegel University of California, San Diego 3 rd Int l Symposium on Turbo Codes September 1, 2003 Outline Channels
More informationEECS 545 Project Report: Query Learning for Multiple Object Identification
EECS 545 Project Report: Query Learning for Multiple Object Identification Dan Lingenfelter, Tzu-Yu Liu, Antonis Matakos and Zhaoshi Meng 1 Introduction In a typical query learning setting we have a set
More informationtopics about f-divergence
topics about f-divergence Presented by Liqun Chen Mar 16th, 2018 1 Outline 1 f-gan: Training Generative Neural Samplers using Variational Experiments 2 f-gans in an Information Geometric Nutshell Experiments
More informationLecture 5 Channel Coding over Continuous Channels
Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From
More informationAlternating Minimization and Alternating Projection Algorithms: A Tutorial
Alternating Minimization and Alternating Projection Algorithms: A Tutorial Charles L. Byrne Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA 01854 March 27, 2011 Abstract
More informationContext tree models for source coding
Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel
More informationProbabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationLDPC Codes. Intracom Telecom, Peania
LDPC Codes Alexios Balatsoukas-Stimming and Athanasios P. Liavas Technical University of Crete Dept. of Electronic and Computer Engineering Telecommunications Laboratory December 16, 2011 Intracom Telecom,
More informationDiscrete Markov Random Fields the Inference story. Pradeep Ravikumar
Discrete Markov Random Fields the Inference story Pradeep Ravikumar Graphical Models, The History How to model stochastic processes of the world? I want to model the world, and I like graphs... 2 Mid to
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationOn the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,
Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,
More informationEfficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation
Efficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation mark.coates@mcgill.ca McGill University Department of Electrical and Computer Engineering Montreal,
More informationAsynchronous Decoding of LDPC Codes over BEC
Decoding of LDPC Codes over BEC Saeid Haghighatshoar, Amin Karbasi, Amir Hesam Salavati Department of Telecommunication Systems, Technische Universität Berlin, Germany, School of Engineering and Applied
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationChapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationEE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes
EE229B - Final Project Capacity-Approaching Low-Density Parity-Check Codes Pierre Garrigues EECS department, UC Berkeley garrigue@eecs.berkeley.edu May 13, 2005 Abstract The class of low-density parity-check
More informationTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationCodes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008
Codes on Graphs Telecommunications Laboratory Alex Balatsoukas-Stimming Technical University of Crete November 27th, 2008 Telecommunications Laboratory (TUC) Codes on Graphs November 27th, 2008 1 / 31
More informationInformation Geometry
2015 Workshop on High-Dimensional Statistical Analysis Dec.11 (Friday) ~15 (Tuesday) Humanities and Social Sciences Center, Academia Sinica, Taiwan Information Geometry and Spontaneous Data Learning Shinto
More informationBayesian networks approximation
Bayesian networks approximation Eric Fabre ANR StochMC, Feb. 13, 2014 Outline 1 Motivation 2 Formalization 3 Triangulated graphs & I-projections 4 Successive approximations 5 Best graph selection 6 Conclusion
More informationLecture 4 : Introduction to Low-density Parity-check Codes
Lecture 4 : Introduction to Low-density Parity-check Codes LDPC codes are a class of linear block codes with implementable decoders, which provide near-capacity performance. History: 1. LDPC codes were
More informationCommon Randomness Principles of Secrecy
Common Randomness Principles of Secrecy Himanshu Tyagi Department of Electrical and Computer Engineering and Institute of Systems Research 1 Correlated Data, Distributed in Space and Time Sensor Networks
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationSTUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION
EE229B PROJECT REPORT STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION Zhengya Zhang SID: 16827455 zyzhang@eecs.berkeley.edu 1 MOTIVATION Permutation matrices refer to the square matrices with
More informationPerformance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels
Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels Jilei Hou, Paul H. Siegel and Laurence B. Milstein Department of Electrical and Computer Engineering
More informationMessage passing and approximate message passing
Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x
More informationOn Bit Error Rate Performance of Polar Codes in Finite Regime
On Bit Error Rate Performance of Polar Codes in Finite Regime A. Eslami and H. Pishro-Nik Abstract Polar codes have been recently proposed as the first low complexity class of codes that can provably achieve
More informationEXPECTATION PROPAGATION FOR DISTRIBUTED ESTIMATION IN SENSOR NETWORKS
EXPECTATION PROPAGATION FOR DISTRIBUTED ESTIMATION IN SENSOR NETWORKS John MacLaren Walsh Dept. of Electrical & Computer Eng. Drexel University Philadelphia, PA 19104 Phillip A. Regalia Dept. of Electrical
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationLOW-density parity-check (LDPC) codes were invented
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 54, NO 1, JANUARY 2008 51 Extremal Problems of Information Combining Yibo Jiang, Alexei Ashikhmin, Member, IEEE, Ralf Koetter, Senior Member, IEEE, and Andrew
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual
More informationIntegrating Correlated Bayesian Networks Using Maximum Entropy
Applied Mathematical Sciences, Vol. 5, 2011, no. 48, 2361-2371 Integrating Correlated Bayesian Networks Using Maximum Entropy Kenneth D. Jarman Pacific Northwest National Laboratory PO Box 999, MSIN K7-90
More informationAuxiliary-Function Methods in Optimization
Auxiliary-Function Methods in Optimization Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences University of Massachusetts Lowell Lowell,
More informationp L yi z n m x N n xi
y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen
More informationLearning MN Parameters with Approximation. Sargur Srihari
Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief
More information5 Mutual Information and Channel Capacity
5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several
More informationECEN 655: Advanced Channel Coding
ECEN 655: Advanced Channel Coding Course Introduction Henry D. Pfister Department of Electrical and Computer Engineering Texas A&M University ECEN 655: Advanced Channel Coding 1 / 19 Outline 1 History
More informationLegendre transformation and information geometry
Legendre transformation and information geometry CIG-MEMO #2, v1 Frank Nielsen École Polytechnique Sony Computer Science Laboratorie, Inc http://www.informationgeometry.org September 2010 Abstract We explain
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationQuasi-cyclic Low Density Parity Check codes with high girth
Quasi-cyclic Low Density Parity Check codes with high girth, a work with Marta Rossi, Richard Bresnan, Massimilliano Sala Summer Doctoral School 2009 Groebner bases, Geometric codes and Order Domains Dept
More informationGrowth Rate of Spatially Coupled LDPC codes
Growth Rate of Spatially Coupled LDPC codes Workshop on Spatially Coupled Codes and Related Topics at Tokyo Institute of Technology 2011/2/19 Contents 1. Factor graph, Bethe approximation and belief propagation
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationInformation Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18
Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationOn the Block Error Probability of LP Decoding of LDPC Codes
On the Block Error Probability of LP Decoding of LDPC Codes Ralf Koetter CSL and Dept. of ECE University of Illinois at Urbana-Champaign Urbana, IL 680, USA koetter@uiuc.edu Pascal O. Vontobel Dept. of
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationarxiv: v1 [math.oc] 29 Dec 2018
DeGroot-Friedkin Map in Opinion Dynamics is Mirror Descent Abhishek Halder ariv:1812.11293v1 [math.oc] 29 Dec 2018 Abstract We provide a variational interpretation of the DeGroot-Friedkin map in opinion
More informationAdaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes
Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes Xiaojie Zhang and Paul H. Siegel University of California, San Diego, La Jolla, CA 9093, U Email:{ericzhang, psiegel}@ucsd.edu
More informationLecture 16: FTRL and Online Mirror Descent
Lecture 6: FTRL and Online Mirror Descent Akshay Krishnamurthy akshay@cs.umass.edu November, 07 Recap Last time we saw two online learning algorithms. First we saw the Weighted Majority algorithm, which
More informationGraph-based codes for flash memory
1/28 Graph-based codes for flash memory Discrete Mathematics Seminar September 3, 2013 Katie Haymaker Joint work with Professor Christine Kelley University of Nebraska-Lincoln 2/28 Outline 1 Background
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationApplications of Information Geometry to Hypothesis Testing and Signal Detection
CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More informationBounding the Entropic Region via Information Geometry
Bounding the ntropic Region via Information Geometry Yunshu Liu John MacLaren Walsh Dept. of C, Drexel University, Philadelphia, PA 19104, USA yunshu.liu@drexel.edu jwalsh@coe.drexel.edu Abstract This
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationThe Geometry of Turbo-Decoding Dynamics
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 46, NO 1, JANUARY 2000 9 The Geometry of Turbo-Decoding Dynamics Tom Richardson Abstract The spectacular performance offered by turbo codes sparked intense
More informationA Unified Approach to Proximal Algorithms using Bregman Distance
A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department
More informationInformation geometry of Bayesian statistics
Information geometry of Bayesian statistics Hiroshi Matsuzoe Department of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology, Nagoya 466-8555, Japan Abstract.
More informationSingle-Gaussian Messages and Noise Thresholds for Low-Density Lattice Codes
Single-Gaussian Messages and Noise Thresholds for Low-Density Lattice Codes Brian M. Kurkoski, Kazuhiko Yamaguchi and Kingo Kobayashi kurkoski@ice.uec.ac.jp Dept. of Information and Communications Engineering
More informationStatus of Knowledge on Non-Binary LDPC Decoders
Status of Knowledge on Non-Binary LDPC Decoders Part I: From Binary to Non-Binary Belief Propagation Decoding D. Declercq 1 1 ETIS - UMR8051 ENSEA/Cergy-University/CNRS France IEEE SSC SCV Tutorial, Santa
More informationA New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction
A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization
More informationLecture 2 Linear Codes
Lecture 2 Linear Codes 2.1. Linear Codes From now on we want to identify the alphabet Σ with a finite field F q. For general codes, introduced in the last section, the description is hard. For a code of
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More information17 Variational Inference
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact
More informationIntroduction to Low-Density Parity Check Codes. Brian Kurkoski
Introduction to Low-Density Parity Check Codes Brian Kurkoski kurkoski@ice.uec.ac.jp Outline: Low Density Parity Check Codes Review block codes History Low Density Parity Check Codes Gallager s LDPC code
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationInformation geometry of mirror descent
Information geometry of mirror descent Geometric Science of Information Anthea Monod Department of Statistical Science Duke University Information Initiative at Duke G. Raskutti (UW Madison) and S. Mukherjee
More informationA Dykstra-like algorithm for two monotone operators
A Dykstra-like algorithm for two monotone operators Heinz H. Bauschke and Patrick L. Combettes Abstract Dykstra s algorithm employs the projectors onto two closed convex sets in a Hilbert space to construct
More informationSome New Information Inequalities Involving f-divergences
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 12, No 2 Sofia 2012 Some New Information Inequalities Involving f-divergences Amit Srivastava Department of Mathematics, Jaypee
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationThis article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing
More informationInference as Optimization
Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationOn the Typicality of the Linear Code Among the LDPC Coset Code Ensemble
5 Conference on Information Sciences and Systems The Johns Hopkins University March 16 18 5 On the Typicality of the Linear Code Among the LDPC Coset Code Ensemble C.-C. Wang S.R. Kulkarni and H.V. Poor
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More information