data:image/s3,"s3://crabby-images/5dcaf/5dcafecb78324258e6fd4ec6fae307d2d53ae1be" alt=""
Red represents -1 potential, green represents +1 potential. More precisely, it's the diagram of the following model, where x's are {-1,1}-valued
data:image/s3,"s3://crabby-images/84974/84974a71dfb2e817d47a230aecce8ee7d2f8e845" alt=""
We'd like to find the odds of x1 being 1. You could do it by dividing sum of potentials over labellings with x1=1 by corresponding sum over labellings with x1=-1.
data:image/s3,"s3://crabby-images/161c2/161c2f6e2e660a8340e85b7cb515095c9b383e8c" alt=""
In a graph without cycles, such sums can be computed in time linear in the number of nodes by means of dynamic programming. In such cases it reduces to a series of local updates
data:image/s3,"s3://crabby-images/1e1e8/1e1e81770a2bed992a2c3d8a20af261407362618" alt=""
data:image/s3,"s3://crabby-images/3aa68/3aa682f7ff8c3095375b16e4d966679b3153a14c" alt=""
where evidence (or potential) from each branch is found by marginalizing over the separating variable
data:image/s3,"s3://crabby-images/ad349/ad3490aa50acbb3b50b6115ee62818bc76e29d4e" alt=""
For binary states, update equations become particularly simple.
data:image/s3,"s3://crabby-images/cf644/cf644ae7d69728299ebb50d1a5048153b86e214f" alt=""
Here, m_{i,j} is the "message" being sent from node i to node j. If we view message passing as a way of computing marginal conditioned on the evidence, then message {i,j} represents the log-odds that node i is positive, given all the evidence on the other side of of the edge i,j. If our graph has loops, then "other side" of the edge doesn't make sense, but you can use the equations anyway, and this is called "loopy BP".
For binary networks with at most 1 loop, Yair Weiss showed that even though loopy BP may give incorrect probability estimates, highest probability labels under BP will be the same.
Implementation of loopy BP is below. Basically you add up all the messages from neighbours, subtract the message of the target node, then "squash" the message towards zero using ArcTanh[Tanh[J]Tanh[x]] function. Procedure loopyBP takes a matrix of connections, local potential strengths, and returns a function that represents one iteration of BP. Below you see the first few iterations for the graphical model above
data:image/s3,"s3://crabby-images/86c56/86c569402c196c8b2c16c26d72d8512b76fe0b13" alt=""
data:image/s3,"s3://crabby-images/5521a/5521a4feac2e27774abb6367324ca0470e1a633f" alt=""
To get the final potential at any given node, just add up all the incoming messages. For the graph above, we can plot the probabilities and see that they fall on the correct side of zero after 10 iterations, so that is much smaller computational burden than exact marginalization.
data:image/s3,"s3://crabby-images/33013/33013c945f6aed870583756fdaf57ea7ca44ffc4" alt=""
Below you can see the estimates produced by running loopy BP for 200 iterations. The top bar at each node is the actual probability, the bottom bar is the estimated probability using loopy BP. You can see that the feedback makes the estimates overly confident.
data:image/s3,"s3://crabby-images/e5b86/e5b8646931ed289a1b14354f8a7d9bc677877c2e" alt=""
For graphs with more than one cycle, loopy BP is no longer guaranteed to converge, or to give MAP labelling when it converges. For instance, here's an animation of loopy BP applied to square Ising lattice with randomly initialized neighbour potentials
Some methods like the Convex-Concave procedure and Generalized Belief Propagation improve on BP's convergence and accuracy.
Mathematica notebook used to produce these figures