Loopy Belief Propagation for Bipartite Maximum Weight b-matching
|
|
- Tamsyn Scott
- 5 years ago
- Views:
Transcription
1 Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027
2 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
3 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
4 Bipartite Weighted b-matching
5 Bipartite Weighted b-matching On bipartite graph, G = (U, V, E) {u 1,..., u n } U {v 1,..., v n } V E = (u i, v j ), i j u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4
6 Bipartite Weighted b-matching On bipartite graph, {u 1,..., u n } U {v 1,..., v n } V E = (u i, v j ), i j G = (U, V, E) u 1 u 2 u 3 u 4 A = weight matrix s.t. weight of edge v 1 v 2 v 3 v 4 (u i, v j ) = A ij
7 Bipartite Weighted b-matching Task: Find the maximum weight subset of E such that each vertex has exactly b neighbors.
8 Bipartite Weighted b-matching Task: Find the maximum weight subset of E such that each vertex has exactly b neighbors. Example: u 1 u 2 u 3 u 4 n = 4 b = 2 v 1 v 2 v 3 v 4
9 Bipartite Weighted b-matching Classical Application: Resource Allocation - Manual labor Workers u 1 u 2 u 3 u 4 - n workers - n tasks Tasks - Team of b workers needed per task. - skill of worker at performing task. A ij v 1 v 2 v 3 v 4
10 Bipartite Weighted b-matching Alternate uses of b-matching: - Balanced k-nearest-neighbors - Each node can only be picked k times. - Robust to translations of test data. - When test data is collected under different conditions (e.g. time, location, instrument calibration).
11 Bipartite Weighted b-matching Classical algorithms solve Max-Weighted b- Matching in O(bn 3 ) running time, such as: - Blossom Algorithm (Edmonds 1965) - Balanced Network Flow (Fremuth-Paeger, Jungnickel 1999)
12 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
13 Edge Weights As a Distribution - Bayati, Shah, and Sharma (2005) formulated the 1-matching problem as a probability distribution. - This work generalizes to arbitrary b.
14 Edge Weights As a Distribution Variables: Each vertex chooses b neighbors.
15 Example: u i - u 1 u 1 u 1 v 1 v 2 v 3 v 4 v 1 v 2 v 3 v 4 v 1 v 2 v 3 v 4 u 1 u 1 u 1 v 1 v 2 v 3 v 4 v 1 v 2 v 3 v 4 v 1 v 2 v 3 v 4
16 Edge Weights As a Distribution Variables: Each vertex chooses b neighbors.
17 Edge Weights As a Distribution Variables: Each vertex chooses b neighbors. For vertex, u i X i V, X i = b Similarly, for v j have variable Y j Note: variables have ( ) n b possible settings.
18 Edge Weights As a Distribution Weights as probabilities: Since we sum weights but multiply probabilities, exponentiate. φ(x i ) = exp( 1 2 φ(y j ) = exp( 1 2 v j X i A ij ) u i Y j A ij ) ( ) n b
19 Edge Weights As a Distribution Enforce b-matching: Neighbor choices must agree
20 Example: Invalid settings u 1 u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4 v 1
21 Edge Weights As a Distribution Enforce b-matching: Neighbor choices must agree
22 Edge Weights As a Distribution Enforce b-matching: Neighbor choices must agree Pairwise compatibility function: ψ(x i, Y j ) = = (v j X i u i Y j ). ( ) n b ( ) n b
23 Edge Weights As a Distribution P (X, Y ) = 1 Z n i,j=1 ψ(x i, Y j ) n k=1 φ(x k )φ(y j ) φ(x i ) = exp( 1 A ij ) φ(y j ) = exp( v j X i ψ(x i, Y j ) = (v j X i u i Y j ). A ij ) u i Y j
24 Edge Weights As a Distribution P (X, Y ) = 1 Z n i,j=1 ψ(x i, Y j ) n k=1 φ(x k )φ(y j ) φ(x i ) = exp( 1 A ij ) φ(y j ) = exp( v j X i ψ(x i, Y j ) = (v j X i u i Y j ). A ij ) u i Y j Ignore the Z normalization, P(X,Y) is exactly the exponentiated weight of the b-matching.
25 Edge Weights As a Distribution P (X, Y ) = 1 Z n i,j=1 ψ(x i, Y j ) n k=1 φ(x k )φ(y j ) φ(x i ) = exp( 1 A ij ) φ(y j ) = exp( v j X i ψ(x i, Y j ) = (v j X i u i Y j ). A ij ) u i Y j Also, since we re maximizing, ignore the 1/2 (makes the math more readable).
26 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
27 Standard Max-Product Send messages between variables: m Xi (Y j ) = 1 Z max X i φ(x i )ψ(x i, Y j ) m Yk (X i ) k j Fuse messages to obtain beliefs (or estimate of max-marginals): b(x i ) = 1 Z φ(x i) k m Yk (X i )
28 Standard Max-Product Converges to true maximum on any tree structured graph (Pearl 1986). We show that it converges to the correct maximum on our graph.
29 Standard Max-Product Converges to true maximum on any tree structured graph (Pearl 1986). We show that it converges to the correct maximum on our graph. But what about the belief vectors? ( ) n b -length message and
30 Efficient Max-Product ( ) n - Use algebraic tricks to reduce b -length message vectors to scalars. - Derive new update rule for scalar messages. - Use similar trick to maximize belief vectors efficiently. Let s speed through the math.
31 Efficient Max-Product Take advantage of binary ψ(x i, y j ) function: Message vectors consist of only two values m xi (y j ) m xi (y j ) max v j x i φ(x i ) k j max v j / x i φ(x i ) k j m yk (x i ), if u i y j m yk (x i ), if u i / y j
32 Efficient Max-Product Take advantage of binary ψ(x i, y j ) function: Message vectors consist of only two values m xi (y j ) m xi (y j ) max v j x i φ(x i ) k j max v j / x i φ(x i ) k j m yk (x i ), if u i y j m yk (x i ), if u i / y j If we rename these two values we can break up the product.
33 Efficient Max-Product Take advantage of binary function: Message vectors consist of only two values µ xi y j max v j x i φ(x i ) ν xi y j max v j / x i φ(x i ) ψ(x i, y j ) u k x i \v j µ ki u k x i \v j µ ki u k / x i \v j ν ki u k / x i \v j ν ki. If we rename these two values we can break up the product.
34 Efficient Max-Product Normalize messages by dividing whole vector by ν xi y j ˆµ xi y j = µ x i y j ν xi y j and ˆν xi y j = 1
35 Efficient Max-Product Normalize messages by dividing whole vector by ν xi y j ˆµ xi y j = µ x i y j ν xi y j and ˆν xi y j = 1 ( ) n b -length vector scalar
36 Efficient Max-Product Derive update rule: ˆµ xi y j = max j x i φ(x i ) k x i \j ˆµ ki max j / xi φ(x i ) k x i \j ˆµ ki = max j x i k x i exp(a ik ) k x i \j ˆµ ki max j / xi k x i exp(a ik ) k x i \j ˆµ ki = exp(a ij) max j xi k x i \j exp(a ik)ˆµ ki max j / xi k x i exp(a ij )ˆµ ki
37 Efficient Max-Product Derive update rule: ˆµ xi y j = max j x i φ(x i ) k x i \j ˆµ ki max j / xi φ(x i ) k x i \j ˆµ ki = max j x i k x i exp(a ik ) k x i \j ˆµ ki max j / xi k x i exp(a ik ) k x i \j ˆµ ki φ(x = exp(a i ) exp(a ij) max j xi k x i \j exp(a ik ) k x ik)ˆµ ki i max j / xi k x i exp(a ij )ˆµ ki
38 Efficient Max-Product Derive update rule: ˆµ xi y j = max j x i φ(x i ) k x i \j ˆµ ki max j / xi φ(x i ) k x i \j ˆµ ki = max j x i k x i exp(a ik ) k x i \j ˆµ ki max j / xi k x i exp(a ik ) k x i \j ˆµ ki = exp(a ij) max j xi k x i \j exp(a ik)ˆµ ki max j / xi k x i exp(a ij )ˆµ ki
39 Efficient Max-Product Derive update rule: ˆµ xi y j = max j x i φ(x i ) k x i \j ˆµ ki max j / xi φ(x i ) k x i \j ˆµ ki = max j x i k x i exp(a ik ) k x i \j ˆµ ki max j / xi k x i exp(a ik ) k x i \j ˆµ ki = exp(a ij) max j xi k x i \j exp(a ik)ˆµ ki max j / xi k x i exp(a ij )ˆµ ki
40 Efficient Max-Product After canceling terms message update simplifies to ˆµ xi y j = exp(a ij) exp(a il )ˆµ yl x i. l = and we maximize beliefs with max b(x i ) max φ(x i ) ˆµ yk x i x i x i k x i max exp(a ik )ˆµ yk x i x i k x i bth greatest setting of k for the term exp(a ik )m yk (x i ), s. t. k j Both these updates take O(bn) time per vertex.
41 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
42 Convergence Proof Sketch
43 Convergence Proof Sketch Assumptions: - Optimal b-matching is unique. - ɛ = difference between weight of best and 2nd best b-matching is constant. - Weights treated as constants.
44 Convergence Proof Sketch Basic mechanism: Unwrapped Graph, T 1. Pick root node 2. Copy all neighbors 3. Continue but don t backtrack 4. Continue to depth d u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4
45 the same asymptotic rate of O(bn 3 ). Furthe more, the algorithm shows promising perfo mance in machine learning applications. Convergence Proof Sketch 1 INTRODUCTION Basic mechanism: Unwrapped Graph, T 1. Pick root node 2. Copy all neighbors 3. Continue but don t backtrack 4. Continue to depth d u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4 u 1 Figure 1: Example b-matching M G on a bipartite gr v Dashed 1 v lines represent 2 v possible edges, solid 3 v lines re 4 b-matched edges. In this case b = 2. u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 2 EXPERIMENTS v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
46 Convergence Proof Sketch Basic mechanism: Unwrapped Graph, T 1. Pick root node 2. Copy all neighbors 3. Continue but don t backtrack 4. Continue to depth d u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4 u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
47 Convergence Proof Sketch Basic mechanism: Unwrapped Graph, T 1. Pick root node 2. Copy all neighbors 3. Continue but don t backtrack 4. Continue to depth d u 1 u 2 u 3 u 4 v 1 v 2 v 3 v 4 u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
48 Convergence Proof Sketch Basic mechanism: Unwrapped Graph, T 1. Pick root node 2. Copy all neighbors 3. Continue but don t backtrack 4. Continue to depth d u 1 u 2 u 3 u 4 Too complex to color... v 1 v 2 v 3 v 4 u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
49 Convergence Proof Sketch Basic mechanism: Unwrapped Graph, T - Construction follows loopy BP messages in reverse. - True max-marginals of root node are exactly belief at iteration d. - Max of unwrapped graph distribution is the maximum weight b-matching on tree.
50 Convergence Proof Sketch Proof by contradiction: What happens if optimal b-matching on T differs from optimal b-matching on G at root?
51 Convergence Proof Sketch Best b-matching on T u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3 Best b-matching on G copied onto T ple unwrapped graph T of G at 3 iterations. u 1 The matching M T is highlighted based on af nodes cannot have perfect b-matchings, but all inner nodes and the root do. One po ich is discussed in Lemma??. v 1 v 2 v 3 v 4 u 2 u First 3 cycle u 4 u 2 u 3 u 4 u 2 v 1 u 3 u 4 u 4 u 2 u 3 u 4 u 3 v 1 u 4 v 3 u 1 v 1 u 4 v 2 u 2... u 1 v 3 u 3 v 1 u 4 v 2 u 2 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
52 Convergence Proof Sketch There exists at least one path on T that alternates between edges that are b-matched in each b-matching. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
53 Convergence Proof Sketch There exists at least one path on T that alternates between edges that are b-matched in each b-matching. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
54 Convergence Proof Sketch Claim: If depth d is great enough, if we replace the blue edges of this path with the red edges in the optimal b-matching on T, we get a new b-matching with greater weight. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
55 Convergence Proof Sketch Claim: If depth d is great enough, if we replace the blue edges of this path with the red edges in the optimal b-matching on T, we get a new b-matching with greater weight. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
56 Convergence Proof Sketch Claim: If depth d is great enough, if we replace the blue edges of this path with the red edges in the optimal b-matching on T, we get a new b-matching with greater weight. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
57 Convergence Proof Sketch Modifying optimal b-matching produced better b-matching Original contradiction impossible. We can analyze the change in weight by looking only at edges on path. u 1 v 1 v 2 v 3 v 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 u 2 u 3 u 4 v 2 v 3 v 4 v 2 v 3 v 4 v 2 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 3 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 4 v 1 v 2 v 3 v 1 v 2 v 3 v 1 v 2 v 3
58 Convergence Proof Sketch Loopy BP converges to true maximum weight b-matching in d iterations d n ɛ max i,j A ij = O(n) ɛ = difference between weight of best and 2nd best b-matching.
59 Convergence Proof Sketch Loopy BP converges to true maximum weight b-matching in d iterations d n ɛ max i,j A ij = O(n) ɛ = difference between weight of best and 2nd best b-matching. Running time of full algorithm: O(bn 3 )
60 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
61 Experiments Running time comparison against GOBLIN graph optimization library. - Random weights. - Varied graph size n from 3 to Varied b from 1 to n/2
62 Experiments 3 Running time comparison against GOBLIN GOBLIN BP graph optimization library. 2 t BP median running time n b GOBLIN median running time t 1/3 1 0 Median Running time when B= n Median Running time when B=! n/2 " 4 GOBLIN 3 BP t 100 t 1/ n b n
63 Experiments: Translated Test Data $ % On toy data, translation cripples KNN but b-matching makes no classification errors. '()*+,*-./01*1 1 Accuracy & "!& Accuracy !%!$!! "! #" #! 231)451*,6/7,1*83, 0.2 b!matching k!nearest!neighbor k
64 Experiments MNIST Digits with pseudo-translation - Image data with background changes is like translation. - Train on MNIST digits 3, 5, and 8. - Test on new examples with various bluescreen textures.
65 Experiments Average accuracy Background texture BM KNN 5 6 Average accuracy BM KNN k
66 Outline 1. Bipartite Weighted b-matching 2. Edge Weights As a Distribution 3. Efficient Max-Product 4. Convergence Proof Sketch 5. Experiments 6. Discussion
67 Discussion Provably convergent belief propagation for a new type of graph (b-matchings). + Empirically faster than previous algorithms. + Parallelizeable - Only bipartite case. - Requires unique maximum. Interesting theoretical results coming out of sum-product for approximating marginals.
Graphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More informationBeyond Junction Tree: High Tree-Width Models
Beyond Junction Tree: High Tree-Width Models Tony Jebara Columbia University February 25, 2015 Joint work with A. Weller, N. Ruozzi and K. Tang Data as graphs Want to perform inference on large networks...
More informationMessage passing and approximate message passing
Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory
UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationApproximating the Permanent with Belief Propagation
Approximating the Permanent with Belief Propagation Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 {bert,jebara}@cs.columbia.edu http://www.cs.columbia.edu/learning
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationMessage Passing Algorithms and Junction Tree Algorithms
Message Passing lgorithms and Junction Tree lgorithms Le Song Machine Learning II: dvanced Topics S 8803ML, Spring 2012 Inference in raphical Models eneral form of the inference problem P X 1,, X n Ψ(
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationEfficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each
More informationCutting Plane Training of Structural SVM
Cutting Plane Training of Structural SVM Seth Neel University of Pennsylvania sethneel@wharton.upenn.edu September 28, 2017 Seth Neel (Penn) Short title September 28, 2017 1 / 33 Overview Structural SVMs
More informationProbablistic Graphical Models, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07
Probablistic Graphical odels, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07 Instructions There are four questions in this homework. The last question involves some programming which
More informationMIRA, SVM, k-nn. Lirong Xia
MIRA, SVM, k-nn Lirong Xia Linear Classifiers (perceptrons) Inputs are feature values Each feature has a weight Sum is the activation activation w If the activation is: Positive: output +1 Negative, output
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More informationBudget-Optimal Task Allocation for Reliable Crowdsourcing Systems
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems Sewoong Oh Massachusetts Institute of Technology joint work with David R. Karger and Devavrat Shah September 28, 2011 1 / 13 Crowdsourcing
More informationCS 6820 Fall 2014 Lectures, October 3-20, 2014
Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationClique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018
Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More information6.046 Recitation 11 Handout
6.046 Recitation 11 Handout May 2, 2008 1 Max Flow as a Linear Program As a reminder, a linear program is a problem that can be written as that of fulfilling an objective function and a set of constraints
More informationTopological properties of Z p and Q p and Euclidean models
Topological properties of Z p and Q p and Euclidean models Samuel Trautwein, Esther Röder, Giorgio Barozzi November 3, 20 Topology of Q p vs Topology of R Both R and Q p are normed fields and complete
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction
More informationMaximum Weight Matching via Max-Product Belief Propagation
Maximum Weight Matching via Max-Product Belief Propagation Mohsen Bayati Department of EE Stanford University Stanford, CA 94305 Email: bayati@stanford.edu Devavrat Shah Departments of EECS & ESD MIT Boston,
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 18 The Junction Tree Algorithm Collect & Distribute Algorithmic Complexity ArgMax Junction Tree Algorithm Review: Junction Tree Algorithm end message
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More information17 Variational Inference
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact
More informationTotally Corrective Boosting Algorithms that Maximize the Margin
Totally Corrective Boosting Algorithms that Maximize the Margin Manfred K. Warmuth 1 Jun Liao 1 Gunnar Rätsch 2 1 University of California, Santa Cruz 2 Friedrich Miescher Laboratory, Tübingen, Germany
More informationMinimum Weight Perfect Matching via Blossom Belief Propagation
Minimum Weight Perfect Matching via Blossom Belief Propagation Sungsoo Ahn Sejun Park Michael Chertkov Jinwoo Shin School of Electrical Engineering, Korea Advanced Institute of Science and Technology,
More informationRepresentational Power of Restricted Boltzmann Machines and Deep Belief Networks. Nicolas Le Roux and Yoshua Bengio Presented by Colin Graber
Representational Power of Restricted Boltzmann Machines and Deep Belief Networks Nicolas Le Roux and Yoshua Bengio Presented by Colin Graber Introduction Representational abilities of functions with some
More informationLecture 3. 1 Polynomial-time algorithms for the maximum flow problem
ORIE 633 Network Flows August 30, 2007 Lecturer: David P. Williamson Lecture 3 Scribe: Gema Plaza-Martínez 1 Polynomial-time algorithms for the maximum flow problem 1.1 Introduction Let s turn now to considering
More information8. INTRACTABILITY I. Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley. Last updated on 2/6/18 2:16 AM
8. INTRACTABILITY I poly-time reductions packing and covering problems constraint satisfaction problems sequencing problems partitioning problems graph coloring numerical problems Lecture slides by Kevin
More informationKaggle.
Administrivia Mini-project 2 due April 7, in class implement multi-class reductions, naive bayes, kernel perceptron, multi-class logistic regression and two layer neural networks training set: Project
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationUnwrapping phase images using the sum-product algorithm (loopy probability propagation)
Unwrapping phase images using the sum-product algorithm (loopy probability propagation) Brendan Frey Electrical and Computer Engineering University of Toronto Collaborators Approximate inference algorithms
More information11 The Max-Product Algorithm
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced
More informationDOMINO TILINGS INVARIANT GIBBS MEASURES
DOMINO TILINGS and their INVARIANT GIBBS MEASURES Scott Sheffield 1 References on arxiv.org 1. Random Surfaces, to appear in Asterisque. 2. Dimers and amoebae, joint with Kenyon and Okounkov, to appear
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationStatistical Physics on Sparse Random Graphs: Mathematical Perspective
Statistical Physics on Sparse Random Graphs: Mathematical Perspective Amir Dembo Stanford University Northwestern, July 19, 2016 x 5 x 6 Factor model [DM10, Eqn. (1.4)] x 1 x 2 x 3 x 4 x 9 x8 x 7 x 10
More informationOpen Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers
JMLR: Workshop and Conference Proceedings vol 35:1 8, 014 Open Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers Balázs Kégl LAL/LRI, University
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationLow-Density Parity-Check Codes
Department of Computer Sciences Applied Algorithms Lab. July 24, 2011 Outline 1 Introduction 2 Algorithms for LDPC 3 Properties 4 Iterative Learning in Crowds 5 Algorithm 6 Results 7 Conclusion PART I
More information10.3 Matroids and approximation
10.3 Matroids and approximation 137 10.3 Matroids and approximation Given a family F of subsets of some finite set X, called the ground-set, and a weight function assigning each element x X a non-negative
More informationPayment Rules for Combinatorial Auctions via Structural Support Vector Machines
Payment Rules for Combinatorial Auctions via Structural Support Vector Machines Felix Fischer Harvard University joint work with Paul Dütting (EPFL), Petch Jirapinyo (Harvard), John Lai (Harvard), Ben
More informationFault-Tolerant Consensus
Fault-Tolerant Consensus CS556 - Panagiota Fatourou 1 Assumptions Consensus Denote by f the maximum number of processes that may fail. We call the system f-resilient Description of the Problem Each process
More informationLearning theory. Ensemble methods. Boosting. Boosting: history
Learning theory Probability distribution P over X {0, 1}; let (X, Y ) P. We get S := {(x i, y i )} n i=1, an iid sample from P. Ensemble methods Goal: Fix ɛ, δ (0, 1). With probability at least 1 δ (over
More informationThe min cost flow problem Course notes for Optimization Spring 2007
The min cost flow problem Course notes for Optimization Spring 2007 Peter Bro Miltersen February 7, 2007 Version 3.0 1 Definition of the min cost flow problem We shall consider a generalization of the
More informationLearning in Markov Random Fields An Empirical Study
Learning in Markov Random Fields An Empirical Study Sridevi Parise, Max Welling University of California, Irvine sparise,welling@ics.uci.edu Abstract Learning the parameters of an undirected graphical
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationLecture 4: An FPTAS for Knapsack, and K-Center
Comp 260: Advanced Algorithms Tufts University, Spring 2016 Prof. Lenore Cowen Scribe: Eric Bailey Lecture 4: An FPTAS for Knapsack, and K-Center 1 Introduction Definition 1.0.1. The Knapsack problem (restated)
More informationBELIEF-PROPAGATION FOR WEIGHTED b-matchings ON ARBITRARY GRAPHS AND ITS RELATION TO LINEAR PROGRAMS WITH INTEGER SOLUTIONS
BELIEF-PROPAGATION FOR WEIGHTED b-matchings ON ARBITRARY GRAPHS AND ITS RELATION TO LINEAR PROGRAMS WITH INTEGER SOLUTIONS MOHSEN BAYATI, CHRISTIAN BORGS, JENNIFER CHAYES, AND RICCARDO ZECCHINA Abstract.
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationA Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles
A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles Jinwoo Shin Department of Electrical Engineering Korea Advanced Institute of Science and Technology Daejeon,
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationFour-coloring P 6 -free graphs. I. Extending an excellent precoloring
Four-coloring P 6 -free graphs. I. Extending an excellent precoloring Maria Chudnovsky Princeton University, Princeton, NJ 08544 Sophie Spirkl Princeton University, Princeton, NJ 08544 Mingxian Zhong Columbia
More informationMachine Learning for Structured Prediction
Machine Learning for Structured Prediction Grzegorz Chrupa la National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Grzegorz Chrupa la (DCU) Machine Learning for
More informationLecture 8: PGM Inference
15 September 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I 1 Variable elimination Max-product Sum-product 2 LP Relaxations QP Relaxations 3 Marginal and MAP X1 X2 X3 X4
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne
More informationCS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding
CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding Tim Roughgarden October 29, 2014 1 Preamble This lecture covers our final subtopic within the exact and approximate recovery part of the course.
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationExpectation propagation for symbol detection in large-scale MIMO communications
Expectation propagation for symbol detection in large-scale MIMO communications Pablo M. Olmos olmos@tsc.uc3m.es Joint work with Javier Céspedes (UC3M) Matilde Sánchez-Fernández (UC3M) and Fernando Pérez-Cruz
More informationLinear Response for Approximate Inference
Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University
More informationCourse 16:198:520: Introduction To Artificial Intelligence Lecture 9. Markov Networks. Abdeslam Boularias. Monday, October 14, 2015
Course 16:198:520: Introduction To Artificial Intelligence Lecture 9 Markov Networks Abdeslam Boularias Monday, October 14, 2015 1 / 58 Overview Bayesian networks, presented in the previous lecture, are
More information3 : Representation of Undirected GM
10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:
More information(i) The optimisation problem solved is 1 min
STATISTICAL LEARNING IN PRACTICE Part III / Lent 208 Example Sheet 3 (of 4) By Dr T. Wang You have the option to submit your answers to Questions and 4 to be marked. If you want your answers to be marked,
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationPart 4: Conditional Random Fields
Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationOutline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual
Outline: Ensemble Learning We will describe and investigate algorithms to Ensemble Learning Lecture 10, DD2431 Machine Learning A. Maki, J. Sullivan October 2014 train weak classifiers/regressors and how
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationConsensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University
Consensus Algorithms for Camera Sensor Networks Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University Camera Sensor Networks Motes Small, battery powered Embedded camera Wireless interface
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University
More informationApproximate Message Passing
Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters
More informationThe min cost flow problem Course notes for Search and Optimization Spring 2004
The min cost flow problem Course notes for Search and Optimization Spring 2004 Peter Bro Miltersen February 20, 2004 Version 1.3 1 Definition of the min cost flow problem We shall consider a generalization
More information5. Sum-product algorithm
Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationThe Ising model and Markov chain Monte Carlo
The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationBelief propagation decoding of quantum channels by passing quantum messages
Belief propagation decoding of quantum channels by passing quantum messages arxiv:67.4833 QIP 27 Joseph M. Renes lempelziv@flickr To do research in quantum information theory, pick a favorite text on classical
More informationHomomorphism Testing
Homomorphism Shpilka & Wigderson, 2006 Daniel Shahaf Tel-Aviv University January 2009 Table of Contents 1 2 3 1 / 26 Algebra Definitions 1 Algebra Definitions 2 / 26 Groups Algebra Definitions Definition
More informationAdaptive Crowdsourcing via EM with Prior
Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationLecture 3: The Shape Context
Lecture 3: The Shape Context Wesley Snyder, Ph.D. UWA, CSSE NCSU, ECE Lecture 3: The Shape Context p. 1/2 Giant Quokka Lecture 3: The Shape Context p. 2/2 The Shape Context Need for invariance Translation,
More informationEnergy minimization via graph-cuts
Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph
More informationExpectation Propagation in Factor Graphs: A Tutorial
DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More information