1. Two different ways represent a distribution over several random variables: (1) product of conditional probabilities: \(p(x_1,x_2,x_3,x_4)=p(x_4)p(x_3|x_4)p(x_2|x_3,x_4)p(x_1|x_2,x_3,x_4)\) and (2) global energy function: \(p(x_1,x_2,x_3,x_4)=\frac{1}{Z}e^{-E(x_1,x_2,x_3,x_4)}\), where \(Z\) is the partition function.
  2. Directed graphical models use conditional probabilities, which undirected graphical models use energy functions that are a sum of several terms. Deep belief net(DBN) is a hybrid model.
    1. Probabilistic Model

    Two different ways represent a distribution over several random variables:

    • product of conditional probabilities: p(x1,x2,x3,x4)=p(x4)p(x3|x4)p(x2|x3,x4)p(x1|x2,x3,x4)

    • global energy function:

    p(x1,x2,x3,x4)=1Ze{-E(x1,x2,x3,x4)},

    where Zis the partition function.

    Directed graphical models use conditional probabilities(Bayesian networks), while undirected graphical models(Markov random fields, Boltzmann machines) use energy functions that are a sum of several terms. Deep belief net(DBN) is a hybrid model.

    Directed Graphs

    Directed graphs are useful for expressing causal relationships between random variables.

    • The joint distribution defined by the graph is given by the product of a conditional distribution for each node conditioned on its parents.

    • For example, the joint distribution over x1,,x7 factorizes:

    p(x)=p(x1)p(x2)p(x3)p(x4|x1,x2,x3)p(x5|x1,x3)p(x6|x4)p(x7|x4,x5)

    Markov Random Fields

    p(x)=1Zcc(xc)

    • Each potential function is a mapping from joint configurations of random variables in a clique to non-negative real numbers.

    • The choice of potential functions is not restricted to having specific probabilistic interpretations.

    • Potential functions are often represented as exponentials:

    p(x)=1Zcc(xc)=1Z(-cE(xc))=1Z(-E(x)) (Boltzmann distribution)

    • Computing Z is very hard, which represents a major limitation of undirected models.


     

    1. Singular Value Decomposition

    Singular Value Decomposition(SVD) is a factorization of a real or complex matrix. Formally, the singular value decomposition of an mn matrix M is a factorization of the form

    M=UV*

    where U is a mm unitary matrix,  is an mn rectangular diagonal matrix with nonnegative real numbers on the diagonal, and V*(the conjugate transpose of V: (V*)ij=Vji, for real matrix, it equals the transpose) is an nn unitary matrix.

    A complex square matrix U is unitary if U*U=UU*=I.

    The diagonal entries ij of  are known as the singular values of M, which means they are the square roots of the eigenvalues of matrix MM*. The m columns of U and n columns of V are called the left-singular vectors and right-singular vectors of M, respectively.

    The SVD and the eigendecomposition are closely related:

    • The left-singular vectors of M(columns of U) are eigenvectors of MM*.

    • The right-singular vectors of M(columns of V) are eigenvectors of M*M.

    • The non-zero singular values of M(diagonal entries of ) are the square roots of the non-zero eigenvalues of both M*M and MM*.

References:

  1. U Toronto CSC2535: http://www.cs.toronto.edu/~hinton/csc2535/lectures.html