Basic

Distributions I Binomial
Distributions II Multinomial
KL Divergence

Sampling

Importance Sampling

MCMC includes

MCMC - a method that repeatedly draws random values for the parameters of a distribution based on the current values. Each sample of values is random, but the choices for the values are limited by the current state and the assumed prior distribution of the parameters

MCMC can be considered as a random walk that gradually converges to the true distribution

Metropolis–Hastings is a MCMC method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult

Gibbs Sampling is a MCMC method for obtaining a sequence of observations which are approximately from a specified multivariate probability distribution, when direct sampling is difficult

Stochastic Optimization

Stochastic optimization (SO) methods are optimization methods for minimizing or maximizing an objective function when randomness is present

SO Includes:

  • Stochastic Gradient Descent
  • Mini-Batch Stochastic Gradient Descent

Energy-based Model

Energy-based Model includes

Energy-based Model (EBM) captures dependencies by associating a scalar energy (a measure of compatibility) to each configuration of the variables

Restricted Boltzmann Machine is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs

RBMs are a variant of Boltzmann machines, with the restriction that their neurons must form a bipartite graph: a pair of nodes from each of the two groups of units (commonly referred to as the “visible” and “hidden” units respectively) may have a symmetric connection between them; and there are no connections between nodes within a group

RBM related:

  • the gradient-based Contrastive Divergence algorithm
  • Deep Belief Networks (stacking RBMs)

Unsupervised

  • RBMs
  • Autoencoder