Variational Inference
Preliminaries It is usually the case that we have a dataset and a parametrized family of distributions . We would like to find the parameters that best describe the data. This is typically done using [[MLE and MAP|maximum likelihood estimation (MLE)]]. In this method, the optimal parameters are those that maximize the log likelihood of the data. Mathematically speaking, $$ \hat{\theta}_\mathrm{MLE} = \arg\max_\theta \frac{1}{N}\sum_{i=1}^{N}\log p_{\theta}(x_i)....