View on GitHub

Research Review Notes

Summaries of academic research papers

Semi-supervised Learning with Deep Generative Models


Idea

The authors aim to tackle the issue of insufficient labeled data in many domains that prompt the usage of semi-supervised learning.

The authors present a stochastic variational inference algorithm that allows for joint optimization of both model and variational parameters, and that is scalable to large data sets.

Background

The simplest algorithms for semi-supervised learning is a self-training scheme in which the model is bootstrapped with training data, and the predictions made with high confidence are used as labeled examples in an iterative process. This method is heuristic and prone to errors because poor predictions might be reinforced.

Method

The authors wish to use a variational autoencoder to compress the input representation and then classify the compressed representation.

The encoder parameters are $\phi$ and the decoder parameters are $\theta$.

There are 2 components in the overall model:

The inference network $q_{\phi}(z|x)$ is used on both the labeled and unlabeled data sets. The approximate posterior learned by the encoder is then used as a feature extractor to train the classifier.

$q_{\phi}(z,y|x)$ can be treated as a continuous-discrete mixture model, since $z$ is a continuous Gaussian distribution and $y$ is a multinomial distribution.

Experiment Setup

Architecture

architecture

Observations