Tale of Three Probabilistic Families: Discriminative, Descriptive and Generative

cs702 · on Oct 12, 2018

These guys have done what looks like an impressive amount of work... but I was disappointed to see there's no mention of generative language modeling at all and only a very brief mention of key advances in image generation over the past few years.

Examples of generative language modeling ignored by this paper (not even mentioned):

* Unsupervised Transformer - https://blog.openai.com/language-unsupervised/

* ELMo - https://arxiv.org/abs/1802.05365

* ULMFit - https://arxiv.org/abs/1801.06146

Examples of generative image modeling ignored by this paper (some are mentioned only in passing):

* Glow - https://blog.openai.com/glow/

* RealNVP - https://arxiv.org/abs/1605.08803

* NICE - https://arxiv.org/abs/1410.8516

* Pixel CNN - https://arxiv.org/abs/1606.05328 (and its cousin PixelRNN - https://arxiv.org/abs/1601.06759)

sqrt17 · on Oct 12, 2018

The text misappropriates (or, to use a more positive term, re-purposes) the notions of generative and discriminative models.

* Classically, a generative model is one that describes a statistical process that generates a whole observation (maybe plus some hidden part) and thus gives you scores that are consistent between different observations. An example for this would be (unidirectional) language models, where the total probabilities for different possible sentences can be compared

* Classically, discriminative models are models where you also get a probability over outputs, but only conditional on some observed facts. As an example, a CNN transforms an image to logits that express the likelihood of different classes for this image. Hence, CNNs (used in this fashion) are a form of discriminative modeling.

If we map this to their "probabilistic families", Generative Models (in the old sense) would be Descriptive in their book, while Discriminative stays Discriminative.

The closest counterpart to their "Generative" models would be generative (in the old sense) models that assume a common distribution over hidden variables and a process that generates the actual output. However, this only loosely fits how GANs are used today, where you don't really assume a distribution over the hidden seed for an output, and where you get a single output (i.e. a deterministic function from seed to output instead of a distribution).

Neural networks have deterministic functions where MRFs and Gibbs sampling and all the pre-2012 machine learning goodies have probability distributions. It's not really helpful to use the term "probabilistic families" here when the most important bits of what they describe are non-probabilistic.

prions · on Oct 12, 2018

This actually answers my question! I was getting mixed up with their re-purposing of generative and descriptive models.

Can you clear up what you mean by a hidden seed? My understanding of GANs were that the generator learns the distribution of the input data without ever directly observing it.

prions · on Oct 12, 2018

The paper mentions markov random fields as a descriptive model, but I've read that MRF's are a type of generative classifier. Are generative classifiers a type of descriptive model?

wodenokoto · on Oct 13, 2018

Sibling comment explains the articles issue with nomenclature:

https://news.ycombinator.com/item?id=18200882

rsp1984 · on Oct 12, 2018

The descriptive model specifies the probability distribution of the signal, based on an energy function defined on the signal.

I understand Discriminative and Generative models and their usage in inference of a quantity from a given signal.

However I don't quite get what a Descriptive model does. What is it used for? It doesn't sound like inference is the goal.