Difference between Generative, Discriminating and Parametric, Nonparametric Algorithm/Model

3.8k views Asked by At

Here in SO I found the following explanation of generative and discriminitive algorithms:

"A generative algorithm models how the data was generated in order to categorize a signal. It asks the question: based on my generation assumptions, which category is most likely to generate this signal?

A discriminative algorithm does not care about how the data was generated, it simply categorizes a given signal."

And here is the definition for parametric and nonparametric algorithms

"Parametric: data are drawn from a probability distribution of specific form up to unknown parameters. Nonparametric: data are drawn from a certain unspecified probability distribution. "

So essentially can we say that generative and parametric algorithms assume underlying model whereas discriminitve and nonparametric algorithms dont assume any model?

thanks.

3

There are 3 answers

0
Ben Allison On BEST ANSWER

Say you have inputs X (probably a vector) and output Y (probably univariate). Your goal is to predict Y given X.

A generative method uses a model of the joint probability p(X,Y) to determine P(Y|X). It is thus possible given a generative model with known parameters to sample jointly from the distribution p(X,Y) to produce new samples of both input X and output Y (note they are distributed according to the assumed, not true, distribution if you do this). Contrast this to discriminative approaches which only have a model of the form p(Y|X). Thus provided with input X they can sample Y; however, they cannot sample new X.

Both assume a model. However, discriminative approaches assume only a model of how Y depends on X, not on X. Generative approaches model both. Thus given a fixed number of parameters you might argue (and many have) that it's easier to use them to model the thing you care about, p(Y|X), than the distribution of X since you'll always be provided with the X for which you wish to know Y.

Useful references: this (very short) paper by Tom Minka. This seminal paper by Andrew Ng and Michael Jordan.

The distinction between parametric and non-parametric models is probably going to be harder to grasp until you have more stats experience. A parametric model has a fixed and finite number of parameters regardless of how many data points are observed. Most probability distributions are parametric: consider a variable z which is the height of people, assumed to be normally distributed. As you observe more people, your estimate for the parameters \mu and \sigma, the mean and standard deviation of z, become more accurate but you still only have two parameters.

In contrast, the number of parameters in a non-parametric model can grow with the amount of data. Consider an induced distribution over peoples' heights which places a normal distribution over each observed sample, with mean given by the measurement and fixed standard deviation. The marginal distribution over new heights is then a mixture of normal distributions, and the number of mixture components increases with each new data point. This is a non-parametric model of people's height. This specific example is called a kernel density estimator. Popular (but more complicated) non parametric models include Gaussian Processes for regression and Dirichlet Processes.

A pretty good tutorial on non-parametrics can be found here, which constructs the Chinese Restaurant Process as the limit of a finite mixture model.

0
DatamineR On

I don't think you can say it. E.g. linear regression is a discriminative algorithm - you make an assumption about P(Y|X), and then estimate paramenters directly from the data, without making any assumption about P(X) or P(X|Y), as you would do in case of generative models. But at the same time, aby inference based on linear regression, including the properties of the paramenters, is a parametric estimation, as there is an assumption about behaviour of unobserved errors.

0
aerin On

Here I'm only talking about parametric/non-parametric. Generative/ discriminative is a separate concept.

Non-parametric model means you don't make any assumptions on the distribution of your data. For example, in the real world, data will not 100% follow theoretical distributions like Gaussian, beta, Poisson, Weibull, etc. Those distributions are developed for our need's to model the data.

On the other hand, parametric models try to completely explain our data using parameters. In practice, this way is preferred because it makes easier to define how the model should behave in different circumstances (for example, we already know the derivative/gradients of the model, what happens when we set the rate too high/too low in Poisson, etc.)