Generative Engine – #2

This is a non-parametric probabilistic engine for classification.

Supported kinds of features for Data Set

1. Quantitative variable, e.g. price value

2. Categorical variable, e.g. label value (red, white, blue, etc)

Inference  Steps

Inferred label = argmax P( target label | features )

Depending on the graph structure, we expand the formula like the below:

P(c | features) = P(c) P( q1 | q2, c) ... = P(c) P(q1,q2) ...

P(q1 | q2, c) = P(q1, q2)

Here, the probabilistic term depending on categorical variable can remove that categorical variable dependency, which leads to term with only quantitative variables.

After expanding the formula, you can get the below two kinds of term:

P(qi, qj, …) : term depending on only quantitative variables

P(ci, cj, …) : term depending on only categorical variables

For P(qi, qj), we simply use Multiplicative Kernel Density Estimation

For P(ci, cj) : we simply return “1”, which is the same behaviour of uniform distribution for this classification engine.

This is non parametric model so that we do not have any target optimized function to find hyper parameter.