Our task will be the same as in the preceding class on (generative) classification. But this time, the class-conditional data distributions look very non-Gaussian, yet the linear discriminative boundary looks easy enough:

[A.] Get inspiration from the generative approach: choose the familiar softmax structure with linear discrimination bounderies for the posterior class probability
$$
p(\mathcal{C}_k|x_n,\theta) = \frac{e^{\theta_k^T x_n}}{\sum_j e^{\theta_j^T x_n}}
$$
but do not impose a Gaussian structure on the class features.

$\Rightarrow$ There are two key differences between the discriminative and generative approach:

In the discriminative approach, the parameters $\theta_k$ are not structured into $\{\mu_k,\Sigma,\pi_k \}$. This provides discriminative approach with more flexibility.

ML learning for the discriminative approach by optimization of conditional likelihood $\prod_n p(y_n|x_n,\theta)$ rather than joint likelihood $\prod_n p(y_n,x_n|\theta)$.

Let us perform ML estimation of $\theta$ on the data set from the introduction. To allow an offset in the discrimination boundary, we add a constant 1 to the feature vector $x$. We only have to specify the (negative) log-likelihood and the gradient w.r.t. $\theta$. Then, we use an off-the-shelf optimisation library to minimize the negative log-likelihood.

We plot the resulting maximum likelihood discrimination boundary. For comparison we also plot the ML discrimination boundary obtained from the generative Gaussian classifier from lesson 7.

Given $\hat{\theta}$, we can classify a new input $x_\bullet = [3.75, 1.0]^T$:

In [9]:

x_test=[3.75;1.0]println("P(C1|x•,θ) = $(p_1(x_test))")

P(C1|x•,θ) = 0.6476513551215346

The generative model gives a bad result because the feature distribution of one class is clearly non-Gaussian: the model does not fit the data well.

The discriminative approach does not suffer from this problem because it makes no assumptions about the feature distribition $p(x|y)$, it just estimates the conditional class distribution $p(y|x)$ directly.