Kernel conditional random fields are introduced as a framework for
discriminative modeling of graph-structured data. A representer
theorem for conditional graphical models is given which shows how
kernel conditional random fields arise from risk minimization
procedures defined using Mercer kernels on labeled graphs. A
procedure for greedily selecting cliques in the dual representation is
then proposed, which allows sparse representations. By incorporating
kernels and implicit feature spaces into conditional graphical models,
the framework enables semi-supervised learning algorithms for
structured data through the use of graph kernels. The clique
selection and semi-supervised methods are demonstrated in synthetic
data experiments, and are also applied to the problem of protein
secondary structure prediction.