Abstract

We develop and present the deep-structured conditional random field (CRF), a multi-layer CRF model in which each higher layer’s input observation sequence consists of the previous layer’s observation sequence and the resulted frame-level marginal probabilities. Such a structure can closely approximate the longrange state dependency using only linear-chain or zeroth-order CRFs by constructing features on the previous layer’s output (belief). Although the final layer is trained to maximize the log-likelihood of the state (label) sequence, each lower layer is optimized by maximizing the frame-level marginal probabilities. In this deepstructured CRF, both parameter estimation and state sequence inference are carried out efficiently layer-by-layer from bottom to top.We evaluate the deep-structured CRF on two natural language processing tasks: search query tagging and advertisement field segmentation. The experimental results demonstrate that the deepstructured CRF achieves word labeling accuracies that are significantly higher than the best results reported on these tasks using the same labeled training set.