For this experiment, all models were estimated from the training set and evaluated on the development set .

Experiments

For test set evaluations, we trained on the combination of the training and development sets (§2), to maximize the amount of training data for the final experiments.

Experiments

12We selected a threshold for binarization from a grid of 1001 points from 1 to 4 that maximized the accuracy of binarized predictions from a model trained on the training set and evaluated on the binarized development set .

System Description

On 10 preliminary runs with the development set , this variance

System Description

Table l: Pearson’s 7“ on the development set , for our full system and variations excluding each feature type.

We conducted experiments on the Penn Chinese Treebank (CTB) version 5.1 (Xue et al., 2005): Articles 001-270 and 400-1151 were used as the training set, Articles 301-325 were used as the development set , and Articles 271-300 were used

Experiment

We tuned the optimal number of iterations of perceptron training algorithm on the development set .

Experiment

We trained these three systems on the training set and evaluated them on the development set .

One example of this distinction that appeared in the development set is the pair any)» mawdm“‘my topic” (yo madeZ< + 6.

Experiments

F1 scores provide a more informative assessment of performance than word-level or character-level accuracy scores, as over 80% of tokens in the development sets consist of only one segment, with an average of one segmentation every 4.7 tokens (or one every 20.4 characters).

Experiments

Table 1 contains results on the development set for the model of Green and DeNero and our improvements.

We split each dataset into a training fold (70%), development fold (15%), and a test fold (15%): the training data are used to fit models; the development set are used to select parameters (anchor threshold M, document prior 04, regularization weight A); and final results are reported on the test fold.

Table 2: Results for the Penn Treebank development set , sentences of length g 40, for different annotation schemes implemented on top of the X-bar grammar.

Features

Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set .

Other Languages

(2013) only report results on the development set for the Berkeley-Rep model; however, the task organizers also use a version of the Berkeley parser provided with parts of speech from high-quality POS taggers for each language (Berkeley-Tags).

Other Languages

On the development set , we outperform the Berkeley parser and match the performance of the Berkeley-Rep parser.

There are 660 citations in the development set and 367 citation in the test set.

Citation Extraction Data

We then use the development set to learn the penalties for the soft constraints, using the perceptron algorithm described in section 3.1.

Citation Extraction Data

We instantiate constraints from each template in section 5.1, iterating over all possible labels that contain a B prefix at any level in the hierarchy and pruning all constraints with imp(c) < 2.75 calculated on the development set .

Soft Constraints in Dual Decomposition

We found it beneficial, though it is not theoretically necessary, to learn the constraints on a held-out development set , separately from the other model parameters, as during training most constraints are satisfied due to overfitting, which leads to an underestimation of the relevant penalties.

1We tuned AM+1 on the development set but found that AM+1 = 1 resulted in faster training and equal accuracy.

Expected BLEU Training

We fix 6 and re-optimize A in the presence of the recurrent neural network model using Minimum Error Rate Training (Och, 2003) on the development set (§5).

Experiments

ther lattices or the unique 100-best output of the phrase-based decoder and reestimate the log-linear weights by running a further iteration of MERT on the n-best list of the development set , augmented by scores corresponding to the neural network models.