PYTHY is a trainable extractive summarization engine that learns a log-linear
sentence ranking model by maximizing three metrics of sentence goodness: two of
the metrics are based on ROUGE scores against model summaries and one is based on
Semantic Content Unit (SCU) weights associated with sentences selected by past
peers that were obtained during the Pyramid evaluations. In addition to sentences
from the document set, our system considers simplified sentences for inclusion in
the generated summaries. The feature weights of the model are optimized on the
DUC 2005 data, with the final feature set for the submitted system being
determined by ROUGE-2 scores against the DUC 2006 model summaries. For the DUC
update task, the model was augmented with a novelty detection classifier.