James Clarke & Research

Modelling Compression with Discourse Constraints

James Clarke and Mirella Lapata. 2007. Modelling Compression with Discourse Constraints. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning, pages 1–11. Prague, Czech Republic.

Received the Best Paper Award EMNLP-CoNLL 2007

Download talk slides.

Abstract

Sentence compression holds promise for many applications ranging from summarisation to subtitle generation. The task is typically performed on isolated sentences without taking the surrounding context into account, even though most applications would operate over entire documents. In this paper we present a discourse informed model which is capable of producing document compressions that are coherent and informative. Our model is inspired by theories of local coherence and formulated within the framework of Integer Linear Programming. Experimental results show significant improvements over a state-of-the-art discourse agnostic approach.

Bibtex

@inproceedings{Clarke:Lapata:07,
  author =       {James Clarke and Mirella Lapata},
  title =        {Modelling Compression with Discourse Constraints},
  booktitle =    {Proceedings of the Conference on Empirical Methods
                  in Natural Language Processing and on Computational
                  Natural Language Learning (EMNLP-CoNLL-2007)},
  pages =        {1--11},
  year =         {2007},
  address =      {Prague, Czech Republic},
  URL =          {http://jamesclarke.net/media/papers/clarke-lapata-emnlp07.pdf},
}