James Clarke & Research

I am a post-doctoral researcher in the Cognitive Computation Group at The University of Illinois at Urbana-Champaign. You can find my contact details on the front page.

My research interests lie in the field of natural language processing. I am currently interested in leveraging context expressed through a world model to help natural language interpretation and understanding. My other interests are in using integer linear programming and other methods to create more global models for natural language processing problems. This is closely related to my PhD research in which I focused on developing methods to process, extract and summarise information from large natural text collections. In particular I formalised the compression task within an integer linear programming framework which allowed new and existing models to be supplemented with linguistic constraints.

Interact

We are currently in the process of creating a resource for people interested in Integer Linear Programming for Natural Language Processing.

I co-organised the Workshop on Integer Linear Programming for Natural Language Processing hosted at NAACL HLT 2009. More information can be found on the ILP for NLP wiki.

You can help our community by participating in our Language Experiments.

Publications

PhD Thesis

James Clarke. 2008. Global Inference for Sentence Compression: An Integer Linear Programming Approach. PhD Thesis, University of Edinburgh.
Full Details  ∞  pdf

Journal Papers

James Clarke and Mirella Lapata. 2008. Global Inference for Sentence Compression: An Integer Linear Programming Approach. In Journal of Articifial Intelligence Research, vol. 31, pages 399–429.
Full Details  ∞  pdf

Conference Papers

Jacob Eisenstein, James Clarke, Dan Goldwasser and Dan Roth. 2009. Reading to Learn: Constructing Features from Semantic Abstracts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 958–967. Singapore.
Full Details  ∞  pdf

Sebastian Riedel and James Clarke. 2009. Revisiting Optimal Decoding for Machine Translation IBM Model 4. In Proceedings of the NAACL HLT 2009 Short Papers, pages 5–8. Boulder, Colorado.
Full Details  ∞  pdf

James Clarke and Mirella Lapata. 2007. Modelling Compression with Discourse Consraints. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning, pages 1–11. Prague, Czech Republic.
Full Details  ∞  pdf  ∞  Received the Best Paper Award EMNLP-2007

Sebastian Riedel and James Clarke. 2006. Incremental Integer Linear Programming for Non-projective Dependency Parsing. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 129–137. Sydney, Australia.
Full Details  ∞  pdf

James Clarke and Mirella Lapata. 2006. Constraint-Based Sentence Compression: An Integer Programming Approach. In Proceedings of the COLING/ACL 2006 Main Conference Poster Session, pages 144–151. Sydney, Australia.
Full Details  ∞  pdf

James Clarke and Mirella Lapata. 2006. Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 377–384. Sydney, Australia.
Full Details  ∞  pdf

The resources page contains the compression corpora I used in my compression experiments.

People

Other people you should really be paying attention to: