Prof. dr. C.J. (Kees) van Deemter

Prof. dr. C.J. (Kees) van Deemter

Natural Language Processing

I use computer algorithms in combination with controlled experiments to model how people speak and write. An example of my work is K. van Deemter (2016), "Computational Models of Referring: A Study in Cognitive Science", MIT Press (now freely downloadable). I outlined some of the challenges in my research area in my inaugural address at Utrecht University in September 2020. 


Here are some of the topics I'm particularly interested in.

1. Ambiguity and vagueness. I’m interested not only in the strengths of natural languages, but also their potential weaknesses. English (Dutch, Chinese,..) sentences can often be interpreted in a baffling number of ways, or they can lack precision. Why is this? Can it sometimes be a strength rather than a weakness?

  • Example: A while ago, I edited (with Stanley Peters) a book that helped to focus linguistis' and logicians' attention on the phenomena of ambiguity and underspecification. Many of the issues raised at the time are still essentially unresolved.
  • Example: Not Exactly: in Praise of Vagueness (OUP 2010) addresses readers both within and outside my field. It discusses how vagueness comes up in most kinds of communication, how it can be modelled in mathematical logic, and how it can be handled when computers speak or write.
  • Example: In this book chapter.pdf (prefinal draft), entitled The Elusive Benefits of Vagueness, Matthew Green and I (2019) try to find out when it helps a listener to be vague. Our conclusion, after a series of controlled experiments, is that the advantages ascribed to vagueness can often be explained by other factors (e.g., the fact that non-subitizable numbers are processed slowly). 

2. Computational psycholinguistics. With psychologists and other colleagues, I work on computational models of referring.

3. Logic in Language. We construct algorithms that capture the way logical structures are expressed in human language.

  • Example: Explaining Logical Formulas in English. This PhD project, which is part of an EU project on Natural Language Technology for Explainable Artificial Intelligence" (NL4XAI), aims to generate clear and succinct textual explanations of formulas of First Order Predicate Logic. Project has started in September 2020.
  • Example: Models of quantification. Guanyi Chen and I are trying to model how speakers choose combinations of quantified expressions to describe complex visual scenes.

4. Cross-linguistic Pragmatics. My colleagues and I use experiments and algorithms to understand the "Pragmatic" mechanisms that govern communication in different languages. Our current focus is on Chinese: 

  • Guanyi Chen's work on James Huang’s “coolness” hypothesis. This is the well-known but largely unproven idea that the languages of the Far East trade off clarity and brevity differently from those of Europe. To understand in detail what's going on, Guanyi makes computational models of the phenomena in question. 
  • Linda Li’s work on the computational modelling of lexical choice in Mandarin. Linda's starting point is the observation that most Chinese words have a short and a long version. She seeks to understand why this is, and what the choice between the long and the short version depends on.

A list of Master and/or Bachelor Thesis projects that I'd be keen to supervise can be found on my Teaching page.