Dagstuhl Perspectives Workshop on Human In the Loop Learning through Grounded Interaction in Games
This Dagstuhl Perspectives Workshop aims to bring together the communities working on the related areas of learning through interaction, (conversational) agents in games, dialogue and interaction, and collecting judgments from crowds through games, to make each community aware of the most recent developments in the other areas. We also intend to discuss current challenges, and whether advances in one area (e.g., grounded interaction) can benefit other areas (e.g., interactive learning).
Over the past few years, there has been a decisive move in Artificial Intelligence (AI) towards human-centered intelligence and AI models that can learn through interaction. This shift is the result of the appearance of Large Language Models (LLMs) able to act as Intelligent Assistants such as ChatGPT, Llama, or Gemini, and achieving an entire new level of performance in many AI tasks. Much of the success of these models is due to training regimes combining supervised learning and learning from interaction with humans, such as Reinforcement Learning Through Human Feedback (Christiano et al, 2017; Ouyang et al, 2022). The most recent among such models, such as GPT-4, are trained with multimodal data and capable of producing output in different modalities. However, these models also have well-known issues, such as hallucinations, so that researchers talk of a Generative AI Paradox (West et al, 2023).
In parallel with the above developments, there has also been substantial progress on grounded interaction – developing models aware of the situation in which they operate (a physical world in the case of robots, a virtual world in the case of artificial agents) and able to, e.g., understand / produce references to this situation (Fitzgerald et al, 2013; Kazemzadeh et al, 2014; Kennington & Schlangen 2017; Chevalier-Boisvert et al, 2019; Testoni and Bernardi 2021; Suglia et al, 2024) perhaps through negotiation (Clark and Brennan, 1990). However, the communication between the intelligent assistant and grounded interaction communities is still limited (Krishnamurthy & Kollar 2013).
A particularly promising approach to study learning through grounded interaction with human agents is virtual world games: games in which conversational agents impersonating characters can learn to perform tasks, or improve their communicative ability, by interacting with human players in platforms such as Minecraft or Light (Johnson et al, 2016; Urbanek et al, 2019; Narayan-Chen et al, 2019; Szlam et al, 2019; Kiseleva et al, 2022; Zhou et al, 2023). Games have been shown to be a promising platform for collecting data from thousands of players (Ahn, 2006; Yu et al, 2023); virtual worlds approach the complexity of the real world; and virtual agents operating in such virtual worlds need to be able to develop a variety of interactional skills to be perceived as 'real' (Schlangen, 2023; Chamalasetti et al, 2023).
Researchers from the University of Utrecht and throughout the world have organized a workshop to be held at Schloß Dagstuhl 2-6 December to discuss these issues, take stock, and propose a Manifesto inspiring new research directions.