Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues simultaneously.
Others, including earlier versions of our system, bury discourse functions inside other modules, such as natural language interpretation or the back-end interface. An innovation of this work is the compartmentalization of discourse processing into three generically definable components--Dialogue Management, Context Tracking, and Pragmatic Adaptation (described in Section 1 below)--and the software control structure for interaction between these and other components of a spoken dialogue system (Section 2).
In spoken dialogue systems, Partially Observable Markov Decision Processes (POMDPs) provide a formal framework for making dialogue management decisions under uncertainty, but efﬁciency and interpretability considerations mean that most current statistical dialogue managers are only MDPs. These MDP systems encode uncertainty explicitly in a single state representation.
Over several years, we have developed an approach to spoken dialogue systems that includes rule-based and trainable dialogue managers, spoken language understanding and generation modules, and a comprehensive dialogue system architecture. We present a Reinforcement Learning-based dialogue system that goes beyond standard rule-based models and computes on-line decisions of the best dialogue moves. The key concept of this work is that we bridge the gap between manually written dialog models (e.g.
In this paper we present results on developing robust natural language interfaces by combining shallow and partial interpretation with dialogue management. The key issue is to reduce the effort needed to adapt the knowledge sources for parsing and interpretation to a necessary minimum. In the paper we identify different types of information and present corresponding computational models. The approach utilizes an automatically generated lexicon which is updated with information from a corpus of simulated dialogues. The grammar is developed manually from the same knowledge sources. ...
Adaptive Dialogue Systems are rapidly becoming part of our everyday lives. As they progress and adopt new technologies they become more intelligent and able to adapt better and faster to their environment. Research in this ﬁeld is currently focused on how to achieve adaptation, and particularly on applying Reinforcement Learning (RL) techniques, so a comparative study of the related methods, such as this, is necessary.
Spoken dialogue managers have beneﬁted from using stochastic planners such as Markov Decision Processes (MDPs). However, so far, MDPs do not handle well noisy and ambiguous speech utterances. We use a Partially Observable Markov Decision Process (POMDP)-style approach to generate dialogue strategies by inverting the notion of dialogue state; the state represents the user’s intentions, rather than the system state.
This paper describes a system for managing: dialogue in a natural language interface. The proposed approach uses a dialogue manager as the overall control mechanism. The dialogue manager accesses domain independent resources for interpretation, generation and background system access. It also uses information from domain dependent knowledge sources, which are customized for various applications.
This book is based on publications from the ISCA Tutorial and Research
Workshop on Multi-Modal Dialogue in Mobile Environments held at Kloster
Irsee, Germany, in 2002. The workshop covered various aspects of development
and evaluation of spoken multimodal dialogue systems and components
with particular emphasis on mobile environments, and discussed the state-ofthe-
art within this area. On the development side the major aspects addressed
include speech recognition, dialogue management, multimodal output generation,
system architectures, full applications, and user interface issues.
Dialogues may be seen as comprising commonplace routines on the one hand and specialized, task-specific interactions on the other. Object-orientation is an established means of separating the generic from the specialized. The system under discussion combines this objectoriented approach with a self-organizing, mixed-initiative dialogue strategy, raising the possibility of dialogue systems that can be assembled from ready-made components and tailored, specialized components.
An increasing number of telephone services are offered in a fully automatic
way with the help of speech technology. The underlying systems, called spoken
dialogue systems (SDSs), possess speech recognition, speech understanding,
dialogue management, and speech generation capabilities, and enable a
more-or-less natural spoken interaction with the human user. Nevertheless, the
principles underlying this type of interaction are different from the ones which
govern telephone conversations between humans, because of the limitations of
the machine interaction partner.
The paper considers how to scale up dialogue protocols to multilogue, settings with multiple conversationalists. We extract two benchmarks to evaluate scaled up protocols based on the long distance resolution possibilities of nonsentential utterances in dialogue and multilogue in the British National Corpus. In light of these benchmarks, we then consider three possible transformations to dialogue protocols, formulated within an issue-based approach to dialogue management. We show that one such transformation yields protocols for querying and assertion that fulﬁll these benchmarks. ...
Mobile interfaces need to allow the user and system to adapt their choice of communication modes according to user preferences, the task at hand, and the physical and social environment. We describe a multimodal application architecture which combines ﬁnite-state multimodal language processing, a speech-act based multimodal dialogue manager, dynamic multimodal output generation, and user-tailored text planning to enable rapid prototyping of multimodal interfaces with ﬂexible input and adaptive output. ...
This paper presents the ﬁrst demonstration of a statistical spoken dialogue system that uses automatic belief compression to reason over complex user goal sets. Reasoning over the power set of possible user goals allows complex sets of user goals to be represented, which leads to more natural dialogues. The use of the power set results in a massive expansion in the number of belief states maintained by the Partially Observable Markov Decision Process (POMDP) spoken dialogue manager.
This paper describes MIMUS, a multimodal and multilingual dialogue system for the in– home scenario, which allows users to control some home devices by voice and/or clicks. Its design relies on Wizard of Oz experiments and is targeted at disabled users. MIMUS follows the Information State Update approach to dialogue management, and supports English, German and Spanish, with the possibility of changing language on–the– ﬂy. MIMUS includes a gestures–enabled talking head which endows the system with a human–like personality. ...
We describe how context-sensitive, usertailored output is speciﬁed and produced in the COMIC multimodal dialogue system. At the conference, we will demonstrate the user-adapted features of the dialogue manager and text planner. three-dimensional walkthrough of the ﬁnished bathroom. We will focus on how context-sensitive, usertailored output is generated in the third, guidedbrowsing phase of the interaction. Figure 2 shows a typical user request and response from COMIC in this phase.
Given the growing complexity of tasks that spoken dialogue systems are trying to handle, Reinforcement Learning (RL) has been increasingly used as a way of automatically learning the best policy for a system to make. While most work has focused on generating better policies for a dialogue manager, very little work has been done in using RL to construct a better dialogue state. This paper presents a RL approach for determining what dialogue features are important to a spoken dialogue tutoring system. ...
Wetland ecosystems are a natural resource of global significance.Historically,
their high level of plant and animal (especially bird) diversity is perhaps the
major reason why wetland protection has become a high priority worldwide,
supported by international agreements, such as the Ramsar Convention and
the International Convention of Biological Diversity (Fig. 1.1).
Managing Your Career in the Sports Industry guides readers through the steps necessary to achieve and sustain a dream career, from assessing your interests and skills, setting goals, planning career actions, and searching for a job to interviewing, entering the field, networking, and thriving in the workplace.
In this paper we discuss the use of discourse context in spoken dialogue systems and argue that the knowledge of the domain, modelled with the help of dialogue topics is important in maintaining robustness of the system and improving recognition accuracy of spoken utterances. We propose a topic model which consists of a domain model, structured into a topic tree, and the Predict-Support algorithm which assigns topics to utterances on the basis of the topic transitions described in the topic tree and the words recognized in the input utterance. ...