An increasing number of telephone services are offered in a fully automatic
way with the help of speech technology. The underlying systems, called spoken
dialogue systems (SDSs), possess speech recognition, speech understanding,
dialogue management, and speech generation capabilities, and enable a
more-or-less natural spoken interaction with the human user. Nevertheless, the
principles underlying this type of interaction are different from the ones which
govern telephone conversations between humans, because of the limitations of
the machine interaction partner.
In this study, a novel approach to robust dialogue act detection for error-prone speech recognition in a spoken dialogue system is proposed. First, partial sentence trees are proposed to represent a speech recognition output sentence. Semantic information and the derivation rules of the partial sentence trees are extracted and used to model the relationship between the dialogue acts and the derivation rules.
We present a human-robot dialogue system that enables a robot to work together with a human user to build wooden construction toys. We then describe a study in which na¨ve subjects interacted with this ı system under a range of conditions and then completed a user-satisfaction questionnaire. The results of this study provide a wide range of subjective and objective measures of the quality of the interactions.
Techniques for automatically training modules of a natural language generator have recently been proposed, but a fundamental concern is whether the quality of utterances produced with trainable components can compete with hand-crafted template-based or rulebased approaches. In this paper We experimentally evaluate a trainable sentence planner for a spoken dialogue system by eliciting subjective human judgments.
In this paper we discuss the use of discourse context in spoken dialogue systems and argue that the knowledge of the domain, modelled with the help of dialogue topics is important in maintaining robustness of the system and improving recognition accuracy of spoken utterances. We propose a topic model which consists of a domain model, structured into a topic tree, and the Predict-Support algorithm which assigns topics to utterances on the basis of the topic transitions described in the topic tree and the words recognized in the input utterance. ...
Others, including earlier versions of our system, bury discourse functions inside other modules, such as natural language interpretation or the back-end interface. An innovation of this work is the compartmentalization of discourse processing into three generically definable components--Dialogue Management, Context Tracking, and Pragmatic Adaptation (described in Section 1 below)--and the software control structure for interaction between these and other components of a spoken dialogue system (Section 2).
In-vehicle dialogue systems often contain more than one application, e.g. a navigation and a telephone application. This means that the user might, for example, interrupt the interaction with the telephone application to ask for directions from the navigation application, and then resume the dialogue with the telephone application. In this paper we present an analysis of interruption and resumption behaviour in human-human in-vehicle dialogues and also propose some implications for resumption strategies in an in-vehicle dialogue system. determine type of workload and act accordingly. ...
We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods.
Individual utterances often serve multiple communicative purposes in dialogue. We present a data-driven approach for identiﬁcation of multiple dialogue acts in single utterances in the context of dialogue systems with limited training data. Our approach results in signiﬁcantly increased understanding of user intent, compared to two strong baselines.
Over several years, we have developed an approach to spoken dialogue systems that includes rule-based and trainable dialogue managers, spoken language understanding and generation modules, and a comprehensive dialogue system architecture. We present a Reinforcement Learning-based dialogue system that goes beyond standard rule-based models and computes on-line decisions of the best dialogue moves. The key concept of this work is that we bridge the gap between manually written dialog models (e.g.
Spoken language generation for dialogue systems requires a dictionary of mappings between semantic representations of concepts the system wants to express and realizations of those concepts. Dictionary creation is a costly process; it is currently done by hand for each dialogue domain. We propose a novel unsupervised method for learning such mappings from user reviews in the target domain, and test it on restaurant reviews.
In this paper we explore the utility of the Navigation Map (NM), a graphical representation of the discourse structure. We run a user study to investigate if users perceive the NM as helpful in a tutoring spoken dialogue system. From the users’ perspective, our results show that the NM presence allows them to better identify and follow the tutoring plan and to better integrate the instruction. It was also easier for users to concentrate and to learn from the system if the NM was present. Our preliminary analysis on objective metrics further strengthens these findings. ...
We present a prototype natural-language problem-solving application for a financial services call center, developed as part of the Amitiés multilingual human-computer dialogue project. Our automated dialogue system, based on empirical evidence from real call-center conversations, features a datadriven approach that allows for mixed system/customer initiative and spontaneous conversation. Preliminary evaluation results indicate efficient dialogues and high user satisfaction, with performance comparable to or better than that of current conversational travel information systems. ...
To tackle the problem of presenting a large number of options in spoken dialogue systems, we identify compelling options based on a model of user preferences, and present tradeoffs between alternative options explicitly. Multiple attractive options are structured such that the user can gradually reﬁne her request to ﬁnd the optimal tradeoff. We show that our approach presents complex tradeoffs understandably, increases overall user satisfaction, and signiﬁcantly improves the user’s overview of the available options.
We demonstrate a multimodal dialogue system using reinforcement learning for in-car scenarios, developed at Edinburgh University and Cambridge University for the TALK project1. This prototype is the ﬁrst “Information State Update” (ISU) dialogue system to exhibit reinforcement learning of dialogue strategies, and also has a fragmentary clariﬁcation feature. This paper describes the main components and functionality of the system, as well as the purposes and future use of the system, and surveys the research issues involved in its construction. ...
We present and evaluate a new model for Natural Language Generation (NLG) in Spoken Dialogue Systems, based on statistical planning, given noisy feedback from the current generation context (e.g. a user and a surface realiser). We study its use in a standard NLG problem: how to present information (in this case a set of search results) to users, given the complex tradeoffs between utterance length, amount of information conveyed, and cognitive load. We set these trade-offs by analysing existing MATCH data.
It is not always clear how the differences in intrinsic evaluation metrics for a parser or classiﬁer will affect the performance of the system that uses it. We investigate the relationship between the intrinsic evaluation scores of an interpretation component in a tutorial dialogue system and the learning outcomes in an experiment with human users. Following the PARADISE methodology, we use multiple linear regression to build predictive models of learning gain, an important objective outcome metric in tutorial dialogue.
This paper presents the ﬁrst demonstration of a statistical spoken dialogue system that uses automatic belief compression to reason over complex user goal sets. Reasoning over the power set of possible user goals allows complex sets of user goals to be represented, which leads to more natural dialogues. The use of the power set results in a massive expansion in the number of belief states maintained by the Partially Observable Markov Decision Process (POMDP) spoken dialogue manager.
This system demonstration paper presents IRIS (Informal Response Interactive System), a chat-oriented dialogue system based on the vector space model framework. The system belongs to the class of examplebased dialogue systems and builds its chat capabilities on a dual search strategy over a large collection of dialogue samples.
We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difﬁcult to understand in technical domains where users may not know the technical ‘jargon’ names of the domain entities. In such cases, dialogue systems must be able to model the user’s (lexical) domain knowledge and use appropriate referring expressions.