We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods.
The written form of Arabic, Modern Standard Arabic (MSA), differs quite a bit from the spoken dialects of Arabic, which are the true “native” languages of Arabic speakers used in daily life. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. We present the Arabic Online Commentary Dataset, a 52M-word monolingual dataset rich in dialectal content, and we describe our long-term annotation effort to identify the dialect level (and dialect itself) in each sentence of the dataset. ...
Miscommunication in speech recognition systems is unavoidable, but a detailed characterization of user corrections will enable speech systems to identify when a correction is taking place and to more accurately recognize the content of correction utterances. In this paper we investigate the adaptations of users when they encounter recognition errors in interactions with a voice-in/voice-out spoken language system.
For spoken dialogue systems to correctly understand user intentions to achieve certain tasks while conversing with users, the dialogue state has to be appropriately updated (Zue and Glass, 2000) after each user utterance. Here, a dialogue state means all the information that the system possesses concerning the dialogue. For example, a dialogue state includes intention recognition results after each user utterance, the user utterance history, the system utterance history, and so forth.
In a language generation system, a content planner embodies one or more “plans” that are usually hand–crafted, sometimes through manual analysis of target text. In this paper, we present a system that we developed to automatically learn elements of a plan and the ordering constraints among them. As training data, we use semantically annotated transcripts of domain experts performing the task our system is designed to mimic.
Spoken word collections promise access to unique and compelling content, and most of the technology needed to realize that promise is now in place. Decreasing storage costs, increasing network capacity, and the availability of software to encode and exchange digital audio make possible physical access to spoken word collections at a previously unimaginable scale. Effective support for intellectual access — the problem of finding what you are looking for — is much more challenging, however. ...
At a later stage, pre-final draft chapters were presented to the Longman Linglex Advisory Committee, where again a strong impetus to improve the book's content and presentation was provided by valuable and (often) trenchant critiques from a group of leading British linguists, under the chairmanship of Lord Quirk: Rod Rotitho, Gillian Brown, David Crystal, Philip Scholfield, Katie Wales, John Wells, and Yorick Wilks. Alan Tonkyn offered useful advice and information on C-units (Chapter 14).
Today, digital audio applications are part of our everyday lives. Popular examples
include audio CDs, MP3 audio players, radio broadcasts, TV or video DVDs,
video games, digital cameras with sound track, digital camcorders, telephones,
telephone answering machines and telephone enquiries using speech or word
We also received valuable comments from Bengt Altenberg and Gunnel Tottie, who read draft versions of individual chapters. (Bengt's online ICAME Bibliography also provided us with an extremely useful starting point for our own Bibliography.
Trong chương này chúng tôi sử dụng MPEG-7 được xác định rõ tiêu chuẩn mô tả Spoken Nội dung như là một ví dụ để minh họa cho những thách thức trong lĩnh vực này. Phần âm thanh của MPEG-7 bao gồm công cụ một SpokenContent cao cấp nhắm vào các ứng dụng quản lý dữ liệu nói.
Named reactions still are an important element of organic chemistry, and a thorough
knowledge of such reactions is essential for the chemist. The scientific
content behind the name is of great importance, and the names themselves are
used as short expressions in order to ease spoken as well as written communication
in organic chemistry. Furthermore, named reactions are a perfect aid for
learning the principles of organic chemistry. This is not only true for the study
of chemistry as a major subject, but also when studying chemistry as a minor
We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques for automatically detecting the presence of a repair, its location, and making the appropriate correction. The criteria involve integration of knowledge from several sources: pattern matching, syntactic and semantic analysis, and acoustics. INTRODUCTION Spontaneous spoken language often includes speech that is not intended by the speaker to be part of the content of the utterance. ...