We describe a novel method for coping with ungrammatical input based on the use of chart-like data structures, which permit anytime processing. Priority is given to deep syntactic analysis. Should this fail, the best partial analyses are selected, according to a shortest-paths algorithm, and assembled in a robust processing phase. The m e t h o d has been applied in a speech translation project with large HPSG grammars.
Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition.
This section addresses the inverse problem in robust speech processing. A problem that speaker and
speech recognition systems regularly encounter in the commercialized applications is the dramatic
degradation of performance due to the mismatch of the training and operating environments.
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài:
Research Article Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Compensating Acoustic Mismatch Using Class-Based Histogram Equalization for Robust Speech Recognition
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article A Comprehensive Noise Robust Speech Parameterization Algorithm Using Wavelet Packet
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface
Zigital speech processing is a maCor field in current research all oAer the world. -n particular
for automatic speech recognition [8?O\, Aery significant achieAements haAe been made since
the first attempts of digit recogni]ers in the YU^G_s and YUXG_s when spectral resonances were
determined by analogue filters and logical circuits.
In this study, a novel approach to robust dialogue act detection for error-prone speech recognition in a spoken dialogue system is proposed. First, partial sentence trees are proposed to represent a speech recognition output sentence. Semantic information and the derivation rules of the partial sentence trees are extracted and used to model the relationship between the dialogue acts and the derivation rules.
We describe the design and function of a robust processing component which is being developed for the Verbmobil speech translation system. Its task consists of collecting partial analyses of an input utterance produced by three parsers and attempting to combine them into more meaningful, larger units.
While various aspects of syntactic structure have been shown to bear on the determination of phraselevel prosody, the text-to-speech field has lacked a robust working system to test the possible relations between syntax and prosody. We describe an implemented system which uses the deterministic parser Fidditch to create the input for a set of prosody rules.
Speech recognition affords automobile drivers a hands-free, eyes-free method of replying to Short Message Service (SMS) text messages. Although a voice search approach based on template matching has been shown to be more robust to the challenging acoustic environment of automobiles than using dictation, users may have difficulties verifying whether SMS response templates match their intended meaning, especially while driving. Using a high-fidelity driving simulator, we compared dictation for SMS replies versus voice search in increasingly difficult driving conditions. ...
We demonstrate that transformation-based learning can be used to correct noisy speech recognition transcripts in the lecture domain with an average word error rate reduction of 12.9%. Our method is distinguished from earlier related work by its robustness to small amounts of training data, and its resulting efﬁciency, in spite of its use of true word error rate computations as a rule scoring function.
Speech interfaces to question-answering systems offer signiﬁcant potential for ﬁnding information with phones and mobile networked devices. We describe a demonstration of spoken question answering using a commercial dictation engine whose language models we have customized to questions, a Web-based textprediction interface allowing quick correction of errors, and an open-domain question-answering system, AnswerBus, which is freely available on the Web.