Statistical prediction

Xem 1-20 trên 100 kết quả Statistical prediction
  • Probability and statistics are concerned with events which occur by chance. Examples include occurrence of accidents, errors of measurements, production of defective and nondefective items from a production line, and various games of chance, such as drawing a card from a well-mixed deck, flipping a coin, or throwing a symmetrical six-sided die. In each case we may have some knowledge of the likelihood of various possible results, but we cannot predict with any certainty the outcome of any particular trial....

    pdf417p sofia11 25-05-2012 79 35   Download

  • This series aims to capture new developments and summarize what is known over the whole spectrum of mathematical and computational biology and medicine. It seeks to encourage the integration of mathematical, statistical and computational methods into biology by publishing a broad range of textbooks, reference works and handbooks. The titles included in the series are meant to appeal to students, researchers and professionals in the mathematical, statistical and computational sciences, fundamental biology and bioengineering, as well as interdisciplinary researchers involved in the field....

    pdf0p 951628473 07-05-2012 37 10   Download

  • Reading is known to be an essential task in language learning, but finding the appropriate text for every learner is far from easy. In this context, automatic procedures can support the teacher’s work. Some tools exist for English, but at present there are none for French as a foreign language (FFL). In this paper, we present an original approach to assessing the readability of FFL texts using NLP techniques and extracts from FFL textbooks as our corpus. Two logistic regression models based on lexical and grammatical features are explored and give quite good predictions on new texts. ...

    pdf9p bunthai_1 06-05-2013 28 4   Download

  • The National Center for Education Statistics (NCES) is the primary federal entity for collecting, analyzing, and reporting data related to education in the United States and other nations.

    pdf67p bin_pham 05-02-2013 20 3   Download

  • In this paper, with a belief that a language model that embraces a larger context provides better prediction ability, we present two extensions to standard n-gram language models in statistical machine translation: a backward language model that augments the conventional forward language model, and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models.

    pdf10p hongdo_1 12-04-2013 22 3   Download

  • In this paper, we propose a linguistically annotated reordering model for BTG-based statistical machine translation. The model incorporates linguistic knowledge to predict orders for both syntactic and non-syntactic phrases. The linguistic knowledge is automatically learned from source-side parse trees through an annotation algorithm. We empirically demonstrate that the proposed model leads to a significant improvement of 1.55% in the BLEU score over the baseline reordering model on the NIST MT-05 Chinese-to-English translation task. ...

    pdf4p hongphan_1 15-04-2013 15 3   Download

  • This paper focuses on the analysis and prediction of so-called aware sites, defined as turns where a user of a spoken dialogue system first becomes aware that the system has made a speech recognition error. We describe statistical comparisons of features of these aware sites in a train timetable spoken dialogue corpus, which reveal significant prosodic differences between such turns, compared with turns that ‘correct’ speech recognition errors as well as with ‘normal’ turns that are neither aware sites nor corrections. ...

    pdf8p bunrieu_1 18-04-2013 16 3   Download

  • It is important to correct the errors in the results of speech recognition to increase the performance of a speech translation system. This paper proposes a method for correcting errors using the statistical features of character co-occurrence, and evaluates the method. The proposed method comprises two successive correcting processes. The first process uses pairs of strings: the first string is an erroneous substring of the utterance predicted by speech recognition, the second string is the corresponding section of the actual utterance.

    pdf5p bunrieu_1 18-04-2013 22 3   Download

  • This paper examines the feasibility of using statistical methods to train a part-of-speech predictor for unknown words. By using statistical methods, without incorporating hand-crafted linguistic information, the predictor could be used with any language for which there is a large tagged training corpus. Encouraging results have been obtained by testing the predictor on unknown words from the Brown corpus. The relative value of information sources such as affixes and context is discussed.

    pdf3p bunrieu_1 18-04-2013 13 3   Download

  • This paper extends the training and tuning regime for phrase-based statistical machine translation to obtain fluent translations into morphologically complex languages (we build an English to Finnish translation system). Our methods use unsupervised morphology induction. Unlike previous work we focus on morphologically productive phrase pairs – our decoder can combine morphemes across phrase boundaries. Morphemes in the target language may not have a corresponding morpheme or word in the source language.

    pdf11p hongdo_1 12-04-2013 18 2   Download

  • Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CYK chart cell, effectively predicting the most promising areas of the model space to explore.

    pdf10p hongdo_1 12-04-2013 15 2   Download

  • In many natural language applications, there is a need to enrich syntactical parse trees. We present a statistical tree annotator augmenting nodes with additional information. The annotator is generic and can be applied to a variety of applications. We report 3 such applications in this paper: predicting function tags; predicting null elements; and predicting whether a tree constituent is projectable in machine translation. Our function tag prediction system outperforms significantly published results. ...

    pdf9p hongdo_1 12-04-2013 19 2   Download

  • In this paper, we present a block-based model for statistical machine translation. A block is a pair of phrases which are translations of each other. For example, Fig. 1 shows an Arabic-English translation example that uses blocks. During decoding, we view translation as a block segmentation process, where the input sentence is segmented from left to right and the target sentence is generated from bottom to top, one block at a time. A monotone block sequence is generated except for the possibility to swap a pair of neighbor blocks.

    pdf8p bunbo_1 17-04-2013 18 2   Download

  • This paper introduces an indexing method based on static analysis of grammar rules and type signatures for typed feature structure grammars (TFSGs). The static analysis tries to predict at compile-time which feature paths will cause unification failure during parsing at run-time. To support the static analysis, we introduce a new classification of the instances of variables used in TFSGs, based on what type of structure sharing they create. The indexing actions that can be performed during parsing are also enumerated. ...

    pdf8p bunbo_1 17-04-2013 17 2   Download

  • Determining the relationship between the intonational characteristics of an utterance and other features inferable from its text is important both for speech recognition and for speech synthesis. This work investigates the use of text analysis in predicting the location of intonational phrase boundaries in natural speech, through analyzing 298 utterances from the DARPA Air Travel Information Service database. For statistical modeling, we employ Classification and Regression Tree (CART) techniques. ...

    pdf8p bunmoc_1 20-04-2013 12 2   Download

  • Sentence fluency is an important component of overall text readability but few studies in natural language processing have sought to understand the factors that define it. We report the results of an initial study into the predictive power of surface syntactic statistics for the task; we use fluency assessments done for the purpose of evaluating machine translation. We find that these features are weakly but significantly correlated with fluency. Machine and human translations can be distinguished with accuracy over 80%.

    pdf9p bunthai_1 06-05-2013 24 2   Download

  • In this paper, we extend current state-of-theart research on unsupervised acquisition of scripts, that is, stereotypical and frequently observed sequences of events. We design, evaluate and compare different methods for constructing models for script event prediction: given a partial chain of events in a script, predict other events that are likely to belong to the script.

    pdf9p bunthai_1 06-05-2013 26 2   Download

  • Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific purpose. Since ad-hoc manual translation can represent a significant investment in time and money, a prior assesment of the amount of training data required to achieve a satisfactory accuracy level can be very useful. In this work, we show how to predict what the learning curve would look like if we were to manually translate increasing amounts of data.

    pdf9p nghetay_1 07-04-2013 18 1   Download

  • This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the resulting permutations and select the n-best for translation.

    pdf10p nghetay_1 07-04-2013 8 1   Download

  • Speaker’s intention prediction modules can be widely used as a pre-processor for reducing the search space of an automatic speech recognizer. They also can be used as a preprocessor for generating a proper sentence in a dialogue system. We propose a statistical model to predict speakers’ intentions by using multi-level features. Using the multi-level features (morpheme-level features, discourselevel features, and domain knowledge-level features), the proposed model predicts speakers’ intentions that may be implicated in next utterances. ...

    pdf4p hongphan_1 15-04-2013 19 1   Download


Đồng bộ tài khoản