Probability model from

Xem 1-20 trên 83 kết quả Probability model from
  • This paper proposes a novel method for learning probability models of subcategorization preference of verbs. We consider the issues of case dependencies and noun class generalization in a uniform way by employing the maximum entropy modeling method. We also propose a new model selection algorithm which starts from the most general model and gradually examines more specific models.

    pdf7p bunrieu_1 18-04-2013 23 4   Download

  • In the quest for knowledge, it is not uncommon for researchers to push the limits of simulation techniques to the point where they have to be adapted or totally new techniques or approaches become necessary. True multiscale modeling techniques are becoming increasingly necessary given the growing interest in materials and processes on which large-scale properties are dependent or that can be tuned by their low-scale properties. An example would be nanocomposites, where embedded nanostructures completely change the matrix properties due to effects occurring at the atomic level.

    pdf0p thienbinh1311 13-12-2012 15 3   Download

  • User simulations are shown to be useful in spoken dialog system development. Since most current user simulations deploy probability models to mimic human user behaviors, how to set up user action probabilities in these models is a key problem to solve. One generally used approach is to estimate these probabilities from human user data. However, when building a new dialog system, usually no data or only a small amount of data is available.

    pdf9p hongphan_1 14-04-2013 22 2   Download

  • Language models for speech recognition typically use a probability model of the form Pr(an[al,a2,...,an-i). Stochastic grammars, on the other hand, are typically used to assign structure to utterances, A language model of the above form is constructed from such grammars by computing the prefix probability ~we~* Pr(al.-.artw), where w represents all possible terminations of the prefix The main result in this paper is an algorithm to compute such prefix probabilities given a stochastic Tree Adjoining Grammar (TAG). The algorithm achieves the required computation in O(n 6) time. ...

    pdf7p bunrieu_1 18-04-2013 19 2   Download

  • This paper explores the use of clickthrough data for query spelling correction. First, large amounts of query-correction pairs are derived by analyzing users' query reformulation behavior encoded in the clickthrough data. Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system.

    pdf9p hongdo_1 12-04-2013 15 1   Download

  • This paper compares a number of generative probability models for a widecoverage Combinatory Categorial Grammar (CCG) parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabeled word-word dependencies, our best model achieves a performance of 89.9%, comparable to the figures given by Collins (1999) for a linguistically less expressive grammar. In contrast to Gildea (2001), we find a significant improvement from modeling wordword dependencies. ...

    pdf8p bunmoc_1 20-04-2013 13 1   Download

  • Chapter 4: Bayes Classifier present of you about The naïve Bayes Probabilistic model, Constructing a Classifier from the probability model, An application of Naïve Bayes Classifier, Bayesian network.

    ppt27p cocacola_10 08-12-2015 9 1   Download

  • Econometricians, as well as other scientists, are engaged in learning from their experience and data - a fundamental objective of science. Knowledge so obtained may be desired for its own sake, for example to satisfy our curiosity about aspects of economic behavior and/or for use in solving practical problems, for example to improve economic policymaking. In the process of learning from experience and data, description and generalization both play important roles.

    pdf112p phuonghoangnho 23-04-2010 254 146   Download

  • There are many books written about statistics, some brief, some detailed, some humorous, some colorful, and some quite dry. Each of these texts is designed for a specific audience. Too often, texts about statistics have been rather theoretical and intimidating for those not practicing statistical analysis on a routine basis. Thus, many engineers and scientists, who need to use statistics much more frequently than calculus or differential equations, lack sufficient knowledge of the use of statistics.

    pdf103p chuyenphimbuon 21-07-2012 24 8   Download

  • Continuing improvements led to the furnace and bellows and provided the ability to smelt and forge native metals (naturally occurring in relatively pure form).[38] Gold, copper, silver, and lead, were such early metals. The advantages of copper tools over stone, bone, and wooden tools were quickly apparent to early humans, and native copper was probably used from near the beginning of Neolithic times (about 8000 BC).[39] Native copper does not naturally occur in large amounts, but copper ores are quite common and some of them produce metal easily when burned in wood or charcoal fires.

    pdf354p louisxlll 20-12-2012 22 5   Download

  • This paper presents an algorithm for learning the probabilities of optional phonological rules from corpora. The algorithm is based on using a speech recognition system to discover the surface pronunciations of words in spe.ech corpora; using an automatic system obviates expensive phonetic labeling by hand. We describe the details of our algorithm and show the probabilities the system has learned for ten common phonological rules which model reductions and coarticulation effects.

    pdf8p bunmoc_1 20-04-2013 27 4   Download

  • We investigate a number of simple methods for improving the word-alignment accuracy of IBM Model 1. We demonstrate reduction in alignment error rate of approximately 30% resulting from (1) giving extra weight to the probability of alignment to the null word, (2) smoothing probability estimates for rare words, and (3) using a simple heuristic estimation method to initialize, or replace, EM training of model parameters.

    pdf8p bunbo_1 17-04-2013 13 3   Download

  • We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus, PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a given grammatical relation to the other. PONG estimates this probability by combining both distributions, whether or not either word occurs in the labeled corpus. ...

    pdf10p bunthai_1 06-05-2013 18 3   Download

  • Several attempts have been made to learn phrase translation probabilities for phrasebased statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with overfitting. We describe a novel leavingone-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task.

    pdf10p hongdo_1 12-04-2013 19 2   Download

  • We propose a statistical method that finds the maximum-probability segmentation of a given text. This method does not require training data because it estimates probabilities from the given text. Therefore, it can be applied to any text in any domain. An experiment showed that the method is more accurate than or at least as accurate as a state-of-the-art text segmentation system.

    pdf8p bunrieu_1 18-04-2013 12 2   Download

  • Language modeling is to associate a sequence of words with a priori probability, which is a key part of many natural language applications such as speech recognition and statistical machine translation. In this paper, we present a language modeling based on a kind of simple dependency grammar. The grammar consists of head-dependent relations between words and can be learned automatically from a raw corpus using the reestimation algorithm which is also introduced in this paper. Our experiments show that the proposed model performs better than n-gram models at 11% to 11.

    pdf5p bunrieu_1 18-04-2013 18 2   Download

  • PCFGs can be accurate, they suffer from vocabulary coverage problems: treebanks are small and lexicons induced from them are limited. The reason for this treebank-centric view in PCFG learning is 3-fold: the English treebank is fairly large and English morphology is fairly simple, so that in English, the treebank does provide mostly adequate lexical coverage1 ; Lexicons enumerate analyses, but don’t provide probabilities for them; and, most importantly, the treebank and the external lexicon are likely to follow different annotation schemas, reflecting different linguistic perspectives.

    pdf9p bunthai_1 06-05-2013 12 2   Download

  • We describe a novel method that extracts paraphrases from a bitext, for both the source and target languages. In order to reduce the search space, we decompose the phrase-table into sub-phrase-tables and construct separate clusters for source and target phrases. We convert the clusters into graphs, add smoothing/syntacticinformation-carrier vertices, and compute the similarity between phrases with a random walk-based measure, the commute time.

    pdf10p bunthai_1 06-05-2013 19 2   Download

  • In this paper, we extend the work on using latent cross-language topic models for identifying word translations across comparable corpora. We present a novel precisionoriented algorithm that relies on per-topic word distributions obtained by the bilingual LDA (BiLDA) latent topic model. The algorithm aims at harvesting only the most probable word translations across languages in a greedy fashion, without any prior knowledge about the language pair, relying on a symmetrization process and the one-to-one constraint.

    pdf11p bunthai_1 06-05-2013 19 2   Download

  • In data-oriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, using corpora with syntactic annotations such as the Penn Tree-bank. If a corpus with semantically annotated sentences is used, the same approach can also generate the most probable semantic interpretation of an input sentence. The present paper explains this semantic interpretation method. ...

    pdf9p bunthai_1 06-05-2013 20 2   Download


Đồng bộ tài khoản