Statistical computing

Xem 1-20 trên 243 kết quả Statistical computing
  • A complete practical tutorial for RStudio, designed keeping in mind the needs of analysts and R developers alike. Step-by-step examples that apply the principles of reproducible research and good programming practices to R projects. Learn to effectively generate reports, create graphics, and perform analysis, and even build R-packages with RStudio.

    pdf126p titatu_123 09-03-2013 30 4   Download

  • This series aims to capture new developments and summarize what is known over the whole spectrum of mathematical and computational biology and medicine. It seeks to encourage the integration of mathematical, statistical and computational methods into biology by publishing a broad range of textbooks, reference works and handbooks. The titles included in the series are meant to appeal to students, researchers and professionals in the mathematical, statistical and computational sciences, fundamental biology and bioengineering, as well as interdisciplinary researchers involved in the field....

    pdf0p 951628473 07-05-2012 31 7   Download

  • This paper describes an extension to the hidden Markov model for part-of-speech tagging using second-order approximations for both contextual and lexical probabilities. This model increases the accuracy of the tagger to state of the art levels. These approximations make use of more contextual information than standard statistical systems. New methods of smoothing the estimated probabilities are also introduced to address the sparse data problem.

    pdf8p bunrieu_1 18-04-2013 14 4   Download

  • This paper reports the on-going research of a thesis project investigating a computational model of early language acquisition. The model discovers word-like units from crossmodal input data and builds continuously evolving internal representations within a cognitive model of memory. Current cognitive theories suggest that young infants employ general statistical mechanisms that exploit the statistical regularities within their environment to acquire language skills.

    pdf9p bunthai_1 06-05-2013 22 4   Download

  • This paper presents a comparative study of five parameter estimation algorithms on four NLP tasks. Three of the five algorithms are well-known in the computational linguistics community: Maximum Entropy (ME) estimation with L2 regularization, the Averaged Perceptron (AP), and Boosting. We also investigate ME estimation with L1 regularization using a novel optimization algorithm, and BLasso, which is a version of Boosting with Lasso (L1) regularization. We first investigate all of our estimators on two re-ranking tasks: a parse selection task and a language model (LM) adaptation task. ...

    pdf8p hongvang_1 16-04-2013 22 3   Download

  • In this paper we describe a novel data structure for phrase-based statistical machine translation which allows for the retrieval of arbitrarily long phrases while simultaneously using less memory than is required by current decoder implementations. We detail the computational complexity and average retrieval times for looking up phrase translations in our suffix array-based data structure. We show how sampling can be used to reduce the retrieval time by orders of magnitude with no loss in translation quality. ...

    pdf8p bunbo_1 17-04-2013 20 3   Download

  • The search space of Phrase-Based Statistical Machine Translation (PBSMT) systems can be represented under the form of a directed acyclic graph (lattice). The quality of this search space can thus be evaluated by computing the best achievable hypothesis in the lattice, the so-called oracle hypothesis. For common SMT metrics, this problem is however NP-hard and can only be solved using heuristics.

    pdf10p bunthai_1 06-05-2013 25 3   Download

  • Think Bayes is an introduction to Bayesian statistics using computational methods and Python programming language. Bayesian statistics are usually presented mathematically, but many of the ideas are easier to understand computationally. Contents: Bayes's Theorem; Computational statistics; Tanks and Trains; Urns and Coins; Odds and addends; Hockey; The variability hypothesis; Hypothesis testing.

    pdf176p ringphone 06-05-2013 45 3   Download

  • As part of its new Digital Government program, the National Science Foundation (NSF) requested that the Computer Science and Telecommunications Board (CSTB) undertake an in-depth study of how information technology research and development could more effectively support advances in the use of information technology in government.

    pdf102p camnhung_1 14-12-2012 55 2   Download

  • This textbook was designed and developed to provide health care students, primarily health information management and health information technology students, and health care professionals with a rudimentary understanding of the terms, definitions, and formulae used in computing health care statistics and to provide self-testing opportunities and applications of the statistical formulae.

    pdf288p cronus75 16-01-2013 13 2   Download

  • We tackle the previously unaddressed problem of unsupervised determination of the optimal morphological segmentation for statistical machine translation (SMT) and propose a segmentation metric that takes into account both sides of the SMT training corpus. We formulate the objective function as the posterior probability of the training corpus according to a generative segmentation-translation model. We describe how the IBM Model-1 translation likelihood can be computed incrementally between adjacent segmentation states for efficient computation. ...

    pdf6p hongdo_1 12-04-2013 23 2   Download

  • Statistical models in machine translation exhibit spurious ambiguity. That is, the probability of an output string is split among many distinct derivations (e.g., trees or segmentations). In principle, the goodness of a string is measured by the total probability of its many derivations. However, finding the best string (e.g., during decoding) is then computationally intractable. Therefore, most systems use a simple Viterbi approximation that measures the goodness of a string using only its most probable derivation.

    pdf9p hongphan_1 14-04-2013 14 2   Download

  • State-of-the-art computer-assisted translation engines are based on a statistical prediction engine, which interactively provides completions to what a human translator types. The integration of human speech into a computer-assisted system is also a challenging area and is the aim of this paper. So far, only a few methods for integrating statistical machine translation (MT) models with automatic speech recognition (ASR) models have been studied. They were mainly based on N best rescoring approach. ...

    pdf8p hongvang_1 16-04-2013 25 2   Download

  • We describe a new loss function, due to Jeon and Lin (2006), for estimating structured log-linear models on arbitrary features. The loss function can be seen as a (generative) alternative to maximum likelihood estimation with an interesting information-theoretic interpretation, and it is statistically consistent. It is substantially faster than maximum (conditional) likelihood estimation of conditional random fields (Lafferty et al., 2001; an order of magnitude or more).

    pdf8p hongvang_1 16-04-2013 21 2   Download

  • In this paper we focus on how to improve pronoun resolution using the statisticsbased semantic compatibility information. We investigate two unexplored issues that influence the effectiveness of such information: statistics source and learning framework. Specifically, we for the first time propose to utilize the web and the twin-candidate model, in addition to the previous combination of the corpus and the single-candidate model, to compute and apply the semantic information. t

    pdf8p bunbo_1 17-04-2013 18 2   Download

  • In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible word-reorderings in an appropriate way, we obtain a polynomial-time search algorithm. In this paper, we compare two different reordering constraints, namely the ITG constraints and the IBM constraints.

    pdf8p bunbo_1 17-04-2013 16 2   Download

  • The processes through which readers evoke mental representations of phonological forms from print constitute a hotly debated and controversial issue in current psycholinguistics. In this paper we present a computational analysis of the grapho-phonological system of written French, and an empirical validation of some of the obtained descriptive statistics.

    pdf7p bunrieu_1 18-04-2013 20 2   Download

  • This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation system outperforms state-of-the-art systems, automatically generating some of the most natural image descriptions to date. ...

    pdf10p bunthai_1 06-05-2013 15 2   Download

  • Since they cluster terms through statistical measures of context similarities, these tools exploit recurring situations. Since single-word terms denote broader concepts than multi-word terms, they appear more frequently in corpora and are therefore more appropriate for statistical clustering. The contribution of this paper is to propose an integrated platform for computer-aided term extraction and structuring that results from the combination of LEXTER, a Term Extraction tool (Bouriganlt et al., 1996), and FASTR 1, a Term Normalization tool (Jacquemin et al., 1997). ...

    pdf8p bunthai_1 06-05-2013 13 2   Download

  • Computer simulation is used to reduce the risk associated with creating new systems or with making changes to existing ones. More than ever, modern organizations want assurance that investments will produce the expected results. For instance, an assembly line may be required to produce a particular number of autos during an eight hour shift. Complex, interacting factors influence operation and so powerful tools are needed to develop an accurate analysis.

    pdf172p tuanloc_do 03-12-2012 19 1   Download


Đồng bộ tài khoản