Analyze tabular data using the BI Semantic Model (BISM) in Microsoft® SQL Server® 2012 Analysis Services—and discover a simpler method for creating corporate-level BI solutions. Led by three BI experts, you’ll learn how to build, deploy, and query a BISM tabular model with step-by-step guides, examples, and best practices. This hands-on book shows you how the tabular model’s in-memory database enables you to perform rapid analytics—whether you’re a professional BI developer new to Analysis Services or familiar with its multidimensional model....
Modern embedded systems come with contradictory design constraints. On one hand, these systems often target mass production and battery-based devices, and therefore should be cheap and power efficient. On the other hand, they still need to show high (sometimes real-time) performance, and often support multiple applications and standards which requires high programmability.
Level 1 shows you how to produce 3-D models using NURBS geometryand arrange models for export, annotation and plotting.In class, you will receive information at an accelerated pace. For best results, practice at a Rhino workstation between class sessions, and consult your Rhino reference manual and the Help file for additional information.
This book has two main goals: to provide a unifed and structured overview of this growing field, as well as to propose a corresponding software framework, the OpenTL library, developed by the author and his working group at TUM-Informatik.
The main objective of this work is to show, how most real-world application scenarios can be naturally cast into a common description vocabulary, and therefore implemented and tested in a fully modular and scalable way, through the defnition of a layered, object-oriented software architecture.
This paper presents a method to develop a class of variable memory Markov models that have higher memory capacity than traditional (uniform memory) Markov models. The structure of the variable memory models is induced from a manually annotated corpus through a decision tree learning algorithm. A series of comparative experiments show the resulting models outperform uniform memory Markov models in a part-of-speech tagging task.
This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for canonicalizing derivations. Even for non-ITG sentence pairs, we show that it is possible learn ITG alignment models by simple relaxations of structured discriminative learning objectives. For efﬁciency, we describe a set of pruning techniques that together allow us to align sentences two orders of magnitude faster than naive bitext CKY parsing.
This innovative text presents computer programming as a unified discipline in a way that is both practical and scientifically sound. The book focuses on techniques of lasting value and explains them precisely in terms of a simple abstract machine. The book presents all major programming paradigms in a uniform framework that shows their deep relationships and how and where to use them together.After an introduction to programming concepts, the book presents both well-known and lesser-known computation models ("programming paradigms"). ...
Active Learning (AL) is typically initialized with a small seed of examples selected randomly. However, when the distribution of classes in the data is skewed, some classes may be missed, resulting in a slow learning progress. Our contribution is twofold: (1) we show that an unsupervised language modeling based technique is effective in selecting rare class examples, and (2) we use this technique for seeding AL and demonstrate that it leads to a higher learning rate. The evaluation is conducted in the context of word sense disambiguation. ...
In many computational linguistic scenarios, training labels are subjectives making it necessary to acquire the opinions of multiple annotators/experts, which is referred to as ”wisdom of crowds”. In this paper, we propose a new approach for modeling wisdom of crowds based on the Latent Mixture of Discriminative Experts (LMDE) model that can automatically learn the prototypical patterns and hidden dynamic among different experts. Experiments show improvement over state-of-the-art approaches on the task of listener backchannel prediction in dyadic conversations. ...
We present a novel probabilistic classiﬁer, which scales well to problems that involve a large number of classes and require training on large datasets. A prominent example of such a problem is language modeling. Our classiﬁer is based on the assumption that each feature is associated with a predictive strength, which quantiﬁes how well the feature can predict the class by itself. The predictions of individual features can then be combined according to their predictive strength, resulting in a model, whose parameters can be reliably and efﬁciently estimated.
This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphrases only. A method for performing this process efﬁciently is presented and applied to some 2500 sentences with unknown terms. Our experiments show that the proposed approach substantially increases the number of properly translated texts. ...
Large-scale discriminative machine translation promises to further the state-of-the-art, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised. Results show that accounting for multiple derivations does indeed improve performance.
We propose a cascaded linear model for joint Chinese word segmentation and partof-speech tagging. With a character-based perceptron as the core, combined with realvalued features such as language models, the cascaded model is able to efﬁciently utilize knowledge sources that are inconvenient to incorporate into the perceptron directly. Experiments show that the cascaded model achieves improved accuracies on both segmentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank 5.0, we obtain an error reduction of 18.
We would like to draw attention to Hidden Markov Tree Models (HMTM), which are to our knowledge still unexploited in the ﬁeld of Computational Linguistics, in spite of highly successful Hidden Markov (Chain) Models. In dependency trees, the independence assumptions made by HMTM correspond to the intuition of linguistic dependency. Therefore we suggest to use HMTM and tree-modiﬁed Viterbi algorithm for tasks interpretable as labeling nodes of dependency trees.
This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree alignment model is proposed to identify the translationally equivalent texts and hyperlinks between two parallel DOM trees. By tracing the identified parallel hyperlinks, parallel web documents are recursively mined. Compared with previous mining schemes, the benchmarks show that this new mining scheme improves the mining coverage, reduces mining bandwidth, and enhances the quality of mined parallel sentences.
This paper presents an unsupervised topic identiﬁcation method integrating linguistic and visual information based on Hidden Markov Models (HMMs). We employ HMMs for topic identiﬁcation, wherein a state corresponds to a topic and various features including linguistic, visual and audio information are observed. Our experiments on two kinds of cooking TV programs show the effectiveness of our proposed method.
This paper presents empirical studies and closely corresponding theoretical models of the performance of a chart parser exhaustively parsing the Penn Treebank with the Treebank’s own CFG grammar. We show how performance is dramatically affected by rule representation and tree transformations, but little by top-down vs. bottom-up strategies.
Transition-based dependency parsers are often forced to make attachment decisions at a point when only partial information about the relevant graph conﬁguration is available. In this paper, we describe a model that takes into account complete structures as they become available to rescore the elements of a beam, combining the advantages of transition-based and graph-based approaches. We also propose an efﬁcient implementation that allows for the use of sophisticated features and show that the completion model leads to a substantial increase in accuracy.
This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-occurrence statistics to be constant over a window of several hundred words, we show that their influence is nonstationary on a much smaller time scale.
Sentence Similarity is the process of computing a similarity score between two sentences. Previous sentence similarity work ﬁnds that latent semantics approaches to the problem do not perform well due to insufﬁcient information in single sentences. In this paper, we show that by carefully handling words that are not in the sentences (missing words), we can train a reliable latent variable model on sentences.