Can we automatically compose a large set of Wiktionaries and translation dictionaries to yield a massive, multilingual dictionary whose coverage is substantially greater than that of any of its constituent dictionaries? The composition of multiple translation dictionaries leads to a transitive inference problem: if word A translates to word B which in turn translates to word C, what is the probability that C is a translation of A? The paper introduces a novel algorithm that solves this problem for 10,000,000 words in more than 1,000 languages. ...
Probabilistic inference is an attractive approach to uncertain reasoning and empirical
learning in artificial intelligence. Computational difficulties arise, however,
because probabilistic models with the necessary realism and
exibility lead to complex
distributions over high-dimensional spaces.
We present a novel approach to deciding whether two sentences hold a paraphrase relationship. We employ a generative model that generates a paraphrase of a given sentence, and we use probabilistic inference to reason about whether two sentences share the paraphrase relationship. The model cleanly incorporates both syntax and lexical semantics using quasi-synchronous dependency grammars (Smith and Eisner, 2006).
Recognizing entailment at the lexical level is an important and commonly-addressed component in textual inference. Yet, this task has been mostly approached by simpliﬁed heuristic methods. This paper proposes an initial probabilistic modeling framework for lexical entailment, with suitable EM-based parameter estimation. Our model considers prominent entailment factors, including differences in lexical-resources reliability and the impacts of transitivity and multiple evidence.
This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of child-directed utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model.
Variational EM has become a popular technique in probabilistic NLP with hidden variables. Commonly, for computational tractability, we make strong independence assumptions, such as the meanﬁeld assumption, in approximating posterior distributions over hidden variables. We show how a looser restriction on the approximate posterior, requiring it to be a mixture, can help inject prior knowledge to exploit soft constraints during the variational E-step. We show that empirically, injecting prior knowledge improves performance on an unsupervised Chinese grammar induction task. ...
Despite the great technological advancement experienced in recent years,
Programmable Logic Controllers (PLC) are still used in many applications from the real
world and still play a central role in infrastructure of industrial automation. PLC operate in
the factory-floor level and are responsible typically for implementing logical control,
regulatory control strategies, such as PID and fuzzy-based algorithms, and safety logics.
Usually PLC are interconnected with the supervision level through communication
network, such as Ethernet networks, in order to work in an integrated form....
Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence
We predict entity type distributions in Web search queries via probabilistic inference in graphical models that capture how entitybearing queries are generated. We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks. We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base.
Identifying background (context) information in scientiﬁc articles can help scholars understand major contributions in their research area more easily. In this paper, we propose a general framework based on probabilistic inference to extract such context information from scientiﬁc papers. We model the sentences in an article and their lexical similarities as a Markov Random Field tuned to detect the patterns that context data create, and employ a Belief Propagation mechanism to detect likely context sentences. ...
We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference. For tagging, learned constraints are directly used to constrain Viterbi decoding.
This paper establishes a connection between two apparently very different kinds of probabilistic models. Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a lowdimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) deﬁne distributions over trees. The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well. Adaptor Grammars (AGs) are a hierarchical, non-parameteric Bayesian extension of PCFGs. ...
We present a probabilistic topic model for jointly identifying properties and attributes of social media review snippets. Our model simultaneously learns a set of properties of a product and captures aggregate user sentiments towards these properties. This approach directly enables discovery of highly rated or inconsistent properties of a product. Our model admits an efﬁcient variational meanﬁeld inference algorithm which can be parallelized and run on large snippet collections.
This paper introduces a machine learning method based on bayesian networks which is applied to the mapping between deep semantic representations and lexical semantic resources. A probabilistic model comprising Minimal Recursion Semantics (MRS) structures and lexicalist oriented semantic features is acquired. Lexical semantic roles enriching the MRS structures are inferred, which are useful to improve the accuracy of deep semantic parsing.
There are two decoding algorithms essential to the area of natural language processing. One is the Viterbi algorithm for linear-chain models, such as HMMs or CRFs. The other is the CKY algorithm for probabilistic context free grammars. However, tasks such as noun phrase chunking and relation extraction seem to fall between the two, neither of them being the best ﬁt. Ideally we would like to model entities and relations, with two layers of labels.
We consider the problem of predictive inference for probabilistic binary sequence labeling models under F-score as utility. For a simple class of models, we show that the number of hypotheses whose expected Fscore needs to be evaluated is linear in the sequence length and present a framework for efﬁciently evaluating the expectation of many common loss/utility functions, including the F-score. This framework includes both exact and faster inexact calculation methods.
Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, a simple Monte Carlo method used to perform approximate inference in factored probabilistic models.