# Multinomial distribution

A New Feature Selection Score for Multinomial Naive Bayes Text Classiﬁcation Based on KL-Divergence

We deﬁne a new feature selection score for text classiﬁcation based on the KL-divergence between the distribution of words in training documents and their classes. The score favors words that have a similar distribution in documents of the same class but different distributions in documents of different classes. Experiments on two standard data sets indicate that the new method outperforms mutual information, especially for smaller categories.

Identifying Word Translations from Comparable Corpora Using Latent Topic Models

A topic model outputs a set of multinomial distributions over words for each topic. In this paper, we investigate the value of bilingual topic models, i.e., a bilingual Latent Dirichlet Allocation model for ﬁnding translations of terms in comparable corpora without using any linguistic resources. Experiments on a document-aligned English-Italian Wikipedia corpus conﬁrm that the developed methods which only use knowledge from word-topic distributions outperform methods based on similarity measures in the original word-document space.