Atherothrombosis describes the occurrence of both
atherosclerosis and thrombosis in an artery, a
common feature of peripheral arterial disease.1
It is estimated that 1 in 16 U.S. residents who were
at least 40 years of age in 2000 (approximately 8.5
million persons) had peripheral arterial disease.
The International Biological Programme (IPB), a world-wide plan of coordinated
research on the biological basis of productivity and human welfare, covered terrestrial
and aquatic environments, and ran for almost a decade from 1964. Recognising
the need for guidance in methodology, the IBP arranged for the publication of a series
of handbooks on techniques with the aim of achieving comparability of results
all over theworld. One of these handbooks dealt with the study of marine benthos.
Although indexes may overlap, the output of an automatic indexer is generally presented as a fiat and unstructured list of terms. Our purpose is to exploit term overlap and embedding so as to yield a substantial qualitative and quantitative improvement in automatic indexing through concept combination. The increase in the volume of indexing is 10.5% for free indexing and 52.3% for controlled indexing. The resulting structure of the indexed corpus is a partial conceptual analysis.
This report presents results of a three-year evaluation of Learn and Serve America, Higher Education (LSAHE), a program sponsored by the Corporation for National and Community Service that aims to increase involvement in community service by higher education institutions and students. LSAHE emphasizes an approach to
This paper presents an innovative, complex approach to semantic verb classiﬁcation that relies on selectional preferences as verb properties. The probabilistic verb class model underlying the semantic classes is trained by a combination of the EM algorithm and the MDL principle, providing soft clusters with two dimensions (verb senses and subcategorisation frames with selectional preferences) as a result. A language-model-based evaluation shows that after 10 training iterations the verb class model results are above the baseline results.
This paper proposes to solve the bottleneck of finding training data for word sense disambiguation (WSD) in the domain of web queries, where a complete set of ambiguous word senses are unknown. In this paper, we present a combination of active learning and semi-supervised learning method to treat the case when positive examples, which have an expected word sense in web search result, are only given. The novelty of our approach is to use “pseudo negative examples” with reliable confidence score estimated by a classifier trained with positive and unlabeled examples.
The paper describes two parsing schemes: a shallow approach based on machine learning and a cascaded ﬁnite-state parser with a hand-crafted grammar. It discusses several ways to combine them and presents evaluation results for the two individual approaches and their combination. An underspeciﬁcation scheme for the output of the ﬁnite-state parser is introduced and shown to improve performance.
A hybrid system is described which combines the strength of manual rulewriting and statistical learning, obtaining results superior to both methods if applied separately. The combination of a rule-based system and a statistical one is not parallel but serial: the rule-based system performing partial disambiguation with recall close to 100% is applied ﬁrst, and a trigram HMM tagger runs on its results. An experiment in Czech tagging has been performed with encouraging results.
In this paper, we study the problem of combining a Chinese thesaurus with a Chinese dictionary by linking the word entries in the thesaurus with the word senses in the dictionary, and propose a similar word strategy to solve the problem. The method is based on the definitions given in the dictionary, but without any syntactic parsing or sense disambiguation on them at all.
We predict discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data. We present two methods for developing segmentation algorithms from training data: hand tuning and machine learning. When multiple types of features are used, results approach human performance on an independent test set (both methods), and using cross-validation (machine learning).
Kernel based methods dominate the current trend for various relation extraction tasks including protein-protein interaction (PPI) extraction. PPI information is critical in understanding biological processes. Despite considerable efforts, previously reported PPI extraction results show that none of the approaches already known in the literature is consistently better than other approaches when evaluated on different benchmark PPI corpora.
A back -propagation artificial neural net has been trained to estimate the activity values of a set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular properties such as the partition coefficient P, 3d structure dependent parameters, charge dependent parameters, and topological descriptors.
In this paper we show how to train statistical machine translation systems on reallife tasks using only non-parallel monolingual data from two languages. We present a modiﬁcation of the method shown in (Ravi and Knight, 2011) that is scalable to vocabulary sizes of several thousand words. On the task shown in (Ravi and Knight, 2011) we obtain better results with only 5% of the computational effort when running our method with an n-gram language model.
Rapid and inexpensive techniques for automatic transcription of speech have the potential to dramatically expand the types of content to which information retrieval techniques can be productively applied, but limitations in accuracy and robustness must be overcome before that promise can be fully realized. Combining retrieval results from systems built on various errorful representations of the same collection offers some potential to address these challenges.
This paper proposes a novel method that exploits multiple resources to improve statistical machine translation (SMT) based paraphrasing. In detail, a phrasal paraphrase table and a feature function are derived from each resource, which are then combined in a log-linear SMT model for sentence-level paraphrase generation. Experimental results show that the SMT-based paraphrasing model can be enhanced using multiple resources. The phrase-level and sentence-level precision of the generated paraphrases are above 60% and 55%, respectively.
Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method that performs combination as an optimization process. Our algorithm explicitly maximizes the effectiveness function with greedy search for phrase table training or synchronized grammar extraction. Experimental results show that the proposed method leads to signiﬁcantly better translation quality than existing methods. ...
This paper presents three methods that can be used to recognize paraphrases. They all employ string similarity measures applied to shallow abstractions of the input sentences, and a Maximum Entropy classiﬁer to learn how to combine the resulting features. Two of the methods also exploit WordNet to detect synonyms and one of them also exploits a dependency parser. We experiment on two datasets, the MSR paraphrasing corpus and a dataset that we automatically created from the MTC corpus. Our system achieves state of the art or better results. ...
Statistical machine translation is quite robust when it comes to the choice of input representation. It only requires consistency between training and testing. As a result, there is a wide range of possible preprocessing choices for data used in statistical machine translation. This is even more so for morphologically rich languages such as Arabic. In this paper, we study the effect of different word-level preprocessing schemes for Arabic on the quality of phrase-based statistical machine translation. ...
We introduce the possibility of combining lexical association measures and present empirical results of several methods employed in automatic collocation extraction. First, we present a comprehensive summary overview of association measures and their performance on manually annotated data evaluated by precision-recall graphs and mean average precision. Second, we describe several classiﬁcation methods for combining association measures, followed by their evaluation and comparison with individual measures. ...
We present a novel hybrid approach for Word Sense Disambiguation (WSD) which makes use of a relational formalism to represent instances and background knowledge. It is built using Inductive Logic Programming techniques to combine evidence coming from both sources during the learning process, producing a rule-based WSD model. We experimented with this approach to disambiguate 7 highly ambiguous verbs in EnglishPortuguese translation.