Collocation extraction methods

Xem 1-9 trên 9 kết quả Collocation extraction methods
  • This paper presents a status quo of an ongoing research study of collocations – an essential linguistic phenomenon having a wide spectrum of applications in the field of natural language processing. The core of the work is an empirical evaluation of a comprehensive list of automatic collocation extraction methods using precision-recall measures and a proposal of a new approach integrating multiple basic methods and statistical classification.

    pdf6p bunbo_1 17-04-2013 26 2   Download

  • This paper focuses on the use of advanced techniques of text analysis as support for collocation extraction. A hybrid system is presented that combines statistical methods and multilingual parsing for detecting accurate collocational information from English, French, Spanish and Italian corpora. The advantage of relying on full parsing over using a traditional window method (which ignores the syntactic information) is first theoretically motivated, then empirically validated by a comparative evaluation experiment. ...

    pdf8p hongvang_1 16-04-2013 24 1   Download

  • We introduce the possibility of combining lexical association measures and present empirical results of several methods employed in automatic collocation extraction. First, we present a comprehensive summary overview of association measures and their performance on manually annotated data evaluated by precision-recall graphs and mean average precision. Second, we describe several classification methods for combining association measures, followed by their evaluation and comparison with individual measures. ...

    pdf8p hongvang_1 16-04-2013 35 1   Download

  • Automatically acquiring synonymous collocation pairs such as and from corpora is a challenging task. For this task, we can, in general, have a large monolingual corpus and/or a very limited bilingual corpus. Methods that use monolingual corpora alone or use bilingual corpora alone are apparently inadequate because of low precision or low coverage. In this paper, we propose a method that uses both these resources to get an optimal compromise of precision and coverage.

    pdf8p bunbo_1 17-04-2013 24 2   Download

  • The paper describes ongoing work on the evaluation of methods for extracting collocation candidates from large text corpora. Our research is based on a German treebank corpus used as gold standard. Results are available for adjective+noun pairs, which proved to be a comparatively easy extraction task. We plan to extend the evaluation to other types of collocations (e.g., PP+verb pairs).

    pdf4p bunthai_1 06-05-2013 33 1   Download

  • In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method retrieve collocations in the following stages: 1) extracting strings of characters as units of collocations 2) extracting recurrent combinations of strings in accordance with their word order in a corpus as collocations. Through the method, various range of collocations, especially domain specific collocations, are retrieved.

    pdf6p bunthai_1 06-05-2013 23 2   Download

  • This paper introduces a new method for identifying candidate phrasal terms (also known as multiword units) which applies a nonparametric, rank-based heuristic measure. Evaluation of this measure, the mutual rank ratio metric, shows that it produces better results than standard statistical measures when applied to this task. 1 Introduction The ordinary vocabulary of a language like English contains thousands of phrasal terms -multiword lexical units including compound nouns, technical terms, idioms, and fixed collocations. ...

    pdf9p bunbo_1 17-04-2013 30 1   Download

  • Collocation translation is important for machine translation and many other NLP tasks. Unlike previous methods using bilingual parallel corpora, this paper presents a new method for acquiring collocation translations by making use of monolingual corpora and linguistic knowledge. First, dependency triples are extracted from Chinese and English corpora with dependency parsers.

    pdf8p bunbo_1 17-04-2013 31 1   Download

  • This paper describes unsupervised learning algorithm for disambiguating verbal word senses using term weight learning. In our method, collocations which characterise every sense are extracted using similarity-based estimation. For the results, term weight learning is performed. Parameters of term weighting are then estimated so as to maximise the collocations which characterise every sense and minimise the other collocations. The resuits of experiment demonstrate the effectiveness of the method. ...

    pdf8p bunthai_1 06-05-2013 28 3   Download



p_strKeyword=Collocation extraction methods

nocache searchPhinxDoc


Đồng bộ tài khoản