Automatically parsed text

Xem 1-19 trên 19 kết quả Automatically parsed text
  • This paper describes a Verb Phrase Ellipsis (VPE) detection system, built for robustness, accuracy and domain independence. The system is corpus-based, and uses machine learning techniques on free text that has been automatically parsed. Tested on a mixed corpus comprising a range of genres, the system achieves a 70% F1-score. This system is designed as the first stage of a complete VPE resolution system that is input free text, detects VPEs, and proceeds to find the antecedents and resolve them. ...

    pdf6p bunbo_1 17-04-2013 19 1   Download

  • In this paper we describe a new technique for parsing free text: a transformational grammar I is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled. The algorithm works by beginning in a very naive state of knowledge about phrase structure. By repeatedly comparing the results of bracketing in the current state to proper bracketing provided in the training corpus, the system learns a set of simple structural transformations that can be applied to reduce error.

    pdf7p bunmoc_1 20-04-2013 33 2   Download

  • This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). In our method, a targetside tree fragment that corresponds to a source-side tree fragment is identified via word alignment and mapping rules that are automatically learned. Then it is verified by checking the subtree list that is collected from large scale automatically parsed data on the target side.

    pdf9p hongdo_1 12-04-2013 26 1   Download

  • We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition on overlapping windows of linear context. We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a matter of hours.

    pdf10p nghetay_1 07-04-2013 27 2   Download

  • This paper shows how finite approximations of long distance dependency (LDD) resolution can be obtained automatically for wide-coverage, robust, probabilistic Lexical-Functional Grammar (LFG) resources acquired from treebanks. We extract LFG subcategorisation frames and paths linking LDD reentrancies from f-structures generated automatically for the Penn-II treebank trees and use them in an LDD resolution algorithm to parse new text.

    pdf8p bunbo_1 17-04-2013 44 2   Download

  • In this paper we deal with several kinds of anaphora in unrestricted texts. These kinds of anaphora are pronominal references, surfacecount anaphora and one-anaphora. In order to solve these anaphors we work on the output of a part-of-speech tagger, on which we automatically apply a partial parsing from the formalism: Slot Unification Grammar, which has been implemented in Prolog. We only use the following kinds of information: lexical (the lemma of each word), morphologic (person, number, gender) and syntactic. ...

    pdf7p bunrieu_1 18-04-2013 29 2   Download

  • Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of words. The similarity measure allows us to construct a thesaurus using a parsed corpus. We then present a new evaluation methodology for the automatically constructed thesaurus. The evaluation results show that the thesaurns is significantly closer to WordNet than Roget Thesaurus is.

    pdf7p bunrieu_1 18-04-2013 31 2   Download

  • A m e t h o d is presented for automatically augmenting the bilingual lexicon of an existing Machine Translation system, by extracting bilingual entries from aligned bilingual text. The proposed m e t h o d only relies on the resources already available in the MT system itself. It is based on the use of bilingual lexical templates to match the terminal symbols in the parses of the aligned sentences.

    pdf8p bunrieu_1 18-04-2013 32 2   Download

  • The Constituent Likelihood Automatic Word-tagging System (CLAWS) was originally designed for the low-level grammatical analysis of the million-word LOB Corpus of English text samples. CLAWS does not attempt a full parse, but uses a firat-order Markov model of language to assign word-class labels to words. CLAWS can be modified to detect grammatical errors, essentially by flagging unlikely word-class transitions in the input text.

    pdf8p buncha_1 08-05-2013 29 2   Download

  • This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem.

    pdf10p nghetay_1 07-04-2013 44 1   Download

  • MorP is a system for automatic word class assignment on the basis of surface features. It has a very small lexicon of form words (%o entries), and for the rest works entirely on morphological and configurational patterns. This makes it robust and fast, and in spite of the (deliberate) restrictedness of the system, its performance reaches an average accuracy level above 91% when run on unrestricted Swedish text. K e y w o r d s : parsing, morphology. The development of the parser to be presented has been supported by the Swedish Research Council for the Humanities. ...

    pdf6p buncha_1 08-05-2013 30 1   Download

  • A problem frequently encountered in the automatic parsing of Russian texts is the correct structuring of prepositional phrases in sentences. Studies of text samples indicate that, when other criteria are absent, the syntactic governors of prepositions can be determined with a high degree of accuracy by reference to the relative position and part-ofspeech of elements in the clausal environment.

    pdf0p nghetay_1 06-04-2013 38 2   Download

  • This paper proposes a forest-based tree sequence to string translation model for syntaxbased statistical machine translation, which automatically learns tree sequence to string translation rules from word-aligned sourceside-parsed bilingual texts. The proposed model leverages on the strengths of both tree sequence-based and forest-based translation models.

    pdf9p hongphan_1 14-04-2013 36 2   Download

  • Automatic acquisition of translation rules from parallel sentence-aligned text takes a variety of forms. Some machine translation (MT) systems treat aligned sentences as unstructured word sequences. Other systems, including our own ((Grishman, 1994) and (Meyers et al., 1996)), syntactically analyze sentences (parse) before acquiring transfer rules (cf. (Kaji et hi., 1992), (Matsumoto et hi., 1993), and (Kitamura and Matsumoto, 1995)). This has the advantage of acquiring structural as well as lexical correspondences. ...

    pdf5p bunrieu_1 18-04-2013 43 2   Download

  • The paper describes the development of software for automatic grammatical ana]ysi$ of u n l ~ ' U i ~ , unedited English text at the Unit for Compm= Research on the Ev~li~h Language (UCREL) at the U n i v e t ~ of Lancaster. The work is ~n'nmtly funded by IBM and carried out in collaboration with colleagues at IBM UK ( W ' ~ ) and IBM Yorktown Heights. The paper will focus on the lexicon component of the word raging system, the UCREL grammar, the datal~zlks of parsed sentences, and the tools that have been...

    pdf6p bungio_1 03-05-2013 33 2   Download

  • Mobile voice-enabled search is emerging as one of the most popular applications abetted by the exponential growth in the number of mobile devices. The automatic speech recognition (ASR) output of the voice query is parsed into several fields. Search is then performed on a text corpus or a database. In order to improve the robustness of the query parser to noise in the ASR output, in this paper, we investigate two different methods to query parsing.

    pdf8p bunthai_1 06-05-2013 29 2   Download

  • The automatic extraction of relations between entities expressed in natural language text is an important problem for IR and text understanding. In this paper we show how different kernels for parse trees can be combined to improve the relation extraction quality. On a public benchmark dataset the combination of a kernel for phrase grammar parse trees and for dependency parse trees outperforms all known tree kernel approaches alone suggesting that both types of trees contain complementary information for relation extraction. ...

    pdf4p hongphan_1 15-04-2013 28 1   Download

  • We present a novel translation model based on tree-to-string alignment template (TAT) which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is linguistically syntaxbased because TATs are extracted automatically from word-aligned, source side parsed parallel texts. To translate a source sentence, we first employ a parser to produce a source parse tree and then apply TATs to transform the tree into a target string. ...

    pdf8p hongvang_1 16-04-2013 28 1   Download

  • We present a weakly supervised approach to automatic Ontology Population from text and compare it with other two unsupervised approaches. In our experiments we populate a part of our ontology of Named Entities. We considered two high level categories - geographical locations and person names and ten sub-classes for each category. For each sub-class, from a list of training examples and a syntactically parsed corpus, we automatically learn a syntactic model - a set of weighted syntactic features, i.e.

    pdf8p bunthai_1 06-05-2013 33 1   Download



p_strKeyword=Automatically parsed text

nocache searchPhinxDoc


Đồng bộ tài khoản