  • The Head-driven Phrase Structure G r a m m a r project ( H P S G ) is an English language database query system under development at Hewlett-Packard Laboratories. Unlike other product-oriented efforts in the natural language understanding field, the H P S G system was designed and implemented by linguists on the basis of recent theoretical developments. But, unlike other implementations of linguistic theories, this system is not a toy, as it deals with a variety of practical problems not covered in the theoretical literature. ...

  • We propose an improved, bottom-up method for converting CCG derivations into PTB-style phrase structure trees. In contrast with past work (Clark and Curran, 2009), which used simple transductions on category pairs, our approach uses richer transductions attached to single categories. Our conversion preserves more sentences under round-trip conversion (51.1% vs. 39.6%) and is more robust.

  • Phrase-structure grammars are an effective representation for important syntactic and semantic aspects of natural languages, but are computationally too demanding for use as language models in real-time speech recognition. An algorithm is described that computes finite-state approximations for context-free grammars and equivalent augmented phrase-structure grammar formalisms. The approximation is exact for certain contextfree grammars generating regular languages, including all left-linear and right-linear context-free grammars. ...

  • This abstract outlines a parser implemented in a connectionist model of short term memory and reasoning 1. This connectionist architecture, proposed by Shastri in [Shastri and Ajjanagadde, 1990], preserves the symbolic interpretation of the information it stores and manipulates, but does its computations with nodes which have roughly the same computational properties as neurons. The parser recovers the phrase structure of a sentence incrementally from beginning to end and is intended to be a plausible model of human sentence processing. ...

  • There is renewed interest in examining the descriptive as well as generative power of phrase s~-~uctur~ grammars. The primary motivation has come from the recent investigations in alternatives to t-~ansfor~ational gremmmrs [e.g., i, 2, 3, 4]. We will present several results and ideas related to phrase structure trees which have significant relevance to computational linguistics. We %~_nT to accomplish several objectives in this paper. I.

  • The f o r m a l i s m consists o f two parts: 1. A declarative description phrase-structures and t h e i r translation. of basic associated additional syntactic semantic grammar. PSGs have efficient algorithms for parsing [3]. In a sense, all o f the work of transformations has been pushed off into a p r e - p r o c e s s i n g phase w h e r e new g r a m m a r rules are d e r i v e d .

  • This paper describes a natural language processing system implemented at Hewlett-Packard's Computer Research Center. The system's main components are: a Generalized Phrase Structure Grammar (GPSG); a top-down parser; a logic transducer that outputs a first-order logical representation; and a "disambiguator" that uses sortal information to convert "normal-form" f i r s t - o r d e r logical expressions into the q u e r y language for HIRE, a relational database hosted in the SPHERE system.

  • Generalised phrase structure grammars (GPSG's) appear to offer a means by which the syntactic properties of natural languages may be very concisely described. The main reason for this is that the GPSG framework allows you to state a variety of meta-grammatical rules which generate new rules from old ones, so that you can specify rules with a wide variety of realisations via a very small number of explicit statements.

  • In this paper we present an efficient context-free (CF) bottom-up, non deterministic parser. It is an extension of the ICA (Immediate Constituent Analysis) parser proposed by Grishman (1976), and its major improvements are described. It has been designed to run Augmented Phrase-Structure Grammars (APSG) and performs semantic interpretation in parallel with syntactic analysis.

  • Statistical parsing of noun phrase (NP) structure has been hampered by a lack of goldstandard data. This is a significant problem for CCGbank, where binary branching NP derivations are often incorrect, a result of the automatic conversion from the Penn Treebank.(N (N/N lung) (N (N/N cancer) (N deaths) ) )This structure is correct for most English NPs and is the best solution that doesn’t require manual reannotation. However, the resulting derivations often contain errors.

  • The Penn Treebank does not annotate within base noun phrases (NPs), committing only to flat structures that ignore the complexity of English NPs. This means that tools trained on Treebank data cannot learn the correct internal structure of NPs. This paper details the process of adding gold-standard bracketing within each noun phrase in the Penn Treebank. We then examine the consistency and reliability of our annotations. Finally, we use this resource to determine NP structure using several statistical approaches, thus demonstrating the utility of the corpus.

  • Most statistical machine translation systems employ a word-based alignment model. In this paper we demonstrate that word-based alignment is a major cause of translation errors. We propose a new alignment model based on shallow phrase structures, and the structures can be automatically acquired from parallel corpus. This new model achieved over 10% error reduction for our spoken language translation task.

  • Flat noun phrase structure was, up until recently, the standard in annotation for the Penn Treebanks. With the recent addition of internal noun phrase annotation, dependency parsing and applications down the NLP pipeline are likely affected. Some machine translation systems, such as TectoMT, use deep syntax as a language transfer layer. It is proposed that changes to the noun phrase dependency parse will have a cascading effect down the NLP pipeline and in the end, improve machine translation output, even with a reduction in parser accuracy that the noun phrase structure might cause.

  • Ristad (1986a) examines the computational complexity of two components of the G P S G formal system (metarules and the feature system) and shows how each of these systems can lead to computational intractability. Rlstad also proves that the universal recognition problem for G P S G s is E X P - P O L Y hard, and intractable.2 In another words, the fastest recognition algorithm for G P S G s can take more than exponential time. These results m a y appear surprising, given GPSG's weak context-fres generative power. ...

  • The lexicon now plays a central role in our implementation of a Head-driven Phrase Structure G r a m m a r (HPSG), given the massive relocation into the lexicon of linguisticinformation that was carried by the phrase structure rules in the old G P S G system. HPSG's grammax contains fewer tha4z twenty (very general) rules; its predecessor required over 350 to achieve roughly the same coverage. This simplification of the grammax is made possible by an enrichment of the structure and content of lexical entries, using both inhcrit~nce mechanisms and lexical rules to represent thc linguistic...

  • Syntactic Reordering of the source language to better match the phrase structure of the target language has been shown to improve the performance of phrase-based Statistical Machine Translation. This paper applies syntactic reordering to English-to-Arabic translation. It introduces reordering rules, and motivates them linguistically. It also studies the effect of combining reordering with Arabic morphological segmentation, a preprocessing technique that has been shown to improve Arabic-English and EnglishArabic translation.

  • Tài liệu Comparison Structure Words and Phrases sau đây sẽ giúp các bạn hiểu rõ hơn về cấu trúc của từ và cụm từ thông qua việc so sánh sự giống và khác nhau giữa chúng. Với sự trình bày rõ ràng và kèm theo những ví dụ minh họa tài liệu sẽ giúp các bạn nắm bắt kiến thức một cách tốt  hơn.


  • Hierarchical phrase-based models are attractive because they provide a consistent framework within which to characterize both local and long-distance reorderings, but they also make it dif cult to distinguish many implausible reorderings from those that are linguistically plausible. Rather than appealing to annotationdriven syntactic modeling, we address this problem by observing the in uential role of function words in determining syntactic structure, and introducing soft constraints on function word relationships as part of a standard log-linear hierarchical phrase-based model. ...

  • In this paper we describe a novel data structure for phrase-based statistical machine translation which allows for the retrieval of arbitrarily long phrases while simultaneously using less memory than is required by current decoder implementations. We detail the computational complexity and average retrieval times for looking up phrase translations in our suffix array-based data structure. We show how sampling can be used to reduce the retrieval time by orders of magnitude with no loss in translation quality. ...

  • While various aspects of syntactic structure have been shown to bear on the determination of phraselevel prosody, the text-to-speech field has lacked a robust working system to test the possible relations between syntax and prosody. We describe an implemented system which uses the deterministic parser Fidditch to create the input for a set of prosody rules.

