  • This paper describes a syntactic representation for modeling speech repairs. This representation makes use of a right corner transform of syntax trees to produce a tree representation in which speech repairs require very few special syntax rules, making better use of training data. PCFGs trained on syntax trees using this model achieve high accuracy on the standard Switchboard parsing task.

  • We present a model for sentence compression that uses a discriminative largemargin learning framework coupled with a novel feature set defined on compressed bigrams as well as deep syntactic representations provided by auxiliary dependency and phrase-structure parsers. The parsers are trained out-of-domain and contain a significant amount of noise. We argue that the discriminative nature of the learning algorithm allows the model to learn weights relative to any noise in the feature set to optimize compression accuracy directly.

  • We study the impact of syntactic and shallow semantic information in automatic classification of questions and answers and answer re-ranking. We define (a) new tree structures based on shallow semantics encoded in Predicate Argument Structures (PASs) and (b) new kernel functions to exploit the representational power of such structures with Support Vector Machines. Our experiments suggest that syntactic information helps tasks such as question/answer classification and that shallow semantics gives remarkable contribution when a reliable set of PASs can be extracted, e.g. from answers. ...

  • We present two approaches for syntactic and semantic transfer based on LFG f-structures and compare the results with existing co-description and restriction operator based approaches, focusing on aspects of ambiguity preserving transfer, complex cases of syntactic structural mismatches as well as on modularity and reusability. The two transfer approaches are interfaced with an existing, implemented transfer component (Verbmobi1), by translating f-structures into a term language, and by interfacing fstructure representations with an existing semantic based transfer approach, respectively. ...

  • A major focus of current work in distributional models of semantics is to construct phrase representations compositionally from word representations. However, the syntactic contexts which are modelled are usually severely limited, a fact which is reflected in the lexical-level WSD-like evaluation methods used.

  • We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task. ...

  • The task of aligning corresponding phrases across two related sentences is an important component of approaches for natural language problems such as textual inference, paraphrase detection and text-to-text generation. In this work, we examine a state-of-the-art structured prediction model for the alignment task which uses a phrase-based representation and is forced to decode alignments using an approximate search approach.

  • We combine multiple word representations based on semantic clusters extracted from the (Brown et al., 1992) algorithm and syntactic clusters obtained from the Berkeley parser (Petrov et al., 2006) in order to improve discriminative dependency parsing in the MSTParser framework (McDonald et al., 2005).

  • It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have developed a representation framework comprised of an abstract model for a variety of different annotation types (e.g., morpho-syntactic tagging, syntactic annotation, co-reference annotation, etc.), which can be instantiated in different ways depending on the annotators approach and goals. ...

  • A two-tier model for the description of morphological, syntactic and semantic variations of multi-word terms is presented. It is applied to term normalization of French and English corpora in the medical and agricultural domains. Five different sources of morphological and semantic knowledge are exploited (MULTEXT, CELEX, AGROVOC, WordNetl.6, and Microsoft Word97 thesaurus).

  • Ambiguities related to intension and their consequent inference failures are a diverse group, both syntactically and semantically. One particular kind of ambiguity t h a t has received little attention so far is whether it is the speaker or the third p a r t y to whom a description in an opaque third-party attitude report should be attributed. The different readings lead to different inferences in a system modeling the beliefs of external agents. We propose t h a t a unified approach to the representation of the alternative readings of intension-related ambiguities can be based on the...

  • Automatic detection of general relations between short texts is a complex task that cannot be carried out only relying on language models and bag-of-words. Therefore, learning methods to exploit syntax and semantics are required. In this paper, we present a new kernel for the representation of shallow semantic information along with a comprehensive study on kernel methods for the exploitation of syntactic/semantic structures for short text pair categorization.

  • This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of child-directed utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model.

  • We present a model of semantic processing of spoken language that (a) is robust against ill-formed input, such as can be expected from automatic speech recognisers, (b) respects both syntactic and pragmatic constraints in the computation of most likely interpretations, (c) uses a principled, expressive semantic representation formalism (RMRS) with a well-defined model theory, and (d) works continuously (producing meaning representations on a wordby-word basis, rather than only for full utterances) and incrementally (computing only the additional contribution by the new word, rather than ...

  • A new approach to structure-driven generation is I)resented that is based on a separate semantics as input structure. For the first time, a GPSGbased formalism is complemented with a system of pattern-action rules that relate the parts of a semantics to appropriate syntactic rules. This way a front end generator can be adapted to some application system (such as a machine translation system) more easily than would be possible with many previous generators based on modern grammar formalisms.

  • In this paper, we propose innovative representations for automatic classification of verbs according to mainstream linguistic theories, namely VerbNet and FrameNet. First, syntactic and semantic structures capturing essential lexical and syntactic properties of verbs are defined. Then, we design advanced similarity functions between such structures, i.e., semantic tree kernel functions, for exploiting distributional and grammatical information in Support Vector Machines.

  • This paper presents a comparative evaluation of several state-of-the-art English parsers based on different frameworks. Our approach is to measure the impact of each parser when it is used as a component of an information extraction system that performs protein-protein interaction (PPI) identification in biomedical papers. We evaluate eight parsers (based on dependency parsing, phrase structure parsing, or deep parsing) using five different parse representations.

  • The third tier is the knowledge base (KB) that The three-tiered discourse representation defined in describes the belief system of one agent in the (Luperfoy, 1991) is applied to multimodal humandialogue, namely, the backend system being interfaced computer interface (HCI) dialogues. In the applied to. Figure 1 diagrams a partitioning of the system the three tiers are (1) a linguistic analysis information available to a dialogue processing agent.

  • This paper proposes a set of representations for tenses and a set of constraints on how they can be combined in adjunct clauses. The semantics we propose explains the possible meanings of tenses in a variety of sentential contexts. It also supports an elegant constraint on tense combination in adjunct clauses. These semantic representations provide insights into the interpretations of tenses, and the constraints provide a source of syntactic disambiguation that has not previously been demonstrated.

  • preconstructed dictionaries or thesauruses. Even in this relatively simplified environment one does not normally undertake a linguistic analysis of any scope. In fact, syntactic and semantic analysis have been used in b i b l i o g r a p h i c information retrieval only under special circumstances to analyze query phrases [22], to process structured text samples of a certain kind, [7,15], or finally to process texts in severely restricted topic areas. [2] Where s p e c i a l conditions do n o t o b t a i n , the preferred approach in...

