Mapping and classiﬁcation of chemical compound names are important aspects of the tasks of BioNLP. This paper introduces the architecture of a system for the syntactic and semantic analysis of such names. Our system aims at yielding both the denoted chemical structure and a classiﬁcation of a given name. We employ a novel approach to the task which promises an elegant and efﬁcient way of solving the problem. The proposed system differs signiﬁcantly from existing systems, in that it is also able to deal with underspecifying names and class names. ...
Probabilistic Latent Semantic Analysis (PLSA) models have been shown to provide a better model for capturing polysemy and synonymy than Latent Semantic Analysis (LSA). However, the parameters of a PLSA model are trained using the Expectation Maximization (EM) algorithm, and as a result, the trained model is dependent on the initialization values so that performance can be highly variable. In this paper we present a method for using LSA analysis to initialize a PLSA model.
This paper describes the use of an on-line system to do word-sense ambiguity resolution and content analysis of English paragraphs, using a system of semantic analysis programmed in Q32 LISP 1.5. The system of semantic analysis comprises dictionary codings for the text words, coded forms of permitted message, and rules producing message forms in combination on the basis of a criterion of semantic closeness. All these can be expressed as a single system of rules of phrase-structure form.
We discuss Feature Latent Semantic Analysis (FLSA), an extension to Latent Semantic Analysis (LSA). LSA is a statistical method that is ordinarily trained on words only; FLSA adds to LSA the richness of the many other linguistic features that a corpus may be labeled with. We applied FLSA to dialogue act classiﬁcation with excellent results. We report results on three corpora: CallHome Spanish, MapTask, and our own corpus of tutoring dialogues.
We present a novel fine-grained semantic representation of text and an approach to constructing it. This representation is largely extractable by today’s technologies and facilitates more detailed semantic analysis. We discuss the requirements driving the representation, suggest how it might be of value in the automated tutoring domain, and provide evidence of its validity.
Prosody can be useful in resolving certain lexical and structural ambiguities in spoken English. In this paper we present some results of employing two types of prosodic information, namely pitch and pause, to assist syntactic and semantic analysis during parsing. morphosyntactic interpretations to one correct analysis without error (p. 262). (Steedman 1990) explores taking advantage of intonational structure in spoken sentence understanding in the combinatory categorial grammar formalism.
Term translation probabilities proved an effective method of semantic smoothing in the language modelling approach to information retrieval tasks. In this paper, we use Generalized Latent Semantic Analysis to compute semantically motivated term and document vectors. The normalized cosine similarity between the term vectors is used as term translation probability in the language modelling framework. Our experiments demonstrate that GLSAbased term translation probabilities capture semantic relations between terms and improve performance on document classiﬁcation. ...
This paper presents a new m e t h o d of analyzing Japanese noun phrases of the form N1 no 5/2. The Japanese postposition no roughly corresponds to of, but it has much broader usage. The method exploits a definition of N2 in a dictionary. For example, rugby no coach can be interpreted as a person who teaches technique in rugby. We illustrate the effectiveness of the m e t h o d by the analysis of 300 test noun phrases.
At our institute a speech understanding and dialog system is developed. As an example we model an information system for timetables and other information about intercity trains. In understanding spoken utterances, additional problems arise due to pronunciation variabilities and vagueness of the word recognition process. Experiments so far have also shown that the syntactical analysis produces a lot more hypotheses instead of reducing the number of word hypotheses.
A framework for a structured representation of semantic knowledge (e.g. word-senses) has been defined at the IBM Scientific Center of Roma, as part of a project on Italian Text Understanding. This representation, based on the conceptual graphs formalism [SOW84], expresses deep knowledge (pragmatic) on word-senses. The knowledge base data structure is such as to provide easy access by the semantic verification algorithm. This paper discusses some important problem related to the definition of a semantic knowledge base, as depth versus generality, hierarchical ordering of concept types, etc.
A system for semantic analysis of a wide range of English sentence forms is described. The system has been implemented in LISP 1.5 on the System Development Corporation (SDC) time-shared computer. Semantic analysis is defined as the selection of a unique word sense for each word in a natural-language sentence string and its bracketing in an underlying deep structure of that string.
We report on a mechanism for semantic and pragmatic interpretation that has been designed to take advantage of the generally compositional nature of semantic analysis, without unduly constraining the order in which pragmatic decisions are made. To achieve this goal, we introduce the idea of a conditional interpretation: one that depends upon a set of assumptions about subsequent pragmatic processing. Conditional interpretations are constructed compositionally according to a set of declaratively specified interpretation rules.
The variation in speech due to dialect is a factor which significantly impacts speech system performance. In this study, we investigate effective methods of combining acoustic and language information to take advantage of (i) speaker based acoustic traits as well as (ii) content based word selection across the text sequence. For acoustics, a GMM based system is employed and for text based dialect classification, we proposed n-gram language models combined with Latent Semantic Analysis (LSA) based dialect classifiers. ...
We present a study aimed at investigating the use of semantic information in a novel NLP application, Electronic Career Guidance (ECG), in German. ECG is formulated as an information retrieval (IR) task, whereby textual descriptions of professions (documents) are ranked for their relevance to natural language descriptions of a person’s professional interests (the topic).
Discourse in formal domains, such as mathematics, is characterized by a mixture of telegraphic natural language and embedded (semi-)formal symbolic mathematical expressions. We present language phenomena observed in a corpus of dialogs with a simulated tutorial system for proving theorems as evidence for the need for deep syntactic and semantic analysis. We propose an approach to input understanding in this setting.
Interpreting metaphors is an integral and inescapable process in human understanding of natural language. This paper discusses a method of analyzing metaphors based on the existence of a small number of generalized metaphor mappings. Each generalized metaphor contains a recognition network, a basic mapping, additional transfer mappings, and an implicit intention component. It is argued that the method reduces metaphor interpretation from a reconstruction to a recognition task.
We argue that because the very concept of computation rests on notions of interpretation, the semantics of natural languages and the semantics of computational formalisms are in the deepest sense the same subject. The attempt to use computational formalisms in aid of an explanation of natural language semantics, therefore, is an enterprise that must be undertaken with particular care. We describe a framework for semantical analysis that we have used in the computational realm, and suggest that it may serve to underwrite computadonally-oriented linguistic ser.antics as well. ...
Japanese has m a n y noun phrase patterns of the type A no B consisting of two nouns A and B with an adnominal particle no. As the semantic relations between the two nouns in the noun phrase are not made explicit, the interpretation of the phrases depends mainly on the semantic characteristics of the nouns. This paper describes the semantic diversity of A no B and a method of semantic analysis for such phrases based on feature unification.
S E A F A C T (Semantic Analysis For the Animation of Cooking Tasks) is a natural language interface to a computer-generated animation system operating in the domain of cooking tasks. S E A F A C T allows the user to specify cooking tasks "using a small subset of English. The system analyzes English input and produces a representation of the task which can drive motion synthesis procedures. Tl~is paper describes the semantic analysis of verbal modifiers on which the S E A F A C T implementation is based. ...