The purpose of the special parasession on "Interactive Man/Machine Discourse" is to discuss some critical issues in the design of (computer-based) interactive natural language processing systems. This panel will be addressing the question of how the purpose of the interaction, or "problem context" affects what is said and how it is interpreted. Each of the panel members brings a different orientation toward the study of language to this question.
My consents are organized within the framework suggested by the Panel Chair, Barbara Grosz, which I find very appropriate. All of my conlnents pertain to the various issues raised by her; however, wherever possible I will discuss these issues more in the context of the "information seeking" interaction and the data base doma/n. The primary question is how the purpose of the interaction or "the problem context" affects what is said and how it is interpreted. The ~ separate aspects of this question that must be considered are the function and the domain of the discourse. I. Types of...
contrast, the solution is deﬁned by data structures that describe the original problem context indirectly and thus, determine the search space within an evolutionary search (optimization process). There exists the analogy in the nature, where the genotype encodes the phenotype, as well. Consequently, a genotype-phenotype mapping determines how the genotypic representation is mapped to the... more
n contrast, the solution
is deﬁned by data structures that describe the original problem context indirectly and thus,
determine the search space within an evolutionary search (optimization process). There exists
the analogy in the nature, where the genotype encodes the phenotype, as well. Consequently,
a genotype-phenotype mapping determines how the genotypic representation is mapped to
the phenotypic property. In other words, the phenotypic property determines the solution in
original problemcontext. ...
One morning, long ago, John woke up and decided he wanted to write a book on theories
and techniques in counseling and psychotherapy. He thought, “Of all the classes I
teach, I love teaching theories and techniques best, so I should write a textbook.”
John then began using a cognitive self-instructional problem-solving strategy (see
Chapter 8). He identified the problems associated with existing theories and techniques
textbooks and formulated possible solutions.
We consider the problem of learning context-dependent mappings from sentences to logical form. The training examples are sequences of sentences annotated with lambda-calculus meaning representations. We develop an algorithm that maintains explicit, lambda-calculus representations of salient discourse entities and uses a context-dependent analysis pipeline to recover logical forms. The method uses a hidden-variable variant of the perception algorithm to learn a linear model used to select the best analysis.
We discuss some of the practical issues that arise from decoding with general synchronous context-free grammars. We examine problems caused by unary rules and we also examine how virtual nonterminals resulting from binarization can best be handled. We also investigate adding more ﬂexibility to synchronous context-free grammars by adding glue rules and phrases.
HAVING STARTED WORK on mechanical translation, we arrived at the conclusion that both the lexical meaning and the morphological shape of the word can and should be utilized in analyzing the text, and that for purposes of translation it is impractical to omit the information which can be thus obtained. The utilization of the lexical meanings of words as well as of their contexts may also affect problems of coding. These questions are extremely important to automatic translation. We based our work on the following principles: 1.
One problem with phrase-based statistical machine translation is the problem of longdistance reordering when translating between languages with different word orders, such as Japanese-English. In this paper, we propose a method of imposing reordering constraints using document-level context. As the documentlevel context, we use noun phrases which signiﬁcantly occur in context documents containing source sentences.
In this paper we present a tool that uses comparable corpora to ﬁnd appropriate translation equivalents for expressions that are considered by translators as difﬁcult. For a phrase in the source language the tool identiﬁes a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking. ...
This paper presents a partial solution to a component of the problem of lexical choice: choosing the synonym most typical, or expected, in context. We apply a new statistical approach to representing the context of a word through lexical co-occurrence networks. The implementation was trained and evaluated on a large corpus, and results show that the inclusion of second-order co-occurrence relations improves the performance of our implemented lexical choice program.
We study the problem of ﬁnding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a speciﬁed righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to ﬁnd the best head-driven strategy in terms of either the time or space complexity of parsing.
Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language ﬂuency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al.
We consider the problem of parsing non-recursive context-free grammars, i.e., context-free grammars that generate ﬁnite languages. In natural language processing, this problem arises in several areas of application, including natural language generation, speech recognition and machine translation. We present two tabular algorithms for parsing of non-recursive context-free grammars, and show that they perform well in practical settings, despite the fact that this problem is PSPACEcomplete.
The class of linear context-free rewriting systems has been introduced as a generalization of a class of grammar formalisms known as mildly context-sensitive. The recognition problem for linear context-free rewriting languages is studied at length here, presenting evidence that, even in some restricted cases, it cannot be solved efficiently. This entails the existence of a gap between, for example, tree adjoining languages and the subclass of linear context-free rewriting languages that generalizes the former class; such a gap is attributed to "crossing configurations".
We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguistic structure, among others). The method operates via the computation of substring expectations, which in turn is accomplished by solving systems of linear equations derived from the grammar. The procedure is fully implemented and has proved viable and useful in practice. confirming its practical feasibility and utility.
This paper addresses the problem of correcting spelling errors that result in valid, though unintended words (such as peace and piece, or quiet and quite) and also the problem of correcting particular word usage errors (such as amount and number, or among and between). Such corrections require contextual information and are not handled by conventional spelling programs such as Unix spell. First, we introduce a method called Trigrams that uses part-of-speech trigrams to encode the context. This method uses a small number of parameters compared to previous methods based on word trigrams.
I will address the questions posed to the panel from wlthln the context of a project at SRI, TEAM [Grosz, 1982b], that is developing techniques for transportable natural-language interfaces. The goal of transportability is to enable nonspeciallsts to adapt a natural-language processing system for access to an existing conventional database. TEAM is designed to interact with two different kinds of users.
Parallel Multiple Context-Free Grammar (PMCFG) is an extension of context-free grammar for which the recognition problem is still solvable in polynomial time. We describe a new parsing algorithm that has the advantage to be incremental and to support PMCFG directly rather than the weaker MCFG formalism. The algorithm is also top-down which allows it to be used for grammar based word prediction.
One of the major problems one is faced with when decomposing words into their constituent parts is ambiguity: the generation of multiple analyses for one input word, many of which are implausible. In order to deal with ambiguity, the MORphological PArser MORPA is provided with a probabilistic context-free grammar (PCFG), i.e. it combines a "conventional" context-free morphological grammar to filter out ungrammatical segmentations with a probability-based scoring function which determines the likelihood of each successful parse. ...