  • For a natural language access to database system to be practical it must achieve a good match between the capabilities of the user and the requirements of the task. The user brings his own natural language and his own style of interaction to the system. The task brings the questions that must be answered and the database domaln+s semantics. All natural language access systems achieve some degree of success. But to make progress as a field, we need to be able to evaluate the degree of this success. For too long, the best we have menaged has been to...

  • In recent years tree kernels have been proposed for the automatic learning of natural language applications. Unfortunately, they show (a) an inherent super linear complexity and (b) a lower accuracy than traditional attribute/value methods. In this paper, we show that tree kernels are very helpful in the processing of natural language as (a) we provide a simple algorithm to compute tree kernels in linear average running time and (b) our study on the classification properties of diverse tree kernels show that kernel combinations always improve the traditional methods. ...

  • We have developed an approach to natural language processing in which the natural language processor is viewed as a knowledge-based system whose knowledge is about the meanings of the utterances of its language. The approach is orzented around the phrase rather than the word as the basic unit. We believe that this p a r a d i ~ for language processing not only extends the capabilities of other natural language systems, but handles those tasks that previous systems could perform in e more systematic and extensible manner. ...

  • Based on experience with INTELLECT in the areas of quality assurance and customer support, a number of issues in evaluating a natural language database query system, particularly the INTELLECT system, will be discussed. A, I. Corporation offers licenses for customers t o use the INTELLECT software on their computers, to access their databases. We now have a number of customer installations, plus reports from companies that are marketing INTELLECT under agreements with us, so that we can begin to discuss user reactions as possible criteria for evaluating our system. ...

  • In responding to the guidelines established by the session chairman of this panel, three of the five topics he set forth will be discussed. These include aggregate functions and quantity questions, querying semantically complex fields, and multi-file queries. As we will make clear in the sequel, the transformational apparatus utilized in the TQA Question Answering System provides a principled basis for handling these and many other problems i n natural language access to databases.

  • The Layered Domain Class system (LDC) is an experimental natural language processor being developed at Duke University which reached the prototype stage in M a y of 1983. Its primary goals are (I) to provide English-language retrieval capabilities for structured but unnormaUzed data files created by the user, (2) to allow very complex semantics, in terms of the information directly available from the physical data file; and (3) to enable users to customize the system to operate with new types of data. In this paper we shall discuss (a) the types of modifiers LDC provides for; (b) h o...

  • Problem localization Is the identification of the most slgnlflcant failures i n the AND-OR tree resulting from an unsuocass/ul attempt to achieve a goal, for instance, In planning, b a c k w a r d - c h n i n i n g inference, or top-down parnin~ We examine beurlstics and strategies for problem localization in the context of using a planner to check for pragmatic failures in natural language input to computer systems, such as a cooperative natural language interface to Unix .

  • The undisputed favorite application for natural language interfaces has been data base query. Why? The reasons range from the relative simplicity of the task, including shallow semantic processing, to the potential real-world utility of the resultant system. Because of such reasons, the data base query task was an excellent paradigmatic problem for computational linguistics, and for the very same reasons it is now time for the field to abandon its protective cocoon and progress beyond this rather limiting task. ...

  • Do natural language database systems still ,~lovide a valuable environment for further work on n~,tural language processing? Are there other systems which provide the same hard environment :for testing, but allow us to explore more interesting natural language questions? In order to answer , o to the first question and yes to the second (the position taken by our panel's chair}, there must be an interesting language problem which is more naturally studied in some other system than in the database system. ...

  • The aim of this paper is to show how large-scale (computational) grammars of natural language benefit from an organization of semantics which is based on Minimal Recursion Semantics (MRS; Copestake et al. (1999)). This we are doing by providing an account of valence alternations in German based on MRS, showing how such an account makes a computational grammar more efficient and less complicated for the grammar writer.

  • ( 'What you see is what you meant') is a user-interface technique which uses natural language generation (NLG) technology to provide feedback for user interactions. To date, the technology has been applied in a number of demonstrator applications, using customised, nonportable implementations. In this demonstration, we introduce a WYSIWYM library package, designed to be used as a modular component of a larger JAVA-based application.

  • An algorithm based on the Generalized Hebbian Algorithm is described that allows the singular value decomposition of a dataset to be learned based on single observation pairs presented serially. The algorithm has minimal memory requirements, and is therefore interesting in the natural language domain, where very large datasets are often used, and datasets quickly become intractable. The technique is demonstrated on the task of learning word and letter bigram pairs from text.

  • We present and evaluate a new model for Natural Language Generation (NLG) in Spoken Dialogue Systems, based on statistical planning, given noisy feedback from the current generation context (e.g. a user and a surface realiser). We study its use in a standard NLG problem: how to present information (in this case a set of search results) to users, given the complex tradeoffs between utterance length, amount of information conveyed, and cognitive load. We set these trade-offs by analysing existing MATCH data.

  • We demonstrate an open-source natural language generation engine that produces descriptions of entities and classes in English and Greek from OWL ontologies that have been annotated with linguistic and user modeling information expressed in RDF . We also demonstrate an accompanying plug-in for the Prot´ g´ ontology editor, e e which can be used to create the ontology’s annotations and generate previews of the resulting texts by invoking the generation engine.

  • The subject of this demonstration is natural language interaction, focusing on adaptivity and profiling of the dialogue management and the generated output (text and speech). These are demonstrated in a museum guide use-case, operating in a simulated environment. The main technical innovations presented are the profiling model, the dialogue and action management system, and the text generation and speech synthesis systems.

  • We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sentences into clauses, and one that produces valid rhetorical structure trees for unrestricted natural language texts. The algorithms use information that was derived from a corpus analysis of cue phrases.

  • Several recent efforts in statistical natural language understanding (NLU) have focused on generating clumps of English words from semantic meaning concepts (Miller et al., 1995; Levin and Pieraccini, 1995; Epstein et al., 1996; Epstein, 1996). This paper extends the IBM Machine Translation Group's concept of fertility (Brown et al., 1993) to the generation of clumps for natural language understanding. The basic underlying intuition is that a single concept may be expressed in English as many disjoint clump of words. ...

  • This paper presents a method for the automatic extraction of subgrammars to control and speeding-up natural language generation NLG. The method is based on explanation-based learning EBL. The main advantage for the proposed new method for NLG is that the complexity of the grammatical decision making process during NLG can be vastly reduced, because the EBL method supports the adaption of a NLG system to a particular use of a language.

  • Acquiring information systems specifications from natural language description is presented as a problem class that requires a different treatment of semantics when compared with other applied NL systems such as database and operating system interfaces. Within this problem class, the specific task of obtaining explicit conceptual data models from natural language text or dialogue is being investigated. The knowledge brought to bear on this task is classified into syntactic, semantic and systems analysis knowledge.

  • In the field of knowledge based systems for natural language processing, one of the most challenging aims is to use parts of an existing knowledge base for different domains and/or different tasks. We support the point that this problem can only be solved by using adequate metainformation about the content and structuring principles of the representational systems concerned. One of the prerequisites in this respect is the transparency of modelling decisions.

