Correctional process

  • This paper describes a spelling correction system that functions as part of an intelligent tutor that carries on a natural language dialogue with its users. The process that searches the lexicon is adaptive as is the system filter, to speed up the process. The basis of our approach is the interaction between the parser and the spelling corrector. Alternative correction targets are fed back to the parser, which does a series of syntactic and semantic checks, based on the dialogue context, the sentence context, and the phrase context. phrases that are used in the correction process. ...

  • It is important to correct the errors in the results of speech recognition to increase the performance of a speech translation system. This paper proposes a method for correcting errors using the statistical features of character co-occurrence, and evaluates the method. The proposed method comprises two successive correcting processes. The first process uses pairs of strings: the first string is an erroneous substring of the utterance predicted by speech recognition, the second string is the corresponding section of the actual utterance.

  • Statistical methods require very large corpus with high quality. But building large and faultless annotated corpus is a very difficult job. This paper proposes an efficient m e t h o d to construct part-of-speech tagged corpus. A rulebased error correction m e t h o d is proposed to find and correct errors semi-automatically by user-defined rules. We also make use of user's correction log to reflect feedback. Experiments were carried out to show the efficiency of error correction process of this workbench. The result shows that about 63.2 % of tagging errors can be corrected. ...

  • The enforcement of laws and the numerous tasks involved in establishing and maintaining civility in our communities are daily challenges to the criminal justice system. The administration of justice has become a series of competing mandates that demand bravery, accountability, and service. The administration and management of the criminal justice system have grown more complex for law enforcement, courts, corrections, and victim services.

  • Reduce overall grain production and value by cracking is one of the main problems directly reduces income and the availability of staple food to the farmers in the Mekong Delta. Cracking or fissuring of a grain of wheat that may occur in the rice field by harvest time is not right / real, post-harvest conditions improper drying and milling activities inconsistent. There are a series of activities during harvesting and processing rice harvest. Figure 1 is a diagram showing a system of rice production in the Mekong Delta of Vietnam today. All strings in this system may contribute to losses.

  • Conditional Random Fields (CRFs) have been applied with considerable success to a number of natural language processing tasks. However, these tasks have mostly involved very small label sets. When deployed on tasks with larger label sets, the requirements for computational resources mean that training becomes intractable. This paper describes a method for training CRFs on such tasks, using error correcting output codes (ECOC). A number of CRFs are independently trained on the separate binary labelling tasks of distinguishing between a subset of the labels and its complement. ...

  • Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpus by downloading web pages to create a topic-diverse collection of 10 billion words of English. We show that for context-sensitive spelling correction the Web Corpus results are better than using a search engine. For thesaurus extraction, it achieved similar overall results to a corpus of newspaper text.

  • It has previously been assumed in the psycholinguistic literature that finite-state models of language are crucially limited in their explanatory power by the locality of the probability distribution and the narrow scope of information used by the model. We show that a simple computational model (a bigram part-of-speech tagger based on the design used by Corley and Crocker (2000)) makes correct predictions on processing difficulty observed in a wide range of empirical sentence processing data. ...

  • Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or REPAIRS, little is known about the extent to which cues in the speech signal may facilitate repair processing. We identify several cues based on acoustic and prosodic analysis of repairs in a corpus of spontaneous speech, and propose methods for exploiting these cues to detect and correct repairs.

  • Tills paper describes a computational method for correcting users' miseonceptioas concerning the objects modelled by a compute," s.ystem. The method involves classifying object-related misc,mce|,tions according to the knowledge-base feature involved in the incorrect information. For each resulting class sub-types are identified, :.:cording to the structure of the knowledge base, which indicate wh:LI i.formativn may be supporting the misconception and therefore what information to include in the response. ...

  • Faced with the problem of annotation errors in part-of-speech (POS) annotated corpora, we develop a method for automatically correcting such errors. Building on top of a successful error detection method, we first try correcting a corpus using two off-the-shelf POS taggers, based on the idea that they enforce consistency; with this, we find some improvement. After some discussion of the tagging process, we alter the tagging model to better account for problematic tagging distinctions.

  • RNA processing is an essential process in eukaryotic cells, creating different RNA species from one and the same gene. RNA processing occurs on nearly all kinds of RNAs, including mRNA that codes for proteins, ribosomal RNA, tRNA, snRNAs, and RNA. RNA processing usually occurs co-transcriptionally, and many factors are recruited by the RNA polymerase itself. This stimulates RNA processing by enhancing the correct assembly of factors as the RNA is being produced. Some factors, such as splice factors and cleavage factors for rRNA, are also recruited by the growing RNA-chain.

  • Department of Cognitive and Linguistic Sciences Brown University Providence, RI, USA correction, the approximation is poor for hierarchical models, which are commonly used for NLP applications. We derive an improved O(1) formula that gives exact values for the expected counts in non-hierarchical models. For hierarchical models, where our formula is not exact, we present an efficient method for sampling from the HDP (and related models, such as the hierarchical PitmanYor process) that considerably decreases the memory footprint of such models as compared to the naive implementation. ...

  • W e describe a program for assigning correct stress contours to nominals in English. It makes use of idiosyncratic knowledge about the stress behavior of various nominal types and general knowledge about English stress rules. W e have also investigated the related issue of parsing complex nominals in English. The importance of this work and related research to the problem of text-to-speech is 'discussed.

  • We present the psycholinguistically motivated task of predicting human plausibility judgements for verb-role-argument triples and introduce a probabilistic model that solves it. We also evaluate our model on the related role-labelling task, and compare it with a standard role labeller. For both tasks, our model benefits from classbased smoothing, which allows it to make correct argument-specific predictions despite a severe sparse data problem. The standard labeller suffers from sparse data and a strong reliance on syntactic cues, especially in the prediction task. ...

  • Machine translation (MT) systems have improved significantly; however, their outputs often contain too many errors to communicate the intended meaning to their users. This paper describes a collaborative approach for mediating between an MT system and users who do not understand the source language and thus cannot easily detect translation mistakes on their own. Through a visualization of multiple linguistic resources, this approach enables the users to correct difficult translation errors and understand translated passages that were otherwise baffling. ...

  • Building on work detecting errors in dependency annotation, we set out to correct local dependency errors. To do this, we outline the properties of annotation errors that make the task challenging and their existence problematic for learning. For the task, we define a feature-based model that explicitly accounts for non-relations between words, and then use ambiguities from one model to constrain a second, more relaxed model. In this way, we are successfully able to correct many errors, in a way which is potentially applicable to dependency parsing more generally. ...

  • The quality of the part-of-speech (PoS) annotation in a corpus is crucial for the development of PoS taggers. In this paper, we experiment with three complementary methods for automatically detecting errors in the PoS annotation for the Icelandic Frequency Dictionary corpus. The first two methods are language independent and we argue that the third method can be adapted to other morphologically complex languages. Once possible errors have been detected, we examine each error candidate and hand-correct the corresponding PoS tag if necessary. ...

  • Automated R e a s o n i n g techniques applied to the p r o b l e m of natural language correctness allow the d e s i g n of flexible training aids for the t e a c h i n g of foreign languages. The approach involves important advantages for both the student and the teacher by d e t e c t i n g possible errors and pointing out their reasons. Explanations may be given on four d i s t i n c t levels, thus offering differently instructive error messages according to the needs of the student.

  • The correction method distinguishes between orthographic errors and typographical errors. • Typographical errors (or misstypings) are uncognitive errors which do not follow linguistic criteria. • Orthographic errors are cognitive errors which occur when the writer does not know or has forgotten the correct spelling for a word. They are more persistent because of their cognitive nature, they leave worse impression and, finally, its treatment is an interesting application for language standardization purposes. ...

