Tuyển sinh 2024 dành cho Gen-Z

Trang Chủ

» Luận Văn - Báo Cáo

» Báo cáo khoa học

Annotation errors

Xem 1-20 trên 34 kết quả Annotation errors

OMGene: Mutual improvement of gene models through optimisation of evolutionary conservation

The accurate determination of the genomic coordinates for a given gene – its gene model – is of vital importance to the utility of its annotation, and the accuracy of bioinformatic analyses derived from it. Currentlyavailable methods of computational gene prediction, while on the whole successful, frequently disagree on the model for a given predicted gene, with some or all of the variant gene models often failing to match the biologically observed structure. Many prediction methods can be bolstered by using experimental data such as RNA-seq.

18p vibeauty 23-10-2021 7 1 Download

Gene annotation errors are common in the mammalian mitochondrial genomes database

Although animal mitochondrial DNA sequences are known to evolve rapidly, their gene arrangements often remain unchanged over long periods of evolutionary time. Therefore, comparisons of mitochondrial genomes may result in significant insights into the evolution both of organisms and of genomes.

8p vigiselle2711 30-08-2021 13 1 Download

Seq2Ref: A web server to facilitate functional interpretation

The size of the protein sequence database has been exponentially increasing due to advances in genome sequencing. However, experimentally characterized proteins only constitute a small portion of the database, such that the majority of sequences have been annotated by computational approaches. Current automatic annotation pipelines inevitably introduce errors, making the annotations unreliable.

7p viwyoming2711 16-12-2020 13 1 Download
Semantic segmentation of cilia using fully-convolutional dense networks

Cilia are hairlike structures protruding from nearly every cell in the body. Diseases known as ciliopathies where cilia function is disrupted can result in a wide spectrum of diseases. However, most techniques for assessing ciliary motion rely on manual identification and tracking of cilia. This annotation is tedious and error-prone and more analytical techniques impose strong assumptions such as periodic motion of beating cilia.

8p vigeorgia2711 01-12-2020 9 1 Download
Vindel: A simple pipeline for checking indel redundancy

With the advance of next generation sequencing (NGS) technologies, a large number of insertion and deletion (indel) variants have been identified in human populations. Despite much research into variant calling, it has been found that a non-negligible proportion of the identified indel variants might be false positives due to sequencing errors, artifacts caused by ambiguous alignments, and annotation errors.

10p vikentucky2711 26-11-2020 11 0 Download
Understanding the causes of errors in eukaryotic protein‑coding gene prediction: A case study of primate proteomes

Recent advances in sequencing technologies have led to an explosion in the number of genomes available, but accurate genome annotation remains a major challenge.

16p vikentucky2711 24-11-2020 11 1 Download
An effective approach for annotation of protein families with low sequence similarity and conserved motifs: Identifying GDSL hydrolases across the plant kingdom

The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity.

17p vioklahoma2711 19-11-2020 6 1 Download
Báo cáo khoa học: "Comparing a Linguistic and a Stochastic Tagger"

Concerning different approaches to automatic PoS tagging: EngCG-2, a constraintbased morphological tagger, is compared in a double-blind test with a state-of-the-art statistical tagger on a common disambiguation task using a common tag set. The experiments show that for the same amount of remaining ambiguity, the error rate of the statistical tagger is one order of magnitude greater than that of the rule-based one. The two related issues of priming effects compromising the results and disagreement between human annotators are also addressed. ...

8p bunthai_1 06-05-2013 48 3 Download
Báo cáo khoa học: " Automatic Verb Classification Using Distributions of Grammatical Features"

We apply machine learning techniques to classify automatically a set of verbs into lexical semantic classes, based on distributional approximations of diatheses, extracted from a very large annotated corpus. Distributions of four grammatical features are sufficient to reduce error rate by 50% over chance. We conclude that corpus data is a usable repository of verb class information, and that corpus-driven extraction of grammatical features is a promising methodology for automatic lexical acquisition. ...

8p bunthai_1 06-05-2013 45 3 Download
Báo cáo khoa học: "Compensating for Annotation Errors in Training a Relation Extractor"

The well-studied supervised Relation Extraction algorithms require training data that is accurate and has good coverage. To obtain such a gold standard, the common practice is to do independent double annotation followed by adjudication. This takes significantly more human effort than annotation done by a single annotator.

10p bunthai_1 06-05-2013 48 2 Download
Báo cáo khoa học: "Dependency Parsing of Hungarian: Baseline Results and Challenges"

Hungarian is a stereotype of morphologically rich and non-conﬁgurational languages. Here, we introduce results on dependency parsing of Hungarian that employ a 80K, multi-domain, fully manually annotated corpus, the Szeged Dependency Treebank. We show that the results achieved by state-of-the-art data-driven parsers on Hungarian and English (which is at the other end of the conﬁgurational-nonconﬁgurational spectrum) are quite similar to each other in terms of attachment scores.

11p bunthai_1 06-05-2013 58 3 Download
Báo cáo khoa học: "Semi-supervised Training for the Averaged Perceptron POS Tagger"

This paper describes POS tagging experiments with semi-supervised training as an extension to the (supervised) averaged perceptron algorithm, ﬁrst introduced for this task by (Collins, 2002). Experiments with an iterative training on standard-sized supervised (manually annotated) dataset (106 tokens) combined with a relatively modest (in the order of 108 tokens) unsupervised (plain) data in a bagging-like fashion showed signiﬁcant improvement of the POS classiﬁcation task on typologically different languages, yielding better than state-of-the-art results for English and Czech (4.

9p bunthai_1 06-05-2013 37 1 Download
Báo cáo khoa học: "Correcting a PoS-tagged corpus using three complementary methods"

The quality of the part-of-speech (PoS) annotation in a corpus is crucial for the development of PoS taggers. In this paper, we experiment with three complementary methods for automatically detecting errors in the PoS annotation for the Icelandic Frequency Dictionary corpus. The ﬁrst two methods are language independent and we argue that the third method can be adapted to other morphologically complex languages. Once possible errors have been detected, we examine each error candidate and hand-correct the corresponding PoS tag if necessary. ...

9p bunthai_1 06-05-2013 56 1 Download
Báo cáo khoa học: "Correcting Dependency Annotation Errors"

Building on work detecting errors in dependency annotation, we set out to correct local dependency errors. To do this, we outline the properties of annotation errors that make the task challenging and their existence problematic for learning. For the task, we deﬁne a feature-based model that explicitly accounts for non-relations between words, and then use ambiguities from one model to constrain a second, more relaxed model. In this way, we are successfully able to correct many errors, in a way which is potentially applicable to dependency parsing more generally. ...

9p bunthai_1 06-05-2013 49 2 Download
Báo cáo khoa học: "From detecting errors to automatically correcting them"

Faced with the problem of annotation errors in part-of-speech (POS) annotated corpora, we develop a method for automatically correcting such errors. Building on top of a successful error detection method, we ﬁrst try correcting a corpus using two off-the-shelf POS taggers, based on the idea that they enforce consistency; with this, we ﬁnd some improvement. After some discussion of the tagging process, we alter the tagging model to better account for problematic tagging distinctions. This modiﬁcation results in signiﬁcantly improved performance, reducing the error rate of the corpus. ...

8p bunthai_1 06-05-2013 45 2 Download
Báo cáo khoa học: "Detecting Errors in Part-of-Speech Annotation"

We propose a new method for detecting errors in "gold-standard" part-ofspeech annotation. The approach locates errors with high precision based on n-grams occurring in the corpus with multiple taggings. Two further techniques, closed-class analysis and finitestate tagging guide patterns, are discussed. The success of the three approaches is illustrated for the Wall Street Journal corpus as part of the Penn Treebank.

8p bunthai_1 06-05-2013 39 2 Download
Báo cáo khoa học: "Using Grammatical Relations to Compare Parsers"

We use the grammatical relations (GRs) described in Carroll et al. (1998) to compare a number of parsing algorithms A first ranking of the parsers is provided by comparing the extracted GRs to a gold standard GR annotation of 500 Susanne sentences: this required an implementation of GR extraction software for Penn Treebank style parsers. In addition, we perform an experiment using the extracted GRs as input to the Lappin and Leass (1994) anaphora resolution algorithm.

8p bunthai_1 06-05-2013 56 2 Download
Báo cáo khoa học: "Machine Aided Error-Correction Environment for Korean Morphological Analysis and Part-of-Speech Tagging"

Statistical methods require very large corpus with high quality. But building large and faultless annotated corpus is a very difficult job. This paper proposes an efficient m e t h o d to construct part-of-speech tagged corpus. A rulebased error correction m e t h o d is proposed to find and correct errors semi-automatically by user-defined rules. We also make use of user's correction log to reflect feedback. Experiments were carried out to show the efficiency of error correction process of this workbench. The result shows that about 63.2 % of tagging errors can be corrected. ...

5p bunrieu_1 18-04-2013 46 3 Download
Báo cáo khoa học: "A Debug Tool for Practical Grammar Development"

We have developed willex, a tool that helps grammar developers to work efﬁciently by using annotated corpora and recording parsing errors. Willex has two major new functions. First, it decreases ambiguity of the parsing results by comparing them to an annotated corpus and removing wrong partial results both automatically and manually. Second, willex accumulates parsing errors as data for the developers to clarify the defects of the grammar statistically. We applied willex to a large-scale HPSG-style grammar as an example. ...

4p bunbo_1 17-04-2013 55 1 Download
Báo cáo khoa học: "Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations"

Extracting semantic relationships between entities is challenging because of a paucity of annotated data and the errors induced by entity detection modules. We employ Maximum Entropy models to combine diverse lexical, syntactic and semantic features derived from the text. Our system obtained competitive results in the Automatic Content Extraction (ACE) evaluation. Here we present our general approach and describe our ACE results.

4p bunbo_1 17-04-2013 49 1 Download

+

Xem thêm 34 Annotation errors khác

CHỦ ĐỀ BẠN MUỐN TÌM

TOP DOWNLOAD

LV.09: Bộ Luận Văn Tốt Nghiệp Chuyên Ngành Quản Trị Kinh Doanh

81 tài liệu

1627 lượt tải

LV.26: Bộ 320 Luận Văn Thạc Sĩ Y Học

320 tài liệu

1228 lượt tải

LV.01: Bộ Luận Văn Thạc Sĩ Quản Trị Kinh Doanh MBA

165 tài liệu

2055 lượt tải

THÔNG TIN

TRỢ GIÚP

HỖ TRỢ KHÁCH HÀNG

Theo dõi chúng tôi

Chịu trách nhiệm nội dung:

Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA

LIÊN HỆ

Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM

Hotline: 093 303 0098

Email: support@tailieu.vn

Giấy phép Mạng Xã Hội số: 670/GP-BTTTT cấp ngày 30/11/2015 Copyright © 2022-2032 TaiLieu.VN. All rights reserved.