Tuyển sinh 2024 dành cho Gen-Z

Trang Chủ

» Công Nghệ Thông Tin

» Kỹ thuật lập trình

Parts of speech (POS)

Xem 1-20 trên 35 kết quả Parts of speech (POS)

A review of Khmer word segmentation and part-of-speech tagging and an experimental study using bidirectional long short-term memory

This paper is structured as follows: literature reviews part, which discusses different methods and frameworks in recent research related to Khmer word segmentation and POS tagging, Bidirectional long short-term memory section, which describes the experiment of this study and result, and finally is future work section.

12p vibego 02-02-2024 4 0 Download

A two-channel model for representation learning in Vietnamese sentiment classification problem

The main goal of SC is to classify user reviews in a document into opinion poles, such as positive, negative, and possibly neutral sentiments. There are two popular approaches for SC: The lexicon-based approach and the machine learning-based approach. The lexiconbased approach is usually based on a dictionary of negative and positive sentiment values assigned to words. This method thus depends on human effort to define a list of sentiment words and sometimes it suffers from low coverage.

19p nguaconbaynhay11 07-04-2021 12 2 Download

Lecture Natural language processing: Chapter 3 – Lê Ngọc Tấn

Lecture “Natural language processing – Chapter 3: Basic principles for NLP” has contents: POS – part of speech tagging, POS – part of speech examples for English, POS – Methods of tagging, sentence types,…and other contents.

28p dien_vi01 21-11-2018 26 1 Download
A formula to calculate pruning threshold for the part of speech tagging problem

One of crucial factors in the POS (Part-ofSpeech) tagging approaches based on the statistical method is the processing time. In this paper, we propose an approach to calculate the pruning threshold, which can apply into the Viterbi algorithm of Hidden Markov model for tagging the texts in the natural language processing. Experiment on the 1.000.000 words on the tag of the Wall Street Journal corpus showed that our proposed solution is satisfactory.

10p cumeo3000 01-08-2018 27 0 Download
A clustering technique for the Vietnamese word categorization

A clustering technique for the Vietnamese word categorization. In natural language processing, part-of-speech (POS) tagging plays an important role, as its output is the input of many other tasks (syntax analysis, semantic analysis. . . ). One of the problems related to POS tagging is to define the POS set. This could be solved using unsupervised machine learning methods.

12p lehasiphuong 22-05-2018 36 2 Download
An Experimental Investigation of Part-Of-Speech Taggers for Vietnamese

Part-of-speech (POS) tagging plays an important role in Natural Language Processing (NLP). Its applications can be found in many other NLP tasks such as named entity recognition, syntactic parsing, dependency parsing and text chunking. In the investigation conducted in this paper, we utilize the techniques of two widely-used toolkits, ClearNLP and Stanford POS Tagger, and develop two new POS taggers for Vietnamese, then compare them to three well-known Vietnamese taggers, namely JVnTagger, vnTagger and RDRPOSTagger.

15p truongtien_09 10-04-2018 32 6 Download
Báo cáo khoa học: "Simplifying Text for Language-Impaired Readers"

We download the original newspaper articles automatically from the WWW2, and apply a number of processing stages sequentially. Lexical Tagger The tagger (Elworthy, 1994) assigns and ranks part-of-speech (PoS) tags for each word in a sentence using a rst-order HMM. The tagger includes an unknown word guesser with an accuracy of around 85%, and a large diskresident lexicon specialised to newspaper text. Morphological Analyser The morphological analyser (an enhanced version of the GATE project lemmatiser (Cunningham et al.

2p bunthai_1 06-05-2013 55 3 Download
Báo cáo khoa học: "Feature-Rich Part-of-speech Tagging for Morphologically Complex Languages: Application to Bulgarian"

We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inﬂectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-annotated corpus, achieving accuracy of 97.98%, which is a signiﬁcant improvement over the state-of-the-art for Bulgarian.

11p bunthai_1 06-05-2013 47 3 Download
Báo cáo khoa học: "Inferring Selectional Preferences from Part-Of-Speech N-grams"

We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus, PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a given grammatical relation to the other. PONG estimates this probability by combining both distributions, whether or not either word occurs in the labeled corpus. ...

10p bunthai_1 06-05-2013 43 4 Download
Báo cáo khoa học: "Correcting a PoS-tagged corpus using three complementary methods"

The quality of the part-of-speech (PoS) annotation in a corpus is crucial for the development of PoS taggers. In this paper, we experiment with three complementary methods for automatically detecting errors in the PoS annotation for the Icelandic Frequency Dictionary corpus. The ﬁrst two methods are language independent and we argue that the third method can be adapted to other morphologically complex languages. Once possible errors have been detected, we examine each error candidate and hand-correct the corresponding PoS tag if necessary. ...

9p bunthai_1 06-05-2013 56 1 Download
Báo cáo khoa học: "Weakly Supervised Part-of-Speech Tagging for Morphologically-Rich, Resource-Scarce Languages"

This paper examines unsupervised approaches to part-of-speech (POS) tagging for morphologically-rich, resource-scarce languages, with an emphasis on Goldwater and Grifﬁths’s (2007) fully-Bayesian approach originally developed for English POS tagging. We argue that existing unsupervised POS taggers unrealistically assume as input a perfect POS lexicon, and consequently, we propose a weakly supervised fully-Bayesian approach to POS tagging, which relaxes the unrealistic assumption by automatically acquiring the lexicon from a small amount of POS-tagged data....

9p bunthai_1 06-05-2013 55 1 Download
Báo cáo khoa học: "From detecting errors to automatically correcting them"

Faced with the problem of annotation errors in part-of-speech (POS) annotated corpora, we develop a method for automatically correcting such errors. Building on top of a successful error detection method, we ﬁrst try correcting a corpus using two off-the-shelf POS taggers, based on the idea that they enforce consistency; with this, we ﬁnd some improvement. After some discussion of the tagging process, we alter the tagging model to better account for problematic tagging distinctions. This modiﬁcation results in signiﬁcantly improved performance, reducing the error rate of the corpus. ...

8p bunthai_1 06-05-2013 45 2 Download
Báo cáo khoa học: "Categorial Fluidity in Chinese and its Implications for Part-of-speech Tagging"

This paper discusses the theoretical and practical concerns in part-of-speech (POS) tagging for Chinese. Unlike other languages such as English, Chinese lacks morphological marking in association with categorial alternations. We consider such categorial fluidity a continuum, and any categorial shift a transition, with special focus on the verb-noun shift.

4p bunthai_1 06-05-2013 38 2 Download
Báo cáo khoa học: "Lexicon acquisition with a large-coverage unification-based grammar"

We describe how unknown lexical entries are processed in a unification-based framework with large-coverage grammars and how from their usage lexical entries are extracted. To keep the time and space usage during parsing within bounds, information from external sources like Part of Speech (PoS) taggers and morphological analysers is taken into account when information is constructed for unknown words.

4p bunthai_1 06-05-2013 40 2 Download
Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP"

This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps: the Nbest Part-Of-Speech (POS) tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated, we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally, Viterbi algorithm is applied to make global search in the entire sentence, allowing us to obtain linear complexity for the entire process. ...

8p bunrieu_1 18-04-2013 51 2 Download
Báo cáo khoa học: "Finding Anchor Verbs for Biomedical IE Using Predicate-Argument Structures"

For biomedical information extraction, most systems use syntactic patterns on verbs (anchor verbs ) and their arguments. Anchor verbs can be selected by focusing on their arguments. We propose to use predicate-argument structures (PASs), which are outputs of a full parser, to obtain verbs and their arguments. In this paper, we evaluated PAS method by comparing it to a method using part of speech (POSs) pattern matching. POS patterns produced larger results with incorrect arguments, and the results will cause adverse effects on a phase selecting appropriate verbs. ...

4p bunbo_1 17-04-2013 39 1 Download
Báo cáo khoa học: "Beyond N in N-gram Tagging"

The Hidden Markov Model (HMM) for part-of-speech (POS) tagging is typically based on tag trigrams. As such it models local context but not global context, leaving long-distance syntactic relations unrepresented. Using n-gram models for n 3 in order to incorporate global context is problematic as the tag sequences corresponding to higher order models will become increasingly rare in training data, leading to incorrect estimations of their probabilities.

6p bunbo_1 17-04-2013 51 4 Download
Báo cáo khoa học: "High Precision Treebanking — Blazing Useful Trees Using POS Information"

In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary deﬁnition sentences. 5,000 sentences are annotated by three different annotators and the agreement evaluated. An average agreement of 65.4% was found using strict agreement, and 83.5% using labeled precision. Exploiting POS tags allowed the annotators to choose the best parse with 19.5% fewer decisions. ...

8p bunbo_1 17-04-2013 38 4 Download
Báo cáo khoa học: "Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario"

This paper describes our work on building Part-of-Speech (POS) tagger for Bengali. We have use Hidden Markov Model (HMM) and Maximum Entropy (ME) based stochastic taggers. Bengali is a morphologically rich language and our taggers make use of morphological and contextual information of the words. Since only a small labeled training set is available (45,000 words), simple stochastic approach does not yield very good results. In this work, we have studied the effect of using a morphological analyzer to improve the performance of the tagger. ...

4p hongvang_1 16-04-2013 36 2 Download
Báo cáo khoa học: "Chinese Named Entity and Relation Identification System"

In this interactive presentation, a Chinese named entity and relation identification system is demonstrated. The domainspecific system has a three-stage pipeline architecture which includes word segmentation and part-of-speech (POS) tagging, named entity recognition, and named entity relation identitfication. The experimental results have shown that the average F-measure for word segmentation and POS tagging after correcting errors achieves 92.86 and 90.01 separately.

4p hongvang_1 16-04-2013 45 2 Download

+

Xem thêm 35 Parts of speech (POS) khác

CHỦ ĐỀ BẠN MUỐN TÌM

TOP DOWNLOAD

CEO.29: Bộ Tài Liệu Hệ Thống Quản Trị Doanh Nghiệp

628 tài liệu

859 lượt tải

FORM.08: Bộ 130+ Biểu Mẫu Thống Kê Trong Doanh Nghiệp

136 tài liệu

787 lượt tải

LV.26: Bộ 320 Luận Văn Thạc Sĩ Y Học

320 tài liệu

1228 lượt tải

THÔNG TIN

TRỢ GIÚP

HỖ TRỢ KHÁCH HÀNG

Theo dõi chúng tôi

Chịu trách nhiệm nội dung:

Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA

LIÊN HỆ

Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM

Hotline: 093 303 0098

Email: support@tailieu.vn

Giấy phép Mạng Xã Hội số: 670/GP-BTTTT cấp ngày 30/11/2015 Copyright © 2022-2032 TaiLieu.VN. All rights reserved.