![](images/graphics/blank.gif)
Parts of speech (POS)
-
This paper is structured as follows: literature reviews part, which discusses different methods and frameworks in recent research related to Khmer word segmentation and POS tagging, Bidirectional long short-term memory section, which describes the experiment of this study and result, and finally is future work section.
12p
vibego
02-02-2024
4
0
Download
-
The main goal of SC is to classify user reviews in a document into opinion poles, such as positive, negative, and possibly neutral sentiments. There are two popular approaches for SC: The lexicon-based approach and the machine learning-based approach. The lexiconbased approach is usually based on a dictionary of negative and positive sentiment values assigned to words. This method thus depends on human effort to define a list of sentiment words and sometimes it suffers from low coverage.
19p
nguaconbaynhay11
07-04-2021
12
2
Download
-
Lecture “Natural language processing – Chapter 3: Basic principles for NLP” has contents: POS – part of speech tagging, POS – part of speech examples for English, POS – Methods of tagging, sentence types,…and other contents.
28p
dien_vi01
21-11-2018
26
1
Download
-
One of crucial factors in the POS (Part-ofSpeech) tagging approaches based on the statistical method is the processing time. In this paper, we propose an approach to calculate the pruning threshold, which can apply into the Viterbi algorithm of Hidden Markov model for tagging the texts in the natural language processing. Experiment on the 1.000.000 words on the tag of the Wall Street Journal corpus showed that our proposed solution is satisfactory.
10p
cumeo3000
01-08-2018
27
0
Download
-
A clustering technique for the Vietnamese word categorization. In natural language processing, part-of-speech (POS) tagging plays an important role, as its output is the input of many other tasks (syntax analysis, semantic analysis. . . ). One of the problems related to POS tagging is to define the POS set. This could be solved using unsupervised machine learning methods.
12p
lehasiphuong
22-05-2018
36
2
Download
-
Part-of-speech (POS) tagging plays an important role in Natural Language Processing (NLP). Its applications can be found in many other NLP tasks such as named entity recognition, syntactic parsing, dependency parsing and text chunking. In the investigation conducted in this paper, we utilize the techniques of two widely-used toolkits, ClearNLP and Stanford POS Tagger, and develop two new POS taggers for Vietnamese, then compare them to three well-known Vietnamese taggers, namely JVnTagger, vnTagger and RDRPOSTagger.
15p
truongtien_09
10-04-2018
32
6
Download
-
We download the original newspaper articles automatically from the WWW2, and apply a number of processing stages sequentially. Lexical Tagger The tagger (Elworthy, 1994) assigns and ranks part-of-speech (PoS) tags for each word in a sentence using a rst-order HMM. The tagger includes an unknown word guesser with an accuracy of around 85%, and a large diskresident lexicon specialised to newspaper text. Morphological Analyser The morphological analyser (an enhanced version of the GATE project lemmatiser (Cunningham et al.
2p
bunthai_1
06-05-2013
55
3
Download
-
We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-the-art for Bulgarian.
11p
bunthai_1
06-05-2013
47
3
Download
-
We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus, PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a given grammatical relation to the other. PONG estimates this probability by combining both distributions, whether or not either word occurs in the labeled corpus. ...
10p
bunthai_1
06-05-2013
43
4
Download
-
The quality of the part-of-speech (PoS) annotation in a corpus is crucial for the development of PoS taggers. In this paper, we experiment with three complementary methods for automatically detecting errors in the PoS annotation for the Icelandic Frequency Dictionary corpus. The first two methods are language independent and we argue that the third method can be adapted to other morphologically complex languages. Once possible errors have been detected, we examine each error candidate and hand-correct the corresponding PoS tag if necessary. ...
9p
bunthai_1
06-05-2013
56
1
Download
-
This paper examines unsupervised approaches to part-of-speech (POS) tagging for morphologically-rich, resource-scarce languages, with an emphasis on Goldwater and Griffiths’s (2007) fully-Bayesian approach originally developed for English POS tagging. We argue that existing unsupervised POS taggers unrealistically assume as input a perfect POS lexicon, and consequently, we propose a weakly supervised fully-Bayesian approach to POS tagging, which relaxes the unrealistic assumption by automatically acquiring the lexicon from a small amount of POS-tagged data....
9p
bunthai_1
06-05-2013
55
1
Download
-
Faced with the problem of annotation errors in part-of-speech (POS) annotated corpora, we develop a method for automatically correcting such errors. Building on top of a successful error detection method, we first try correcting a corpus using two off-the-shelf POS taggers, based on the idea that they enforce consistency; with this, we find some improvement. After some discussion of the tagging process, we alter the tagging model to better account for problematic tagging distinctions. This modification results in significantly improved performance, reducing the error rate of the corpus. ...
8p
bunthai_1
06-05-2013
45
2
Download
-
This paper discusses the theoretical and practical concerns in part-of-speech (POS) tagging for Chinese. Unlike other languages such as English, Chinese lacks morphological marking in association with categorial alternations. We consider such categorial fluidity a continuum, and any categorial shift a transition, with special focus on the verb-noun shift.
4p
bunthai_1
06-05-2013
38
2
Download
-
We describe how unknown lexical entries are processed in a unification-based framework with large-coverage grammars and how from their usage lexical entries are extracted. To keep the time and space usage during parsing within bounds, information from external sources like Part of Speech (PoS) taggers and morphological analysers is taken into account when information is constructed for unknown words.
4p
bunthai_1
06-05-2013
40
2
Download
-
This paper presents a novel statistical model for automatic identification of English baseNP. It uses two steps: the Nbest Part-Of-Speech (POS) tagging and baseNP identification given the N-best POS-sequences. Unlike the other approaches where the two steps are separated, we integrate them into a unified statistical framework. Our model also integrates lexical information. Finally, Viterbi algorithm is applied to make global search in the entire sentence, allowing us to obtain linear complexity for the entire process. ...
8p
bunrieu_1
18-04-2013
51
2
Download
-
For biomedical information extraction, most systems use syntactic patterns on verbs (anchor verbs ) and their arguments. Anchor verbs can be selected by focusing on their arguments. We propose to use predicate-argument structures (PASs), which are outputs of a full parser, to obtain verbs and their arguments. In this paper, we evaluated PAS method by comparing it to a method using part of speech (POSs) pattern matching. POS patterns produced larger results with incorrect arguments, and the results will cause adverse effects on a phase selecting appropriate verbs. ...
4p
bunbo_1
17-04-2013
39
1
Download
-
The Hidden Markov Model (HMM) for part-of-speech (POS) tagging is typically based on tag trigrams. As such it models local context but not global context, leaving long-distance syntactic relations unrepresented. Using n-gram models for n 3 in order to incorporate global context is problematic as the tag sequences corresponding to higher order models will become increasingly rare in training data, leading to incorrect estimations of their probabilities.
6p
bunbo_1
17-04-2013
51
4
Download
-
In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary definition sentences. 5,000 sentences are annotated by three different annotators and the agreement evaluated. An average agreement of 65.4% was found using strict agreement, and 83.5% using labeled precision. Exploiting POS tags allowed the annotators to choose the best parse with 19.5% fewer decisions. ...
8p
bunbo_1
17-04-2013
38
4
Download
-
This paper describes our work on building Part-of-Speech (POS) tagger for Bengali. We have use Hidden Markov Model (HMM) and Maximum Entropy (ME) based stochastic taggers. Bengali is a morphologically rich language and our taggers make use of morphological and contextual information of the words. Since only a small labeled training set is available (45,000 words), simple stochastic approach does not yield very good results. In this work, we have studied the effect of using a morphological analyzer to improve the performance of the tagger. ...
4p
hongvang_1
16-04-2013
36
2
Download
-
In this interactive presentation, a Chinese named entity and relation identification system is demonstrated. The domainspecific system has a three-stage pipeline architecture which includes word segmentation and part-of-speech (POS) tagging, named entity recognition, and named entity relation identitfication. The experimental results have shown that the average F-measure for word segmentation and POS tagging after correcting errors achieves 92.86 and 90.01 separately.
4p
hongvang_1
16-04-2013
45
2
Download
CHỦ ĐỀ BẠN MUỐN TÌM
![](images/graphics/blank.gif)