intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Word segmentation

Xem 1-20 trên 84 kết quả Word segmentation
  • Despite word-of-mouth (WOM) and electronic WOM (eWOM) influencing people’s willingness to donate blood, no research has explored this behavior among blood service employees who are also donors. This underexplored segment is highly important, as they are generally committed to both the organization and the cause and are likely more informed on the topic of blood donation than the average donor.

    pdf12p vishanshan 27-06-2024 1 1   Download

  • This paper is structured as follows: literature reviews part, which discusses different methods and frameworks in recent research related to Khmer word segmentation and POS tagging, Bidirectional long short-term memory section, which describes the experiment of this study and result, and finally is future work section.

    pdf12p vibego 02-02-2024 2 0   Download

  • This paper proposes a sentiment analysis technique for Thai customers’ reviews. The proposed technique is based on the integration of Thai word extraction and sentiment analysis techniques for mining Thai customers’ opinions. Before the proposed technique is described in more detail, the segmentation problems of the Thai language are first discussed in the next section to clarify the problem.

    pdf19p meriday 20-04-2019 20 1   Download

  • In this chapter, students will be able to understand: Phonology, co-articulation effects, supra segmental features, morphology, word, morpheme, word and morpheme, phoneme and morpheme, lexeme and morpheme.

    ppt38p tieu_vu17 02-08-2018 38 1   Download

  • We propose a content based video retrieval system in some main steps resulting in a good performance. From a main video, we process extracting keyframes and principal objects using Segmentation of Aggregating Superpixels (SAS) algorithm. After that, Speeded Up Robust Features (SURF) are selected from those principal objects. Then, the model “Bag-of-words” in accompanied by SVM classification are applied to obtain the retrieval result. Our system is evaluated on over 300 videos in diversity from music, history, movie, sports, and natural scene to TV program show.

    pdf10p tieuthi3006 16-03-2018 48 2   Download

  • In the second stage, we looked at cross-language correlations, first over a 1-year time period and then over a 2-year time period. Over a 1-year time period, Spanish performance at the end of second grade had a modest relationship to English performance at the end of third grade on the phonemic segmentation, word, and pseudoword tasks. Performance on the Spanish letter identification task at the end of second grade was positively related to English performance on the same task at the end of third grade, but only for the group of children instructed in Spanish only. Over...

    pdf22p commentcmnr 03-06-2013 67 5   Download

  • The measures administered over the course of the study included both researcher- developed tests and standardized tests of the components of reading described above. The researcher-developed tests included the following: a phonology test and a phonemic segmentation task (phonological awareness); a letter, word, and pseudoword naming task (word reading); and tests of cognate awareness and morphological awareness (word knowledge).

    pdf33p commentcmnr 03-06-2013 53 7   Download

  • A second criterion for meaningful research on cross-language transfer is the recognition that literacy comprises many component skills. The component skills of reading must be carefully assessed in the first and second language to trace the development of first- and second-language abilities in relation to one another.

    pdf25p commentcmnr 03-06-2013 63 8   Download

  • One of the major problems one is faced with when decomposing words into their constituent parts is ambiguity: the generation of multiple analyses for one input word, many of which are implausible. In order to deal with ambiguity, the MORphological PArser MORPA is provided with a probabilistic context-free grammar (PCFG), i.e. it combines a "conventional" context-free morphological grammar to filter out ungrammatical segmentations with a probability-based scoring function which determines the likelihood of each successful parse. ...

    pdf10p buncha_1 08-05-2013 43 2   Download

  • This paper presents a trainable rule-based algorithm for performing word segmentation. The algorithm provides a simple, language-independent alternative to large-scale lexicai-based segmenters requiring large amounts of knowledge engineering. As a stand-alone segmenter, we show our algorithm to produce high performance Chinese segmentation. In addition, we show the transformation-based algorithm to be effective in improving the output of several existing word segmentation algorithms in three different languages. ...

    pdf8p bunthai_1 06-05-2013 45 2   Download

  • We propose a novel method for learning morphological paradigms that are structured within a hierarchy. The hierarchical structuring of paradigms groups morphologically similar words close to each other in a tree structure. This allows detecting morphological similarities easily leading to improved morphological segmentation. Our evaluation using (Kurimo et al., 2011a; Kurimo et al., 2011b) dataset shows that our method performs competitively when compared with current state-ofart systems.

    pdf10p bunthai_1 06-05-2013 51 2   Download

  • We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingual corpora and statistical word alignment techniques. First of all, our approach is adapted for the specific translation task at hand by taking the corresponding source (target) language into account. ...

    pdf9p bunthai_1 06-05-2013 44 2   Download

  • Although a lot of progress has been made recently in word segmentation and POS tagging for Chinese, the output of current state-of-the-art systems is too inaccurate to allow for syntactic analysis based on it. We present an experiment in improving the output of an off-the-shelf module that performs segmentation and tagging, the tokenizer-tagger from Beijing University (PKU). Our approach is based on transformation-based learning (TBL).

    pdf9p bunthai_1 06-05-2013 35 2   Download

  • In this paper we introduce a dynamic programming algorithm to perform linear text segmentation by global minimization of a segmentation cost function which consists of: (a) within-segment word similarity and (b) prior information about segment length. The evaluation of the segmentation accuracy of the algorithm on Choi's text collection showed that the algorithm achieves the best segmentation accuracy so far reported in the literature. Keywords: Text Segmentation, Document Retrieval, Information Retrieval, Machine Learning. ...

    pdf8p bunthai_1 06-05-2013 53 1   Download

  • Humans know a great deal about relationships among words. This paper discusses relationships among word pronunciations. We describe a computer system which models human judgement of rhyme by assigning specific roles to the location of primary stress, the similarity of phonetic segments, and other factors. By using the model as an experimental tool, we expect to improve our understanding of rhyme. A related computer model will attempt to generate pronunciations for unknown words by analogy with those for known words. ...

    pdf7p bungio_1 03-05-2013 43 1   Download

  • We present an iterative procedure to build a Chinese language model (LM). We segment Chinese text into words based on a word-based Chinese language model. However, the construction of a Chinese LM itself requires word boundaries. To get out of the chicken-and-egg problem, we propose an iterative procedure that alternates two operations: segmenting text into words and building an LM. Starting with an initial segmented corpus and an LM based upon it, we use a Viterbi-liek algorithm to segment another set of data. Then, we build an LM based on the second set and use the resulting LM to...

    pdf5p bunmoc_1 20-04-2013 37 1   Download

  • We present a stochastic finite-state model for segmenting Chinese text into dictionary entries and productively derived words, and providing pronunciations for these words; the method incorporates a class-based model in its treatment of personal names. We also evaluate the system's performance, taking into account the fact that people often do not agree on a single segmentation.

    pdf8p bunmoc_1 20-04-2013 23 2   Download

  • Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words. Contrary to the conventional wisdom of separating this issue from the task of sentence understanding, we propose an integrated model that performs word boundary identification in lockstep with sentence understanding. In this approach, there is no distinction between rules for word boundary identification and rules for sentence understanding. These two functions are combined. ...

    pdf3p bunmoc_1 20-04-2013 46 1   Download

  • This paper proposes a new indicator of text structure, called the lexical cohesion profile (LCP), which locates segment boundaries in a text. A text segment is a coherent scene; the words in a segment a~e linked together via lexical cohesion relations. LCP records mutual similarity of words in a sequence of text. The similarity of words, which represents their cohesiveness, is computed using a semantic network. Comparison with the text segments marked by a number of subjects shows that LCP closely correlates with the human judgments.

    pdf3p bunmoc_1 20-04-2013 39 3   Download

  • We explore how active learning with Support Vector Machines works well for a non-trivial task in natural language processing. We use Japanese word segmentation as a test case. In particular, we discuss how the size of a pool affects the learning curve. It is found that in the early stage of training with a larger pool, more labeled examples are required to achieve a given level of accuracy than those with a smaller pool. In addition, we propose a novel technique to use a large number of unlabeled examples effectively by adding them gradually to a pool. ...

    pdf8p bunmoc_1 20-04-2013 43 2   Download

CHỦ ĐỀ BẠN MUỐN TÌM

ADSENSE

nocache searchPhinxDoc

 

Đồng bộ tài khoản
2=>2