Xem 1-20 trên 263 kết quả Same language
  • Homophone are words of the same language that are pronounced alike even of they differ in spelling, meaning, or origin, such as " pair", and " pear". Homophones may also bi spelled alike, as in " bear" ( the animal) and " bear" ( to carry). Other common homophones are write and right, meet and meat, peace and piece. You have to listen to the context to know which word someone means if they’re spoken aloud. If they say they like your jeans (genes?), they’re probably talking about your pants and not your height and eye color — but you’d...

    doc92p thanhvan6882 24-11-2009 348 206   Download

  • In 1992, one of those future teachers was still toiling in the orchards and fields of Central Washington, struggling to learn English, and dreaming of a return to teaching. Alfonso Lopez was born in a small village in Oaxaca, Mexico. By the time he arrived in Wenatchee in his mid-20s, he had already struggled through more adversity than many people face in a lifetime. The son of poor farmers, he managed to attend col- lege and earn his teaching degree and later a master’s degree in social science. Lopez taught for five years in rural schools in Oaxaca.

    pdf4p baohan 17-06-2009 119 21   Download

  • This book got its start as an experiment in modern technology. When I started teaching at my present university (1998), the organization and architecture course focused on the 8088 running MS-DOS—essentially a programming environment as old as the sophomores taking the class. (This temporal freezing is unfortunately fairly common; when I took the same class during my undergraduate days, the computer whose architecture I studied was only two years younger than I was.

    pdf334p hotmoingay3 09-01-2013 22 3   Download

  • We investigate the empirical behavior of ngram discounts within and across domains. When a language model is trained and evaluated on two corpora from exactly the same domain, discounts are roughly constant, matching the assumptions of modified Kneser-Ney LMs. However, when training and test corpora diverge, the empirical discount grows essentially as a linear function of the n-gram count. We adapt a Kneser-Ney language model to incorporate such growing discounts, resulting in perplexity improvements over modified Kneser-Ney and Jelinek-Mercer baselines. ...

    pdf6p hongdo_1 12-04-2013 14 3   Download

  • This paper revisits the pivot language approach for machine translation. First, we investigate three different methods for pivot translation. Then we employ a hybrid method combining RBMT and SMT systems to fill up the data gap for pivot translation, where the sourcepivot and pivot-target corpora are independent. Experimental results on spoken language translation show that this hybrid method significantly improves the translation quality, which outperforms the method using a source-target corpus of the same size. ...

    pdf9p hongphan_1 14-04-2013 16 3   Download

  • We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition on overlapping windows of linear context. We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a matter of hours.

    pdf10p nghetay_1 07-04-2013 11 2   Download

  • Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases, our methods achieve significant speed improvements, often by more than a factor of ten, over the conventional beam-search method at the same levels of search error and translation accuracy. ...

    pdf8p hongvang_1 16-04-2013 18 2   Download

  • The previous probabilistic part-of-speech tagging models for agglutinative languages have considered only lexical forms of morphemes, not surface forms of words. This causes an inaccurate calculation of the probability. The proposed model is based on the observation that when there exist words (surface forms) that share the same lexical forms, the probabilities to appear are different from each other. Also, it is designed to consider lexical form of word. By experiments, we show that the proposed model outperforms the bigram Hidden Markov model (HMM)-based tagging model.

    pdf4p bunbo_1 17-04-2013 21 2   Download

  • Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. ...

    pdf7p bunrieu_1 18-04-2013 23 2   Download

  • The undisputed favorite application for natural language interfaces has been data base query. Why? The reasons range from the relative simplicity of the task, including shallow semantic processing, to the potential real-world utility of the resultant system. Because of such reasons, the data base query task was an excellent paradigmatic problem for computational linguistics, and for the very same reasons it is now time for the field to abandon its protective cocoon and progress beyond this rather limiting task. ...

    pdf2p bungio_1 03-05-2013 10 2   Download

  • Do natural language database systems still ,~lovide a valuable environment for further work on n~,tural language processing? Are there other systems which provide the same hard environment :for testing, but allow us to explore more interesting natural language questions? In order to answer , o to the first question and yes to the second (the position taken by our panel's chair}, there must be an interesting language problem which is more naturally studied in some other system than in the database system. ...

    pdf4p bungio_1 03-05-2013 11 2   Download

  • We propose a novel algorithm for extracting dependencies from the derivations of a large fragment of CCG. Unlike earlier proposals, our dependency structures are always tree-shaped. We then use these dependency trees to compare the strong generative capacities of CCG and TAG and obtain surprising results: Both formalisms generate the same languages of derivation trees – but the mechanisms they use to bring the words in these trees into a linear order are incomparable.

    pdf9p bunthai_1 06-05-2013 5 2   Download

  • In the past the evaluation of machine translation systems has focused on single system evaluations because there were only few systems available. But now there are several commercial systems for the same language pair. This requires new methods of comparative evaluation. In the paper we propose a black-box method for comparing the lexical coverage of MT systems. The method is based on lists of words from different frequency classes. It is shown how these word lists can be compiled and used for testing. We also present the results of using our method on 6 MT systems that translate...

    pdf8p bunthai_1 06-05-2013 11 2   Download

  • Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Chinese and Korean. Although these English terms and their equivalences in the Asian languages refer to the same concept, they are erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and suggests a novel technique to solve it....

    pdf8p hongvang_1 16-04-2013 15 1   Download

  • This paper presents a discriminative pruning method of n-gram language model for Chinese word segmentation. To reduce the size of the language model that is used in a Chinese word segmentation system, importance of each bigram is computed in terms of discriminative pruning criterion that is related to the performance loss caused by pruning the bigram. Then we propose a step-by-step growing algorithm to build the language model of desired size.

    pdf8p hongvang_1 16-04-2013 20 1   Download

  • Partition-based morphology is an approach of finite-state morphology where a grammar describes a special kind of regular relations, which split all the strings of a given tuple into the same number of substrings. They are compiled in finite-state machines. In this paper, we address the question of merging grammars using different partitionings into a single finite-state machine. A morphological description may then be obtained by parallel or sequential application of constraints expressed on different partition notions (e.g. morpheme, phoneme, grapheme). ...

    pdf8p hongvang_1 16-04-2013 16 1   Download

  • We have established a phonotactic language model as the solution to spoken language identification (LID). In this framework, we define a single set of acoustic tokens to represent the acoustic activities in the world’s spoken languages. A voice tokenizer converts a spoken document into a text-like document of acoustic tokens. Thus a spoken document can be represented by a count vector of acoustic tokens and token n-grams in the vector space.

    pdf8p bunbo_1 17-04-2013 12 1   Download

  • The n-gram model is a stochastic model, which predicts the next word (predicted word) given the previous words (conditional words) in a word sequence. The cluster n-gram model is a variant of the n-gram model in which similar words are classified in the same cluster. It has been demonstrated that using different clusters for predicted and conditional words leads to cluster models that are superior to classical cluster models which use the same clusters for both words. This is the basis of the asymmetric cluster model (ACM) discussed in our study. ...

    pdf8p bunmoc_1 20-04-2013 14 1   Download

  • In this paper, we explore statistical language modelling for a speech-enabled MP3 player application by generating a corpus from the interpretation grammar written for the application with the Grammatical Framework (GF) (Ranta, 2004). We create a statistical language model (SLM) directly from our interpretation grammar and compare recognition performance of this model against a speech recognition grammar compiled from the same GF interpretation grammar.

    pdf8p bunthai_1 06-05-2013 15 1   Download

  • Chapter 8b - Semantics with dynamic typing. Here we consider the semantics of a dynamically typed language in order to examine the difference between statically and dynamically typed languages more fully. We use the same method of presentation as in section 8.2.

    ppt15p whocare_e 04-10-2016 1 0   Download

Đồng bộ tài khoản