Research on a non-statistical scheme for the insertion of English articles in machine-translated Russian is described. Ideal article insertion as a goal is challenged as unreasonable. Classification of English nouns, simple syntactic criteria, and multiple printout are the scheme's main features.
We have aligned Japanese and English news articles and sentences to make a large parallel corpus. We ﬁrst used a method based on cross-language information retrieval (CLIR) to align the Japanese and English articles and then used a method based on dynamic programming (DP) matching to align the Japanese and English sentences in these articles. However, the results included many incorrect alignments.
For an 8,300-word sample of English text we have found that it is possible to provide at least an acceptable article for more than 90 per cent of the noun occurrences at a "cost" of providing a dual article for half of the occurrences.
Rules 1 - Identify Your Motivation and Desire (Xác định mong muốn và
động lực của bạn)
Hãy xác định động lực bằng các hỏi bản thân tại sao bạn muốn học tiếng
Hiểu được động lực bạn cần xác định mục tiêu (GOAL) của bạn là gì ?
Rules 2 - Realize that English is easy (Tiếng Anh quá dễ) Bạn có nghĩ tiếng Anh rất khó không ? Nếu vậy bạn sẽ không bao giờ
học tốt tiếng anh cả.
ENGLISH IS EASY !!!...
Intonation is important for learners of English because even with satisfactory consonants and vowels, a phrase/sentence with an incorrect intonation contour may change the intended meaning of the whole utterance.
In this demo, we present SciSumm, an interactive multi-document summarization system for scientiﬁc articles. The document collection to be summarized is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each article based on queries generated from the context surrounding the co-cited list of papers.
To verify hardware designs by model checking, circuit specifications are commonly expressed in the temporal logic CTL. Automatic conversion of English to CTL requires the definition of an appropriately restricted subset of English. We show how the limited semantic expressibility of CTL can be exploited to derive a hierarchy of subsets. Our strategy avoids potential difficulties with approaches that take existing computational semantic analyses of English as their starting point--such as the need to ensure that all sentences in the subset possess a CTL translation. ...
The goal of the project is to enhance the database of the Oxford Dictionary of English (a forthcoming new edition of the 1998 New Oxford Dictionary of English) so that it contains not only the original dictionary content but also additional sets of data formalizing, codifying, and supplementing this content. This will allow the dictionary to be exploited effectively as a resource for computational applications. The Oxford Dictionary of English (ODE) is a high-level dictionary intended for fluent English speakers (especially native speakers) rather than for learners.
This study attempts to present what the authors have experienced and applied in fostering learner autonomy in Country Studies (namely British and American Studies) at Faculty of English, Hanoi National University of Education. Starting with some main definitions about learner autonomy and its conditions, and basing on authors’ own experience and belief, the authors discuss four main strategies used in teaching and learning Country Studies.
It is undeniable that English has become the most popular foreign language in Vietnam nowadays. Nevertheless, among the millions of people speaking English in Vietnam, there are many people who make mistakes in pronunciation. This creates some typical features of so-called “Vietnamese English”. This paper focuses on the mistakes made by Vietnamese users of English when pronouncing the four English sounds /∫/, / /, / / and / /. Reasons for the mistakes and some tentative suggestions to mitigate the problem are then discussed.
We study the challenges raised by Arabic verb and subject detection and reordering in Statistical Machine Translation (SMT). We show that post-verbal subject (VS) constructions are hard to translate because they have highly ambiguous reordering patterns when translated to English. In addition, implementing reordering is difﬁcult because the boundaries of VS constructions are hard to detect accurately, even with a state-of-the-art Arabic dependency parser.
We report in this paper our work on accurately generating case markers and sufﬁxes in English-to-Hindi SMT. Hindi is a relatively free word-order language, and makes use of a comparatively richer set of case markers and morphological sufﬁxes for correct meaning representation. From our experience of large-scale English-Hindi MT, we are convinced that ﬂuency and ﬁdelity in the Hindi output get an order of magnitude facelift if accurate case markers and sufﬁxes are produced.
This paper describes methods for relating (threading) multiple newspaper articles, and for visualizing various characteristics of them by using a directed graph. A set of articles is represented by a set of word vectors, and the similarity between the vectors is then calculated. The graph is constructed from the similarity matrix. By applying some constraints on the chronological ordering of articles, an efficient threading algorithm that runs in O(n) time (where n is the number of articles) is obtained. ...
Machine translation of locative prepositions is not straightforward, even between closely related languages. This paper discusses a system of translation of locative prepositions between English and French. The system is based on the premises that English and French do not always conceptualize objects in the same way, and that this accounts for the major differences in the ways that locative prepositions are used in these languages.
Academic writing is arguably the most important language skill to tertiary students, especially to English-major ones, whose grades are largely determined by their performance in written assignments, academic reports, term examinations and graduation theses. However, reality has proved the difficulties of Vietnamese learners in applying the right level of formality, lexical dense, and objectivity.
We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures. Our approach relies on syntactic analysis on the source side (English) and then encodes a wide variety of local and non-local syntactic structures as complex structural tags which appear as additional factors in the training data. On the target side (Turkish), we only perform morphological analysis and disambiguation but treat the complete complex morphological tag as a factor, instead of separating morphemes. ...
We present disputant relation-based method for classifying news articles on contentious issues. We observe that the disputants of a contention are an important feature for understanding the discourse. It performs unsupervised classification on news articles based on disputant relations, and helps readers intuitively view the articles through the opponent-based frame.
In contrast to many languages (like Russian or French), modern English does not distinguish formal and informal (“T/V”) address overtly, for example by pronoun choice. We describe an ongoing study which investigates to what degree the T/V distinction is recoverable in English text, and with what textual features it correlates. Our ﬁndings are: (a) human raters can label English utterances as T or V fairly well, given sufﬁcient context; (b), lexical cues can predict T/V almost at human level. ...
In this paper we examine the task of sentence simpliﬁcation which aims to reduce the reading complexity of a sentence by incorporating more accessible vocabulary and sentence structure. We introduce a new data set that pairs English Wikipedia with Simple English Wikipedia and is orders of magnitude larger than any previously examined for sentence simpliﬁcation.
In this paper, we propose a novel system for translating organization names from Chinese to English with the assistance of web resources. Firstly, we adopt a chunkingbased segmentation method to improve the segmentation of Chinese organization names which is plagued by the OOV problem. Then a heuristic query construction method is employed to construct an efficient query which can be used to search the bilingual Web pages containing translation equivalents.