The search space of Phrase-Based Statistical Machine Translation (PBSMT) systems can be represented under the form of a directed acyclic graph (lattice). The quality of this search space can thus be evaluated by computing the best achievable hypothesis in the lattice, the so-called oracle hypothesis. For common SMT metrics, this problem is however NP-hard and can only be solved using heuristics.
This paper proposes a new discriminative training method in constructing phrase and lexicon translation models. In order to reliably learn a myriad of parameters in these models, we propose an expected BLEU score-based utility function with KL regularization as the objective, and train the models on a large parallel dataset.
We present a Minimum Bayes Risk (MBR) decoder for statistical machine translation. The approach aims to minimize the expected loss of translation errors with regard to the BLEU score. We show that MBR decoding on N -best lists leads to an improvement of translation quality. We report the performance of the MBR decoder on four different tasks: the TCSTAR EPPS Spanish-English task 2006, the NIST Chinese-English task 2005 and the GALE Arabic-English and Chinese-English task 2006. The absolute improvement of the BLEU score is between 0.2% for the TCSTAR task and 1.
We present the results of an experiment on extending the automatic method of Machine Translation evaluation BLUE with statistical weights for lexical items, such as tf.idf scores. We show that this extension gives additional information about evaluated texts; in particular it allows us to measure translation Adequacy, which, for statistical MT systems, is often overestimated by the baseline BLEU method. The proposed model uses a single human reference translation, which increases the usability of the proposed method for practical purposes. ...
Được thành lập năm 1895 tại Paris, qua hơn 1 thế kỷ, Le Cordon Bleu đã trở thành trường hàng đầu thế giới đào tạo nhân lực trong lĩnh vực quản lý khách sạn, nghệ thuật ẩm thực và du lịch. Tại Australia, Le Cordon Bleu được mở ở Sydney vào năm 1996 và đến 1998 mở tại TP Adelaide của miền Nam nước này. Chương trình học của Le Cordon Bleu Australia đều do đầu bếp trưởng và quản lý cao cấp của các khách sạn giảng dạy.
Chọn phương án đúng (ứng với A hoặc B, C, D) cho mỗi câu sau. Câu 1: Ma tante m'a offert une jupe ______. A. bleu marine B. vertes C. noir D. bleus Câu 2: Au cas où tu ______ tard, préviens-moi ! A. rentrerais B. rentrais C. es rentré D. rentreras Câu 3: Ces actrices, je ______ ai rencontrées avant-hier dans un restaurant indien. A. les B. nous C. lui D. leur Câu 4: Josette est retournée chez elle parce qu'elle ______ son billet d'avion. A. oublierait
Bleu của mô hình cho phép sự thay đổi trong bản dịch thô, và trong nhiều trường hợp là không thể phân biệt giữa bản dịch chất lượng khác nhau rõ ràng. Kể từ khi Bleu giao điểm tương tự như các bản dịch của chất lượng khác nhau, nó là hợp lý
Trong bài báo này chúng tôi giới thiệu những phương pháp cho phép đánh giá chất lượng của một bản dịch theo phương pháp NIST và BLEU. Tiếp theo, chúng tôi giới thiệu công cụ do chúng tôi phát triển để đánh giá tự động chất lượng của các hệ thống dịch tự động trên mạng như Reverso, Sytran...
I was called to interview in DC before my overseas departure. I was quickly informed that I met
all the qualifications and thus was offered the job. I was so excited and actually in disbelief. I
thought it was too good to be true. I actually got a job before arriving at post! If I would have
already been at Post, the process would have been much more difficult since the interviewing is
done at the Department along with filing the paperwork, etc.
“The job is perfect, it interlaces my investigative skills with my knowledge of the government,...
In this paper, we propose a linguistically annotated reordering model for BTG-based statistical machine translation. The model incorporates linguistic knowledge to predict orders for both syntactic and non-syntactic phrases. The linguistic knowledge is automatically learned from source-side parse trees through an annotation algorithm. We empirically demonstrate that the proposed model leads to a signiﬁcant improvement of 1.55% in the BLEU score over the baseline reordering model on the NIST MT-05 Chinese-to-English translation task. ...
Automatic tools for machine translation (MT) evaluation such as BLEU are well established, but have the drawbacks that they do not perform well at the sentence level and that they presuppose manually translated reference texts. Assuming that the MT system to be evaluated can deal with both directions of a language pair, in this research we suggest to conduct automatic MT evaluation by determining the orthographic similarity between a back-translation and the original source text. This way we eliminate the need for human translated reference texts.
Inspired by previous preprocessing approaches to SMT, this paper proposes a novel, probabilistic approach to reordering which combines the merits of syntax and phrase-based SMT. Given a source sentence and its parse tree, our method generates, by tree operations, an n-best list of reordered inputs, which are then fed to standard phrase-based decoder to produce the optimal translation. Experiments show that, for the NIST MT-05 task of Chinese-toEnglish translation, the proposal leads to BLEU improvement of 1.56%. ...
A Propos Renard: Auteur de romans, nouvelles et feuilletons, connus pour ses récits fantastiques. Son roman le plus connu est «Les Mains d'Orlac», adapté plusieurs fois au cinéma. Disponible sur Feedbooks pour Renard: • Château hanté (1920) • Le Maître de la lumière (1933) • Le Péril Bleu (1912) • L'Homme Truqué (1921) • La Rumeur dans la montagne (1921) Copyright: This work is available for countries where copyright is Life+70 and in the USA.
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judgment than BLEU. In principle, tuning on these metrics should yield better systems than tuning on BLEU. However, due to issues such as speed, requirements for linguistic resources, and optimization difficulty, they have not been widely adopted for tuning.
Source language parse trees offer very useful but imperfect reordering constraints for statistical machine translation. A lot of effort has been made for soft applications of syntactic constraints. We alternatively propose the selective use of syntactic constraints. A classifier is built automatically to decide whether a node in the parse trees should be used as a reordering constraint or not. Using this information yields a 0.8 BLEU point improvement over a full constraint-based system.
We illustrate and explain problems of n-grams-based machine translation (MT) metrics (e.g. BLEU) when applied to morphologically rich languages such as Czech. A novel metric SemPOS based on the deep-syntactic representation of the sentence tackles the issue and retains the performance for translation to English as well.
We introduce a novel semi-automated metric, MEANT, that assesses translation utility by matching semantic role fillers, producing scores that correlate with human judgment as well as HTER but at much lower labor cost. As machine translation systems improve in lexical choice and fluency, the shortcomings of widespread n-gram based, fluency-oriented MT evaluation metrics such as BLEU, which fail to properly evaluate adequacy, become more apparent.
We show that unseen words account for a large part of the translation error when moving to new domains. Using an extension of a recent approach to mining translations from comparable corpora (Haghighi et al., 2008), we are able to ﬁnd translations for otherwise OOV terms. We show several approaches to integrating such translations into a phrasebased translation system, yielding consistent improvements in translations quality (between 0.5 and 1.5 Bleu points) on four domains and two language pairs.