intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "THE FIRST CONFERENCE ON MECHANICAL TRANSLATION"

Chia sẻ: Nghetay_1 Nghetay_1 | Ngày: | Loại File: PDF | Số trang:10

63
lượt xem
3
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

THE FOLLOWING is a report on the proceedings of the first , held at the Massachusetts Institute of Technology, Cambridge, Mass., June 17-20, 1952, and my own reactions.1 At the Conference individuals working on MT in this country and in England met for the first time and presented their different approaches.

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "THE FIRST CONFERENCE ON MECHANICAL TRANSLATION"

  1. THE FIRST CONFERENCE ON MECHANICAL TRANSLATION Erwin Reifler Department of Far Eastern and Slavic Languages and Literature University of Washington, Seattle, Wash. THE FOLLOWING is a report on the proceed- be very much indebted to him. ings of the first MT Conference, held at the The Conference decided that the papers of the Massachusetts Institute of Technology, Cam- participants should be published together with the discussions.3 bridge, Mass., June 17-20, 1952, and my own r eactions.1 At the Conference individuals working on MT Automatic Dictionary in this country and in England met for the first Of greatest interest to the Conference was Dr. t ime and presented their different approaches. Booth's report on the translation experiments A d etailed list of participants appears on the he and Dr. R, H. Richens had programmed on a next page. The important point is that at this computer in London. Dr. Warren Weaver had Conference linguists and electronic engineers previously, in his first memorandum on MT joined for the first time to survey the linguistic (July 15, 1949), referred to their work. Ac- and engineering problems presented by MT. At c ording to him "their interest was, at least at the end of the Conference it was the general im- that time, confined to the problem of the mech- p ression of the participants that, for certain anization of a dictionary which in a reasonably types of source material, a mechanization of efficient way would handle all forms of all t he translation process is now a distinct possi- words." In a longer paper, SOME METHODS bility. Thus Dr. Warren Weaver's ideas about OF MECHANIZED TRANSLATION, which Dr. the possibility of MT in our time ceased to be a Booth submitted to the Conference he and Dr. d ream and moved into the realm of reality. Richens explain their approach. The transla- A s a matter of fact, the engineers envisaged tion they envisage is a word-for-word transla- the creation of pilot machines within the next tion maintaining the word order of the input f ew years; that is, machines with limited stor- text and, in the case of multiple meanings, sup- age for the translation of a limited quantity of plying alternative English equivalents. The scientific material from a foreign language into machine determines by itself the stems and intelligible English, built for the purpose of endings of the words of the input text and com- convincing the general public and, especially, p ares them with the entries in its separate foundations and other organizations able to sup- stem and ending memories. These furnish not port new ventures, of the feasibility of MT, in only the (often multiple) English equivalents for o rder to obtain the funds necessary for further the input words, but also the (sometimes mul- r esearch and improvements. tiple) grammatical meanings involved. The The Conference was ably organized by Dr. Y. latter are indicated in the output of the machine Bar-Hillel of the Research Laboratory of Elec- b y abbreviations of the terms for the gramma- t ronics at M.I.T. Half a year earlier Dr. Bar- tical meaning concerned. At present only sci- Hillel had visited the different groups working entific material is considered for MT. Idio- on MT in this country and published an excel- g lossaries are used for the various fields, lent REPORT ON THE PRESENT STATE OF RESEARCH ON MECHANICAL TRANSLATION.2 which means a considerable decrease in the number of possible meanings of each technical There can be no doubt that much of the success o f the Conference was due to Dr. Bar-Hillel's efforts, and it is, I believe, no overstatement to 3 Lack of sufficient funds has prevented the say that MT, if and when it materializes, will carrying out of this plan. However, a publisher has now been found for a volume of up-to-date essays reflecting present thinking on MT. This 1 This report was written in July, 1952. Opi- volume is scheduled to be published in the fall nions and facts are of that date. of 1954 jointly by the Technology Press of M.I.T, and John Wiley & Sons. It is being edi- 2 AMERICAN DOCUMENTATION, 2:229 - 237, ted by A. D. Booth and W. N. Locke. 1951. 23
  2. 24 ERWIN REIFLER Participants in the Conference on Mechanical Translation Dr. A. D. Booth, Director, Electronic Computer Section, Birkbeck College, London Prof. William E. Bull, Department of Spanish, University of California, Los Angeles Prof. Stuart C. Dodd, Director, Washington Public Opinion Laboratory, University of Washington, Seattle Prof. Leon Dostert, Director, Institute of Languages and Linguistics, Georgetown University, Washington, D. C. Dr. Olaf Helmer, Director of Research, Math, Division, Rand Corporation, Santa Monica, Calif. Dr. Harry D. Huskey, Assistant Director, National Bureau of Standards, Institute for Numerical Analysis, University of California, Los Angeles Mr. Duncan Harkin, Department of Defense, Washington, D. C. Prof. Victor A, Oswald, Department of Germanic Languages, University of California, Los Angeles Prof. Erwin Reifler, Far Eastern and Russian Institute, University of Washington, Seattle Mr. Victor H. Yngve, University of Chicago, Chicago Dr. Yehoshua Bar-Hillel, Research Associate, Research Laboratory of Electronics, Massachu- setts Institute of Technology, Cambridge Mr. Jay W. Forrester, Director of Digital Computer Laboratory, Massachusetts Institute of Technology, Cambridge Prof. William N. Locke, Department of Modern Languages, Massachusetts Institute of Technology Cambridge Mr, James W. Perry, Research Associate, Center of International Studies, Massachusetts Insti- tute of Technology, Cambridge Dr. Vernon Tate, Director of Libraries, Massachusetts Institute of Technology, Cambridge Dr. Jerome B. Wiesner, Director, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge Mr. A. Craig Reynolds, Jr., Endicott Laboratories, I.B.M., Endicott, N. Y. Mr. Dudley A. Buck, Research Assistant, Electrical Engineering Department, Massachusetts Institute of Technology, Cambridge
  3. THE FIRST CONFERENCE ON MECHANICAL TRANSLATION 25 t erm and an appreciable reduction both in the tactic connection can be built into the 'memory' a mount of storage required and in the access of machines of the high speed computer type." t ime. A number of sample products of this ma- c hine show the degree of intelligibility of the Idio-Glossaries mechanical translation product and demonstrate h ow much this solution of MT leaves to the in- A nother important suggestion made in his pa- t erpretation of a post-editor. There can be no p er and elaborated in a second paper entitled d oubt as to the value of Richens' and Booth's M ICROSEMANTICS is his "micro-glossaries - a pproach. It is, however, as they themselves g lossaries which will reduce the range of choice a re, I believe, very ready to admit, still far of meaning from a bewildering multiplicity to a from the ideal of MT which I would define as m atter of - at the most - two or three." It has follows: A complete mechanization of the t o be emphasized here that on every page of al- t ranslation process - that is, a mechanical sys- m ost every scientific text scientific terms are t em which, without the intervention of either a r are islands in an ocean of general language. p re- or post-editor, o utputs translations satis- C onsequently his scheme envisages "micro- f actory with regard to both semantic accuracy g lossaries" for the non-technical vocabulary of and intelligibility. 4 a w hole domain of a particular science. This S ome of the participating linguists indicated m ay reduce the number of non-grammatical i n private conversations that the samples of m eaning alternatives of the general language a utomatic dictionary output were unintelligible p ortions of scientific material in a number of t o them. My own impression is that the time c ases. In the majority of cases, h owever, the r equired for the interpretation of the meaning n on-grammatical incident meaning i .e., the o f the output of this machine will be a serious p articular meaning of the word in a given con- f actor in the evaluation of its practicality. This t ext, of these portions of the vocabulary is by t ime has to be added to the time required by the n o means determined or generally definable by m achine itself for its operations. People who t he branch of science to which the material be- k now classical Chinese will, for obvious rea- l ongs, but has to be inferred from the meaning s ons, have less difficulty than others with the o f co-occurrences of the narrow c ontext. There- i nterpretation of the products of this machine. f ore, although "micro-glossaries" (for which I s uggested the obviously better term "idio- g lossaries" - it is also preferable to speak of " Word-by-Word" or "Block-by-Block" Trans- "idiosemantics" rather than of "micro-seman- lations t ics") will certainly play a significant role in t he ultimate solution of MT, in the case of sci- Other very valuable contributions were made e ntific source material we are still faced with b y Professor Victor A. Oswald, Jr., of the a ll the problems of multiple non-grammatical U CLA who, together with Stuart L. Fletcher, meaning presented by general language. Micro- Jr., had previously published PROPOSALS FOR g lossaries "could," as Professor Oswald says, THE MECHANICAL RESOLUTION OF GER- MAN SYNTAX PATTERNS.5 In his conference " serve to replace a team of specialists (on the p ost-editor side) in our proposed process of paper WORD-BY-WORD TRANSLATIONS Dr. M T." But they will, I am afraid, not enable us Oswald exemplified the inadequacies of such t o dispense with a human editor or editors for t ranslation, even going so far as to assert that g eneral language problems, whether on the in- s uch a "translation is literally impossible." He put or on the output side, or on both sides of the s uggested instead "block-by-block transverba- M T assembly line. Moreover, Professor Os- l izatlon, in which process, problems of syntac- w ald is well aware that "It is possible that it t ic ambiguity are solved by the connection of might be prohibitively expensive* to produce s yntactic segments with each other, and the s uch glossaries. f luid German word order is resolved into a ri- gid English sequence." This he had previously V ocabulary Frequencies and Distribution demonstrated in the PROPOSALS, "...and," he added, "we now know that a recognition of syn- O f the greatest importance for the develop- m ent of MT will be a conference paper by Pro- f essor William E. Bull of the UCLA, entitled 4 See my chapter in the volume mentioned in PROBLEMS OF VOCABULARY FREQUENCY footnote 3. AND DISTRIBUTION. He exposes a number of " fallacies which are current in most discussions 5 MODERN LANGUAGE FORUM, 36:1 - 24, o f word frequencies" From this highly techni- 1951
  4. 26 ERWIN REIFLER c al paper I quote only the following passages of h is choice. The signal is fundamentally g reat relevance for the problem of "macro-" n on-semantic and the result of useless a nd "micro-glossaries": s pecialization in form usage. The pro- " There exists no scientific method of esta- b lem, however, can be solved both for b lishing a limited vocabulary which will t he machine and the student by isolating t ranslate any predictable percentage of t he fact that "the" in English takes estar the content (not the volume) of hetero- a nd "a" takes haber . g eneous material. An all-purpose mech- a nical memory will have to contain some- T he m an is here. El hombre esta a qui. t hing approaching the total available voca- b ulary of both the foreign (original) lan- 6 A man is here. Hay un hombre aqui." g uage and the target (final) language. In o rder to cover most semantic variations I t is Interesting here to note that Professor s everal millions of items would be needed. B ull's rule is perfectly applicable to the use of A t the present time we have no machine modern Chinese ( haber). In the first case w hich can manage such a number at a pro- one cannot use , i n the second case one f itable speed." h as to use it. Incidentally, Dr. Bar-Hillel also s trongly advocates the development of what he " A micro-vocabulary appears feasible c alls "operational syntax" for language teach- o nly if one is dealing with a micro-sub- i ng as well as for MT. j ect, a field in which the number of ob- O ther important statements in Bull's paper j ective entities and the number of possi- are the following: b le actions are extremely limited. The n umber of such fields is, probably, in- " The total volume of the high frequency significant." w ords is established by counting their uses w ith the words included in the selection " The limitations of machine translation a nd all their uses with the rare words ex- which we must face are, vocabularywise, c luded from the selection. The student, the inadequacy of a closed and rigid sys- c onsequently, who learns this vocabulary t em operating as the medium of transla- i s over-supplied with cement and under- ition within an ever-expanding, open con- s upplied with things to be cemented to- tinuum." g ether. He is like a builder who is given t en tons of cement and 500 bricks and told O perational Syntax and Teaching Foreign Lan- t o build a home. If he keeps his propor- guages t ions proper he has to be contented with a n elegant privy. I submit that this is one E xtremely valuable not only for MT, but also o f the major sources of irritation and f or all those interested in improving the teach- f rustration in our elementary courses in i ng of languages is Professor Bull's second f oreign languages. The reason our stu- paper entitled TEACHING FOREIGN LANGU- d ents cannot say anything much after a A GES. I can here only quote some of the im- y ear of language is not because they haven't p ortant suggestions made in his paper: studied; they haven't_got_a vocabulary w hose proportions permit them to say any- " In teaching languages we should either thing but the obvious banalities." ( The r eplace rules by operational instructions u nderscoring is mine.) o r spell out in simple terms the opera- t ions necessary to make a rule work. I " The principle of excessive repetition s hould like to stress in this connection, c annot be sustained by the evidence of t hat the signs which may be used in teach- h ow a native is forced to learn his own i ng (and in the instruction of a machine) l anguage. This suggests strongly that d o not necessarily have to have any logi- w e should increase the number of items c al connection with the meaning. I shall g iven to the student and decrease, if pos- g ive just two examples from Spanish. s ible, the number of repetitions of high F irst, there are two verbs in Spanish f requency vocabulary." c ommonly used to translate an English locative "to be": estar and haber. They 6 TEACHING FOREIGN LANGUAGES, p.3. a re synonymous and even the educated F or the second example, see the original. n ative does not know what determines
  5. THE FIRST CONFERENCE ON MECHANICAL TRANSLATION 27 I n his conclusion Professor Bull suggests the output side we c an, within certain definable li f ollowing points for consideration in the im- m its, plan the form of the output language. We provement of language teaching: c an put a selected vocabulary and a regularized morphology and syntax into the machine and, m oreover, within the limitations of intelligibi- " (l) the abandonment of outmoded ele- l ity, adjust the final language to certain pecu- m entalism, and research directed l iarities of each of the original languages. a t language as a structural whole (2) a c lear analysis of what is actually I rregular Original Language - Model Pivot mechanical in language Language - Model Output Language (3) t he description of what the native's l anguage-feel actually is Now in General MT, if we do not work with a (4) t he substitution of operational in- " pivot language," we shall (except in the case s tructions, whenever necessary for of original languages like Chinese and Japanese a bstract rules w hich by nature are very regular) in every (5) r esearch to discover the mechani- c ase be faced with a mechanical correlation c al signposts which are guides to b etween one irregular and one regularized lan- usage guage. But if we do use a pivot language, then (6) a n ew approach to the selection and o nly at the first step will this be the case; that teaching of vocabulary based on de- i s, in the MT from a natural language into the m onstrable facts" p ivot language. From here on, however, - that is, in the MT from the pivot language into any Pivot Languages of the model output languages - we would in e very case have a mechanical correlation be- Of the many valuable suggestions made by tween two regularized languages. Thus the use P rofessor Leon Dostert of Georgetown Univer- of a pivot language in General MT as suggested sity I would especially like to mention one b y Professor Dostert will mean a further sim- w hich will certainly become an important fea- plification of the engineering problems involved. t ure of future MT. Describing his experiences i n multiple translations, he stressed the advan- Mechanical Abstraction of Grammatical-Infor- tage of a "pivot language" or "pivot languages." mation G eneral MT (mechanical translation from one i nto many l anguages), he said, should be so de- In my paper quoted above I also demonstrated v eloped that one translates first from the input how the graphic indication by a human agent of language into one "pivot" language (which in our c ertain types of grammatical meaning in the in- c ase will, most likely, be English) and from p ut text might enable the machine to determine that pivot language into any one of the output incident non-grammatical meaning. Drs. Bull l anguages desired. This will, I believe, be very a nd Oswald, however, in their papers foresaw b eneficial for MT, as will become clear from t he possibility that a machine might be de- the following. s igned to determine grammatical meaning by i tself, on the basis of nothing more than the Model Target Languages conventional graphic form of input texts. If t his is possible, then that kind of pre-editorial P rofessor Stuart C. Dodd of the University of w ork which my idea necessitates can be dis- Washington in Seattle addressed the Conference pensed with. It will mean much for MT if it on MODEL TARGET LANGUAGES, (i.e., a re- c an be demonstrated that operational instruc- gularized form of the languages into which one t ions can be abstracted from a language on t ranslates). His paper caused a very lively dis- w hich we can base the programming of a ma- c ussion as a result of which I can say that c hine for the mechanical determination of cer- " model TL-s," especially his "model target tain types of grammatical meaning. But even E nglish" will constitute an important item in so it is important to point out the following: t he mechanization of the translation process. a ) even if this is possible for some types of A s I pointed out in the first of my two papers g rammatical information, it may not be possi- (MT WITH A PRE-EDITOR AND WRITING ble for other types. In his MICROSEMANTICS F OB MT), if we aim at a practical solution of Dr. Oswald mentions one kind of grammatical M T, then we can interfere neither with the lan- i nformation for which he can - at least for the guage nor the conventional spelling (speaking p resent - see only a human supplier. He says: h ere entirely with respect to alphabetized lan- "The German system of noun compounding guages) of the original language. But on the
  6. 28 ERWIN REIFLER is such that a glossary based on the gra- H illel. He said that such a plan would require phic forms would be both unwieldy and a storage of billions or trillions of entries - grossly inefficient because of unneces- obviously quite impossible to achieve. However, sary repetition. Almost any sequence appearances are misleading here. Before I can of nouns in German not syntactically show this, I have first to introduce a few new connected is automatically made into a concepts: compound, and your German noun strays I n the following I shall call "clue-sets" a set gaily about appearing now as the "head" of co-occurrent words of which one or one and now as the "tail" of a compound .... group "pinpoints" the meaning of the remainder. In a word, you must break up German I shall name "pinpointers" the pinpointing compounds if you want to make any sort words and "pinpointees" those whose meaning of efficient German-English glossary.... is pinpointed by such "pinpointers." Further- We know no mechanized process by more, I wish to remind the reader of the phe- which this could be accomplished, but nomenon of "Shared Transferred Meanings" an intelligent....pre-editor could indi- discussed in # H /6 of my first paper on mech- cate the dissection for any sort of con- anical translation and of the vast possibilities text." 7 of "Pseudo-One-To-One Correlations" exem- plified in my second Conference paper. Lastly I shall speak about "Pinpointees with a b) e ven though it is possible for some langu- Manage- ages, it may not be possible for some others. able or Unmanageable Number of Pinpointers" c) t he machinery required may be so com- and about "Pinpointee Meanings Stable or Un- plex and expensive that we may ultimately pre- stable in the Light of Source-Target fer to have a human agent indicate the relevant Semantics" grammatical information of the input text by (I beg the indulgence of the reader for the freak some system of symbolization (pre-editor). terms "pinpointer" and "pinpointee." I could d) i f, as in the case of German compounds not think of any other terms more "to the (see under a), no mechanized process can sup- point.") N ow Dr. Bar -Hillel's objection remains valid ply the information relative to one grammatical only if we are thinking of putting into the mech- situation, so that this information has to be sup- anized memory all p ossible clue-sets. This is, plied anyway by a pre-editor, then the latter however, neither intended nor necessary. We might as well add "seam-signals" to indicate have to consider here the following facts: the position of the "seam" (Oswald's "fracture- 1. E ach set of two languages shares a con- surfaces") in different types of compounds. The siderable number of semantic parallels (shared same signal would thus serve to indicate more transferred meanings). For example English than one type of grammatical meaning. This will w hich, like Chinese , is used in the might result in a simplification of the mechan- sense of "to want, to wish" and also as an auxi- ism designed for the determination of gramma- liary verb, expressing future; French ça va , tical meaning because then the machine has German es geht a nd Chinese , m eaning more instructions o n the basis of which to sup- "to go" and also used in the sense of "that does" ply less information. or "that will do"; Latin noli, "don't," a contrac- tion of non voli, m eaning "not want," and Chi- M echanical Determination of Incident Non- nese , meaning "not want" and "don't"; Grammatical Meaning and the Limited Storage etc., etc. C apacity of the Mechanical Memory 2. I n an extremely large number of cases a literal translation, though resulting in an unac- A m ost serious objection to my suggestion of customed output form, is still perfectly intelli- a mechanical determination of incident non- gible either in the narrower or in the wider con- grammatical meaning was voiced by Dr. Bar- text. For example, in playing Chinese chess, a player may say ; which even in its literal translation, "I eat your ele- 7 S hortly after distributing my report on the phant" (I take your elephant; the elephant is conference I completely solved this problem of something like the bishop in Western chess), is the mechanical dissection and identification of perfectly intelligible to the English reader. We all predictable and unpredictable compounds. are in very many cases able to create artificial A detailed description of this solution, first re- one-to-one correlations by selecting from the ported in my SIMT Nos. 6 & 7 (mimeographed) available output alternatives one which, though will be included in the forthcoming volume it may be customary or "good" only for cer- mentioned in footnote 3. tain context, is still intelligible in others. For example, Chinese , " to create, make, do,
  7. T HE FIRST CONFERENCE ON MECHANICAL TRANSLATION 29 a ct, etc.", is also used in contexts where the p ointers." Here no clue-set entry is necessary E nglish translator usually prefers to render it i n the first case, whereas for the second the b y forms of the verb "to be." If we translate d ecision has to be deferred until we know more " make" also in these contexts, the result will a bout the size of the total residual problem. o ften be horrible for the English hearer or e) " Pinpointees" the number of whose "pin- r eader, but it will still be intelligible. Thus p ointers" is large a nd whose meanings in the " he is a teacher, student, father, son, etc.,etc." l ight of source-target semantics are, in terms w ould appear in the English translation as "he o f points 1 and 2, different w ith regard to dif - m ake teacher, student, father, son, etc.", which f erent g roups of "pinpointers." Here we can i n its context, for example in answer to ques- c ertainly enter all clue-sets relative to one of t ions meaning something like "what is his pro- t he groups, preferably the group with the lar- f ession, position, what is he doing? etc." or g est still manageable number of "pinpointers," w hen discussing somebody's duties in relation w hereas for the remainder the decision has to t o his position, will be perfectly intelligible. A b e deferred until we know more about the size s peaker of standard English does not need to o f the total residual problem. l earn pidgin English in order to understand f) " Pinpointees" the number of whose "pin- w hat "this master makee teacher" (this gentle- p ointers" is large a nd whose meanings in the m an is a teacher) means. l ight of source-target semantics are, in terms 3 . In every language there is a large number o f points 1 and 2, different w ith regard to every o f words which may co-occur with a large num- " pinpointer" (this situation will be either rare b er of other words "pinpointing" their incident o r not occur at all). Here the decision has to m eanings, but among these we have to distin- b e deferred until we know more about the size g uish several groups: o f the total residual problem. a) " Pinpointees" whose meanings in the T hus wherever transferred meanings are l ight of source-target semantics (semantic re- s hared or wherever we can artificially create l ationships between the pair of languages) are o ne-to-one correlations, no consideration of t he same with all "pinpointers," either in fact " pinpointers" is necessary and, consequently, ( semantic parallel, cf. point 1} or in terms of w e need not worry about the entry of clue-sets. a rtificial one-to-one correlations (cf. point 2). W herever transferred meanings are not shared, H ere no clue-set entries are necessary. The o r wherever we can not artificially create one- n umber of possible "pinpointers" is here, of t o-one correlations, and where the number of c ourse, of no consequence whatsoever for MT. " pinpointers" is comparatively small, we cer- F or example German kaufen " to buy", verkau- t ainly can enter all clue-sets. Thus we are ul- f en " to sell", schreiben " to write", essen " to t imately concerned only with the residual pro- e at", in terms of German-English and German- b lem of those cases where "pinpointers" have C hinese semantics. t o be considered and are very numerous. No b) " Pinpointees" the number of whose "pin- r esearch has ever been done for any set of two p ointers" is comparatively small a nd whose l anguages to determine the size of the residual m eanings in the light of source-target seman- p roblem. It is, therefore, not possible to de- t ics are, in terms of points 1 and 2 above, dif - c ide on its treatment at present. If it still re- f erent w ith all "pinpointers." Here all clue- q uired more than, say, 10 million entries, one s ets should and can be entered into the mech- w ould naturally hesitate to consider recording a nized memory. i n the mechanized memory. What is important, c) ' Pinpointees" the number of whose "pin- h owever, is that, assuming the residual pro- p ointers" is large and whose meanings in the b lem required too many entries to permit me- l ight of source-target semantics are, in terms c hanization, the machine would leave only this o f points 1 and 2, the same i n the case of a very r esidual group of multiple meanings to a pre- l arge number o f "pinpointers," but different in o r post-editor. The editor would have much t he case of a small n umber of "pinpointers." l ess editing to do and in the case of a post- H ere no clue-set entry is necessary in the first e ditor the difficulty of semantic determination c ase, whereas in the second all clue-sets m ight well be diminished to a degree he would s hould and can be entered. c ertainly appreciate: the larger the number of s emantic decisions the machine makes for him, d) " Pinpointees" the number of whose "pin- t he clearer the output context he has to consi- p ointers" is large and whose meanings in the d er for the solution of the remaining riddles! l ight of source-target semantics are, in terms C ertainly, in MT wherever mechanization is o f points 1 and 2, the same i n the case of a com- p ractical, it should be carried out! p aratively small number of "pinpointers," but d ifferent w ith regard to a large n umber of "pin-
  8. 30 ERWIN REIFLER P re-editor Versus Post-editor to the realization of mechanical translation if we can mechanize the components of his In this context I should like to add some re- "mechanized'' dictionary....A pre-editor can do m arks to the problem "pre-editor versus post- much to simplify syntactic connection for editor." In my first two papers on MT 1 bur- mechanical 'digestion,' but I do not see how, as dened the pre-editor not only with the signali- a n operator in the FL (i.e., foreign or original zation of the grammatical, but also with that of language), he can effectively guide either the the incident non-grammatical meaning; that is, machine, or the machine plus a post-editor, w herever source-target semantics presented a through the mazes of multiple meaning on the problem of multiple meaning. In #81 of the TL (target or final language). Nor do I think first paper I had actually previously considered we can hope for much accurate help from one the alternative possibility of using a post-editor monolingual post-editor or even from one bi- to whom, in the case of multiple meanings, the lingual consultant. What has been overlooked machine would supply the various alternatives i s the fact that the competence required in the from which he would have to make the correct post-editor, even if he be bilingual, is only selection. I had said there that from the point p artially linguistic. The real prerequisite for of view of complete mechanization this may him is an intimate knowledge of the field to seem to be preferable because then no human which the translated text pertains" (pp. 3-5). factor would interrupt the purely mechanical Apart from the fact that I have in no way side of MT. However, from the point of view of "excluded problems of specific language...from M T as a whole, using a pre-editor is still much the domain of mechanical solution" (I am fully quicker for the following reasons: whereas the aware of the urgency of the translation of sci- r eader of the original text (i.e., pre-editor) has entific material, but would point out that even t o select the meaning that "makes sense" in an in such material we have to solve problems of original context which is completely intelligible g eneral language), I fully agree with Professor to him, the output text reader (i.e., post-editor) Oswald. But he had, when he wrote his paper, has to do this in an output context which will not yet seen my third paper (the first submitted n ecessarily contain a large number of non- to the Conference) in which I indicated my ra- d istinctive words with transferred meanings d ical departure from my previous position, different from those of the corresponding ori- demonstrated the possibility of mechanizing ginal language words, that is in_a context_that the determination of incident non-grammatical will often not be clear." meaning on the basis of information relative to Dr. Bar-Hillel, on the other hand, advocates certain types of grammatical meaning, and the determination of such incident meanings by limited the work of the pre-editor to the signa- a post-editor and has found much support for lization of these types of grammatical meaning. h is idea. As a matter of fact, at this early Both Drs. Oswald and Bull have, on the other stage of MT research I, too, cannot completely hand, mentioned the possibility that the deter- rule out the possibility that a MT post-editor mination of incident grammatical meaning may (not to be confounded with a general post-editor be mechanized. If this can be done, then there concerned with stylistic improvements of the would remain only the question whether the output text) may be necessary for the solution solution of all multiple meaning problems (in o f at least some of the semantic problems in- case no portion of this problem can be mech- volved. anized) or of the semantic problems left over Professor Oswald in his WORD-BY-WORD by the machine is - from the point of view of TRANSLATION voiced his scepticism concern- a ll-round practicality - better done by a pre- ing both the pre- and the post-editorial ap- o r a post-editor. I still feel that this task is proach. "I do not believe," he says, "that his e asier for the pre-editor. The post-editor is (i.e., Reifler's) combination of pre-editor with faced with a non-conventional form of output a mechanical dictionary constitutes the ultimate context in which he has to make a selection solution of our problem. In fact, I am of the from each of a number of conglomerations of opinion that we must grapple with the problem o utput alternatives in consideration of one or p recisely at the point where Mr. Reifler aban- more other conglomerations of output alterna- dons it. His proposals are most enlightening tives. He does, in fact, not fully understand for the solution of problems of general langu- the narrow output context before he has made age, but he has excluded problems of specific a t least some correct s elections. The pre- language (the jargon of medicine, mathematics, editor, on the other hand, is confronted with a linguistics, geology, etc.) from the domain of familiar linguistic medium without any con- mechanical solution. We shall be much closer glomerations of alternative words and under-
  9. THE FIRST CONFERENCE ON MECHANICAL TRANSLATION 31 stands the contexts before he is informed about blems on the basis of an output context which, the existence of a multiple meaning problem in because it does not contain too many clusters t erms of source-target semantics and before o f alternatives, is much clearer. he has chosen the appropriate supplementary Pilot Machines signal from the dictionary entry supplied by the mechanized dictionary. If we assume that a P rofessor Dostert suggested the early crea- large portion of the multiple meaning problems tion of a pilot machine or of pilot machines can be solved mechanically along the lines 1 proving to the world not only the possibility, have suggested and that the pre-editor would but also the practicality of MT. Since the time thus be faced only with the residual semantic n ecessary for the creation of such machines is problems, then the combined man-machine pro- an important factor, it will be best to develop a cedure would be something like the following. plan based on the simplest possible conditions. The pre-editor sends the original text into the When this problem was raised at the Conference, dictionary mechanism. In all cases of multiple the general opinion seemed to be that the sim- meanings in which the dictionary mechanism plest conditions are found in the mechanical can itself determine the incident meaning and correlation of certain European languages (Ger- supply the appropriate output equivalent on the mani) with the English language. I pointed out, b asis of the supplementary grammatical sig- h owever, that contrary to appearances, a Ger- n als which the pre-editor has added to the con- man-into-English scheme can not in the least ventional graphic form of the original text (or compete with a Chinese (or Japanese) into Eng- on the basis of the grammatical information lish scheme. In the case of these two languages Bull's and Oswald's "grammar mechanism" has nature has already provided us with highly reg- abstracted and supplied to the dictionary mech- ular languages. Moreover, both in morphology a nism), the pre-editor would never have to and syntax Chinese and English happen to have know that multiple meanings in terms of source- more in common than German (or any other target semantics are involved. The machine European language) and English. If we put into would do the work without giving any hint that the translation mechanism a regularized Eng- there are such multiple meaning problems. In lich which is, furthermore, within the limita- the case of a residual problem, however, the tions of intelligibility, adjusted to certain pecu- machine would in every case notify the pre- liarities of Chinese, we have an ideal situation: editor in some way and supply him with a dic- a correlation between two regular and in many tionary entry (in his own language!) indicating r espects very similar languages. It is true the meaning alternatives in the light of source- t hat - as was stressed at the Conference - cer- t arget semantics. From these the pre-editor t ain government agencies may be readier to would have to choose and then add the appro- s upply the funds necessary for further research p riate supplementary signal to the portion of and improvements if the first pilot machine is the input text involved. As pointed out above, designed for mechanical translation from Rus- he can make such a choice much quicker than sian into English. But such a machine will be a post-editor because he is dealing with a fami- more complex and more expensive and the work liar linguistic medium and understands the out- n ecessary for its creation more time-consum- put context before he makes his choice. ing than in the case of a Chinese-English MT I should like to add that I am keeping an open unit. m ind with regard to this problem of pre-editor Thus the first pilot machine should, I feel, be v ersus post-editor. It is, in fact, quite possible programmed for a MT from Chinese into Eng- that, in terms of the time and money spent on lich. Moreover, if we want to go further and linguistic and engineering research (linguistic show the possibility and practicality of General r esearch is probably less expensive than en- MT (mechanical translation from one into many gineering research), mechanical complexity and languages) on the basis of the concept of "pivot construction time, speed and accuracy of trans- languages" as suggested by Dr. Dostert, our lation, etc., etc., the optimum may be reached simplest proposition would be one in which we i n an arrangement in which a pre-editor sig- add to the Chinese-English unit a second unit n alizes certain types of grammatical informa- for the translation of the English output of the tion, the machine abstracts some other types of first unit into Japanese. Then we would have a grammatical information and on the basis of mechanical correlation merely between a regu- this information from two sources determines larized language (English) and another language certain types of incident non-grammatical (Japanese) which by nature is highly regular. meaning and reshuffles the word order. A post- The Conference ended on an optimistic note e ditor then solves the residual semantic pro-
  10. 32 ERWIN REIFLER of sino-foreign MT on the basis of the Chinese w ith the suggestion by Professor Booth that the c haracters themselves , which are graphio- next conference be held in London. semantically more distinctive than the I.R. He a dded that he had heard that a machine supply- C hinese Characters Versus Alphabetization i ng the corresponding characters for the Chi- n ese telegraph code numbers has already been I s hould like to add here a valuable sugges- developed in this country. There should be no t ion which has come to me from Dr. Fang-kuei r eason why a machine which reverses this pro- Li. With regard to languages with a non-alpha- cess could not be built. A pre-editor could add betic script I had hitherto thought of making use t he supplementary grammatical signals just as o f an alphabetized form. I had pointed to the w ell to a Chinese character text as to an alpha- f act that, wherever different alphabetization b etized form of this text. The supplementary s ystems have been suggested or are actually signals would be typed into the character-(code) u sed, the graphio-semantically most distinctive n umber machine together with the characters one would be most beneficial for MT. For Chi- to which they refer. Such an approach would n ese this would be the I.R. (Interdialect Roman- e liminate the transcription into an alphabetiza- i zation). But even in this romanization some t ion and thus save time. 8 a dditional differentiation is necessary in order t o further reduce the still large number of h omographs. Dr. Li suggested that, since even t he I.R. requires further adjustments for pur- 8 F or dates and references to Dr. Reifler's p oses of graphio-semantic distinctiveness, it papers on MT, see Vol. I, No. 1 of MT, March may be worthwhile to consider the development 1954.
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2