Báo cáo khoa học: "Syntactical Variants"
lượt xem 3
download
Traditional grammar is normally eclectic and vaguely formulated, and it often tends to overgeneralize or fails to state the range of validity for its rules. Grammars for mechanical translation must be all-inclusive and rigorously explicit. While the input language grammar must register all the grammatical constructions possible, the existence of basically synonymous morphological and syntactical variants permits considerable inventorial reduction in the output grammar.
Bình luận(0) Đăng nhập để gửi bình luận!
Nội dung Text: Báo cáo khoa học: "Syntactical Variants"
- [Mechanical Translation, vol.4, nos.1 and 2, November 1957; pp. 28-34] Syntactical Variants† Bjarne Ulvestad, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts* Traditional grammar is normally eclectic and vaguely formulated, and it often tends to overgeneralize or fails to state the range of validity for its rules. Grammars for mechanical translation must be all-inclusive and rigorously explicit. While the in- put language grammar must register all the grammatical constructions possible, the existence of basically synonymous morphological and syntactical variants per- mits considerable inventorial reduction in the output grammar. These considera- tions are discussed with reference to English and German examples: verb phrases w ith 'remember'/ (sich ) erinnern a s the head; 'as if’ / als ob clauses. IT IS POSSIBLE to imagine a series of poor the syntactical constructions of a given pair of but successively 'better' machine-made trans- languages, and especially of the one on the in- lations, ranging from, say, 'very poor' to put side of the translation machine, will ulti- ' fair' or 'not so very poor,' which might be mately have been 'tagged' or assigned their found to be substantially adequate for their var- specific memberships in a large number of i ous purposes. Thus even a lowest-grade or groups and subgroups of linguistic entities, and 'very poor' translation would conceivably have the more exhaustive this intricate taxonomy, a demonstrable adequacy, provided its purpose t he more adequate, i.e., the less liable to pro- were merely to acquaint its prospective read- duce ungrammatical and nonsensical sentence ers with the subject matter of the original (in- sequences, will be the corresponding transla- put language ) text.1 Leading up from this kind tion mechanism. of primitive, low-standard mechanical trans- The tantalizing question as to whether an ab- lation to one that would be regarded by the pun- solutely foolproof apparatus for the mechanical d its as 'correct,' to the finest shades of idio- transfer of information from one language to matic nuances, there is an almost discourag- another can be constructed, if only in theory, ingly long, devious path, or rather a long se- need not bother us too much at this stage, for ries of shorter excursions each of which is even if the answer to the question should in the more complex and laborious than its predeces- end turn out to be negative, less-than-perfect sor. If we, as we should, consider it impera- mechanical translation will nevertheless be tive never to compromise with perfection where useful for scholars, whose main concern is perfection is attainable, all the words and all naturally to obtain an adequate communication of scientific facts and ideas rather than stylis- tically impeccable texts, desirable though the latter may be. † T his work was supported by the U.S. Judging from reports on the highly significant Army ( Signal Corps ), the U.S. Air Force (Office of Scientific Research, Air Research work which is at present carried on at various and Development Command), and the U.S.Navy universities, we have every reason to believe ( Office of Naval Research); and in part by the that most of the general technical problems of National Science Foundation. mechanical translation are approaching their solution. As an example of this kind of prom- * On leave from University of California, ising study, one may mention N. Chomsky's Berkeley, California; now at University of and V. Yngve's research into workable recog- Bergen, Bergen, Norway. nition devices for use in sentence-for-sentence translation, which is vastly preferable to word- 1. Cf. J. W. Perry, "Translation of Russian for-word transfer. While the bulk of linguistic technical literature by machine," MT, Vol. 2, work in the field of mechanical translation has No. 1, pp. 15-24 (1955). thus far admittedly been of a rather general
- S yntactical Variants 29 guage, 4 they cannot, as a rule, be left out of and preliminary nature, researchers on both the grammar of the input language without more sides of the Atlantic are becoming more and more aware that the most pressing require- or less serious consequences for the quality of ment for further progress is the composition the eventual translation. It is obvious from the of total-coverage grammars deliberately exe- remarks made above that the mechanical trans- cuted with mechanical translation in mind. We lation point of view will compel linguists to ex- do not have such grammars for any language, amine in detail problems that have hitherto except in rudimentary and fragmentary form, been regarded as trivial or inconsequential. but even at this early date we can discuss some We can therefore expect that mechanical trans- of their conspicuous features, as distinct from lation research will be of fundamental value to those of what we may term traditional gram- structural linguistics. mars. The important task of registering all syntac- In this article a few problems in mechanical tical variants, including those that are ordinar- translation grammar will be presented and dis- ily overlooked in standard grammars, need not cussed, with some reference to their practical necessarily lead to a correspondingly greater relevance to the input language and to the out- complexity on the part of the eventual encoding put language. English and German are the two program, although it may seem so at first languages chosen for this exposition. However, glance. An example will perhaps help. substantially similar problems will no doubt be found in any language. (1) Ich erinnere mich an ihn (den Mann) We can state without reservation that in con- (2) Ich erinnere mich auf ihn (den Mann) structing grammars for the input language and (3) Ich erinnere mir ihn (den Mann) for the output language, the input grammar must be subjected to the more piecemeal ex- (4) Ich erinnere mich ihn (den Mann) amination of particular problems. One of the (5) Ich erinnere ihn (den Mann) most transparent reasons for this lies in the relatively large number of basically isoseman- (6) Ich erinnere mich seiner (des Mannes) tic morphological and syntactical variants that exist in every linguistic system. While all T hese German sentences are built around these variants will presumably have to be iden- the weak verb (sich) erinnern 'remember' and t ified and registered in the input language c orresponding to the English sentences 'I grammar, considerable reduction in the num- r emember him' and 'I remember the man.' ber of corresponding variants will ordinarily be possible in the output grammar, as will be seen below. It must be emphasized that the 2. Cf. B. Ulvestad, "Object clauses without chief difference between traditional grammar dass dependent on negative governing clauses and what may be called mechanical translation in modern German," Monatshefte, 47.329-38 (input language) grammar is that the former is (1955). eclectic and normally vaguely formulated, whereas the latter will be all-inclusive and rig- 3. A typical instance is furnished by orously explicit and formalized. Traditional E. E. Cochran, A Practical German Review grammars overgeneralize and rarely state the Grammar. 11th printing (New York, 1947), actual range of the validity of each rule; me- p. 241: "Note: zu after sagen is dropped in chanical translation grammar must, ideally, an indirect statement." The example illustrat- explicate all the cases for which the given rule ing this dropping of zu is: Er sagte zu mir: applies as well as those for which it does not. "Ich kann es mir nicht leisten," vs. Er sagte Furthermore, mechanical translation grammar mir, er könnte es sich nicht leisten. That this must of necessity account for the total number rule is invalid in its present categorical formu- of linguistic constructions that occur in a given lation is seen from such sentences as: Er sagte language even if traditional grammars categor- zu Sabine, er werde sie . . . abholen (Brentano), ically state the nonoccurrence of certain mem- bers; 2 and misleading transformation rules F ranz... sagte einmal zu mir, es gebe in je- dem Dorf ein oder zwei schwere Taten (Wittich). must be recognized as such and correctly re- stated. 3 Whereas variant constructions of low 4. This consideration will be taken up for statistical probabilities may on the whole be s eparate discussion in a later article. disregarded in the grammar of the output lan-
- 30 B. Ulvestad Whatever the tasks for which the translation Only (1) and (6) belong to the generally ac- machine is designed, the encoding will not be cepted standard language, and for that particu- made too difficult by the requirement of full lar code the traditional formula, 'sich ( acc.) coverage. It is the patient grammar writer erinnern is followed by a genitive construction whose difficulties are enhanced by new decis- o r by the preposition an w ith an accusative ions to improve the translation. construction,' is correctly stated, provided, of course, that one does not take 'followed by' It is interesting that if German were the out- literally. In normal modern German literary put language, the situation in the examples prose, however, one may encounter any one of above would be reversed and considerably less the six types. Now, if we want to register complex. As input, we would have English sen- every one of the sentence types with reflexive tences with the verbs 'remember,' 'recall,' and erinnern in the input code (this excludes 5), possibly 'recollect,' all of which are closely we need only add the verb erinnern not only to related from the point of view of multiple-class memberships. With German as the output lan- the class of reflexive verbs with the reflexive guage, one of the six types above is sufficient pronoun in the accusative case, but also to the for mechanical translation purposes since we class of verbs that may occur with the reflex- are primarily interested in cognitive meaning ive pronoun in the dative, and subsequently transfer, not in the kind of additional informa- s tate, e.g., that the verb erinnern w ith accu- tion 'natural language' may furnish (age, sex, sative reflexive may 'govern' the accusative, dialect, education, business background, etc.) the genitive, or a prepositional phrase with an or auf followed by an accusative noun phrase Naturally, the reduction of the number of var- (NP). Since these entities will presumably iants in the output language to one is advisable have been registered and classified in some o nly if the variants are absolutely free or if department of the grammar anyway, they do there is no possibility of making a meaningful not have to be restated, but only referred to in selection out of two or more output variants on terms of a defined code signal. This signal the basis of clues found in the input language. will indicate, for instance, that the verb (sich) We snail explain this below with reference to a erinnern belongs with denken in that it 'gov- typical mechanical translation problem, using erns' an an-phrase with the accusative, and as examples German and English clauses which with sehen in that it takes an auf-phrase with may be termed 'quasi clauses' (in English, 'as the accusative. if'-clauses; in German, als ob-Sätze). Presen- If the purpose of the mechanical translation tation of a grammar of these clauses for me- grammar and translation apparatus were re- c hanical translation is the purpose of the re- stricted exclusively to the transfer of German mainder of this paper. scientific texts, sentence types (1) and (6) above Variations on the following statement, with its would probably be the only ones that would need examples, are current in textbooks of German: to be encoded. Even for translation of current 'The secondary subjunctive (past subjunctive) novelistic prose we need only add (5), which i s usual after als ob 'as if.' Er sprach, als o b occurs much more frequently than (2) and (3). er das Buch gefunden hätte. . . . ob may be omit- In this kind of literary prose, the frequency ted and inverted order used. . . . Er sprach, als continuum runs as follows, from very high to hätte er das Buch gefunden.' 7 It is not difficult very low: (6)— (1)— (5) — (2) — (3)— (4).5 t o see that this 'quasi clause grammar' is far If, on the other hand, a speaker of the Hamburg Umgangssprache were to be used as 'informant,' the first part of the frequency sequence would 7. P.H. Curts, Basic German, revised ed. probably be (5) — (1); (6) can hardly be said t o belong in this city language at all. 6 (New York, 1946), p. 71. It does not matter much whether one's description of als (ob, wenn) reads, (1) 'the ob, like the wenn, may be 5. T he data for this were obtained from a omitted,' or (2) 'the quasi conjunction is als, corpus of 52 recent German novels; (3) and but ob or wenn may be added,' although logi- (4) occurred only five and three times, respec- c ally (1) is preferable in a grammar of the tively, and there was a considerable frequency spoken standard (Hochsprache popularly also drop between (6), (1), and the rest. c alled Schriftsprache). a nd (2) better corre- sponds to the usage actually found in the writ- 6. Native informants refer to (6) as "stilted," ten (novelistic ) language. "constructed," "archaic."
- Syntactical Variants 31 We symbolize the noun phrase and the poten- too fragmentary to be used except for introduc- tially succeeding infinitive or past participle ing the 'rudiments of elementary German' to under one sign, Z [NP + ( Vinf /Vpp) = Z]; beginners; so we shall not take time to demon- and the relationship between (7), (12) on the strate its shortcomings. Rather, we shall at- one hand, and (8), (9), (10) on the other will be tempt to write as complete a grammar of the seen to be one of constituency permutation to German 'quasi clauses' as possible from the the right of the QC. For further simplification data available to us. Subsequently some prac- of the structural statements, we may operate tical problems with reference to the transfer with three classes of QC's: QC1 (als, wie als), processing will be discussed. QC2 (als ob, als wenn, wie wenn), and QC3 Let us consider the following six sentences. (zero).9 Note that a comma always separates (7) Ihm war, als habe er sie seufzen gehört a clause from a succeeding dependent clause (Waggerl) and accordingly stands in an immediate concat- enation relationship with the conjunction. We (8) Es war, als ob noch einmal die Sonne, can therefore (and this may be useful for me- Wasser und Wind ... dem Oberleutnant chanical translation encoding) subsume under in dieser Gestalt vor die Augen treten the term 'conjunction,' for maximum mechani- wollten (Tügel) cal translation signal power, the conjunction (9) Mister Wenner ging durch das Dorf, als itself with the preceding comma, so that, for wenn es gar keine Schwalbacher gäbe example, the symbol QC1 shall be henceforth (Kirschweng) taken to mean 'comma followed by QC1.' The six 'quasi' sentences can accordingly be written (10) Und doch war es, wie wenn ein schiefer- as follows: blanker, tödlicher Ernst sich auf den ganzen Platz gelegt hätte (Goes) I. (7), (12) ---------- QC1 + Vfin + Z (11) Wenn ich im Fahren lange hinaufsah, war II. (8). (9), (10) --------- QC2 + Z + Vfin es mir, der ganze Himmel käme auf mich III. (11) ---------- QC3 + NP + VP z u (Bauer) (12) I ch lief schnell, wie als gälte es, sich Further reduction, stating the transformation ein Landgut zu erobern auf diesem Gang relationship between I and II in formal terms, ( Goes) is possible. For instance, one might state the rules: 'for transforming I into II. rewrite QC 1 Sentences (7) to (12) have different 'quasi' as QC2 reversing the order of Vfin + Z, and conjunctions (QC's), namely, als, als ob, als for transforming II into I, rewrite QC2 as QC1 wenn, wie wenn, zero (Ø), and wie als. The reversing the order of Z and Vfin,' but further internal relationships between these sentences study would disclose that T I → II is correctly will be seen from the following regrouping of stated, and not the reverse T II→ I. From (7) to (12) symbolized in terms of significant er tat, als hätte er ihn nicht gesehen (I) we c onstituents (the symbol / is read 'or'): 8 clearly obtain by this transformation: er tat, (7) -------- , als + Vfin + NP + ( Vinf / Vpp) als ob er ihn nicht gesehen hätte (II), but there exist instances of so-called elliptic II-sentences (12) -------- , wie als ---------------------------------- that do not permit a direct transformation (8) -------- , als ob + NP + (Vinf / Vpp) + Vfin T II → I, for instance, er tat als ob er ihn nicht gesehen, in which the finite verb (here, (9) -------- , als wenn -------------------------- (10) -------- , wie wenn --------------------------------- 9. On a different level of analysis, one might (11) -------- , Ø + NP + VP ------------------------------ m ake use of the structural relationships be- tween (12) and a sentence such as es war mehr s o, als hielte sich etwas an ihrem Bein fest 8. The mode of the finite verb in the ' quasi' (Nossack) and state that the adverb so i n the c lause is not considered at this point. Note governing clause can be shifted into the depen- that the term 'Vfin' in parentheses is used in a dent clause and changing its status into that of wide sense and includes so-called passive in- a corresponding conjunction particle, thus: finitives such as gehört werden, gehört worden X + s o , als + Y → X, wie als + Y. Note s ein, e tc. the positions of the comma in the two formulas.
- 32 B. Ulvestad ' quasi' clauses, e.g., er sagte , als hätte er h ätte o r habe ) is dropped, or more correctly nichts verstanden, dass er es morgen Versucher stated, does not occur. The ellipsis of the werde.11 Here the 'quasi' clause is included (readily predictable) finite verbs haben and in an indirect discourse sentence, and its spe- sein after past participles is encountered oc- cial formula is simply X + QC1 + Vfin subj + Z. casionally in all subtypes of II, in (8) as well Note that 'Vfin + Z' is an indispensable ele- äs in (9) and (10), whereas the finite verb ment in formula I, because of the nonunique must always be made explicit in I. And the function of als as a dependent clause conjunc- omission of haben / s ein is not restricted to tion ( cf. als er n ach Hause kam, etc.), where- 'quasi' clauses. [Cf. the dependent clauses of as in formula II the element ' Z + Vfin' can be sentences like er fragte, ob er ihn gesehen considered predictable, and the simplified for- [ habe / hätte ] and als er nach Hause gekommen mula X + QC2 + Z would perhaps be an adequate [ war ], fand er , dass. . ... ] This 'dropping' of statement for a sentence like am nächsten Tage haben / sein after past participles thus need not l ag e r g anz s till , als o b e r t ot w äre. T he be specially explicated in the grammar of unique function of als ob as a conjunction 'quasi' clauses; it will have been taken into makes this reduction possible. account elsewhere. Another distinctive feature Formula III is more recalcitrant in that its differentiating I and II may be adduced: The primitive form, ( --------- Ø + N P + VP) is subjunctive mode of the finite verb, or rather also the statement of the structure of indirect the subjunctive ([er] höre, [er] ginge) or the discourse sentences with zero conjunction; nonovert, 'neutral, ambiguous' mode ( indic- e.g., er sagte, er sei krank. Actually, III ative or subjunctive, such as [er] hörte, [er] formalizes a genuine overlapping or ambiguous suchte) is obligatory in the I-sentences, but sentence type. [Cf. such sentences as mir not in the II-sentences; for instance, er tut, scheint, dass ............, mir scheint, Ø ........... , als höre / hörte er nichts, but er tut, als ob er and mir scheint, als ob ................ ] Note that n ichts hört / h öre / h örte, w here hört i s an our token sentence (11) above can be translated overtly indicative weak verb. In a recent study e ither as '... it seemed to me as though..' or of German 'quasi' sentences, based on twenty- a s '... it seemed to me (that)...,' with only four novels, no overt indicative finite verbs trivial difference in cognitive meaning. There were found among 737 als-clause s (I), but fif- are two possible ways of solving the recognition teen were found among the 187 als ob- / als wenn-clauses (II) found in the corpus. 10 Con- problem in this case: (1) We can add specifica- tions as to the context of the clause and state sequently, the establishment of groups I, II, that zero is used as a 'quasi' conjunction after and III appears so far to be the simplest pos- governing clauses such as mir ist, es scheint, sible classification and if we include reference o r (2) we can drop III from our 'quasi' clause to the mode of the finite verb in the 'quasi' formulations altogether and consider it an in- clause, the following three statements or for- direct discourse formula only (the term 'indi- mulas describe the grammar of the 'quasi' r ect discourse' being used here in its tradi- clauses in German: tional meaning). The second solution seems I. QC1 + Vfin subj + Z preferable for the following reasons: The zero II. QC2 + Z + Vfin subj / ind I II. QC3 + N P + VP subj /ind F ormulas I and II uniquely define German 11. This statement needs to be qualified to ex- 'quasi' clauses. They can therefore be used clude some rarely occurring clauses that would directly, i.e., without additional specification, seem to correspond to II in its present formu- as clause identification formulas in standard lations. The following sequence was found in written German. Thus X + I + Y or W.v.Niebelschütz, Verschneite Tiefen, (Berlin, X + II + Y is normally sufficient information 1940), p. 144: 'Doch wessen das Herz hier for establishing that one is concerned with sen- gierig ist, weiss niemand; nur ich. Vielleicht tences or sentence sequences that include weiss es der Ritter auch? Mag sein. Mag es sein, es wäre leichter für mich, als wenn ich's ihm sagen müsste.' The clause starting with 10. B. Ulvestad, "The Structure of the German als wenn means: 'than if I had to tell it to him.' Quasi Clauses," to be published in Germanic Such dependent clauses as this are found only Review (1957). after comparatives in the governing clauses, here, leichter.
- S yntactical Variants 33 Table I Frequencies of chosen present subjunctive (c.pr.) and chosen past subjunc- tive ( c.pt.) in three different 'quasi' clause types in novels by 24 authors. conjunction occurs only after governing clauses and our reduced grammar now simply reads: l ike es s cheint , mir i st , es k ommt m ir v or , I. QC1 + Vfin subj + Z and it is infrequently found. Only thirteen ex- II. QC2 + Z + Vfin subj / ind amples [such as mir schien , ich könnte sie The tense-forms of the subjunctive in such aussprechen, jedoch fehlte das Wort (Zweig)] clauses need not occupy us for long. In most were found among 1168 'quasi' sentences taken traditional grammars, which are usually of the from twenty-four works. This in conjunction prescriptive type, statements indicating the ob- with the basic similarities in meaning ('it s eemed to me that / as though . . . . ' ) , appears ligatory nature of past subjunctive finite verbs to furnish sufficient justification for operating are found. Table I amply demonstrates that with only two types of 'quasi' clauses, I and II, these statements are untenable and unwarranted. 12. The term 'chosen present/past subjunctive' rence of such forms as, e.g., [er] sei, gehe, m eans that either tense form in a given case bringe (present subjunctive) and [er ] wäre, would represent the subjunctive mode unam- ginge, brächte (past subjunctive). The names biguously. In other words, we are interested o f the authors are of no importance in this i n the ratios between the numbers of occur- context.
- 34 B. Ulvestad We would therefore be wrong in adding the adequate German-to-English transfer grammar word 'past' after 'subj' in formulas I and II; o f 'quasi' clauses: the correct statement is obviously one that I. QC1 + Vfin subj + Z does not specify tense-form. If German were → ' as though' + NP + VP the output language, (in which case we would be faced with a choice, see below) the gram- II. QC2 + Z + Vfin subj / ind mar would read, at least for the literary style → ' as if' + NP + VP level: T he concise 'quasi' clause grammar which I. QC1 + Vfin subj past + Z we have worked out above could be further sim- plified within the context of a full-scale input In this formula, QC1 would include only als, g rammar of German, because most, perhaps not wie als , and formula II would not occur in all, of the constituents would already have been t his grammar at all, unless compelling rea- d escribed and classified. For instance, the s ons for its inclusion were discovered. 13 t wo clauses in the sentence wenn er mich sähe, A s imilar problem emerges with regard to w ürde e r g rüssen b elong in the same classes the translation of German into English: Should a s some of the 'quasi' clause constructions w e register both 'as if' and 'as though' as cor- a fter als i n [er t at , ] als w enn e r m ich s ähe respondent conjunctions, and if not, which one and [er t at , ] als w ürde er g rüssen, would be preferable? Let us discuss this from r espectively. t he point of view of a particular transfer situ- The classification and coding of sentence ele- ation. The following German sentences are all ments and the subsequent elaboration of the g rammatically correct: s implest possible grammatical rules in terms E r tat, als ob er krank wäre o f these classes are indispensable prelimi- n aries to a successful construction of a work- ------ , als wenn--------------- a ble translation machine. Every new gram- ------ , wie wenn -------------- m atical statement will also represent a step ------, als wäre er krank f orward in our scientific description of the ------, wie als ------------ l anguage whose structure the grammar expli- c ates and formalizes. The ultimate grammar T hese sentences are, at least from the point w ill constitute the central prerequisite for a o f view of mechanical translation, isosemantic t ranslation machine. a nd can be translated as either 'he acted as if h e were ill,' or 'he acted as though he were ill.' T herefore, NP + VP + 'as if' + NP + VP 13. T he reasons for preferring I (with als ) to s eems just as good a correspondence formula a s NP + VP + 'as though' + NP + VP. 1 4 I I (with als o b , als w enn) f or the output gram- mar, if only one formula were to be employed, However, we would reasonably argue that the can be read out of the table. s lightly 'elevated,' 'literary' connotation of ' as though' in contradistinction to the more ' colloquial' one of 'as if' corresponds to that 14. A more complete discussion of the English o f the German als ( I) and als ob ( II), respec- c orrespondences would, of course, include t ively, in which case one may suggest as an s uch 'quasi' clauses as 'as though being ill.'
CÓ THỂ BẠN MUỐN DOWNLOAD
Chịu trách nhiệm nội dung:
Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA
LIÊN HỆ
Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM
Hotline: 093 303 0098
Email: support@tailieu.vn