Machine Translation Research at MIT in 1960, Lecture notes of Programming Languages

The history of mechanical translation research at MIT, including the first conference on mechanical translation in 1952 and the founding of the journal Mechanical Translation in 1954. The document also describes the research approach taken by the MIT group, which emphasizes completeness and seeks definitive solutions rather than short-cut methods. The document concludes with a discussion of the framework within which the group is working, which involves understanding and using as much as possible of the syntax of the languages being translated and dividing the problem of mechanical translation into six parts.

Typology: Lecture notes

2022/2023

Uploaded on 05/11/2023

freddye
freddye 🇺🇸

4.3

(11)

235 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
[
Proceedings of the National Symposium on Machine Translation
, UCLA February 1960]
Session 2: CURRENT RESEARCH
MT AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY
1
Victor H. Yngve
Massachusetts Institute of Technology
Mechanical translation has had a long history at M.I.T. Shortly
after the Warren Weaver memorandum of 1949, Yehoshua Bar-Hillel
became the first full-time worker in the field. He contributed many of
the early ideas and will be well remembered for this. He organized
the first conference on mechanical translation, held at M.I.T. in June
of 1952. It was an international conference, and although there were
only 18 persons registered, nearly everyone interested in MT in the
world at that time was there. Of those 18 people, 4 are on the program
of this conference, Leon Dostert, Victor Oswald, Erwin Reifler, and
myself. The number of people here today gives a measure of how the
field has grown in the intervening 7- 1/2 years. And this is a national,
not an international conference. The second conference, also held at
M.I.T. and also an international conference, took place in October of
1956. At that conference there were about 30 in attendance.
The reports or proceedings of both these conferences were
published in the journal Mechanical Translation. This journal was
founded at M.I.T. in 1954 when it became obvious that there was a
need for better communication between those interested in MT and to
prevent needless duplication of effort. The journal has continued to
grow. The first volume contained 57 pages. The current volume,
volume five, will contain well over twice that number. Starting with
the next volume we will abandon the electric typewriter and photo-
offset format, and go to letter press. This will give us a more
attractive journal, will allow it to expand naturally, and will speed
up the process of publication. We feel at M.I.T. that we are holding
the journal in trust until the field comes of age. When the field has
grown to the point where it becomes desirable to found a professional
society, the journal can become its official organ.
Let us now turn to the research on mechanical translation at
M.I.T. The group at M.I.T. has always stressed a basic, long-range
1 This work was supported in part by the National Science Founda-
tion, and in part by the U.S. Army (Signal Corps), the U.S. Air
Force (Office of Scientific Research, Air Research and Development
Command), and the U.S. Navy (Office of Naval Research).
126
pf3
pf4
pf5

Partial preview of the text

Download Machine Translation Research at MIT in 1960 and more Lecture notes Programming Languages in PDF only on Docsity!

[ Proceedings of the National Symposium on Machine Translation , UCLA February 1960]

Session 2: CURRENT RESEARCH MT AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY^1 Victor H. Yngve Massachusetts Institute of Technology

Mechanical translation has had a long history at M.I.T. Shortly after the Warren Weaver memorandum of 1949, Yehoshua Bar-Hillel became the first full-time worker in the field. He contributed many of the early ideas and will be well remembered for this. He organized the first conference on mechanical translation, held at M.I.T. in June of 1952. It was an international conference, and although there were only 18 persons registered, nearly everyone interested in MT in the world at that time was there. Of those 18 people, 4 are on the program of this conference, Leon Dostert, Victor Oswald, Erwin Reifler, and myself. The number of people here today gives a measure of how the field has grown in the intervening 7- 1/2 years. And this is a national, not an international conference. The second conference, also held at M.I.T. and also an international conference, took place in October of

  1. At that conference there were about 30 in attendance. The reports or proceedings of both these conferences were published in the journal Mechanical Translation. This journal was founded at M.I.T. in 1954 when it became obvious that there was a need for better communication between those interested in MT and to prevent needless duplication of effort. The journal has continued to grow. The first volume contained 57 pages. The current volume, volume five, will contain well over twice that number. Starting with the next volume we will abandon the electric typewriter and photo- offset format, and go to letter press. This will give us a more attractive journal, will allow it to expand naturally, and will speed up the process of publication. We feel at M.I.T. that we are holding the journal in trust until the field comes of age. When the field has grown to the point where it becomes desirable to found a professional society, the journal can become its official organ. Let us now turn to the research on mechanical translation at M.I.T. The group at M.I.T. has always stressed a basic, long-range 1 This work was supported in part by the National Science Founda- tion, and in part by the U.S. Army (Signal Corps), the U.S. Air Force (Office of Scientific Research, Air Research and Development Command), and the U.S. Navy (Office of Naval Research).

approach to the problem. We are placing an emphasis on complete- ness where completeness is possible and on the attempt to find out how to do a complete job where completeness is not now possible. We are not looking for short-cut methods that might yield partially adequate translations at an early date, an important goal pursued by other groups. Instead we are looking for methods that will be capable of yielding fully adequate results wherever they apply. We are thus seeking definitive solutions that will constitute permanent advances in the field rather than ad hoc or temporary solutions that may eventually have to be discarded because they are not compatible with improved systems. The framework within which we are working was described about

a year and a half ago in Mechanical Translation. 2 There were two main points in that paper. The first one was concerned with the aspect of completeness and with the point that it is essential for us to under- stand and use as much as possible of the syntax of the languages being translated. For many years the M.I.T. group has been working in the field of syntax. The other point in the paper was that it is possible, and perhaps necessary, to divide the problem of mechanical transla- tion into six parts, each one fairly independent of the others. We are pleased that other groups are also adopting this same split, because we think it has a lot of merit. A split of the problem into six more or less separate problems is a great advantage, because not only can more people work in parallel on the over-all problem by a division of effort, but also each part is easier to solve than the whole problem. The six-way split consists in reality of a two-way split and a three-way split. The two- way split is between the program or manipulative aspect of the prob- lem and the static or stored knowledge aspect of the problem. Such a split would, for example, separate a recognition routine or a sentence- production routine from the grammar or rules of the language. With a split of this nature in a program it becomes much easier to make additions to the grammar rules without having to reprogram the routines. Another advantage is that the programs are easier to Understand and thus easier to improve. The three-way split is

2 "A Framework for Syntactic Translation", Mechanical Translation, vol. IV, no. 3.

other things, an adequate and detailed knowledge of the languages in question -- a knowledge of their formal properties as codes and a knowledge of how they are used to communicate. Linguistic research on the structure of individual languages thus constitutes an important part of our effort. German and English are being given primary attention. French is being studied also. We have no work going on in Russian. Each language is being studied as an isolated system. The relationship between languages is a separate question and is being given separate consideration. Work on English grammar is being carried out by Edward S. Klima, David Lieberman, and V. H. Yngve. The work of Edward Klima, following the theoretical work of Noam Chomsky, has been most detailed, and extensive. He has done work on the imperative, on the use of "ing", on the relative clause, on pronouns, and on negation. Some of this work has already been submitted for publica- tion and should appear shortly. Work on German grammar is being carried out by Joseph Applegate, John Bross, Rosemarie Straussnigg, and John Viertel. Some of the work of Joseph Applegate on the German noun phrase will be presented in a later report at this conference. Some work has also been done on German grammar by visitors in our regular summer program for visiting scholars. This includes the work on the German adverb by James Gough of Georgia Institute of Technology, and work by Leonard Brandwood of England, Bjarne Ulvestad of Norway, and Stanley Werbow of the University of Texas. Our work on French is being carried out by David Dinneen. He is writing a French sentence production routine in COMIT. With such a routine he will be able to study certain questions of French syntax with the help of the computer. General research on the logical structure of language is being carried out by Elinor Charney. She has started from the work that she did with Hans Reichenbach on the analysis of conversational language, and particularly on the tense forms. The results promise to be an opening wedge into many interesting problems in semantics. She is being assisted to some extent by several other members of the group. In addition to the basic research effort into language problems, considerable effort is being made to provide adequate tools for

research. At present these tools include two major sets of programs for the IBM 704 computer. The first of these is the COMIT system, a powerful programming aid which enables the linguist to do his own programming without the difficulties inherent in working through the intermediary of a professional programmer. The system will be described in a later talk. The other tool that is being provided is a method of handling large quantities of text that can be obtained from the publishing industry in the form of punched paper tape. This system of programs, which allows the computer to search through text for particular words or groups of words, is an invaluable aid to the linguist in his study of the structure of languages since it gives him ready access to his data. The programming of the COMIT system is completed and the final check-out is in progress. We expect that it will be available for use soon. The programming has been done in a cooperative arrange- ment with the M. I. T. Computation Center. When the COMIT system is finished, it will be made generally available. It is hoped that the availability of the system, will materi- ally increase the productivity not only of our own group but of many others as well. We have already been using the COMIT notation ex- tensively in mechanical translation research at M.I.T. even though programs cannot yet be run. We have used it to write down in an un- ambiguous fashion our ideas on translation. This has aided greatly in clarifying our own thoughts and in communicating them to each other. We have come to realize that without an adequate notational system, research becomes very difficult. The other set of programs, for handling large quantities of text, has now been completed and is already in use. Texts currently available include 100, 000 words of American newspaper text and 100, 000 words of German newspaper text, both derived from punched paper tape obtained from the publisher. A third text, consisting of U. S. Patents, is being punched by the U. S. Patent Office in a co- operative arrangement whereby they are providing text which we can use and we are providing programs tailored to their text. The design of an appropriate transliteration scheme was carried out by Kenneth Knowlton of M. I. T. and Simon Newman and Rowena Swanson of the Patent Office. A description of the system is available in a Patent Office Report. K. C. Knowlton has written the required transliteration

to the discovery of some unsuspected aspects of English structure. It appears that the English sentences that occur never require more than about seven items to be remembered for future expansion. This is startling because it had previously been thought that one could have clauses within clauses without limit in English. It turns out that one can have clauses within clauses without limit only if most of them are the right—hand constituents of the construction they are a part of. In this way the speaker is relieved of having to remember to complete an indefinite number of constructions. It appears that most of the complications of English syntax can be attributed to phenomena asso- ciated with this restriction imposed by a person's memory. Some of the phenomena of English syntax that appear to be thus explained in- clude: the hierarchy of sentence, clause, noun phrase, adjective and adverb; the different behavior of subject and object clauses; the phrase structure of the active and the passive with the "by" phrase; the reversal of order of direct and indirect object; the shifting of the position of the separable verb particle; the function of the anticipatory "it"; the first position of the interrogative pronoun; the discontinuous nature of adjectival and adverbial phrases; the position of certain adverbs before the article; the fact that when the genitive marker follows its noun phrase, it is an affix " 's", and when it precedes it is a separate word "of"; and that derivational affixes are suffixes, and prepositions, articles, and conjunctions are separate words. This work will be published soon in the Proceedings of the American Philosophical Society. So you see that the mechanical translation research at M.I. T. is proceeding simultaneously on a number of fronts, and that some progress is being made toward a solution of the very difficult prob- lems facing us in the development of mechanical translation to the point where mankind can count on it as a reliable means of bridging the language barriers.