Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


Comparable Corpora for Translation and Language Learning, Dispense di Lingua Inglese

The use of comparable corpora, which are collections of texts in different languages that are not direct translations of each other but are selected based on similar criteria, for translation activities and language learning. The author outlines an experiment conducted with undergraduates to produce an english translation of an italian newspaper article, and suggests how the procedures involved can contribute to enhancing language skills. How learners can use relatively small comparable corpora to contrast formal features and functional similarities/differences between the source and target languages, leading to a variety of learning opportunities beyond just the specific translation task. Key aspects covered include making comparable corpora, using concordancing software to analyze word usage and discourse structures, and how this process can provide insights into language and culture while developing reading and writing abilities.

Tipologia: Dispense

2023/2024

Caricato il 20/07/2024

Karakalapaia
Karakalapaia 🇮🇹

3 documenti

1 / 15

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
SWIMMING
IN
WORDS:
CORPORA,
TRANSLATION,
AND
LANGUAGE
LEARNING
FedericoZanettin
0. Introduction
Translationcanbeameanstohelplearnersdevelopreadingandwritingskills,aswellas
increasing
their
cross-cultural
and
cross-linguistic
awareness.
Translating
consists
of
interpretingadiscourseinthelanguageofthesourcetextandre-interpretingitbycreating
another
discourse
in
the
language
of
the
target
text.
By
recasting
discourse
A
into
discourse
B,
learners
manipulate
language
to
a
meaningful
end,
transforming
a
text
originallycreatedtofulfilacommunicativefunctioninthelanguageofthesourcetextinto
anotherwhichmayhavevaryingdegreesofsimilaritytoitaccordingtothefunctionofthe
target
text,
going
from
word-to-word
transliteration
to
restatement.
Seen
from
this
perspective,translatingbetweenlanguagesisinprinciplenodifferentfromtranslatingfrom
one
language
variety
to
another,
or
from
one
register
to
another:
all
involve
a
shift
in
perspectiveandinrecipientdesign(seeNewmark1988,1991;Snell-Hornby1988;Hatim
andMason1990;BassnettandLefevere1990;Gentzler1993).
Corpora
consisting
of
texts
in
two
languages
which
are
similar
in
subject
and
purpose
allow
not
only
for
contrastive
analysis
of
individual
expressions
but
also
provide
learners
with
a
mapping
of
the
structures
and
strategies
employed
by
the
two
language
communities
for
"building
discourse
in
different
linguistic
and
socio-cultural
settings"
(Marmadou
1990:
564).
In
reading
a
text
in
the
L1
and
trying
to
formulate
a
suitable
"equivalent"
in
the
L2,
or
viceversa,
learners
have
to
strive
to
find
the
most
appropriate
words
for
the
new
audience.
This
is
not
simply
a
matter
of
terminological
accuracy,
but
involves
comparing
higher-level
cultural
codes
concerning
conceptual
and
rhetorical
structures.Thispaperpresentsaspecificexampleoftheuseinatranslationtaskofsuch
"comparable
corpora"
-
two
collections
of
texts,
one
in
L1
another
in
L2,
selected
on
the
basisofacriterionofequivalenceandstoredoncomputer.
I
am
distinguishing
here
between
"comparable"
and
"parallel"
corpora,
two
terms
which
overlap
in
much
of
the
literature.
The
term
“parallel
corpus”
is
generally
used
to
designateacollectionoftextsinlanguageAandoftheirtranslationsintolanguageB:see
for
example
Leech
and
Fligelstone
(1992),
Baker
(1995),
Marinai
et
al.
(1991).
The
best
known
such
collection
is
probably
the
proceedings
of
the
Canadian
Parliament
(HANSARD),whicharepublishedinbothFrenchandEnglish(theoriginaltextmaybein
eitherlanguage).1Corporaofthiskindaregenerallyalignedonasentence-by-sentenceor
phrase-by-phrase
basis,
either
through
reference
to
a
bilingual
dictionary
(Picchi
1991),
through
statistical
elaboration
(Lan
and
Bonnet
1994),
or
a
combination
of
the
two
(Johanssonetal.1996),sothatinstancesofanytextualstringcanberetrievedalongwith
itsequivalentsintheparalleltext:suchcorporahavebeenextensivelyusedasabasisfor
the
creation
of
bi-
or
multilingualterminologydatabasesandthesauri,andfordeveloping
machinetranslationsoftware.2Theterm"parallelcorpus"hasalsobeenused,however,to
refertocollectionsoftextswhicharenottranslationsofeachother,butareselectedonthe
basisofanalogouscriteria.Thesemayeitherbetakenfromdifferentvarietiesofthesame
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Anteprima parziale del testo

Scarica Comparable Corpora for Translation and Language Learning e più Dispense in PDF di Lingua Inglese solo su Docsity!

SWIMMING IN WORDS:

CORPORA, TRANSLATION, AND LANGUAGE LEARNING

Federico Zanettin

0. Introduction Translation can be a means to help learners develop reading and writing skills, as well as increasing their cross-cultural and cross-linguistic awareness. Translating consists of interpreting a discourse in the language of the source text and re-interpreting it by creating another discourse in the language of the target text. By recasting discourse A into discourse B, learners manipulate language to a meaningful end, transforming a text originally created to fulfil a communicative function in the language of the source text into another which may have varying degrees of similarity to it according to the function of the target text, going from word-to-word transliteration to restatement. Seen from this perspective, translating between languages is in principle no different from translating from one language variety to another, or from one register to another: all involve a shift in perspective and in recipient design (see Newmark 1988, 1991; Snell-Hornby 1988; Hatim and Mason 1990; Bassnett and Lefevere 1990; Gentzler 1993). Corpora consisting of texts in two languages which are similar in subject and purpose allow not only for contrastive analysis of individual expressions but also provide learners with a mapping of the structures and strategies employed by the two language communities for "building discourse in different linguistic and socio-cultural settings" (Marmadou 1990: 564). In reading a text in the L1 and trying to formulate a suitable "equivalent" in the L2, or viceversa, learners have to strive to find the most appropriate words for the new audience. This is not simply a matter of terminological accuracy, but involves comparing higher-level cultural codes concerning conceptual and rhetorical structures. This paper presents a specific example of the use in a translation task of such "comparable corpora" - two collections of texts, one in L1 another in L2, selected on the basis of a criterion of equivalence and stored on computer. I am distinguishing here between "comparable" and "parallel" corpora, two terms which overlap in much of the literature. The term “parallel corpus” is generally used to designate a collection of texts in language A and of their translations into language B: see for example Leech and Fligelstone (1992), Baker (1995), Marinai et al. (1991). The best known such collection is probably the proceedings of the Canadian Parliament (HANSARD), which are published in both French and English (the original text may be in either language). 1 Corpora of this kind are generally aligned on a sentence-by-sentence or phrase-by-phrase basis, either through reference to a bilingual dictionary (Picchi 1991), through statistical elaboration (Langé and Bonnet 1994), or a combination of the two (Johansson et al. 1996), so that instances of any textual string can be retrieved along with its equivalents in the parallel text: such corpora have been extensively used as a basis for the creation of bi- or multilingual terminology databases and thesauri, and for developing machine translation software. 2 The term "parallel corpus" has also been used, however, to refer to collections of texts which are not translations of each other, but are selected on the basis of analogous criteria. These may either be taken from different varieties of the same

language (e.g. the various components of the ICE corpus, which are taken from different geographical varieties of English: Greenbaum 1992), or from different languages, for instance collections of laws in French and Danish (Dryber and Tournay 1990), collections of service encounters in British and Italian (Gavioli and Mansfield 1990), or collections of public signs from various English- and German-speaking countries (Snell-Hornby 1984). It is this latter type that I refer to as "comparable" corpora. 3 In this paper I discuss the basic operations necessary to create and use small comparable corpora, outlining an experiment conducted with undergraduates to produce an English translation of an Italian newspaper article, and suggest ways in which the procedures involved may contribute to language learning. While in this case the translation was from the learners' native language (Italian) into English, the methodology would also seem appropriate to translation into the mother tongue. The objective was to write a text which would sound as if it had been taken from a British newspaper, with the aid of a corpus of comparable English and Italian newspaper texts and concordancing software. Example 1 shows the original Italian text and the translation of it into English made by one student: while the final product was individually written, much of the research using the corpus involved interaction with other learners. Example 1 In vasca. Sorvegliato speciale e' Matt Biondi, che cerca di vincere l'oro per la terza volta consecutiva ai Giochi, sul gradino piu' alto del podio ben cinque volte nell'edizione '88. Si esibisce nei 50 e 100 stile libero, oltre che nella 4x stile libero. Re del mezzofondo e' l'australiano Kieren Perkins, primatista mondiale dei 400, 800 e 1.500 stile libero. Swimming. Matt Biondi, the defending champion, will be trying to win gold in his third successive Olympic Games. After gaining no less than five gold medals in 1988, this time he is back to contest the 50 and 100m freestyle, and 4x100m freestyle. Kieren Perkins of Australia, the world record holder for the 400m, 800m, and 1,500m freestyle, is top performer over the longer distances. I will go through the steps followed in making this translation, showing how by contrasting similar formal features in the two corpora which however may differ functionally (false friends, loan words, near synonyms, metaphorical expressions, etc.), and by comparing functionally similar segments of text which may however differ in their formal realisations (rhetorical structures, contextualising information, logical connectors, terminology, etc.), learners can use relatively small comparable corpora for a variety of activities which can not only enhance the specific translation but also allow a wide range of learning to take place.

1. Making comparable corpora Some of the most readily available sources of computerised text are newspapers, many of which are now available on the Internet, or commercialised on CD-ROM at an affordable price. A CD-ROM usually contains up to a year of issues (8 to 10 million words of text) from which selections can be downloaded to the user's hard disk. While not all CD-ROM and online newspaper services use the same search and retrieval software, there is a tendency to standardisation and some basic operations are common to most of them. Any user (teacher or student) who is computer/network literate should be capable of creating

Example 2 Biondi expects to be back to his best Biondi is the big man of swimming, Matt Biondi: Swimmer. Won five golds Matt Biondi, the defending Olympic champion, Matt Biondi, the defending champion Matt Biondi, the first man to win seven swimming Biondi, who gained five golds in 1988, Matt Biondi, winner of five gold medals Biondi, with five golds last time MATT BIONDI will try to slip into his `Superman' guise In the Italian article Matt Biondi is introduced as sorvegliato speciale. This is a phrase that belongs to the language of law, used to refer to a person under police surveillance. Here it is used metaphorically to convey the idea of Biondi, champion of the '88 Olympics, being under attack and defending his supremacy. Thus among the descriptions in the English corpus, "defending champion" seemed a feasible way of translating sorvegliato speciale. The other proper name in the Italian text is that of Kieren Perkins, often referred to as "Australia's Kieren Perkins" or "Kieren Perkins of Australia". This surprised one learner, who had hypothesised using "the Australian Kieren Perkins" in his translation. By generating sample concordances to compare the use of adjectives of nationality, country names as possessives, and of followed by the country name, it was found that the third of these forms was quite the most frequent when referring to contestants in the English corpus. This form was therefore selected as a translation. After proper names, a second strategy was to look for similar expressions and/or classes of expressions in the two corpora. Work with concordancing software favours an approach which starts from a relatively low level of text constituency - the behaviour of words (Brodine, this volume; Baker 1992). What the learner is typically looking for is something of the kind "how do you say this in English?" - the equivalent of a key word. For instance, in the first sentence of the Italian text, two more things are said about Biondi: che cerca di vincere l'oro per la terza volta consecutiva ai Giochi (lit. "who is trying to win the gold for the third consecutive time at the Games"), and sul gradino più alto del podio ben cinque volte nell'edizione '88 (lit. "on the highest step of the podium no less than five times in the '88 edition"). Concordances were therefore generated for the presumed English equivalents of the key words oro (gold), podio (podium), and consecutiva (consecutive). A concordance of gold* produced nearly 850 lines. Sorting these by the words to the left/right and skimming through them, a number of patterns were noticed, for instance that one can "win/gain/earn/get the/a gold (medal)", or "win golds". By also generating a concordance of or* in the Italian corpus (109 lines) these expressions could be analysed contrastively. In Italian you can say "vincere/conquistare/prendere la/una medaglia d'oro" (lit. win/conquer/take the/a medal of gold), "vincere/conquistare/prendere un oro" (lit. win/conquer/take a gold), or "vincere/conquistare/prendere ori/medaglie d'oro" (lit. win/conquer/take golds/medals of gold). It was noted, however, that while in English you can "win gold" (ex. 3), Italian requires a definite article: "vincere l'oro" (ex. 4): Example 3 ( gold with win* or won immediately to the left, sorted by the word to the left of gold : every third citation) 1 way ahead. I badly wanted to win gold but I accept I probably won't now.' F 2 de open,' he said. 'Anyone can win gold.' Robb's Liverpool Harriers team-mate

3 run she destroyed the field to win gold with one of the greatest track perform 4 y, they sailed brilliantly to win gold. Windsurfers Barrie Edgington, 25, and 5 ke status in Turkey after winning gold in the 60kg in Seoul in 1988, had bee 6 heir tally, Romas Ubartas winning gold in the discus for Lithuania. When Atl 7 ed strong men to tears by winning gold in Munich aged 33. Brasher (3,000m ste 8 onze, and Ann Brightwell, who won gold and silver. So you can imagine how I f 9 and, for a long time, better won gold; when he was collecting bronze medals 10 here else - East German women won gold and silver in all events at the 1986 11 0m and 400m medley while Hong won gold in the 100m butterfly. BRITISH SWIMM 12 Mike McIntyre and Bryn Vaile won gold medals in South Korea four years ago, 13 Pattison, a naval officer, won gold in the Flying Dutchman class in 1968 a Example 4 ( or* with vinc* within two words to the left, sorted by the second word to the left: all citations) 1 ct Velasco _ non solo non si vince l'oro, ma non si arriva alla finale>. Nella 2 ablo Morales, un ragazzo che vince l'oro nei 100 farfalla a 28 anni (per il nu 3 Matt Biondi, che cerca di vincere l'oro per la terza volta consecutiva ai Gio 4 di ridicolo. Due bulgari vincitori dell'oro erano risultati positivi agli ster 5 to, io prendo un sabbatico e vinco l'oro>. Spitz, forse il piu' grande nuotato 6 ", Maurizio Damilano, che ha vinto l'oro nella 20 chilometri di marcia addirit Learners rapidly discovered that the next two key words in the first sentence of the source text, podio and consecutiva , had cognates in the English corpus, podium and consecutive. The question was: if there is a cognate form in English, is it a true or a false friend (Holmes and Guerra Ramos 1993; Partington 1995)? As can be seen from the following concordances ( podium* in the English texts: ex. 5; podi* in the Italian texts: ex. 6) the sense of podium does in fact correspond to that of podio in this context: Example 5 (every third citation) 1 ay Michael Carruth was standing on a podium in the boxing arena here, listening 2 t finish anywhere better than on the podium of an Olympic Games.' Ever since his 3 win. But I was proud to stand on the podium after a race like that. 'It's a gr 4 me away from a definite place on the podium after crushing Australia 98-65 in t 5 two Americans stood on the winner's podium to salute the anthem. True that duo 6 .60m. Despite his climbing on to the podium along with Zelezny and Raty, there w 7 as Skah stepped jauntily out to the podium matched that which accompanied his Example 6 (every third citation) 1 e Abebe), conosceva la sua ascesa al podio (terzo posto) dopo un quarto d'ora qu 2 spettacolare autorita' la scalata al podio piu' alto del torneo a squadre costri 3 assullo e Bomprezzi sono lontani dal podio. Nella vela un avvio in sordina dopo 4 avoriti per il gradino piu' alto del podio Scarpa e Josefa Idem; outsider Rossi- 5 era mista) sul gradino piu' alto del podio, la cinesina Zhang Shang. La seconda, 6 tiste azzurre che hanno sottratto il podio piu' alto alle tedesche. Da sinistra: 7 6 e 10,8 milioni. Il terzo posto sul podio equivale, quindi, a una vittoria in u 8 empio, se Michael Jordan salira' sul podio alla cerimonia di premiazione del tor 9 i di esilio, il Sud Africa torna sul podio: i tennisti Ferreira e Norval si sono It was noticed, though, that there was a difference in the relative frequency with which the two terms occurred. There were 27 occurrences of podio in the Italian as opposed to only 22 of podium in the English corpus, even though the latter was four times as large. Inspecting the concordances highlighted that the expression in the source text, il

strategy adopted was to examine its function as a headline which indicates that this news item is about swimming. Learners looked for expressions in the English corpus which might fulfil this function. A concordance of swim* quickly revealed that an equivalent headline appeared to be Swimming , 14 out of 102 instances of this form being found in headlines. 5 Even where learners were relatively confident in proposing a translation, the corpus often surprised them in unexpected manners. Rather than on terminological accuracy or functional appropriacy, this often involved focusing on style: is a certain phrase "native-like", would it be used in a British newspaper? In the second sentence of the source text, the expression 50 e 100 stile libero intuitively had a straightforward equivalent in "50 and 100 freestyle". With the corpus at their disposal, some students tried looking for numbers by typing *0 as a keyword, running the search on both the Italian and the English data. As this resulted in hundreds of citations, the search had to be narrowed by adding other characters to the string, looking for words matching the strings ?00 and *00m. This highlighted the fact that in the English corpus, race distances were expressed specifying the unit of length ( m or metres ), in such patterns as 100 metres freestyle or 100m freestyle , while in Italian they also appeared as 100 stile libero. The only exception was in coordinate constructions ("50 and 100 metres freestyle"). Some discoveries were purely casual. While examining the data related to numbers, one student happened to notice the following citation (ex. 10): Example 10 Games, is back to contest the 100 metres, but Carl Lewis, the man who inhe of which he proceeded to view the enlarged context (ex. 11): Example 11 Ben Johnson, banned for two years for drug abuse after the Seoul Games, is back to contest the 100 metres, but Carl Lewis, the man who inherited his gold medal, will not be there, or in the 200m, following his failure in the trials. In the 110m hurdles the world champion, Greg Foster, and world record holder, Roger Kingdom, also failed to come through the trials, as did Antonio Pettigrew, the world 400m champion, and Dan O'Brien, the world decathlon champion and subject of a massive pre-Olympic publicity campaign. This is not an article about swimming, but nonetheless contained a couple of expressions which seemed particularly appropriate solutions to the problems of translating si esibisce (literally "exhibits him/herself") and primatista mondiale (literally "world record holder") in the source text, insofar as neither of these terms would seem to refer exclusively to swimming. Translating si esibisce had proved problematic, with searches being run in the Italian corpus for strings corresponding to synonyms such as gareggiare and disputare , and in the English corpus for cognates and functionally equivalent words to these, such as dispute and perform. However, a search in both corpora for perform* showed that out of 219 occurences, only 21 were verb forms, the rest mainly being the noun performance (17 of these in the compound performance- enhancing drugs/substances , and 5 as a borrowed term in the Italian texts). Example 12 lists five typical examples: Example 12

His magnificent performance at Tokyo attracted a warm tribute ... ANOTHER masterly performance by Steve Redgrave and Matthew Pin ... McKean's performance was reminiscent of other runs he ... ... Carruth's gold medal-winning performance from the ringside. Moorhouse's performance tomorrow, and that of his team ... Rather than attempting to rearrange the text to use this nominal form, this student decided to use the phrase "[he] is back to contest", which in example 11 refers to the runner Ben Johnson, but could easily be applied to the swimmer Matt Biondi. Casual discoveries were worth checking systematically, however. In the same paragraph on Johnson there also appeared the expression world record holder , which seemed the equivalent of primatista mondiale. Checking this equivalence by examining concordances for these expressions confirmed that they were used in similar contexts, but also showed that world record holder is usually preceded by the article (ex. 13): Example 13 ( world record holder , sorted by the first word to the left: alternate citations) 1 ound Said Aouita, the 1,500m world record holder prevented by injury from compe 2 w a double world champion and world record holder. There is also a men's 200m fr 3 Stewart, the 200m butterfly world record holder, said. Mike Barrowman, who hol 4 the Olympic title. The former world record holder said a combination of the heat 5 ved Leroy Burrell, the former world record holder who ran the second leg in the 6 year because the 400 metres world record holder had competed while banned for 7 the fact that the 400 metres world record holder had failed a drugs test. Confu 8 m backstroke. Jeff Rouse, the world record holder from Petersburg, set the early 9 be his year but he wanted the world record holder in the race to prove he could 10 European champion, three times world record holder By MIKE ROWBOTTO In this respect it differed from Italian. To sum up, the Olympics corpora allowed learners to contrast the source and target languages at various levels, from single words and phrases to discourse functions and organisations. By looking for patterns and regularities in the English and Italian texts and comparing the uses of words and expressions which they felt had some kind of relationship either within or across languages, they were able to find evidence to formulate and to support specific translation hypotheses. As a final step, they were invited to check how far the patterns they had observed in English might be generalisable to other contexts by comparing concordances from the Olympics corpus with data from a more general newspaper corpus ( MicroConcord A : Scott and Johns 1993).

3. Learning to create meaning Like any other process of discourse construction, translation involves creating meaning, the difference being that translation is " guided creation of meaning" (Halliday 1993: 15). Using comparable corpora in this process increases the available guidance, insofar as it means building up the text using ready-made chunks of language, which have been used in similar contexts on similar occasions, selecting those judged most appropriate to convey the desired meaning. This does not mean, however, that translating simply becomes a "cut and paste" activity. The various pieces found will rarely fit together exactly, but have to be adjusted and linked in order to create a new text which is more than just a patchwork of pieces stolen from elsewhere. For instance, in the first sentence of the target language text in

contexts. By supplying meaningful instances of real language in use whose full context is always available at the touch of a key, concordances offer the learner both greater safety of numbers and greater certainty of contextual appropriacy than do dictionary examples. The argument concerns grammar as well as lexis: while one feature which seems particularly difficult for learners to master from dictionaries or grammars is article use, the corpora provided clear evidence as to the use of determiners with the phrases "win gold", "world record holder" and their italian equivalents. They similarly highlighted regularities in the use of proper names, showing the preferred constructions used to refer to a country of origin (cf 2 above; see also Bertaccini, this volume). Lastly, corpora also constitute a source of extralinguistic (world) knowledge, by providing information about people, places and institutions. The specific information drawn from these small comparable corpora must not of course be treated as generalised "facts". The corpora do not claim to be representative of any wider category of texts, and features typical of them should not be interpreted as necessarily typical of the language as a whole, or even of sports journalism. They are reliable to the extent that specific features are attested with a certain frequency, in a set of texts which are credibly analogous. While certain findings, such as the preference in English for nominal constructions with performance over the use of the verb perform , may have a wider bearing, such hypotheses must always be checked prior to generalisation, and the learner needs to be wary of the influence which the specific composition of the corpus exerts on the meaning and function of items of all kinds (see Zorzi, this volume). It is important to be careful about what may well be domain-specific words, senses and distributions. For instance, the word golds occurs over 50 times in the English Olympics corpus. Yet gold is usually classed as an uncountable noun in EFL textbooks (see e.g. Fowler et al. 1983: 71), and there are no examples of golds at all in the much larger and more varied MicroConcord A newspaper corpus. This suggests that the plural golds may only have the specific sense of "gold medals" in refernce to sporting events. In contrast, both gold and golden occur as adjectives in the Olympics corpus, golden being sometimes used where in Italian we find d'oro (as in ragazza d'oro : "golden girl"). The latter example suggests that whereas gold indicates the metal, golden indicates the metaphorical quality, a hypothesis which is in fact confirmed by a search for these words in the MicroConcord A corpus. While not immediately useful for the translation, where it was clear from the start that the English for medaglia d'oro was "gold medal", this finding exemplifies the way analyses of comparable corpora can throw up hypotheses which may be noted and investigated as learning spin-offs for future use. The findings of a search for an appropriate translation of re in re del mezzofondo (literally, "king of the middle distance") were also not used in constructing the TL text, but they again illustrate how the process of using the corpora can lead to unforeseen learning. In the English corpus, king was almost exclusively used to refer to Juan Carlos, the king of Spain. The sole metaphorical instance referred to an activity rather than an individual: Basketball is the king in Lithuania and they were hoping to use the sport to strike a blow against the Commonwealth of Independent States - which they still regard as a symbol of the old Soviet regime. Another possible translation of re was big man , a phrase found referring to Matt Biondi (see ex. 2 above). It was rejected for the reason that this use was not only metaphorical but also literal ("Biondi is the big man of swimming, standing 6ft 7in tall and the winner

of six Olympic gold medals"). A further candidate was top performer , which had been noticed during another search. On searching for top , two occurrences of top dog were also found. However, both of these were in quoted statements, suggesting that this expression might be more typical of spoken registers, and top performer was the form eventually chosen. But as much if not more was probably learned from investigating hypotheses that were rejected than from examining those which were finally adopted in the target text.

5. Conclusions Using comparable corpora and concordancing software for translation activities can help learners gain insights into the languages and the cultures involved and to develop their reading and writing skills. By its very nature, translation is an activity in which the negotiation of meaning is mediated through the written medium, where the learner interacts first as a reader with the producer of the source text and then as a writer with the recipient of the target text. While the interaction which takes place between learner, text and computer is in primis an individual activity, it does not exclude extension to an interactive classroom setting. Duff (1989) argues that many translation activities can form a basis for group work and oral discussion, in which learners engage in a meaningful negotiation of meaning and exchange of their own contributions. Comparable corpora can also be a springboard for use in these activities. Besides the reward of a more natural-sounding final text - Appendix B gives examples of other translations of the same source text carried out without the aid of the corpus - I have tried to show that much else can be achieved through their use. Notes

  1. Other aligned parallel corpora include those being developed from EC official journals and telecommunication texts (McEnery and Wilson 1994), from the multilingual technical manuals of computer companies like IBM or Microsoft, as well as from more heterogeneous sources (Johansson et al. 1996).
  2. Whereas machine translation (MT) issues are not within the scope of this paper, it may be useful to point out briefly possible areas of overlap, insofar as much of the work carried out with parallel and comparable corpora has been intended to have applications in this field. Wills (1993) distinguishes between four different procedures which go under the heading of MT: (a) word-for-word substitution, (b) machine-aided human translation (MAHT), (c) human-aided machine translation (HAMT) and (d) fully automatic machine translation (FAMT). The first of these is simply “a form of the interlinear version known from the Middle Ages” (Wills 1993:
    1. and is generally of little use; MAHT consists essentially in a word processor equipped with the capability of interfacing with dictionaries and terminological data banks, which “may contain [...] a device to specify certain words in certain contextual environments”; HAMT, which requires human intervention in either or both the editing and the post-editing phase, is the area where most commercial software has been developed, ranging from professional tools such as IBM Translation Manager/2 , Trados Translator's Workbench or Globalink to less sophisticated programs like Microtac Language Assistants. HAMT is also the

future research". Target , 7. 223-243. Baker, M. 1996. "Corpus-based translation studies - the challanges that lie ahead". Paper presented at Unity in Diversity? International translation studies conference, Dublin City University, 9-11 May 1996. Bassnett, S. and A. Lefevere (eds) 1990. Translation, history and culture. London: Pinter. Butler, C. (ed) 1992. Computers and written texts. Oxford: Blackwell. Cowie, A.P. 1992. "Multiword lexical units and communicative language teaching". In P.J.L. Arnaud and H. Bejoint (eds), Vocabulary and applied linguistics. London: Macmillan. 1-12. Dryberg, G. and J. Tournay 1990. "Définition des équivalents de traduction de termes économiques et juridiques sur la base de textes parallèles". Cahiers de lexicologie , 56-57. 261-274. Duff, A. 1989. Translation. Oxford: Oxford University Press. Fontanelle, T. 1994. "Towards the construction of a collocational database for translation students". Meta , 39/1. 47-58. Fowler, W.S., J. Pidcock, R. Rycroft and G. Del Giudice 1983. Sprint: a complete English programme. London: Nelson. Gavioli, L. and G. Mansfield (eds) 1990. The PIXI corpora: bookshop encounters in English and Italian. Bologna: Cooperativa Libraria Universitaria Editrice. Gentzler, E. 1993. Contemporary translation theories. London: Routledge. Greenbaum, S. 1992. "A new corpus of English: ICE". In J. Svartvik (ed.), Directions in corpus linguistics. Mouton: De Gruyter. 171-179. Halliday, M.A.K. 1992. "Language theory and translation practice". Rivista internazionale di tecnica della traduzione , 0. 15-26. Hatim, B. and I. Mason 1990. Discourse and the translator. London: Longman. Holmes, J. and R. Guerra Ramos 1993. "False friends and reckless guessers: observing cognate recognition strategies". In T. Huckin and J. Coady (eds), Second language reading and vocabulary acquisition. Norwood, NJ: Ablex. 86-108. Johansson, S., J. Ebeling and K. Hofland 1996. "Coding and aligning the English- Norwegian parallel corpus". In K. Aijmer, B. Altenberg and M. Johansson (eds), Languages in contrast. Lund: Lund University Press. 87-112. Laffling, J. 1992. "On constructing a transfer dictionary for man and machine". Target , 4. 17-31. Langé and Bonnet, 1994. "The multiple uses of parallel corpora". Paper presented at the 1st International Conference on Teaching and Language Corpora(TALC), Lancaster University, 10-13 April 1994. Leech, G. and S. Fligelstone 1992. "Computers and corpus analysis". In Butler. 115-140. Lewis, D. 1992. "Computers and translation". In Butler. 75-113. Marinai, E., C. Peters and E. Picchi 1991. "Bilingual reference corpora: a system for parallel text retrieval". Using Corpora: Proceedings of the Seventh Annual Conference of the UW Centre for the New OED and Text Research. St. Catherine 's College: Oxford. 63-70. Marmadou, S. 1990. "Contrastive analysis at the discourse level and the communicative teaching of languages". In J. Fisiak (ed), Further insights into contrastive linguistic analysis. Amsterdam: Benjamin. 561-571. McEnery, A. and A. Wilson 1994. "Corpora and translation: uses and future prospects. In M.A. Lorgnet (ed), Atti della fiera internazionale della traduzione II. Bologna: Cooperativa Libraria Universitaria Editrice. 311-343. Newmark, P. 1988. A textbook of translation. London: Prentice-Hall.

Newmark, P. 1991. About translation. Clevedon: Multilingual Matters. Partington, A. 1995. "True friends are hard to find': a machine-assisted investigation of false, true and just plain unreliablefriends'". Perspective Studies in Translatology. 95:1. 99-112. Peters, C. and E. Picchi. "Bilingual reference corpora for translators and translation studies". Paper presented at Unity in Diversity? International translation studies conference, Dublin City University, 9-11 May 1996. Picchi, E. 1991. "DBT: a textual database system". In L. Cignoni and C. Peters (eds), Computational lexicology and lexicography. Linguistica computazionale , 7. 177-205. Scott, M. and T. Johns 1993. MicroConcord ver. 1.0. Oxford: Oxford University Press. Scott, M. and T. Johns 1993. MicroConcord - Corpus A: The Independent and The Independent on Sunday. Oxford: Oxford University Press.

  1. Snell-Hornby, M. 1984. "The linguistic structure of public directives in German and English". Multilingua , 4. 203-211. Snell-Hornby, M. 1988. Translation studies: an integrated approach. Amsterdam: Benjamin. Somers, H.L. 1993. "Current research in machine translation". Machine translation , 7. 231-

Wills, W. 1993. "Basic concepts of MT". Meta , 38. 403-413. Zanettin, F. 1994. "Parallel words: designing a bilingual database for translation activities". In A. Wilson and T. McEnery (eds), Corpora in language education and research: a selection of papers from Talc 94. UCREL technical papers , 4. Lancaster: UCREL. 99-111. Appendix A 1 in the centre lane. On his right was Biondi, 26, with Jager, 27, on his left. Th 2 s showed no respect for reputations. Biondi, 26, made a brave attempt to add to 3 force his way into the record books. Biondi, a giant of a man in both stature - 4 1 sec. Foster was sixth in 22.52 and Biondi, a five-time winner in Seoul, missed 5 rged from the dive just behind Matt Biondi and Alexander Popov it did not augur 6 for Spain it was disappointment for Biondi and Evans in the 100m and 400m frees 7 s the United States favourites, Matt Biondi and Janet Evans, failed to retain th 8 ludes the outstanding talent of Matt Biondi and Tom Jager, of the United States. 9 ast night, which, on a day when Matt Biondi and Janet Evans were racing, was a s 10 lute the anthem. True that duo, Matt Biondi and Janet Evans, heard it often eno 11 st Borges was given the same time as Biondi but after the officials looked at th 12 ming golds were awarded to the pair, Biondi collecting five. Yesterday Evans ha 13 and my shoulders get cramped up.' Biondi expects to be back to his best by th 14 00m and 400m freestyle respectively. Biondi is the big man of swimming, standing 15 50-metre freestyle, and two relays. Biondi is hoping that a combination of less 16 son, Sergei Bubka, Mike Powell, Matt Biondi, Li Jing, Michael Jordan, too. Wha 17 DTL 30 JUL 92 / Olympics '92: Biondi must conquer the threat of Popov - S 18 reestyle title, writes Colin Gibson. Biondi (pictured), 27 the day after the Gam 19 for a run in the morning. Perversely Biondi's recent dip in form - he was third 20 and a bronze in Seoul MATT BIONDI returns to the Olympic fray in Barce 21 e of a factor of working too hard,' Biondi said. 'Fortunately, we're over the a 22 Foster (50m freestyle) might ruffle Biondi's supremacy. The women's chances ar 23 tsteps over the next fortnight? Matt Biondi: Swimmer. Won five golds, a silver 24 sprint freestyle relay to make Matt Biondi the first male to win seven swimming 25 tenders are the American duo of Matt Biondi, the defending Olympic champion, an 26 ailed to qualify for the final. Matt Biondi, the defending champion, and Tom Jag 27 too uspset to speak and walked out. Biondi's vulnerability was first hinted at