Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


Trados (beginner, intermediate) - CAT Tools, Appunti di Traduzione

ENGLISH/ITALIAN - Beginner/Intermediate introduction to Trados Studio as in the Machine Translation course at University of Bologna

Tipologia: Appunti

2021/2022

Caricato il 20/02/2023

sofia.gonzalez3
sofia.gonzalez3 🇮🇹

5

(1)

1 documento

1 / 38

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
Computer-assisted translation and post-editing
02.22.22
__________________________________________________________________________
Computer-assisted translation systems (CAT)
> a CAT system makes available to the translator a number of IT tools useful to integrate
their work, through the recovery of text fragments and terms which have been already
translated.
Composed by 3 parts:
1. A terminology-based system (TB), terminology management
2. A translation memory (TM) - main element, without which we couldn't have a
CAT tool
3. A translation interface with a text editor - place where you physically type
your translation
CAT systems can be commercial (Trados Studio - by RWS, most commonly used as CAT,
like 80%; MemoQ - by KIRIGREY, very user-friendly, used often..) or open-source*/free*
(OmegaT - open-source CAT tool; MateCat - free software, web-based CAT tool belonging
to the category of "software as a service")
*free: you can download them for free and use them
*open-source: you can modify it
Explaining the parts composing CAT tools.
1. Terminology management (TB)
Terminology management can be used for:
A. storing, creating, organizing and recovering terminology sheets (terminology database
aka termbases) and glossaries
> teminology sheets are more complex resources also containing definitions,
equivalent terms, grammar notes, context, reliability degree, explicative images etc
> glossaries have a minimal approach, only contain equivalent terms and notes, easier
to create
B. Searches, which can be performed
> manually, by the user
> automatically, by the programme (the programme will automatically search for
matches, for matching terms within a CAT environment, while you translate your
document).
C. Especially in combination with TMs
2. Translation Memory (TM)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26

Anteprima parziale del testo

Scarica Trados (beginner, intermediate) - CAT Tools e più Appunti in PDF di Traduzione solo su Docsity!

Computer-assisted translation and post-editing 02.22.


Computer-assisted translation systems (CAT)

a CAT system makes available to the translator a number of IT tools useful to integrate their work, through the recovery of text fragments and terms which have been already translated. Composed by 3 parts:

**1. A terminology-based system (TB), terminology management

  1. A translation memory (TM) - main element, without which we couldn't have a CAT tool
  2. A translation interface with a text editor - place where you physically type your translation** CAT systems can be commercial (Trados Studio - by RWS, most commonly used as CAT, like 80%; MemoQ - by KIRIGREY, very user-friendly, used often..) or open-source/free (OmegaT - open-source CAT tool; MateCat - free software, web-based CAT tool belonging to the category of "software as a service") *free: you can download them for free and use them *open-source: you can modify it **Explaining the parts composing CAT tools.
  3. Terminology management (TB)** Terminology management can be used for: A. storing, creating, organizing and recovering terminology sheets (terminology database aka termbases) and glossaries

teminology sheets are more complex resources also containing definitions, equivalent terms, grammar notes, context, reliability degree, explicative images etc glossaries have a minimal approach, only contain equivalent terms and notes, easier to create B. Searches, which can be performed manually, by the user automatically, by the programme (the programme will automatically search for matches, for matching terms within a CAT environment, while you translate your document). C. Especially in combination with TMs 2. Translation Memory (TM)

A translation memory is a multilingual text archive containing multilingual texts, allowing storage and retrieval of aligned segments against various search conditions; aka with a TM you can recover old translations in order to re-use sentences (identical or similar) we have already translated in previous texts. You have two methods to create translation memories:

  • Incremental method: the TM automatically increases in size gradually while you translate in the CAT tool
  • Recycling legacy contents method: you re-use, you align previously translated documents and send sentences wich you translated in already translated documents aka you can split the documents into sentences, you align sentences and send the aligned sentences to the translation memories. Main requirements for using TMs:

you need to have source texts (ST) in electronic format the texts should be repetitive, containing consistency and precision (the terminology and phraseology should be approved by the client) this is usually for technical texts, user manuals, legal documents like contracts or tender specifications different TMs for different languages and domains aka you have to differentiate your resources depending on clients and domains **KEY CONCEPTS AND TERMS ABOUT TRANSLATION MEMORIES

  1. Source Text (SL) ; Translated Text (TT)
  2. Translation units:** unit of measurements of TMs; aligned couple of sentences (segments) from SL into TL; ONE translation unit is composed by TWO segments. example: if i have 1k translation units > i have 2k segments 2. Translation "candidates" (matches): while we translate we can receive the translation matches automatically, which are candidates retreived from the translation memory; they have different match levels, they can be: - exact matches/full/100% matches: identical to the segment we are translating - fuzzy matches (70/99%): similar to the segment we are translating - the TM only retreives only fuzzy matches between 70% and 99%. what happens below 70? - minimum match value (MMV): the degree of match which must exist between a source document segment and a translation memory segment, in order for the segment to be offered as a match. On this basis, the TM proposes (or not) translation candidates during the translation process depending on the match percentage - the MMV is usually set automatically at 70%

I get a full match when the match is the same except for a PLACEABLE ELEMENT (ex. numbers, dates, etc). Trados changes automatically placeable elements so you just have to accept c. I GET A THIRD PROPOSAL: I only have to change a couple of words. d. I GET A FOURTH PROPOSAL: the 60% match of a fuzzy match shouldn't appear, because the MMV i set was at 70%, not less.

3. Text Editor The text editor is the place where the translator carries out the translation through this the translation can see at the same time the text 2 b translated (divided into segments) and the potential proposes of the TB/glossary Benefits and limitations of CAT tools. Benefits:

  • they accelerate the trans process especially in some text types, like some specific legal documents, manuals; this increases your productivity
  • they ensure terminological and phraseological consistency; this means increase of quality
  • they allow the management of complex trans project: in a single trans project you can translate 1 or more texts into 1 or more target languages, all in a single project
  • they simplify the work of the translator, especially in the case of some particular formats (you translate complex files modifying only translatable content and preserving the code,

not the structure or other parts of the document)

  • it's a necessity for many technical domains; the use of CAT tools is required by many trans agencies/clients Limitations:
  • useless/dangerous for some text types, like creative texts (novels, essays, etc)
  • useless for texts with very long sentences, because the more a sentence/segment is long the less can it be recognised by a TM 24.02. TRADOS STUDIO 2021 Part 1: translating single files. The translator has a number of resources available; the T needs to translate a single document into a given target language OR has received a project package; Now the resources available are: > Translation Memory (TM) (SDLTM extension, SDL TM):
  • it's a bilingual database containing translation units (TU) TU: source and target sentences contained in the database
  • it contains additional information (whether the TU was modified or not) like metadata
  • a TM works with a automatic look-up (queried automatically while i translate) and with pre-translation pre-translation: comparing a text to be translated with a translation memory and automatically inserting in the target text all those segments, exact matches, that are in the translation memory
  • when a TM is searched it can provide 3 possible results: 100% matches fuzzy matches: similar match (70-99%) context matches: even more reliable than a 100% match because they also check the correspondance of the segments before the segment that's currently being analyzed
  • matches can be reused or edited
  • TM can suggest whole sentences or even fragments of segments > AutoSuggest Dictionary (AS) (BPM extension, Bilingual Phrase Mapping)
  • it's a bilingual database
  • it doesn't contain whole sentences like the TM but sentence fragments associated with sentence fragments in the target languace (which can be single words but also short expressions)
  • it works with the predictive typing technology, this way words are automatically suggested

application ribbon: file navigation pane: rettangolo a sx view navigation buttons: you only have 4 at first but when you open a new file two more appear work pane: middle of the screen status bar: in the welcome view the status bar is empty **Supported files:

text-only formats:**

  • delimited text files
  • custom text formats through freely-definable regular expressions filter **> PDF

Bilingual formats**

  • DOC/X (SDL Trados)
  • TTX (SDL Trados)
  • ITD (SDLX)
  • SDLXLIFF Translation processes: analyzing the perspective of a translator who translates directly for a client, without a translation agency inbetween. workflow:

they receive a file open the document and connects all the resources available: TM, TB, AS open the file in trados studio, which automatically transforms the file in a SDLXLIFF file the translator translates by: looking up in TM, TB, AS, confirming translations, adding comments, previewing, spell checking (F7), QA Verification, (F8), terminology etc. the translator finally saves the translation as the SAME format as the source text; the translator updates the TM (by confirming the segment), marks it as complete, removes it from list sends file Practice: TRANSLATING A DOCX (WORD) DOCUMENT autopropagation autosubstitution of placeable elements placeable elements: control + , confirming something in the TM (validating translation units): control + enter saving: control + s single file projects are different from normal projects because they are created automatically by trados 01.03.

from 7 to 8 you get a 99% fuzzy match; if you go up it says that the difference is only in formatting how do you add formatting?

  1. BOLD : bold button available in the tool bar (or control + B )
  2. control + , (which is also what you use to add placeable elements) is the second method to add formatting; highlight the element you want to modify, then activate the quickplace list (keybord shortcut) to generate a real-time preview you go on preview (top right, on the side) > click in the middle (click here to generate a preview) > to close it just click outside of the window In-line tags, internal tags, are elements like placeholders used to incode extratextual elements in the segments (ex. fn1 is referred to the footnote number) how to place the in-line tag in the segment: place the cursor just after the word menu and then go control + , and select the tag Line 13 has a no-match, but i remember i already translated an expression contained in the segment (access units) > i perform a concordance search to see how i had translated it previously; how can you perform a concordance searc? select the expression "access keys" in the source segment then press (control +)F this will show you all the times you translated the same expression and how you translated it how to insert the already translated expression "chiavi di accesso" in the other segment? 2 ways:

you type it manually you highlight the word in the concordance search, then right-click and click on insert into document ( or control + alt + F3 ) How to insert the tag pairs (they inclose a web address), 2 ways: control + , > insert the first, insert the second first you translate the segment, you highlight the web address, press control and while keeping it pressed click on the first or second tag: it should transfer automatically to your translated text

memory > auto substitutions > date and time > this way you can select your preferred format Segment 11 has a spelling mistake and you don't want to send a spelling mistake to your TM; you can edit the source segment. By default this option is not enabled; right click the source segment. to activate it:

project settings > project > on the right: allow source editing > ok > go on the segment and right click on "edit source" segment 16 was automatically completed and localized. Segments 19 and 20 could be just one segment, but trados automatically divides the 2 every time one ends with a : and the other starts with an uppercase letter. How do you merge the 2? highlight the 1st^ segment, press control, press the number of the 2nd^ segment (20); right click > merge segments Segment 21 theres an arrow when in your source segment there are elements like soft line-breaks, you have to reproduce them in the target segments!!! you do that with shift + enter TRANSLATING AN EXCEL FILE sample_connecting > translate as single document > select languages (english us, italian italy) > set up the TM (use > filebase > translation memory) now you have to upload the termbase > advanced > auto-suggest dictionaries to upload a TB you have to select the "all language pairs" option, NOT the EnglishUs- ItalianItaly (because the termbase can be a multilingual database) all language pairs > termbases > create OR connect preexisting > use > file-based > upload from your pc (printer tb and sample tb are the same tb) > ok > ok segment 5 has red and blue lines: the segment contains terms that have been recognized from the termbase; the term-recognition window on the right should give you the term in the SL and the equivalent in italian (in this case; you only see equivalents in your target language).

When you start typing the word from the termbase, you will automatically get a suggestion which you insert by pressing ENTER

If you have stuff you dont need to translate you can just:

  • copypaste
  • control + ins (ins: FN + MAIUSC + M)
  • right click + copy source to target salva documento bilingue: control + s salva documento come target file (definitivamente): control + F chiudi: control + F TRANSLATING A PDF DOCUMENT only text files can be translated, obviously not images when translating a PDF document, studio automatically converts it from PDF into docx - during the conversion stage there are a number of formatting and segmentation problems: should ask who sends you the files to preferably send them in the original format, not in PDF

the PDF is very similar to another file you already translated, so you carry out a FILE ANALYSIS (a comparison between the document to be translated and a main translation memory connected to the document), to see how many full and fuzzy matches you can retrieve from your translation memory tool bar > batch tasks > analyze files > this way the file is automatically saved and closed to see how compatible the 2 txts are just go on view > reports total / file details (they are the same if you're analyzing only one file) Segment 1 and 2 should technically be merged (control

  • press the number of the other segment) but in this case it's not allowed, so go on project settings > project > allow source editing + enable merging segments across paragraph FRAGMENT MATCHES you translate one segment (very short, a fragment); you save it in your TM so next time you try to translate a segment containing that fragment it is retrieved from your TM. 08.03. Getting started part 2 Learning objectives:

Pre-production of resources: how to create TM based on previously translated document through alignment how to create an autosuggest dictionary how to create a simple termbase how to convert excel in termbase TRANSLATION PROJECT PACKAGE FILES - TRANSLATOR'S PERSPECTIVE sdlppx - file based project packages sflwsxz - server based project packages you open the package > you have a look at the file analysis report > translate the content of the package using all of the tools you have* > when you're done you create the return package sdlrpx - return project package (PX = project package) *note: spell checking: F How to customize trados: set a default language pair file > options > editor > languages > select default languages > OK How to translate filebased project packages: home > open project package > select it (getting started 2 project package) project packages are based on standardized files, you can connect to this document several resources (dictionaries, tm, tb, ...) and you can trans them into multiple languages; instead of receiving a single file you receive many files, it's sort of a zip file containing documents to be translated and all the resources to translate it

when you open the PP the files are automatically extracted and added to your trados (kind of unzipped) 2 documents, total number of words is overall 233, untranslated words are 229 (the translated ones: when creating the standard project the project manager launched the pre- translation : the matches contained in the TM were automatically added to the target sentences) browse > desktop > CREATE NEW EMPTY FOLDER > finish what resources are connected to the project? project settings > language pairs > all language pairs > TM and automated translation > English-Italian TM > if you see a triangle needs to be reorganized (mantainance) > select the TM > upgrade > close then open termbases menu > you should see the printer termbase english US- italian italy > autosuggest dictionaries > should be connected have a look at the REPORTS when the project manager did the file analysis you can see it in the report; project managers usually set their base on the pre-translated file file details: specific reports generated for this file and for the second file totals: general idea, reports for the whole project here you also get the number of context matches context match: it's a 100% match that if preceded by the same segment as the segment in the translation memory, both the current and the previous sentence are 100% matches (more reliable than the full match) cross-file repetitions: same source segment occurring more than once not in the same file but in the 2 different files contained in the project package now: open the first file and translate it. how? files view > you should see the 2 files > open the first file and translate it

comment > right click > add comment selecting scope and severity level (in this case it's the lowest) NOTE: since you added a comment aka something you want to verify, you don't confirm the segment in the TM. just go down to the next segment with the arrow down. Now i want to add a comment for the whole translated segment: right click anywhere in the segment > add comment > change scope and severity level (like warning for example) > type your comment NOTE: the color is different cause the severity level changed. NOTE: this segment also you do not confirm You translated the whole text. When you're done you want to only visualize the segments containing comments: review > all segments > segment review > with comments you only see the commented segments > select comment > right click > delete comment now save the bilingual file with control + s close the first file with control + F open the second file and start translating, when done save the bilingual file control + s you're done, create the return package (you have to send back the return package!!): project view > right click the project name > create return package 10.03. Alignment of texts Creation of autosuggest dictionaries from preexisting TM Translating new files with the TM created through the alignment process + the autosuggest dictionaries (aka using legacy resources) ALIGNMENT PROJECT Alignment is a process which allows us to import existing translation into new translation memories (preexisting TM or new TM created on purpose). Converting preexisting translations into new translation units to be inserted in a TM We have a source text and the corresponding text in italian, but the content is not stored in a TM because maybe you worked without a TM during that translation. You want to recover and store the content of this document into a new TM because you want to re-use that information. When you align a document with trados or alignment tools, there are 2 challenges:

  • when segments or sentences or paragraphs are left out during translation (sometimes content is left out because considered not relevant for the target audience) > this could be a challenge cause the content of the TT doenst correspond exactly to the ST
  • when in the TT there's a different segmentation of sentences (translators often do it, from EN into IT you get 1 sentence out of the previous 4 for example). Studio uses an algorithm to detect these differences and to adjust them to facilitate the alignment process for users HOW TO ALIGN home > align documents > 1 when to align single file pairs 2 a number of txts in english and TT belonging to the same domain; you create 1 alignment project containing all the ST and TT 3 you use it when you need to open a preexisting alignment project (useful if you need to reopen an alignment you didnt finish)

open option 1 > now you can ADD an existing TM, or you can also CREATE a new TM create > new file based TM > enter a NAME for the resource: the name needs to be always descriptive !!! specify location > select alignment folder > check the language direction > finish add the two files to be aligned: source file > browse > upload > choose language pair > finish > youll find yourself in the alignment view alignment view: right: source text left: tt subdivided in a number of segments middle: number of connections made by trados during the alignment process; some segments are connected with green dotted lines (not full lines), others are yellow and others are red: the colors indicate how reliable (according to trados) a particular segment pair is considered (red - less, yellow - medium, green - nice) all the segment pairs need to be confirmed by a HUMAN segment 1 in EN and segment 1 and 2: the segment 1 en should be aligned only with segment 1 it segment 2 en should be aligned only with segment 4 it

welcome > create autosuggest dictionary an AS dictionary contains words or short expression in one language and the equivalent of the other language. they are created on the basis of a translation memory, usually big ones with at least 1000 translation units

all translation memories (SDLTM, TMX, TMX.GZ) open the tmx file here you can see the number of translation units to process (maybe you could reduce the number of units for memory reasons) keep all of them > next > select location save:

  1. bilingual file: control + s
  2. target file: shift + f
  3. close file: control + f (VEDI CHE COS'E SUCCESSO PRIMA RECUPERA APPUNTI) you only want to view segments containing the words "dialog box":

review > in source segments > type "dialog box" here you can replace all the times a word is translated in the wrong way. to do that you: press CONTROL + H / home > symbol global find and replace now that you corrected the segments, go back to the original segmenting view by clicking on Save and close Open the second file to be translated (language support in english) you see that it's a 99% match and not 100% because there's a multiple translation (the match was found in both translation memories you put up, Software (main) and General (second)). to select one of them you click on it, right click, select, validate the segment.

Once you translate everything that's contained in the project you have to finalize it and update your translation memories FINALIZE: means updating trans memories and/or finalizing translations

Creating a new project based on another project Translating HTM project focusing on how to merge different files in order to optimize the autosuggest feature when you have a lot of repetitions

fai esercizio alignment exercise (vedi download) difference between simple and cross-file repetition: simple repetition is a segment occurring more than once in the same document; a cross-file repetition occurs more times in different documents new project > select the template (the one we created last time) > insert the name of the project (Termbase Search EN-IT) > add files (by clicking the + under project files) now MERGE the files: select all of them > right click > merge files > enter name of the single SDLXLIFF file (Master File) when you are merging files you are creating 1 bilingual document by only merging the bilingual file; at the end you'll still have 4 different files in the original format next > you are using a preexisting template so you're using the settings that were selected last time (cartella lecci project 1 tm e tb) open the MasterFile file that opens the orange tags indicate the end of a file and the beginning of the next we can optimize the autopropagation feature thanks to a lot of cross-text repetitions ctrl s, ctrl f4 > now you have to finalize and update your primary translation memory: projects > right click on the name of the project > batch tasks > finalize