









Studia grazie alle numerose risorse presenti su Docsity
Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium
Prepara i tuoi esami
Studia grazie alle numerose risorse presenti su Docsity
Prepara i tuoi esami con i documenti condivisi da studenti come te su Docsity
Trova i documenti specifici per gli esami della tua università
Preparati con lezioni e prove svolte basate sui programmi universitari!
Rispondi a reali domande d’esame e scopri la tua preparazione
Riassumi i tuoi documenti, fagli domande, convertili in quiz e mappe concettuali
Studia con prove svolte, tesine e consigli utili
Togliti ogni dubbio leggendo le risposte alle domande fatte da altri studenti come te
Esplora i documenti più scaricati per gli argomenti di studio più popolari
Ottieni i punti per scaricare
Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium
Riassunti di Discourse Analysis (Brian Paltridge), Language and the Internet (David Crystal), English as a Global Language (David Crystal) e Corpora in Applied Linguistics (Huston)
Tipologia: Sintesi del corso
1 / 17
Questa pagina non è visibile nell’anteprima
Non perderti parti importanti!










What is discourse analysis? Discourse analysis focuses on knowledge about language beyond the world, clause, phrase and sentence that is needed for successful communication. The term discourse analysis was first introduced by Zellig Harris in 1952 as a way of analysing connected speech and writing. Harris had two main interests: the examination of language beyond the level of the sentence and the relationship between linguistic and non-linguistic behaviour. The relationship between language and context By “the relationship between language and context” Harris means how people know, from the situation they are in, how to interpret what someone says. Discourse analysis considers the relationship between language and the contexts in which it is used and is concerned with the description and analysis of both spoken and written interactions. Its primary purpose is to provide a deeper understanding and appreciation of texts and how they become meaningful to their users. Discourse analysis and pragmatics Pragmatics is concerned with how the interpretation of language depends on knowledge of the real world. Pragmatics is interested in what people mean by what they say, rather than what words in their most literal sense might mean by themselves. The discourse structure of texts Discourse analysis are also interested in how people organize what they say in the sense of what they typically say first, and what they say next and so on in a conversation or in a piece of writing. Mitchell was one of the first researchers to examine the discourse structure of texts. He looks at the ways in which people order what they say in buying and selling interactions. He introduced the notion of stages that is the steps that language users go through as they carry out particular interactions. Mitchell discusses of language is used as cooperative action and how the meaning of language lies in the situational context in which it is used and in the context of the text as a whole. Other researches have also investigated recurring patterns in spoken interactions. Researchers working in the area known as conversation analysis have looked at how people open and close conversations and how people take turns and overlap their speech in conversations. Their interest is in fine-grained analysis of spoken interactions, such as the use of overlap, pauses, increased volume and pitch and what these reveal about how people relate to each other in what they are saying and doing with language. In an ordinary conversation, the overlapping of speech may be an attempt by one speaker to take over the conversation from the other person. If the other person does not want them to take over the conversation, they may increase the volume of what they are saying. In a different situation, overlapping speech may just be a case of co-operative conversational behaviour such as when one speaker gives feedback to another speaker. Cultural ways of speaking and writing One useful way of looking at the ways in which language is used by particular cultural groups is through the notion of the ethnography of communication. Hymes considered aspects of speech events such as who is speaking to whom, about what, for what purpose, where, and when, and how these impact on how we say and do things in culture-specific settings. Communicative competence and discourse Hymes’s notion of communicative competence is an important part of the theoretical background to the ethnography of communication as well as communicative perspectives on language teaching and learning. Communicative competence involves not only knowing a language, but also what to say to whom, and how to say it appropriately in a particular situation; it includes knowledge of rules of speaking, as well as knowing how to use and respond to different speech acts. All of this involves taking account of the social and cultural setting in which the speaking or writing occurs, speakers’ and writers’ relationships with each other, and the community’s norms, values and expectations for the kind of interaction.
Communicative competence is described as being made up of four components: grammatical competence, sociolinguistics competence, discourse competence and strategic competence. Discursive competence Discursive competence draws together the notions of textual competence, generic competence and social competence. Textual competence refers to the ability to produce and interpret contextually appropriate text. Generic competence describes how we are able to respond to both recurring and new communicative situations by constructing, interpreting, using and exploiting conventions associated with the use of particular kind of text, or genres. Social competence describes how we use language to take part in social and institutional interactions in a way that enables us to express our social identity, within the constraint of the particular social situation and communicative interaction. Different views of discourse analysis Discourse as the social construction of reality The view of discourse as the social construction of reality sees texts as communicative units which are embedded in social and cultural practices. The purpose of the text influences the discourse. Discourse shapes the ranges of possible purposes of texts. It is in discourse that words acquire meaning. We cannot understand the significance of any word unless we attend closely to its relationship to other words and to the discourse in which words are always embedded. Discourse and socially situated identities Gee argues that the ways we make visible and recognizable who we are and what we are doing always involves more than just language. It involves acting, interacting and thinking in certain ways. It is also valuing and talking in appropriate ways with appropriate “props”, at appropriate times, in appropriate places. Discourses involve the socially situated identities that we enact and recognize in the different settings that we interact in. They include culture-specific ways of performing and culture-specific ways of recognizing identities and activities. Discourses also include the different style of language that we use to enact and recognize these identities, that is, different social languages. Discourse and performance “A Discourse is a “dance” that exists in the abstract as a coordinated pattern of words, deeds, values, beliefs, symbols, tools, objects, times and places in the here and now as a performance that is recognizable as just such a coordination.” This notion of performance and performativity is taken up by authors such as Butler, Cameron, Eckert and McConnell-Ginet. The notion of performativity derives from speech act theory and the word of the linguistics philosopher Austin. It is based on the view that in saying something, we do it. Discourse and intertextuality All texts make their meanings against the background of other texts and things that have been said on other occasions. All texts are in an intertextual relationship with other texts. Differences between spoken and written discourse Grammatical intricacy and spoken discourse Writing is more structurally complex and elaborate than speech. Halliday argues that speech is no less highly organized than writings. Spoken discourse has its own kind of complexity. He presents the notion of grammatical intricacy to account for the way in which the relationship between clauses in spoken discourse can be much more spread out and with more complex relations between them in writings.
Language, context and discourse The context of situation of what someone says is crucial to understanding and interpreting the meaning of what is being said. This includes the physical context, the social context and the mental worlds and roles of the people involved in the interaction. There are a number of key aspects of context that are crucial to the production and interpretation of discourse: the situational context, the background knowledge context, co-textual context. Meaning is produced in interaction. It is accomplished by both the speaker and the listener, or the writer and their reader. Discourse, in the words of Jaworski and Coupland, is “a form of collaborative social action in which language users jointly collaborate in the production of meanings and inferences”. Speech acts and discourse Austin and Searle argued that language is used to do things other than just refer to the truth or falseness of particular statements. In the same way that we perform physical acts, we also perform acts by using language. We use language to give orders, to make requests, to give warning or to give advice, to do things that go beyond the literal meaning of what we say. What we say often has both a literal meaning and an illocutionary meaning that is the meaning which goes beyond what someone has said. There are three kinds of act which occur with everything we say: the locutionary act: refers to the literal meaning of the actual words; the illocutionary act: refers to the speaker’s intention in uttering the words; the perlocutionary act: refers to the effect this utterance has on the thoughts or actions of the other person. Direct and indirect speech acts Sometimes when we speak we do mean exactly what we say. Often we do, however, say things indirectly. We often intend something that is quite different from the literal meaning of what we say. Felicity conditions and discourse For a speech act to work, Austin argued that there are a number of conditions (felicity conditions) that must be met: there must be a generally accepted procedure for successfully carrying out the speech act, the circumstances must be appropriate for the use of the speech act, the person who uses the speech act must be the appropriate person to use it in the particular context; the communication must be carried out by the right person, in the right place, at the right time and with a certain intention or it will not work. Rules versus principles Searle argued that felicity conditions of an utterance are “constitutive rules”, they are rules that need to be followed for the utterance to work. They thus constitute the particular speech act. Thomas critiques this notion of constitutive rules and suggests that the notion of principles is more helpful to this discussion. In her view, the pragmatic use of language is constrained by maxims or principles rather than by rules. Taking a principles-based view of speech act performance describes what people often do or are most likely to do rather than what they “must” do. The analysis of speech acts then needs to take account of the fact that we are often dealing with approximations, or “fuzzy” rather than discrete categories. Presupposition and discourse Presupposition refers to the common ground that is assumed to exist between language users such as assumed knowledge of a situation and/or the world.
Two main kinds of presupposition are discussed in the area of pragmatics: conventional presupposition: typically linked to particular linguistic forms (would you like some coffee?) pragmatic presupposition: context-dependent and arise from the use of an utterance in a particular context. The co-operative principle and discourse Grice argues that in order for a person to interpret what someone else says, some kind of co-operative principle must be assumed to be in operation. Grice based his co-operative principle on four maxims: maxim of quality: people should only say what they believe to be true maxim of quantity: we should make our contribution as informative as is required for the particular purpose maxim of relation: we should make our contribution relevant to the interaction maxim of manner: we should be clear in what we say, we should avoid ambiguity and obscurity If someone in unsure of what they want to say, or wants to avoid someone inferring they have evidence for what they say, people often use metadiscourse (I may be mistaken but; maybe; I won’t bore you with all the details; by the way). Flouting the co-operative principle On some occasions speakers flout the co-operative principle and intend their hearer to understand this; they purposely do not observe the maxim. Differences between flouting and violating maxims A speaker is flouting a maxim if they do not observe a maxim but has no intention of deceiving or misleading the other person. A person is violating a maxim if there is a likelihood that they are liable to mislead the other person. A speaker may also infringe a maxim when they fail to observe a maxim with no intention to deceive, such as where a speaker does not have the linguistic capacity to answer a question. A speaker may also decide to opt out a maxim such as where a speaker may, for ethical or legal reasons, refuse to say something that breaches a confidentiality agreement that they have with someone. Overlaps between maxims An utterance may be both unclear and longwinded, flouting the maxims of quality and quantity at the same time. It may be socially acceptable to flout a maxim for reasons of tact and politeness. Cross-cultural pragmatics and discourse The way in which people perform speech acts, and what they mean by what they say when they perform them, often varies across cultures. Communication across cultures Different languages and cultures often have different ways of dealing with pragmatic issues, as well as different ways of observing Grice’s maxims. Speakers from different languages may have different understanding of the maxim of quantity. Béal found that communication difficulties occurred between English and French speakers because the English speaker saw questions such as “How are you” as examples of phatic communication and expected short, standard answers such as “Fine, thanks”. The French speaker saw the questions as real requests for information and, in the English speakers’ eyes, flouted the maxim of quantity.
Face-threatening acts Some acts “threaten” a person’s face. These are called face-threatening acts. Often we use mitigation devices to take the edge off face-threatening acts. Politeness and cross-cultural pragmatic The particular nature of face varies across cultures and politeness strategies are not necessarily universal. What may be a face-threatening act in one culture may not be seen in the same way in another. Discourse grammar Cataphoric reference Cataphoric reference describes an item which refers forward to another word or phrase which is used later in the text. Exophoric reference Exophoric reference looks outside the text to the situation in which the text occurs for the identity of the item being referred to. Homophoric reference Homophoric reference is where the identity of the item can be retrieved by reference to culture knowledge rather than the specific context of the text. Comparative and bridging reference With comparative reference the identity of the presumed item is retrieved not because it has already been mentioned but because an item with which it is being compared has been mentioned. A bridging reference is where an item refers to something that has to be inferentially derived from the text or situation, something that has to be presumed indirectly. Lexical cohesion Lexical cohesion refers to relationships in meaning between lexical items in a text and content words and the relationship between them. Repetition Repetition refers to words that are repeated in a text. Synonymy Synonymy refers to words which are similar in meaning. Antonymy Antonymy describes opposite or contrastive meanings. Hyponymy Hyponymy refers to classes of lexical items where the relationship between them is one of general-specific. Meronymy Meronymy is where lexical items are in a whole to part relationship with each other. Collocation Collocation describes associations between vocabulary items which have a tendency to co-occur, such as combinations of adjectives and nouns Expectancy relations This occurs where there is a predictable relationship between a verb and either the subject or the object of the verb (waste/time).
Conjunction Conjunctions are described by Halliday and Hasan under the groupings of: additive conjunctions: and, or, moreover, in addition comparative conjunctions: whereas, but, on the other hand, equally temporal conjunctions: while, when, after, meanwhile consequential conjunctions: so that, because, thus, since, if, therefore Substitution and ellipsis With substitution a substitute form is used for another language item, phrase or group. With ellipsis some essential element is omitted from the text and can be recovered by referring to a preceding element in the text. Theme and rheme Theme is the starting point of a clause, what the clause is about. The remainder of the clause is the rheme. The theme introduces information prominence to the clause. Interpersonal theme refers to an item that comes before the rheme which indicates the relationship between participants in the text, or the position or point of view that is being taken in the clause (perhaps, sometimes, generally, surely, to my mind, frankly, kindly, no doubt, hopefully). Multiple theme: there is more than a single thematic element in the theme component of the clause. Thematic progression Thematic progression refers to the way in which the theme of a clause may pick up, or repeat, a meaning from a preceding theme or rheme. One example of thematic progression is the constant theme: theme 1 is picked up and repeated at the beginning of the next clause. Another pattern of thematic progression is when the subject matter om the rheme of one clause is taken up in the theme of a following clause, this is referred to as linear theme. In a multiple theme/split rheme progression, a rheme may include a number of different pieces of information, each of which may be taken up as the theme in a number of subsequent clauses.
The medium of Netspeak The Internet is an electronic, global and interactive medium. A user’s communicative options are constrained by the nature of the hardware. A set of characters on a keyboard determines productive linguistic capacity and the size and configuration of the screen determines receptive linguistic capacity. Both sender and receiver are constrained linguistically by the properties of the Internet software and hardware linking them. Speech or writing? What makes Netspeak so interesting as a form of communication is the way it relies on characteristics belonging to both sides of the speech/writing divide. The situations of e-mail, chatgroups and virtual worlds, though expressed through the medium of writings, display several of the core properties of speech. But there are several major differences between Netspeak and face-to-face conversations: the lack of simultaneous feedback, messages sent via computer are complete and unidirectional, the message does not leave our computer until we send it, there is no technical way of allowing the receiver to send the electronic equivalent of a simultaneous nod, messages cannot overlap, the rhythm of an Internet interaction is very much slower than that found in a speech situation. Netspeak is unlike speech also with respect to the formal properties of the medium, chief among these properties is the domain of prosody and paralanguage as expressed through vocal variations in pitch,
The language of e-mail We can see in e-mails a fixed sequence of discourse elements. Structural elements headers o the e-address to which the message is being sent; this is an obligatory element o the e-address from which the message has been sent; this is an obligatory element o a brief description of the topic of the message; this is an optional element o the date and time at which the message is sent; inserted automatically by the software body: this too can be viewed in terms of obligatory and optional elements. The obligatory item is a message of some sort. Several types of e-mail have no greeting at all. Between people who know each other, greetingless messages are usually promptly sent responses. The longer the delay in responding, the more likely the response will contain a greeting. Most interpersonal messages end with a pre-closing formula and the identification of the sender. A widely held view is that the body of a message should be entirely visible within a single screenview. Writers are recommended to use a line-of-white between paragraphs. They are advised to use short, simple sentences. The language of the Web The Web holds a mirror up to the graphic dimension of our linguistic nature - graphic refers to all aspects of written language, including typewritten, handwritten and printed text -. It is a mirror that both distorts and enhances, providing new constraints and opportunities. It constrains in that we see language displayed within the physical limitation of monitor screen and subjected to a user-controlled movement (scrolling). Hypertext and interactivity The most important use of colour in a well-designed Web site is to identify the hypertext links – the jumps that users can make if they want to move from one page or site to another. Nothing in traditional written language remotely resembles the dynamic flexibility of the Web. Evolution and management The most basic semantic criteria are missing from the heavily frequency-dominated information retrieval techniques currently used by search engines. The typical problem can be illustrated by the word depression which if typed into the search box of a search engine will produce a mixed bag of hits in which its senses within psychiatry, geography and economics are not distinguished. Languages on the Web The estimates for languages other than English have steadily risen. Some predict that before long the Web will be predominantly non-English, as communications infrastructure develops in Europe, Asia, Africa and South America. In 1998, the total number of newly created non- English Web sites passed that for newly created English Web sites. The Web is increasingly reflecting the distribution of language presence in the real world.
Why a global language? What is a global language? A language achieves a genuinely global status when it develops a special role that is recognized in every country. Such a role will be most evident in countries where large numbers of the people speak the language as a mother tongue – in the case of English. To achieve the global status a language has to be taken up by other countries around the world. They must decide to give it a special place within their communities. There are two main ways in which this can be done. Firstly, a language can be made the official language of a country as a medium of communication in such domains as government, the law courts, the media and the educational systems. Secondly, a language can be made a priority in a country’s foreign-language teaching, even though this language has no official status it becomes the language which children are most likely to be taught when they arrive in school. English is now the language most widely taught as a foreign language. A quarter of the world’s population is already fluent or competent in English, no other language can match this growth. What makes a global language? Why a language becomes a global language has little to do with the number of people who speak it. It is much more to do with who those speakers are. Without a strong power-base no language can make progress as an international medium of communication. Language exists only in the brains and mouths and ears and hands and eyes of its users. When they succeed, on the international stage, their language succeeds. A language does not become a global language because of its intrinsic structural properties or because it was one associated with a great culture, there are all factors which can motivate someone to learn a language. Correspondingly, inconvenient structural properties do not stop a language achieving international status either. The history of a global language can be traced through the successful expeditions of its soldier/sailor speakers. And English has been no exception. The growth of competitive industry and business brought an explosion of international marketing and advertising. Any language at the centre of such an explosion of international activity would suddenly have found itself with a global status. By the beginning of the nineteenth century, Britain had become the world’s leading industrial and trading country. By the end of the century, the population of the USA was larger than any of the other countries of western Europe, and its economy was the most productive and fastest growing in the world. British imperialism had sent English around the globe. During the twentieth century this presence was maintained almost single-handedly trough the economic supremacy of America. Economics replaced politics as the chief driving force and the language behind the US dollar was English. Why do we need a global language? Translation has played a central role in human interaction. The more a community is linguistically mixed, the less it can rely on individuals to ensure communication between different groups. In communities where only two or three languages are in contact bilingualism is a possible solution. But in communities such as Africa and South-east Asia, this solution does not readily apply. The problem has traditionally been solved by finding a language to act as a lingua franca. Sometimes when communities begin to trade with each other, they communicate by adopting a simplified language known as pidgin which combines elements of their different languages. The geographical extent to which a lingua franca can be used is entirely governed by political factors. The prospect that a lingua franca might be needed for the whole world is something which has emerged strongly only in the twentieth century.
International relations The League of Nations was the first of many modern international alliances to allocate a special place to English in its proceedings: English was one of the two official languages, the other was French, and all documents were printed in both. The League was replaced in 1945 by the United Nations. The UN now consists of over fifty distinct organs, programmes, and specialized agencies; English is one of the official languages within all of these structures. In Europe organizations which work only in English are surprisingly common, especially in science. The media □ The press: the English language has been an important medium of the press. □ Advertising: English in advertising began very early on, when the weekly newspapers began to carry items about books, medicines, teat, and other domestic products. □ Broadcasting: it took many decades of experimental research in physics, chiefly in Britain and America, before it was possible to send the first radio telecommunication signals through the air without wires. Although later to develop, the USA rapidly overtook Britain, becoming the leading provider of English- language services abroad. □ Cinema: the technology of this industry has many roots in Europe and America during the nineteenth century. Despite the growth of the film industry in other countries in later decades, English-language movies still dominate the medium, with Hollywood coming to rely increasingly on a small number of annual productions aimed at huge audiences. □ Popular music: when modern popular music arrived, it was almost entirely an English scene. “Pop music is virtually the only field in which the British have led the world in the past three decades” (1996, Nick Reynolds) In the 2000s, the English-language character of the international pop music world is extraordinary. Although every country has its popular singers, only a few manage to break through into the international arena.
Introduction to a corpus in use Corpora and the study of corpora have revolutionised the study of language and of the applications of language, over the last few decades. Corpora allow researches not only to count categories in traditional approaches to language but also to observe categories and phenomena that have not been noticed before. What a corpus can do Strictly speaking, a corpus by itself can do nothing at all, being nothing other than a store of used language. Corpus access software, however, can re-arrange that store so that observations of various kinds can be made. □ Frequency: the words in a corpus can be arranged in order of their frequency in that corpus. An example of differences in frequency is the words man, woman, husband and wife. The totals show that man occurs more frequently than woman and it is therefore unexpected that wife should occur more frequently than husband. The most likely interpretation is that women are more frequently referred to in relation to the person they are married to than men are. The spoken corpus, however, reverses the trend apparent in the books and Times corpora. In both cases the most frequent phrases are with possessive determiners.
□ Phraseology: most people access a corpus through a concordancing program. Concordance lines bring together many instances of use of a word or phrase, allowing the user to observe regularities in use that tend to remain unobserved when the same words or phrases are met in their normal context. It is trough concordances that phraseology is observed. One point of interest is the way that phraseology can be used as an alternative view of phenomena that teachers of English are frequently called upon to explain. □ Collocation: collocation is the statistical tendency of words to co-occur. A list of the collocates of given word can yield similar information to that provided by concordance lines, with the difference that more information can be processed more accurately by the statistical operations of the computer than can be dealt with by the human observer. Collocation can indicate pair of lexical items such as shed + tears, or the association between a lexical word and its frequent grammatical environment. What corpora are used for For language teaching, corpora can give information about how a language works that may not be accessible to native speaker intuition. In addition, the relative frequency of different features can be calculated. Increasingly, in language classroom teachers are encouraging students to explore corpora for themselves, allowing them to observe nuances of usage and to make comparisons between languages. Translators use comparable corpora to compare the use of apparent translation equivalents in two languages, and parallel corpora to see how words and phrases have been translated in the past. General corpora can be used to establish norms of frequency and usage against which individual texts can be measured. Corpora are used also to investigate cultural attitudes expressed through language and as a resource for critical discourse studies. Types of corpora Specialised corpus: a corpus of texts of a particular type, such as newspaper editorials, academic articles, lectures, essays etc. There is no limit to the degree of specialisation involved. General corpus: a corpus of texts of many types, it may include written or spoken language, or both, and may include texts produced in one country or many. A general corpus is usually much larger than a specialised corpus. It may be used to produce reference material for language learning or translations, and it is often used as a baseline in comparison with more specialised corpora. Comparable corpora: two (or more) corpora in different languages or in different varieties of a language. They are designed along the same lines. Parallel corpora: two (or more) corpora in different languages, each containing texts that have been translated from one language into the other, or texts that have been produced simultaneously in two or more languages. They can be used by translators and learners to find potential equivalent expressions in each language. Learner corpus: a collection of texts produced by learners of a language. The purpose of this corpus is to identify in what respects learners differ from each other and from the language of native speaker, for which a comparable corpus in required. Pedagogic corpus: a corpus consisting of all the language a learner has been exposed to. For most learners, their pedagogic corpus does not exist in physical form. Historical or diachronic corpus: a corpus of texts from different periods of time. It is used to trace the development of aspects of a language. Monitor corpus: a corpus designed to track current changes in a language. A monitor corpus is added to annually, monthly or even daily so it rapidly increases in size. Some key terms
The collocation program then calculates the frequency of each item in a 4:4 span, giving these as the fifteen most frequent collocates. The problem with a list of raw frequencies is that it is impossible to attach a precise degree of importance to any of the figures in it. Three of the most commonly used measures of significance are: Mutual Information (MI) score, t-score and z-score. The t-score and MI-score both depend on two calculations: how many instances of the co-occurring word are found in the designated span of the node word and how many instances might be expected in that span, given the frequency of the co-occurring word in the corpus as a whole. □ The t-score is calculated by subtracting Expected from Observed and dividing the result by the standard deviation. □ The MI-score is the Observed divided by the Expected, converted to a bse-2 logarithm. An MI-score indicates the strength of a collocation. The important differences between MI-score and t-score are: MI-score is a measure of strength of collocation; t-score is a measure of certainty of collocation. The value of an MI-score is not particularly dependent on the size of the corpus; for the t-score corpus size is important. MI-score can be compared across corpora, even if the corpora are of different sizes, but absolute t- scores cannot be compared across corpora. Looking at the top collocates from the point of view of t-score tends to give information about the grammatical behaviour of a words. On the other hand, looking at the top collocates from the point of view of MI-score tends to give information about its lexical behaviour, but particularly about the more idiomatic co-occurrences. Tagging and parsing Corpus annotation is the process of adding information to a corpus. This information is designed to interpret the corpus linguistically, for example by indicating the word-class of each of the words in it. The term annotation is used to cover tagging, parsing and other forms of annotation. “Corpora annotation is widely accepted as a crucial contribution to the benefit a corpus brings, since it enriches the corpus as a source of linguistic information for future research and development.” (Leech) □ Tagging means allocating a part of speech (POS) label to each word in a corpus. For example, the word light is tagged as either a verb, a noun or an adjective each time it occurs in the corpus. The tag can be chosen to give general or specific information. □ Parsing means analysing the sentences in a corpus into their constituent parts, that is, doing a grammatical analysis. Other kinds of corpus annotations Different systems of annotation can be used to analyse the anaphora in a text, most do some or all of the following: identify an anaphor and its antecedent categorise the antecedent identify the direction of connection identify the type of anaphor Semantic annotation refers to the categorisation of words and phrases in a corpus in terms of a set of semantic fields. Each word or multi-word item from a tagged corpus is matched against a lexicon in which the items are assigned to a semantic field. The partial annotation is a variation of semantic annotation in that only certain categories are selected.