Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


corpus based analysis, Schemi e mappe concettuali di Linguistica

riassunto del corpus based analysis

Tipologia: Schemi e mappe concettuali

2021/2022

Caricato il 05/12/2023

mmasu30
mmasu30 🇮🇹

6 documenti

1 / 2

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
10.Corpus-based discourse analysi
It is a methodology that has been developed in the late 1990s, also because it uses computers to interpret data so of
course we need to wait for the advent of computer for the full development of this methodology.
othe corpus methodology has revolutionized nearly all branches of linguistics.
Why is corpora linguistic so important? Because:
oOne of the strengths of corpus data/linguistics lies in the fact that we use empirical data.
Empirical means that we are working with examples of real languages in use.
oThe second strength of the corpus linguistic it is represented by the fact that it pools together the intuitions
of great number of speakers, each one of them demonstrating their knowledge of the English language,
that’s why a corpus pulls together intuitions in the knowledge that language speaker had possess of their
language.
oThe third strength of corpus linguistic is linked to the fact that it makes linguistics analysis more objective.
What do we use the corpus for?
oTo test existing linguistic theories and hypothesis
oTo generate and verify new linguistic hypothesis
oTo provide textual evidence in text-based humanities
-A corpus can show us what is common and typical
-A corpus can readily give us accurate statistics
What is a corpus?
oThe word corpus comes from Latin (it means body) and the plural is corpora (corpus is singular).
A corpus is a collection, a body of naturally occurring language but rarely a random collection of text.,
because according to Leech (1992) corpora “are generally assembled/created with particular purposes in
mind’’
A quotation that has all the elements that allow us to define what a corpus is is given by Elena Tognini-Bonelli):
oA computerised collection of authentic texts that we can analyse. The texts are selected according to explicit
criteria in order to capture the regularities of a language-:
-Computerised = Generally this corpus has large dimensions, so it is very difficult to analyze corpora by hand,
it takes a long time. For this reason, the use of machines is required because it allows a rapid analysis.
-Authentic texts = we have)two corpus):
1.GENERAL CORPUS): serves as a resource or for studies of general linguistic features
A general corpus, thus, provides sample data from which we can make generalizations about
spoken and written discourse as a whole, and frequencies of occurrence, and co-occurrence, of
particular aspects of language in the discourse.
An example of a general corpus is the)British National Corpus. Some corpora contain texts that are
sampled (chosen from) a particular)variety)of a language, for example, from a particular dialect or
from a particular subject area.
2.SPECIALIZED CORPUS): is a corpus of texts of a particular type such as newspaper, academic
paper
It tell us about the use of written or spoken language of particular genres or domain of use
A specialized corpus might be used, for example, to examine the use of hedges in casual conversation
or the ways in which people signal a change in topic in an academic presentation
Michigan Corpus of Academic Spoken English (MICASE) is an example which contains only spo-
ken language from a university setting
-Explicit criteria = it means that the corpus data that we have collected is strictly linked to the intended
use of the corpus itself, this is called “research question”, this question guides the collection of our corpus.
pf2

Anteprima parziale del testo

Scarica corpus based analysis e più Schemi e mappe concettuali in PDF di Linguistica solo su Docsity!

10.Corpus-based discourse analysi

It is a methodology that has been developed in the late 1990s, also because it uses computers to interpret data so of course we need to wait for the advent of computer for the full development of this methodology. o the corpus methodology has revolutionized nearly all branches of linguistics. Why is corpora linguistic so important? Because: o One of the strengths of corpus data/linguistics lies in the fact that we use empirical data. Empirical means that we are working with examples of real languages in use. o The second strength of the corpus linguistic it is represented by the fact that it pools together the intuitions of great number of speakers , each one of them demonstrating their knowledge of the English language, that’s why a corpus pulls together intuitions in the knowledge that language speaker had possess of their language. o The third strength of corpus linguistic is linked to the fact that it makes linguistics analysis more objective. What do we use the corpus for? o To test existing linguistic theories and hypothesis o To generate and verify new linguistic hypothesis o To provide textual evidence in text-based humanities

  • A corpus can show us what is common and typical
  • A corpus can readily give us accurate statistics What is a corpus? o The word corpus comes from Latin (it means body ) and the plural is corpora ( corpus is singular). A corpus is a collection, a body of naturally occurring language but rarely a random collection of text., because according to Leech (1992) corpora “are generally assembled/created with particular purposes in mind ’’ A quotation that has all the elements that allow us to define what a corpus is is given by Elena Tognini-Bonelli : o A computerised collection of authentic texts that we can analyse. The texts are selected according to explicit criteria in order to capture the regularities of a language :
  • Computerised = Generally this corpus has large dimensions, so it is very difficult to analyze corpora by hand, it takes a long time. For this reason, the use of machines is required because it allows a rapid analysis.
  • Authentic texts = we have two corpus :
  1. GENERAL CORPUS : serves as a resource or for studies of general linguistic features A general corpus, thus, provides sample data from which w e can make generalizations about spoken and written discourse as a whole , and frequencies of occurrence, and co-occurrence, of particular aspects of language in the discourse. An example of a general corpus is the British National Corpus. Some corpora contain texts that are sampled (chosen from) a particular variety of a language, for example, from a particular dialect or from a particular subject area.
  2. SPECIALIZED CORPUS : is a corpus of texts of a particular type such as newspaper, academic paper It tell us about the use of written or spoken language of particular genres or domain of use A specialized corpus might be used, for example, to examine the use of hedges in casual conversation or the ways in which people signal a change in topic in an academic presentation Michigan Corpus of Academic Spoken English (MICASE) is an example which contains only spo- ken language from a university setting
  • Explicit criteria = it means that th e corpus data that we have collected is strictly linked to the intended use of the corpus itself , this is called “research question”, this question guides the collection of our corpus.

Also, the corpus must be representative of the language or a genre of that language that we are investigating; it must be representative of some type of language so that we can start to measure it up against the research questions that we have posted and that want to investigate. However, achieving representativeness is quite an elusive and controversial goal To explain, there is an example linked to two words: SOMEONE and SOMEBODY. Typically, in the books with grammar references, they simply say : someone /somebody: used to indicate an unknown general subject. However, they don’t investigate if there is a distinction between these two. even if you ask someone, they will simply say that there is any difference. More informations can be acquired through corpora. For example, the corpus can tell us that : ● Someone seems to be used in the written and spoken language, buth much preferred in written;Somebody is used only in spoken language ; When we do corpus linguistic there are certain steps the we take:

  1. Description → we see what linguistic patterns are in the corpus
  2. Interpretation → we see how these patterns contribute towards discourses → triangulation
  3. Explanation → we see why these patterns are here, so we analyse the history/society/context
  4. Evaluation → we see who benefits or not from these patterns.