2 5373127769469227960
2 5373127769469227960
2 5373127769469227960
1
2-Is Corpus Linguistics A Branch of Linguistics?
The answer to this question is both yes and no. Corpus linguistics is not a
branch of linguistics in the same sense as syntax, semantics,
sociolinguistics which explain/describe some aspect of language use.
Corpus linguistics in contrast is
1- a methodology rather than an aspect of language requiring explanation
or description.
2- A corpus based approach can be taken to many aspects of linguistic
enquiry. Syntax, semantics and pragmatics are just three examples of areas
of linguistic enquiry that have used a corpus-based approach.
3- Corpus linguistics is a methodology that may be used in almost any area
of linguistics.
4- Corpus linguistics does, however, allow us to differentiate between
approaches taken to the study of language and, in that respect, it does define
an area of linguistics or, at least, a series of areas of linguistics.
Thus,corpus-based syntax as opposed to non-corpus-based syntax, corpus-
based semantics as opposed to non-corpus-based semantics and so on.
5- So, while corpus linguistics is not an area of linguistic enquiry in itself,
it does, at least, allow us to discriminate between methodological
approaches taken to the same area of enquiry by different groups,
individuals or studies.
2
basic methodology that we can undoubtedly call corpus-based. Harris
(1993: 27) summarizes the approach well: 'The approach began ... with a
large collection of recorded utterances from some language, a corpus.
In some ways it describes all linguistics before Chomsky and links it to the
modern methodology of corpus linguistics to which it has affinity.
4- Language acquisition
Linguistics proceeded by corpus-based description" in the nineteenth
as well as the early twentieth century.
The studies of child language in the diary studies period of language
acquisition research (roughly 1876-1926) were based on carefully
composed parental diaries recording the child's locutions.
Longitudinal studies have been dominant from 1957 to the present.
They are again based on the collection of utterances, but this time around
three children are used as a source of data over time. Brown (1973) and
Bloom (1970) are both examples of longitudinal studies.
5- Spelling conventions
Kading (1897) used a large corpus of German - some 11 million words
- to collate frequency distributions of letters and sequences of letters in
German. The corpus, by size alone, is impressive for its time and compares
favourably in terms of size with some modern corpora.
6- Language pedagogy
Fries and Traver (1940) and Bongers (1947) are examples of linguists
who used the corpus in research on foreign language pedagogy. Indeed, as
noted by Kennedy (1992), the corpus and second language pedagogy had
a strong link in the early half of the twentieth century, with vocabulary lists
for foreign learners often being derived from corpora. The word counts
derived from such studies as Thorndike (1921) and Palmer (1933) were
important in defining the goals of the vocabulary control movement in
second language pedagogy.
3
7- Comparative linguistics
Comparative linguistics also shows evidence of a corpus-based
inclination. A good example here is Eaton's (1940) study comparing the
frequency of word meanings in Dutch, French, German and Italian.
McEnery and Oakes (1996) are examples of early 1990s of using corpora
to derive such information .
Modern contrastive corpus linguistics is casting a wider net than lexis now,
with work such as that of Johansson (1997) and work reported in Johansson
and Oksefjell (1998).
8- Syntax and semantics
The semantic frequency lists used by Eaton were also used by other
researchers interested in monolingual description. Lorge (1949) is an
example of this. Syntax was also examined. Fries (1952) is an early
example of a descriptive grammar of English based on a corpus." This
work predates the corpus-based grammars of the late 1980s, for example,
A Comprehensive Grammar if the English Language (Quirk et a!., 1985),
by thirty years and more.
Indeed, it is no exaggeration to suggest that as a methodology it was
widely perceived as being intellectually discredited for a time. This event
can be placed so accurately because its source lies almost exclusively with
one man and his criticisms of the corpus as a source of information. That
man was Noam Chomsky.
9- Chomskyian view on Corpus Linguistics
Why to study it?
1- it is difficult to truly appreciate the marginal role of corpus data in
language studies until quite recently. It stands in relation to other
approaches to linguistics.
4
2- As Chomsky's views were so deeply influential, it is important to take
his criticisms seriously and respond to them, as corpus linguists have.
Chomsky's attacks on corpus data evoked a response from linguists who
wanted to use corpus data.
3- Finally, as previously noted, the debate that Chomsky triggered in
linguistics is a very old one - the debate between rationalists5 and
empiricists6.
10- His Contribution
o Chomsky changed the object of linguistic enquiry from abstract
descriptions of language to theories which reflected a psychological reality,
cognitively plausible models of language.
o he apparently invalidated the corpus as a source of evidence in linguistic
enquiry.
o Chomsky suggested that the corpus could never be a useful tool for the
linguist, as the linguist must seek to model language competence rather
than performance.
o Chomsky argued that it was competence rather than performance that
the linguist was trying to model. It is competence which both explains and
characterises a speaker's knowledge of the language.
5
11- How do we determine from any given utterance what are the
linguistically relevant performance phenomena?