Susan Jean Howcroft English For Science and Technology: Universidade de Aveiro Departamento
Susan Jean Howcroft English For Science and Technology: Universidade de Aveiro Departamento
de Línguas e Culturas
1999
a Computer Corpus-based
Analysis of English Science and
Technology Texts for Application
in Higher Education.
Vogais
Prof. Dr. Anthony David Barker
Professor Associado do Departamento de Línguas e Culturas da
Universidade de Aveiro
iii
Acknowledgements
v
Abstract
viii
Resumo
Esta tese apresenta duas análises: primeiro uma análise de corpora
computadorizados, criados a partir de livros dos estudantes de
licenciaturas, para isolar a linguagem Inglesa (Americana) das ciências e
tecnologias que apresentam; segundo uma análise dos conhecimentos da
língua Inglesa que estes alunos apresentam ao iniciar os seus estudos
universitários em ciências e tecnologias. Estas duas análises são postas
em contraste para se aplicar os resultados obtidos ao desenho de um
programa de língua Inglesa para os alunos do primeiro ano.
Foi criada uma lista com a abrangência e a frequência das palavras
de um corpus de larga base, para ser contrastada com os principais
corpora compilados dos livros de física e química constantes das
bibliografias dos estudantes, como uma fonte para o desenho de
programas. Seguidamente, quatro corpora, dois principais e dois
subordinados, produzidos a partir dos livros de física e química referidos
nas bibliografias dos estudantes, foram analisados usando os algoritmos
e funções de Biber (1988) para variações entre linguagem falada e escrita.
Durante cinco anos, à entrada para a Universidade, os estudantes
foram submetidos a testes e os resultados analisados. Constatou-se que
havia variações consideráveis no nível de conhecimentos da língua por
parte dos estudantes. Contudo, havia uma correlação apertada entre as
competências dos estudantes e o número de anos que tinham estudado
Inglês nas escolas secundárias. Todavia, havia estudantes com
competências extremamente avançadas e outros com competências
reduzidas, ou quase nulas, em Inglês. A compreensão de textos
científicos estava geralmente correlacionada com os níveis mais
avançados de competências e maior número de anos de estudo.
A lista com a abrangência e a frequência das palavras mostrou os
contextos apropriados dos materiais a utilizar com estes estudantes e
demonstrou que havia diferenças em relação a muitos dos pontos de
vista aceites em relação à linguagem das ciências e tecnologias. A análise
dos corpora computadorizados varia das categorias da linguagem da
prosa académica de Biber. Os corpora subordinados mostram uma maior
variação, que se julga ser devida a materiais específicos, culturais e/ou
literário, usados nas analogias dos livros de estudo.
O grande peso dos conhecimentos de fundo de que os estudantes
necessitam para trabalhar adequadamente com os livros de estudo foi,
também, encontrado nos exercícios que necessitam de fazer para
praticarem o que está referido nos tópicos dos capítulos. Isto, juntamente
com a interpretação das imagens dos livros, foram considerados os dois
principais factores a precisarem de ser relevados no programa para o
primeiro ano dos estudantes. Contudo, atendendo às restrições de tempo
ix
para o ensino de línguas a estudante de ciências e tecnologias, a
metodologia que conduziria a maior autonomia dos alunos será baseada
na utilização de corpora computadorizados (data-driven learning) e
aprendizagem à distância assistida por computador.
x
CONTENTS
Jury..................................................................................................................... iii
Acknowledgements............................................................................................ v
Abstract............................................................................................................. vii
Resumo .............................................................................................................. ix
Contents............................................................................................................. xi
Chapter 1 Introduction............................................... 3
1.1 Science and Technology Education ............................................11
1.2 Lifelong Education.....................................................................15
1.3 The Impact of New Technology ..................................................21
1.4 The Dominance of English in Science and Technology................26
1.5 The Situation in Portugal ...........................................................28
1.6 Science and Technology Undergraduates and English ................32
1.7 The Ano Comum..............................................................................34
1.8 Appropriate Text Types ..............................................................36
1.9 The Corpora ..............................................................................37
1.10 CD ROM Material ....................................................................38
1.11 The Syllabus ...........................................................................42
1.12 The Research...........................................................................43
1.13 Methodology ............................................................................44
Appendices............................................................... 369
xv
Figures
3.1 Pie Graph for the Academic Year 1993/94 showing the Students’
Number of Years of English ........................................................145
3.2 Pie Graph for the Academic Year 1994/95 showing the Students’
Number of Years of English ........................................................146
3.3 Pie Graph for the Academic Year 1995/96 showing the Students’
Number of Years of English ........................................................146
3.4 Pie Graph for the Academic Year 1996/97 showing the Students’
Number of Years of English ........................................................147
3.5 Pie Graph for the Academic Year 1997/98 showing the Students’
Number of Years of English ........................................................147
6.1 Dimension 1 ‘Involved versus Informational Production’ ...........250
6.2 Dimension 2 ‘Narrative versus Non-Narrative Concerns’...........256
6.3 Dimension 3 ‘Explicit versus Situation-Dependent Reference’...261
6.4 Dimension 4 ‘Overt Expression of Persuasion’ ..........................265
6.5 Dimension 5 ‘Abstract versus Non-Abstract Information’ ..........268
6.6 Dimension 6 ‘On-Line Informational Elaboration’ ......................271
6.7 Dimension 1 ‘Involved versus Informational Production’ for the
Academic Prose Sub-Genres............................................................275
6.8 Dimension 2 ‘Narrative versus Non-Narrative Concerns’ for the
Academic Prose Sub-Genres............................................................277
6.9 Dimension 3 ‘Explicit versus Situation-Dependent Reference’ for
the Academic Prose Sub-Genres......................................................278
6.10 Dimension 4 ‘Overt Expression of Persuasion’ for the Academic
Prose Sub-Genres ............................................................................280
6.11 Dimension 5 ‘Abstract versus Non-Abstract Information’ for the
Academic Prose Sub-Genres............................................................281
6.12 Dimension 6 ‘On-Line Informational Elaboration’ for the Academic
Prose Sub-Genres ............................................................................283
xvi
Tables
2.1 Munby’s Communicative Needs Processor ...............................90
2.2 Texts, Categories and Numbers of Words in the LOB Corpus .101
2.3 Texts, Categories and Numbers of Words in the London-Lund
Corpus .......................................................................................104
2.4 Categories and Percentages in the British National Corpus....106
2.5 Conversation in the British National Corpus .........................107
2.6 Number of words and texts for Academic Prose and Fiction in the
Longman/Lancaster Corpus ......................................................108
3.1 Huddlestone’s Level of Science Texts ....................................126
3.2 Darian’s Level of Text and Audience .....................................126
3.3 Students’ Number of Years of Study of English.....................143
4.1 Analysis of 1995/96 Test Results by Item.............................166
4.2 Analysis of 1996/97 Test Results by Item.............................168
4.3 Analysis of 1997/98 Test Results by Item.............................171
5.1 Grolier Frequency and Range List.........................................185
5.2 Frequency and Range Results for Abstract Nouns and Adjectives
............................................................................................188
5.3 Normalised Frequencies from the Main Corpora compared to
Biber’s Academic Prose with Statistical Significance Values (chi-
square χ2) ..................................................................................209
5.4 The Physics Main Corpus: Significantly Higher and Lower
Results ............................................................................................
212
5.5 The Chemistry Main Corpus: Significantly Higher and Lower
Results ......................................................................................212
5.6 Normalised Frequencies from the Sub-Corpora compared to
Biber’s Academic Prose with Statistical Significance Values (χ2)..215
5.7 The Physics Sub-Corpus: Significantly Higher and Lower
Results ............................................................................................
218
5.8 The Chemistry Sub-Corpus: Significantly Higher and Lower
Results ......................................................................................219
6.1 Mean scores of each of the Dimensions compared with Biber’s
Academic Prose corpus results...................................................249
6.2 The main physics and chemistry corpora compared with Biber’s
Academic Prose sub-Genres .......................................................274
6.3 The physics and chemistry sub-corpora compared with Biber’s
Academic Prose Sub-Genres ......................................................275
xvii
Abbreviations used in this thesis:
MT Mother Tongue
xviii
Chapter 1 Introduction
Chapter 1
Introduction
1
For example, John Higgins’ and then later Martin Phillips’ work with the British Council Project to
develop software for language teaching much of which was published in the 1980’s in collaboration with
Cambridge University Press, initially for the BBC computers and then for IBM compatible
microcomputers.
3
because they had incompatible operating systems. This situation was
made even more difficult as many academics had chosen the Apple
Mackintosh computer as the most suitable for academic research.
However, at this stage, education policy was encouraging the use of
computers and the teaching of information technology as the use and
application of computer technology came to be known2. This meant that
towards the end of the 1980’s teacher training courses were beginning to
include CALL training (see Birnbaum 1987:19-20, Heppel 1987:20-21).
Being at the cutting edge of the technological revolution was seen to be of
prime importance not only to teachers (Dunn and Morgan 1987) but also
to governments who believed that their future economic success in the
world depended upon this change in education (see later 1.1 Science and
Technology Education).
The availability of computers and their more widespread use in
education also led to changes in the way that teachers prepared their
work. Initially the opportunities for word processing made a big change in
the preparation of materials and tests. Teachers began to write their own
materials directly through the computer rather than relying on the
support of a secretary or on the traditional cut and paste techniques
which followed widespread use of photocopying. Predictions were made at
this time that we were on the verge of the “paperless” office. It was
believed that the need for paper filing systems would disappear because
computer data was stored on floppy disks. It is now recognised that the
contrary is true, the use of computers has led to far more paper being
used as people who once would not have written anything themselves
began to do so and documents can be revised much more easily, leading
to more and more printouts of documents as changes are made to them
in order to achieve greater accuracy or to bring documents up-to-date. As
2
In Portugal the MINERVA Project to introduce Information Technology across the curriculum in state
schools finished its pilot phase in 1989 (José Moura Carvalho, 1991).
4
in the example of teachers preparing their own materials through word
processing, it became possible to produce much more specific, tailor-
made material for individual classes and so more and more materials
have been produced. Equally well, the mixture of computer operating
systems has led people to be much more careful of how they store their
data. Paper is much more accessible than a floppy disk which is the
wrong size for the current computer system or which cannot be read by
the latest machine. Contrary to popular belief some years ago (Jones and
Fortescue 1987:129) computer hardware has shown that it is prone to all
kinds of mechanical breakdowns and floppy disks often become corrupt
at the most inconvenient moments.
However, computers were also being used to investigate language
itself and huge projects were set up in universities, some on artificial
intelligence (particularly in America with the ELIZA project) and others on
lexicography and dictionary writing (such as the COBUILD Project in
Birmingham). These projects have given way to Natural Language
Processing which is basically a sub-field of computer science directly
related to artificial intelligence, human computer interaction, machine
translation and multimedia and the Bank of English which is now used
for linguistic analysis of general language. They have been joined by such
projects as the European funded ELRA project which is the European
Language Resources Association which aims to include such things as
recorded speech databases, lexicons, grammars, text corpora and
terminological data in collaboration with European countries. There has
also been a burgeoning of corpus work in many European countries
themselves. The Universities of Oporto and Lisbon in Portugal are cases
in point carrying out translation studies and linguistic analyses of
corpora under the supervision of Profs. Belinda Maia and João Malaca
Casterleiro, respectively. It has now been seen that with appropriate
software (Tribble and Jones 1990) that teachers could also carry out
5
linguistic research themselves and because of this teachers carried out
work on error analysis of students’ errors (for example, in Portugal,
Fordham 1997) and the use of concordancing for both teaching and
research developed.
Coupled with this interest in computers and computer-assisted
language learning and research was an interest in special or specific
language teaching. In 1986, I was required to design and teach a special
course for people working for the post office (then the CTT - Correios,
Telégrafos e Telefones, see Howcroft 1986). This particular course was for
personnel working with computers in the head office of the post office in
Coimbra and required reflection and research into the language of
computers together with consideration of an appropriate methodology for
a mixed ability group of foreign language learners dealing with
computers. This was a special situation in which the English language
was to be learnt through specific activities using a computer as part of
the process of learning. The use of computers was also the product or
goal of that language learning situation which made it particularly
stimulating. The fact that English is the main language of the computer
is important in any consideration of teaching students with or through
this technology. Stubbs (1992:203) says that the Cox Report (Department
of Education and Science 1989) on which he worked also argued that
most interactions with computers were language experiences. (This is
taken up again later in 1.4 The Dominance of English in Science and
Technology) However, those that design the programs and operating
systems of IBM computers are not linguists and they, therefore, cannot
be expected to have taken into consideration the fact that their audience,
the user, will often be a foreign language learner in a country far away.
The language of the computer is often idiosyncratic from a linguistic
point of view, but it has had and is increasingly having an effect on
English language usage in the modern world. The number of new terms
6
and concepts that are now employed because of the common use of
computers and electronic communications is legion. The latest
communications through the Internet and electronic mail have only
served to emphasise this state of affairs, as has the huge increase in the
amount of information available to computer users in the 1990’s through
the Internet and CD-ROMs containing the equivalent of whole
bookshelves of knowledge. Crystal (1998) describes the Internet as a
“semiotically sanitized medium” because of restricted turn taking and the
fact that messages are received in order, one by one but he reminds us
that we are not at the end of the technological road and that other
technologies are still to come which may change all this. The language
student of today faces a much greater barrage of applications and specific
language which educational policy makers believe the working population
should be capable of handling efficiently and rationally to maintain
economic advantage in the world but this language will almost certainly
be constantly changing as the technology itself changes. The types of
education policy that are relevant to university undergraduates will be
discussed in 1.1 Science and Technology Education and 1.2 Lifelong
Education.
In order to design a syllabus for a modern day undergraduate
student, whose first language is not English, all of these technological
innovations and changes in language usage will have to be taken into
account. Moreover, the sheer size and detail of the available knowledge
on any subject which the student has to cope with also requires new
learning strategies to be found. Education itself has had to change and
will have to continue changing under the weight of what is now known
and needs to be learnt by students today so that they can keep abreast of
their subject. Political changes that have taken place such as joining the
European Union have also had and are continuing to have an effect on
educational policies within the member states. Harmonisation of policies,
7
specific language training and cultural studies are deemed to be
important for the European citizen (van Ek 1990, Kubanek 1998, Byram
1999). The Committee of Ministers stressed the political importance of
intensifying and diversifying language learning as recently as 1998 when
they reviewed the Council’s earlier initiatives (Byram and Riagain 1999)
The attempt to standardise courses and the subsequent qualifications
obtained from them to allow recognition of academic qualifications
throughout the member states is also seen as an important aspect of
educational harmonisation. The whole issue of what skills and what
knowledge are needed by the people who will make up the workforce in
the future has led to changes in the perceptions of learning as will be
discussed later in 1.2 Lifelong Learning.
Although much research has been carried out in the past, the
changes that have taken place over the past two decades mean that the
present situation is often quite different from the ones those studies refer
to. The English of science and technology and the teaching/learning of
English as a foreign language of undergraduate students studying
languages as part of their courses has changed because of all the issues
discussed above. Whilst some of the findings from previous research will
remain valid, other findings will be called into question. This is especially
so because of the possibility of conducting research on specific situations
using up-to-date computers with their much greater memories, speed
and capabilities. In some cases, as will be argued later in 1.10
Appropriate Text Types, there has been little scientific rigour or little
information that would show the relevance of the work carried out for the
undergraduates on science and technology courses in Portugal. Swales
argues against overgeneralising and applying solutions found to work in
situation x to situations y and z as he (1985:188) suggests “there are
rarely global solutions to local problems”. In other cases, there has been
no published research at all on the specific kind of English that the
8
undergraduates will come into contact with (see later 2.5 The Corpus
Analysis Approach).
The undergraduate of today is also in many ways a different entity
than undergraduates in the past. Many more students will arrive at the
university with expectations about and knowledge of computers and their
applications. This may be because of stimulating projects that were
carried out in their schools in different subject areas and through
computer clubs or because of the availability of computers in their homes
and visits to ‘cyber’ cafés. Not only has this aspect of education changed
the profile of the average undergraduate but also changes in language
studies and other subjects in schools will affect the undergraduates’
profiles. Moreover, the universities themselves are just as vulnerable to
change as the schools and the students are from the impact of computers
and their applications. In fact, the universities lead in this change by
influencing the kind of training teachers are given (see 1.5 The Situation
in Portugal). New possibilities for different learning systems have been
opened up and explored by universities using modern communication
systems. What and how language should be learnt in universities by
undergraduates studying science and technology must therefore be
considered in the light of the above changes. Some form of quantification
of the changes that have taken place in both the students and their
subjects must be carried out to take principled decisions about the
syllabus that would be suitable for these particular undergraduates.
The need for this particular piece of research therefore arose with
the advent in the University of Aveiro of the combined first year for
students of science and technology. These students had English as one of
their core subjects in this first year and so it was felt that a syllabus had
to be designed to meet these students’ needs. As Swales (1985:188) says,
9
If those of us in ESP have thought long and hard about how best to serve
our students’ interests, it is simply because circumstances have tended to
make us do so. In circumstances of restricted educational opportunity we
have been forced to search out ways of providing maximum educational
value.
• What was the level of English of the students taking the course? In
other words, what did the students already know or what had they
already learnt prior to starting their undergraduate studies?
• What English did these students need to know? In other words, what
kind of interaction with the English language could these students be
expected to have and what was the nature of that English?
10
Considering the first of these aspects, the more general situation,
we can see that the world is changing ever more rapidly and general
educational priorities are also changing to prepare people for the modern
world. When, where and how people study are undergoing changes
brought about by changes in the technology of communication. These
changes will have a number of effects on undergraduate students who
are being prepared to take their places in society. The two strands of
education policy and technological change will have an effect on
university courses themselves and new curricula and courses are being
and will be started. Course content and even the structure of courses will
change. The level or length of the courses will change and different
systems, for example a modular system, and different academic
timetables are being and will be experimented with. Schemes to
understand and standardise different countries’ credit systems for
studies in order to allow movement across borders between different
countries’ education systems and recognition of educational
qualifications or part qualifications have started and will become more
usual and widespread. The materials published for use on university
courses will change and modern technology will have a profound effect on
those materials and even on the means by which they are delivered.
Contact with professors can now by means of e-mail and lectures and
notes to accompany coursework be accessed through university
computer networks. Assignments carried out by students can be
produced and printed in sophisticated styles, transmitted electronically
to the professor and comments received through the same channel. The
actual information content of the work done by undergraduates can be
affected by the information that they can obtain ‘on-line’. The skills
undergraduates are expected to have and to develop through their
courses will therefore also undergo profound changes. Stubbs (1992:220)
says that
11
The evaluation of educational change due to the new technologies involves
the analysis of changed cognitive and social relations in the classroom. We
therefore need simple but powerful concepts to study the pedagogic and
cognitive logic of such situations.
13
risk of an increasing mismatch between the requirements of our new
environment and the capabilities of our people.” This report (ibid:5)
identifies particularly noticeable mismatches between intellectual
aptitudes in the areas of maths/sciences/technology and behavioural
aptitudes which lead to “professionalism, excellence, (and) distinctive
competitive edge.” Furthermore, this emphasis on learning and using
science and technology is seen as a problem. Allen Luke (1992), the
series editor of Critical Perspectives on Literacy and Education, writing in
the Introduction of Halliday and Martin’s Writing Science (1993) says that
the “very dependency on corporate science and technology expansion as a
means for the expansion of state power and legitimacy have translated
the crises of economies and cultures into the crises of sciences.”
Matthews (1994) regards this as the narrow ‘economistic’ view of
education which is designed merely to develop ‘human resources’ so that
countries can overcome their balance of payments deficit, or stay
competitive with other economies. He believes that there is a need for a
much more liberal science education which endeavours to develop
scientific literacy in students which includes the understanding of
concepts and learning about the nature of science both through its
historical and social dimensions.
Nevertheless, because science education is seen by politicians
particularly in this narrower economistic way, different countries are
dedicating time and money to research and development in science and
technology teaching. Mike Robinson (1994) from the University of
Nevada describes why American government funding is being directed
towards science and technology education and the training of teachers.
He says this is because “science, technology, engineering and
mathematics (STEM) are usually singled out as the most pressing
educational areas” as shown by the Carnegie Commission on Science,
Technology and Government Report (1991): In the National Interest: the
14
federal government in the reform of K-12 math and science education. New
York: Carnegie Corporation. This report (1991:7) also mentions the
problem of proficiency in the English language saying that in the year
2000 ‘one child in twelve will lack the English language proficiency
required for learning’. Robinson suggests that the national interest is
best protected by training “hi-tech” workers in order to “maintain the
diminishing US technological advantage in the world economy (Office of
Science and Technology Policy, 1992 Science and Technology.
Washington: Executive Office of the President.)”. The fact that the
economic situation of the USA has changed for the better since 1992
might make the authors of the report change their minds about whether
the US is in fact losing its technological advantage in the world, but the
concern is still to produce the right profile of a technologically competent
workforce. Amongst other things, in order to achieve this goal Robinson
places particular emphasis on the use of E-mail and the Internet in
science teaching. The combination of science and technology education
and the use of modern technology is a common theme amongst
educators (Laurillard 1993).
15
endeavor as a social enterprise that strongly influences – and is influenced
by – human thought and action; and to foster scientific ways of thinking.
16
which education must start at birth and continue throughout a lifetime.”
However, the means to achieve this objective often appear elusive. Luke
(1992) suggests that there is a serious ‘time-lag’ between the debate about
educational change and the actual ‘remaking of science education’. He
discusses the work done by Lingard, Porter and Knight, (1992) who say
that the post-war human capital model of education3 in the USA and
Canada, the UK and Australia, has proved resilient and recyclable,
despite there being little evidence that it works. Gerald W. Bracey
(1997:52) suggests that “The biggest threat to the American educational
system may come not from within our schools but from the depth of our
divisions over what exactly they should accomplish and how best to get
them to accomplish it.” He goes on to point out that there has been an
enormous shift in policy which has led to 62 percent of high school
graduates being deemed capable of studying at a higher level and
therefore enrolling in college as opposed to 20 percent after the Second
World War. The massification of education is another factor that will
affect teaching and learning expectations and goals. These changes will
lead to a different, wider range of courses and qualifications becoming the
norm with shorter, less sophisticated and more modular courses being
given as and when required by the individual.
Educational policy emphasising more students enrolling in higher
education is also reflected throughout the European Union and its
member state Portugal. The proliferation of new universities and
polytechnics in Portugal in the last two decades is a clear example of the
increasing need to provide places for the greater number of students
leaving school who could benefit from further years of study and higher
qualifications4. The recent changes in the number of years of obligatory
3
This model of education is one which stresses universal education that is, education for all a nation’s
children without financial or other constraints.
4
The change that took place in Britain in 1995 when all the higher education institutes became known as
universities is also an example of this phenomenon.
17
schooling requiring students to complete the ninth year also reflects these
changes in educational policy. Despite the fact that Portugal is one of the
countries in Europe with the lowest rate of unemployment, further
education is seen as a means of achieving greater prosperity and
increasing the chances of finding a ‘good’ job. Universities themselves are
in competition with each other to provide more up-to-date courses which
give students the preparation necessary for the types of employment in
the Europe of tomorrow. Indeed recently much forward planning has gone
into attracting new faculties or universities to cities in the interior of the
country, which bears witness to the importance which is attached to
higher education in developing the poorer less industrialised ‘interior’
regions. Educational institutes are also seeking to forge links with
industry in order to work towards providing professionally oriented
training for the local workforce.
Countries throughout the world are constantly producing league
tables of the most advanced economies and comparing and examining the
ability of different educational systems to produce the best scores on tests
of mathematics and science subjects and to equate these findings with
spending on education, number of hours devoted to these subjects in
school, amount of homework set, and other parameters to try to discover
the most successful formula to produce the elite workforce deemed
necessary in the future. The world of education is fraught with insecurity
as to what constitutes a technocrat’s training and how to evaluate
‘quality’ in education. ‘Standards’ of education normally translate as the
results of tests. Recently testing carried out for the Third International
Maths and Science Study (TIMSS, 1997) produced a league table of
nations which showed the concern that many countries feel about their
results in such comparisons. Both Britain (England 25th in Maths and
10th in Science, Scotland 29th and 26th respectively) and America (28th
18
in Maths and 17th in Science) feel that they are doing “poorly” and the
Economist (March 29th 1997) reports that
In a television interview in December, the French president, Jacques
Chirac, described as “shameful” a decision by his education ministry to
pull out of an international study of adult literacy which was showing that
the French were doing badly. And in Britain last year Michael Heseltine,
the deputy prime minister, brushed aside objections from officials in the
Department for Education and Employment, and published the unflattering
results of a study he had commissioned comparing British workers with
those in France, America, Singapore and Germany – chosen as key
economic competitors.
The Germans, in turn, were shocked by their pupils’ mediocre
performance in the TIMSS tests. Their pupils did only slightly better than
the English at maths, coming 23rd out of 41 countries. In science, the
English surged ahead (though not the Scots) while the Germans were
beaten by, among others, the Dutch, the Russians – and even the
Americans. A television network ran a special report called “Education
Emergency in Germany”; industrialists accused politicians of ignoring
repeated warnings about declining standards in schools.
19
education as a means of achieving success in world markets. Success in
education and commercial success are seen as going hand-in-hand.
The American Secretary of Education, Richard W. Riley, trying to
justify the American results, (1997:60) says, “students’ proficiency in
science and math is up about one level compared to what it was a
decade ago. One reason we have been behind countries such as Japan is
because that nation’s public schools always have put extremely heavy
emphasis on science and math. We still have a long way to go.”
President Clinton, not surprisingly, in a speech to the National
Association of Black Journalists (July 17, 1997) put a much more
positive interpretation on the International Math and Science Test
results. He claimed that recent results for 4th and 8th graders showed
an improvement which in turn proved that his policies were working and
that therefore America could achieve “international excellence in
education”. Eight Goals have been identified by the National Educational
Goals Panel which was set up in 1990, these are: Goal 1: Ready to
Learn; Goal 2 : School Completion; Goal 3: Student Achievement and
Citizenship; Goal 4: Teacher Education and Professional Development;
Goal 5: Mathematics and Science; Goal 6: Adult Literacy and Lifelong
Learning; Goal 7: Safe and Disciplined, Alcohol and Drug-Free Schools;
Goal 8: Parental Participation. There is also a similar list of priorities
from the U: S. Department of Education (February 1997):
“All students should be able to:
1. Read independently by the end of the third grade.
2. Master challenging mathematics, including the foundations of algebra and
geometry, by the end of the eighth grade.
3. Be prepared for and be able to afford at least two years of college by age
18, and be able to pursue lifelong learning as adults.
4. Have a talented, dedicated, and well-prepared teacher in their classroom.
5. Have their classroom connected to the Internet by the year 2000 and be
technologically literate.
6. Learn in strong, safe, and drug-free schools.
7. Learn according to challenging and clear standards of achievement and
accountability.
20
The differences between these two lists is one of specifying in more
detail when and what is necessary in education for the future by the
Department of Education, such as the need to be able to afford further
education and the year 2000 being given as the objective for Internet
connection.
One of the policies introduced into the American education system,
as point 4 on the list of priorities shows, is the testing of both teachers
and pupils. A similar system to that found on the American National
Educational Goals Panel website (http://www.ed.gov/pubs/StratPln/priority.html), which
allows people to find out about different states and their educational
achievements, is one which has been introduced into European
educational systems where testing and grading of results from school to
school to compare those schools that are doing well with those that are
doing badly. The idea is that the better schools can be used to show
what should be done to achieve the desired test results and that
teachers can benefit from visiting those excellent schools to learn about
their methods which they can then apply in their own schools to improve
standards5. A similar system of analysing which education systems
teach science and maths best is forecast to explain what conditions are
necessary to promote effective learning.
The Organisation for Economic Co-operation and Development
(OECD) has collected data on how governments spend their combined $1
trillion annual education budgets and explains that their new studies
(launched December 1996) will compare how schools, colleges and
universities are run in each country and analyse the implications for
policy makers. The fact that some countries with low education budgets
achieve high scores on the TIMSS tests has caused politicians to seek
alternatives to more spending on education. Similarly class sizes vary
5
This is similar to the European LEONARDO project which encourages the movement of professionals
between countries in order to find and emulate excellence in teaching.
21
from country to country but results do not seem to support the
contention that only small classes achieve good results in science and
maths. The methodologies used to teach these subjects appear to be as
important as class sizes are.
As there is no consensus on what causes optimal learning,
research for practical application on undergraduate disciplines is, in the
light of the above concerns, an essential prerequisite to aid success in
science and technology education in a foreign language. One of the
implications of Lifelong Learning is that the emphasis in teaching should
be on the learning process itself and not the product of that learning, so
that people learn how to learn rather than learn a particular finite body
of knowledge.
Many other changes have taken place in the last three decades
which will also influence the teaching and learning situations of students
of Science and Technology. One of the principal changes is the advent of
the personal computer with sufficient memory and processing speed to
enable specific situation research work to take place but the implications
of the personal computer go much further than this. Students of English
and those involved in science will find their lives are surrounded by the
specific English of the computer, the associated word processor,
multimedia applications and increasingly the world of electronic mail and
the Internet. It is also the case that the software and computer
communication systems that dominate the world, such as the Internet,
originate from America and are therefore usually originally written and
manipulated through English.
Bucy describes the differences that have taken place in the concept
of what a computer is. He (1985:46) explains how the advent of the
22
silicon chip has made it possible to incorporate the computing power of a
main frame computer into such products as microwave oven controls and
handheld calculators. He also suggests that even by 1985 computers
were thought of as relatively inexpensive machines that can be used in
“any number of activities, such as education, household data storage,
increased job productivity, or entertainment.”
Consequently this greater capacity for data storage has also opened
up the possibility of conducting individual, empirical research into the
language of science and technology in a manner that was unthinkable
only a few decades ago. New software has also been developed to allow
this kind of research to take place as a result of work on corpora (see 2.5
The Corpus Analysis Approach). The problems associated with obtaining
information through the Internet seem to be much more a problem of
obtaining information about where data is available in such a vast
resource and of framing the right sort of question to obtain the desired
result. If the question posed is too general, the enquirer will be inundated
with information which will be hard to sift through in order to locate
appropriate data within a reasonable amount of time. If however the
questions posed are too precise, little or no information may be obtained.
Often the answers to questions surprise because the area the enquirer is
contemplating does not match the results of the search. For example, a
search for data on “bands” would produce results on both the musical
variety, i.e. brass bands, and the electronic forms, i.e. wave bands, either
of which would be inappropriate if the enquirer wished information on
the other. The significance of the data obtained also has to be judged by
the enquirer to ascertain if it is of an appropriate level of sophistication
which in turn requires sufficient background knowledge of the subject by
the person requesting information.
The prominence and utility of modern technology urges therefore
the teaching of a foreign language that is somewhat different from that
23
taught in schools, although many schools are taking part in exciting
projects including distance communication and e-mail. The combination
of modern technology and language use has even created new styles of
language. This difference between the new style of language and other
styles will also have an effect on teaching methods. Teaching techniques
which according to Kelly (1969:120) were “in constant use in the
language classroom right through the history of language teaching” but
which recently have fallen into disuse because they appeared to be
‘contrived’ and inappropriate will have to be reappraised in the light of
the new genre being created. An example of a teaching technique that
has fallen into disuse is written dialogue, which was a common tool for
the presentation and practice of language in the structural model of
language teaching. Modern e-mail seems to be more like this, that is, it is
more like written dialogue than formal letter writing. Written dialogue
was challenged as being inauthentic in the 1970’s and 80’s and,
therefore, not suitable as a model of actual usage but, through e-mail, it
now takes on renewed significance (Leech 1997). There is also to be found
on the Internet written lectures which reflect a little of both worlds, being
written text which is meant to be spoken and which therefore contains
comments and asides that have a specific listener in mind (Stubbs 1996,
McCarthy and Carter 1994). The manner in which students interact with
technology is the object of much research and often the results are
disappointing to those who believed that the technological revolution
would revolutionise teaching6. Most CALL specialists reached the
conclusion that computers are an aid to teaching which, rather like other
modern technologies such as the video, depend upon the ingenuity of the
6
Seymour Papert (1980) Mindstorms: Children, Computers and Powerful Ideas, Harvester Press is an
example of this view of the huge changes (and improvements) that the technological revolution would
bring to education. Robin Goodfellow of the Open University reports (1999) Language Learners’ I.T.
Strategies will they be the Death of CALL? that with university language learners in an open-access IT
environment, CALL is “vulnerable to the growth of IT sophistication in learners” . So he recommends
that teachers need to turn their attention to “the IT choices that learners make when they embark on self-
study” otherwise carefully prepared CALL designs will be sidelined.
24
teacher to make them relevant and useful resources for students
(Kenning and Kenning 1990, Higgins 1988, Phillips 1985, Leech and
Candlin 1986). In 1987 Eastment predicted that computers would not be
found in computer rooms except for computer literacy courses but would
be located in normal classrooms as part of everyday teaching. He also
suggested that at the time of writing (1987:10) concordancing was limited
by the rather unreliable software that was available. This problem has
now been largely overcome but Eastment’s prediction that we would have
“pedgagogical concordances” of varying levels and language types has still
to be realised. Warschauer (1999) argues that in education there is no
BALL (book-assisted language learning), PALL (pen-assisted language
learning) and no LALL (library-assisted language learning) because these
are such powerful technologies. Therefore, he argues, it is only with the
integration of computer technology into teacher education and language
learning that computers could be seen to have taken their place as a
natural and powerful part of the language learning process. The change
towards data-driven learning (DDL) is one of the more exciting new
trends which will be discussed later in relation to the use of corpora for
teaching purposes (Chapter 7).
The use of computers has also had an impact on linguistics and
the description of language through corpus studies and these discoveries
must be exploited and integrated into the curriculum (see 2.5 The Corpus
Analysis Approach). Particularly in the area of collocations, new
information is more easily obtained and is available for use by the
teacher and the learner. Research work on language acquisition has also
suggested that ‘chunks’ of language are used in natural language
acquisition (Hakuta 1974; Huang 1971; Brown 1973; Clark 1974;
Cruttenden 1981; Wong-Fillmore 1976; Newmark 1979, Peters 1983) and
so the use of materials derived from concordancing the target student
texts will provide one more tool to be added to the repertoire of teaching.
25
It is my contention that collocations are appropriate ‘chunks’ of language
that can and should be used use to teach to language learners (Tribble
and Jones 1990). The collocations can be obtained from the corpora
compiled from the textbooks on the bibliographies which the students are
meant to consult. In this way, the language that is being studied becomes
entirely appropriate for the purposes of the students. The specific lexical
semantics of science and technology is being presented rather than
general English. Furthermore, the language being studied could be
brought under the control of the student, thereby customising the
learners materials for study. These aspects will be taken up in more
detail later in 7.4 Data-Driven Learning.
In Portugal, and in the University of Aveiro in particular, there are
now many homepages and interactive websites which the undergraduates
are encouraged to consult and even study from. Most higher education
institutes, like the University of Aveiro, have home pages for each of the
departments. The University has its Informatics Centre through which
students can gain access to the Internet, to say nothing of the facilities
the students have at home or contrive for themselves. There are also
Open University and distance learning courses for undergraduates
making use of the Internet. The first-year students therefore soon
become, if they are not already, quite sophisticated in their knowledge,
use and expectations of modern technology.
With the economic supremacy of America in the world now and for
most of the 20th century, the English language has also come to
dominate the world of science and technology. Most research work is now
published in English no matter where the research was carried out.
26
Kaplan (1993:156) claims that “something on the order of 85% of all the
scientific and technical information available in the world today is either
originally written in, or abstracted in English.” Furthermore, many of the
books used to teach science and technology are based on American
models.
The significance of this is that in most cases the English that
students encounter and the English that students therefore need will be
predominantly American and it will also be language that is not
specifically prepared for the student of English as a foreign language. It
will, however, be predominantly written language.7 Despite the fact that
all European languages are supposed to be equally important in the
European Union some are seen to be more prevalent than others. Sheer
numbers of speakers have an obvious impact upon this so that the
Portuguese language is not one of the languages that the scientific
community sees as essential for the people who will run the businesses of
tomorrow in Europe. The European Commission 1997 Eurobarometer
reported the results of a survey conducted in 34 countries in Western,
Central and Eastern Europe in which Russian was the principal language
of 35% of the 555M people in these countries, English 28%, German
20%, French 17% and Italian 10% and suggests that the languages at the
upper end of this spectrum, that is Russian and English, appear to be
spreading whilst those at the lower end are declining. Crystal (1997:10)
puts forward the financial argument for using a lingua franca in
international bodies which is that the cost of translation can swallow up
to half the budget for such organisations. The European Union has yet to
come to terms with this problem.
7
Research carried out by Prof. Drª Ana Margarida Barros of the Department of Chemistry, University of
Aveiro, published in her (1998) report on the European Chemistry Thematic Network (ECTN) work on
Communication and Management Skills shows that reading and analysing texts in a foreign language
(usually English but possibly in French) is considered indispensable by 100% of those answering her
questionnaire from Universities in Portugal, and that this activity was classified as indispensable by 88%
of the Industries consulted and as very important by the other 12%.
27
Whilst it can be argued that Brazil, Mozambique and Angola are
very important markets outside Europe, the problems that these
countries are experiencing means that this potential may not be realised
for some time to come, if at all, and that, therefore, the Portuguese
language will not be seen to be as important at the moment as it might be
in the future. Crystal (1997:7) argues that a language becomes a global
language because it is the language of power, both political and military,
which explains why Portuguese found its way into the Americas, Africa
and the Far East during the period of colonisation. However, with
Mozambique joining the Commonwealth countries there is a suggestion
that it feels drawn more towards countries that had a connection with
Britain and the English language. The proximity of South Africa,
Zimbabwe, Zambia, Malawi and Tanzania where English is either an
official language or retains some influence may also help to explain this.
Crystal (1997:61) suggests that whether English becomes a global
language in the twenty-first century depends upon what happens in
countries with the largest populations, notably China, Japan, Russia,
Indonesia and Brazil. University students who will be the leaders of
tomorrow will need to learn at least one of the dominant languages. From
the numbers of students studying English in Portuguese secondary
schools it can be seen that the language that is often being chosen is
English (Ferreira, Ramos and Braga da Silva 1999).
A significant factor in the dominance of the English language is the
overwhelming expansion in the use of computers in the world. Kubanek
(1998:202) points out that “the lingua franca function of English would
become obvious” to students using the Internet and contacting websites
for information. This technological revolution is having an enormous
effect on education and employment.
28
1.5 The Situation in Portugal
29
adequately for their future studies. Indeed it may even lead to confusion
and misunderstanding in their scientific and technological studies.
Whilst it cannot be denied that this preoccupation with
“multiculturalism” is important in the context of liberal education for
citizenship, it is less useful for the needs of the science and technology
students who require both this ‘liberal’ education and more or further
language support with their specific English needs.
Furthermore, the teachers, who form the bulk of EFL teachers in
Portuguese secondary education and are required to teach language to
students in schools, have followed typical humanities education courses
themselves. Several ESP theorists (for example, Widdowson 1979, Ewer
1975, Strevens 1978, Hutchinson and Waters 1987 and Kennedy 1983)
have pointed out the fact that those who are required to teach the
language of science and technology feel they are themselves ill-prepared,
and are therefore often reluctant to do so. Even in terms of technology,
teachers with a humanities background were seen to have an antipathy
to “machines”. However, most undergraduates these days in this and
other modern universities are positively encouraged to confront the latter
problem through educational technology disciplines on their courses and
by being expected to submit word processed assignments for other
disciplines. The use of computers in schools however for the most part
continues to be considered the province of the maths department (Moura
Carvalho 1991, Stubbs 1992). There is awareness of a need to change
this state of affairs but as White (1988) highlights it can be difficult to
achieve innovation in schools and the process takes a long time.
30
adopters, who have noted that the innovation produces no harmful effects,
take on the innovation. During the middle stage, the majority adopt
quickly, influenced mainly by the innovators. At a late stage, the laggards
or late adopters finally give in. A minority who never adopt lie outside the
curve.”
8
ESP also exists of course in institutes such as ISCAA here in Aveiro where there is English for
Accountancy.
31
use of staff, providing staff to student ratios which can cope with the
huge entry to university which is taking place in most developed
countries as mentioned earlier. Another advantage of this system lies in
what Laurillard (1993) describes as the need for undergraduates to
develop concepts rather than merely gather facts. The undergraduates
need to learn how to learn autonomously and need to be guided to that
end. The individual scientific concepts that can be found in any specific
science subject nevertheless present some differences from what students
have been taught before. The students are often unaware that this is the
case in university and this in itself can lead to a lack of success and
indeed to considerable frustration if students regarded themselves as
good students and now at university they suddenly begin to get
unexpectedly low marks. In English language studies the students will
also have to undergo this transformation and recognise that what they
have learnt before is only part of the story and what may appear on the
surface to be the same may in fact be quite different in this new context.
The fact that the discipline has to be directed more generally to science
and technology can therefore be an advantage because the students can
become aware that the skills they need to acquire will stand them in good
stead no matter what their subject speciality is. Content knowledge will
also be acquired by the undergraduates along the course so that all of the
students have a similar lack of specific subject knowledge on entering the
university and can benefit from adopting certain strategies when faced
with new material, especially if this new material is in English.
The study which is presented here is focused on University of
Aveiro students and courses as a sample of the language needs and
teaching requirements of undergraduate university students matriculated
in a number of different courses preparing them for the future.
32
1.6 Science and Technology Undergraduates and English
33
with needs to be defined in order to be able to produce a syllabus which
makes the optimum use of the limited time9 and resources available to
this annual first year course.
The immediate short-term needs for these students can be
identified through the bibliographies they are asked to consult in their
first year science and technology courses. These include a number of
books in English on the core science subjects taught in the first year of
the University. As scientific literature is seen as becoming more and more
incomprehensible in the latter half of this century for all but a few
specialists (Hayes 1992:739-740), undergraduates will need help in
reading and understanding scientific texts.
The kind of language the students require to be able to read these
books successfully can be identified by detailed study of the textbooks.
The physics and chemistry textbooks have been studied in order to
identify their needs in respect of the syllabus and they will be presented
here. However, as the students come from over 25 different courses, there
is a need to provide a baseline corpus for comparison for the
comprehensive syllabus to be drawn up. In order to recognise what is
normal use in a particular genre a very large corpus has to be consulted,
the baseline corpus, in order to avoid generalising from what may be an
abnormal or exceptional example of language use found in one or a small
number of texts. As the study of these textbooks is based on variation
studies in order to see how far they differ from other genres or text-types,
some form of comparison needs to be made to highlight the differences
and to add scope to the syllabus. As was mentioned above, the science
and technology courses cover a much wider field than the books on the
bibliography alone can represent. What would be most appropriate, given
9
The discipline has one two-hour class per week across the two terms of the first year for most of the
students. Exceptions to this are the Licenciatura em Novas Tecnologias da Comunicação (NTC) which
has a term of four contact hours of English per week in the second year and the Licenciatura em Gestão e
Planeamento em Turismo (GPT) which has 5 hours of English per week for the first term in the second
year.
34
that the students’ physics and chemistry textbooks on their
bibliographies are overwhelmingly American publications, would be an
American general science textbook aimed at undergraduates. One
textbook would nevertheless not fulfil the criteria of a baseline corpus as
it would itself be liable to offer an exceptional or aberrant style for the
genre so a number of American general science textbooks would be
needed to analyse the genre. As such a number of suitable textbooks
could not be identified for this type of tertiary level student, a multimedia
encyclopaedia will be used. The advantages of such material are its wide-
range and huge size to meet the demands of generality in order to identify
linguistic trends and tendencies in the genre. The range would be more
than adequate to cover the basic science of all of the courses included in
the first year foundation course and the size runs to hundreds of millions
of words rather than the tens of thousands of words to be found in one
general science textbook. This will be taken up in more detail later in
1.10 CD-ROM Material.
Implicit in deciding what to include in the syllabus is what English
the students have already learnt or already know. In order to answer this
question the students were tested and their results analysed in order to
identify areas which need to be addressed by the syllabus. Chapter 4 Test
Results for New Students discusses the test used in each of the academic
years from 1993-1998 and the results found for new undergraduates in
those years.
10
It is admitted that English for Information Science is as relevant as the English contained in textbooks on
the bibliographies for Mathematics. However, the language in Information Science is English even if the
explanations for use (and pronunciation of the terminology) is given in Portuguese. There are also
appropriate glossaries that students can consult for this discipline. The language of Mathematics is
subsumed by the mathematics contained in the physics textbook analysed and has been shown to be a very
restricted genre (Biber 1988).
11
See Arroteia, Jorge Carvalho; Martins, António Maria (1997) Inserção Profissional do Diplomados pela
Universidade de Aveiro: Trajectórias Academicas e Profissionais, Aveiro: Universidade de Aveiro.
36
be made appropriate to the learning purpose of the students. Wilson
(1997:130) suggests that databases designed for use with language
students should contain texts that relate to students’ tasks and interests
in other disciplines in order to make the “students’ goals in the language
learning programme … coincide as far as possible with the students’
wider goals.” Despite this, she identifies the fact that her computer-based
materials were too general as a “disappointment” as they “had none of
the quality control for style and linguistic coverage that good CALL
demands”. In other words, Wilson reminds us that the materials that are
used with undergraduates need to be carefully selected so that they are
sophisticated enough and they must be tried out and improved upon or
abandoned if necessary should they prove to be unsuitable.
12
text here should be understood to include written and spoken language.
37
“inadequate sample”. Even Tarone et al (1981:191) indict themselves they
say
The only study that they report they have found that was on only one
field was Wingard’s (1981) work on medical texts. There has been an
increase in recent years in the numbers of students going on with post-
graduate studies, and it may well be that at a later stage those advanced
students’ specific language needs must be studied to see if further
language training is necessary.
This pattern of combining texts-types often including journalese or
popular science texts has continued in many cases even in some corpora
that are regularly used by researchers. Therefore, there is no suitable
study of undergraduate textbooks for science and technology students
that could usefully be used to identify the target language of these
undergraduates and hence the necessity to start from the beginning to
analyse specific texts appropriate for these students.
38
conclusions (for example, as mentioned earlier in 1.8 Appropriate Text-
Types, Tarone et al 1981 only used two texts). However, more detailed
studies of a smaller corpus may show features that would be lost in a
very large corpus (Robinson 1991). This will be taken up in more detail
later in 2.2.3 Scientific Specificity. For this reason two sub-corpora are
included in the analysis. Five corpora will be used: a large physics
corpus, a small physics sub-corpus, a large chemistry corpus, a small
chemistry sub-corpus and a corpus from a multimedia encyclopaedia for
strictly comparative purposes. Biber, Conrad and Reppen (1998:136) go
even further and suggest that studies based on very small corpora are
likely to be inaccurate and a ‘baseline’ is needed for comparison to
identify significant variation. Halliday (1993) suggests that the
development of the modern corpus is that “we can now for the first time
undertake serious quantitative work in the field of grammar” but he
points out that in order to be able to do this “Quantitative studies require
very large populations to work with.” The multimedia encyclopaedia will
provide that baseline for comparison as will the large physics and
chemistry corpora when used in comparison with the sub-corpora in
these same subject areas.
39
Halliday and Martin 1993). These linguists argue that encyclopaedia texts
are intended to be instructional and so are textbooks. Furthermore,
textbooks and encyclopaedia are seen to be of a similar level or standard.
They are also aimed at a similar reader, that is one who has knowledge of
the subject but is not a specialist and is in the process of learning more.
Although encyclopaedias may of course be used for more general
purposes, more in-depth information can be obtained if the user so
desires. The multimedia encyclopaedia chosen provides reading lists for
further study on any topic.
The CD-ROM encyclopaedia also shares a number of features that
are particularly relevant for our tertiary level students. First of all almost
all of the widely published CD-ROM multimedia encyclopaedias are in
American English. This is partly as a result of Microsoft’s dominance in
the computer market as mentioned earlier and their marketing strategy
of linking other products to the sale of their personal computers.
13
This is an example of an Open University for distance learning programmes which makes use of computer
technology only.
41
widen the gap between real life and education. Youngsters are growing up
in an informatics and media world: education should respond to their
cultural expectation pattern, use their language.” The CD-ROM
encyclopaedia fits this role, but maintains an educational rather than an
entertainment perspective. Young people all too often use games on CD-
ROM as their “language” and although motivation is extremely important
in teaching and learning, this thesis argues that education at tertiary
level should be both stimulating and demanding.
The number of CD-ROM encyclopaedias has increased in recent
years but in the early 1990s there were only a few widely available ones
such as the Grolier, Compton’s and Encarta by Microsoft. Compton’s was
not very user friendly while the Encarta tended to take on a more
entertainment type of format including quizzes and games. For these
reasons the Grolier was chosen as it combines a suitably academic style
with user-friendliness. The report written by Jeremy Fox, Anne Matthews,
Clive Matthews and Arthur Rope for the British Government Employment
Department Group Training Agency Learning Technology Unit by the
University of East Anglia and the Bell Educational Trust, March 1990
Educational Technology in Modern Language Learning in the secondary,
tertiary and vocational sectors, describes the GROLIER ELECTRONIC
ENCYCLOPEDIA which will be used here (ibid.1990:26) as “an excellent
example” of a CD-ROM encyclopaedia which “holds the equivalent of 20
bookshelf volumes plus an index of all the occurrences of every word in
the encyclopaedia.” and the report grades this encyclopaedia for
“secondary, tertiary and vocational” levels, with emphasis on the latter
two which they indicate by means of the italics used. Furthermore, the
report claims that the encyclopaedia is applicable to the areas of reading,
writing and vocabulary and can be used “in a hypertext-like way down a
track of cross-references”.
42
The appropriateness of hypertext in teaching Portuguese students
of English has been explored by Prof. Doctor António Moreira in his
doctoral thesis Desenvolvimento da flexibilidade cognitiva dos alunos-
futuros-professores: uma experiência em Didáctica do Inglês (1996
University of Aveiro). He finds cognitive learning of this type to be
successful with students in Portugal. He found that (1996:x) “hypertext
systems based on an approach that uses cases which are structured in
such a way that they offer multiple representations of knowledge which
in turn emphasise critical interconnections between different structural
and surface knowledge components can be superior in their effectiveness
for the preparation of students in their use of knowledge in new and in
novel situations”. This form of transfer is extremely important for the
students under study here who are attempting to use English in a ‘new
and novel’ situation - that of tertiary level study in science and
technology.
This study will attempt to examine and define just what the
appropriate specific English for these undergraduate science and
technology students is. This will be found through a linguistic analysis of
computer corpora, from the textbooks for physics and chemistry found
on the students’ bibliographies, contrasted with an analysis of the
students’ language needs as obtained from the results of the tests
described later in Chapter 4. The areas that must be addressed in
undergraduate English language studies will then be identified. The
results of this research are to be applied to the development of a syllabus
and teaching materials for the discipline appropriate for the entry
standard of English of the students taking the discipline and for their
43
overall course needs in terms of bibliography in English for science and
technology.
Many other considerations will have to be taken into account as
well such as the amount of contact hours available, the size of classes
and the heterogeneity of the students in those classes. All of these
features of the discipline will influence the syllabus that can be used with
these undergraduates. The fact that these are undergraduates just
starting their courses in university will also have to be taken into
consideration as mentioned earlier as they are going to have to adapt to
many new aspects of life as well as new aspects of learning in an entirely
different environment from the one the have been used to up to this
point. Simply adapting to the size and complexity of university life is a
major difficulty for many of the new students who may also be coping
with being away from home and family for the first time as well. Students
cannot be seen divorced from these different aspects of their lives which
will colour their learning and attitude to learning and which the teacher
and syllabus designer have to take into consideration in their work. The
syllabus then needs to address the state the learner is in at the beginning
of the course not only from the point of view of their level of knowledge,
which will vary from student to student, but also from their personal
situation with regard to university life. There will be a need to draw
together a number of strands to blend the classes into some form of co-
operative body where the differences of level in background knowledge,
both of their subject specialities and language level, together with other
more mundane problems of their new lifestyle will be addressed. Simply
getting the students into contact with each other and making friends is
important for the well-being of undergraduates and their success on their
courses (Tavares, Santiago, Lencestre, Soares 1996).
44
1.12 The Research
45
through testing, in order to identify mismatches with the English needed
by those students coming into the University of Aveiro to take up places
on Science and Technology courses.
1.13 Methodology
46
of such students. The role of modern technology in education is also
addressed in the learning strategies proposed for these students.
47
Chapter 2 Historical and Theoretical
Background to ESP
Chapter 2
49
gone before can be of enormous help in defining what should be included
in a syllabus for university students of science and technology.
A methodology for the teaching of language can be traced right
back to Quintilian (Marcus Fabius Quintilianus. 35 - 95 A.D.) with his
Instituto Oratoria. He outlines the teaching of rhetoric or bene dicendi
scientia as being made up of the study of grammar which is sub-divided
into correct expression or recte loquendi scientia and interpretation of the
poets or poetarum enarratio which, in turn, requires the study of writing
and reading or scribendi legendique facultas. Quintilian was aiming to
produce the perfect orator through his system of linguistic studies and
states that the first requirement of an orator is that “he should be a good
man”. Quintilian’s methodology was that a second (or foreign) language
should be taught to children through total immersion in the target
language, although he also advocated adapting materials to suit different
types of learners and of motivating students to learn. This idea of
different types of learner requiring different types of materials is
fundamental to the modern study of languages for special purposes.
Similar to Quintilian’s ideas on motivation is the modern idea that
motivation is a necessary prerequisite to facilitate learning which is
advocated by those involved in special language training today (cf.
Hutchinson and Waters 1987).
From the seventeenth to the eighteenth and nineteenth centuries,
from Locke to Horne Tooke and Humbolt theories about language led to
etymological studies and then descriptions of languages being made. The
emergence of a method of ‘scientific’ study of language based on empirical
research continued into the twentieth century with work such as that of
Bloomfield on indigenous American Indian languages. All of these strands
of theoretical linguistics have had and still are having effects on syllabus
design, materials and the teaching and learning of special languages like
that of science and technology under study here.
50
2.1 English for Special Purposes
1
The use of the term English for Science and Technology (EST) is usually attributed to Ewer (1971).,
although Trimble (1985) attributes it to Selinker.
51
general language. This was particularly the case when Latin stopped
being the lingua franca of scientific thought. The idea of language as fixed
is contrary to fact whether for scientific purposes or any others (White
1998 and see 5.1.5 Plurals from Latin and Greek) which suggests that
there will always be a need to analyse its use both diachronically and
synchronically.
Theories about language have not stopped being put forward either
and language studies for application in teaching have added more to the
understanding of specific varieties of English. For example Swales
(1985:x) sees EST as underpinning the development of ESP. He says that
“With one or two exceptions …English for Science and Technology has
always set and continues to set the trend in theoretical discussion, in
ways of analysing language, and in the variety of actual teaching
materials.”
2.1.1 Phrasebooks
52
language use or ‘specific purpose’ as opposed to ‘special language’
(Turner 1981) and is based upon the idea that there is an equivalent in
one language for an item found in another language. This approach takes
the view that one language corresponds directly to another language
although it is in code and ignores the idea that there are cultural
differences between languages which need to be coped with.
The implications for language teaching for foreign travel and
phrasebook language has been significant in that different emphasis was
placed in teaching on speaking and listening, although traditional
teaching methods in the past would have favoured pronunciation and
reading aloud of phrases. The results of theories about the reasons for
teaching English has changed the methodologies and materials used for
that teaching. The final example given above for mariners would be
entitled EOP, English for Occupational Purposes, today and might well
restrict itself to very elementary goals. Similarly, just as new dictionaries
continue to be produced, the study of English for Special Purposes
continues to this day and there have been four, often overlapping and
interconnecting major schools of thought for the teaching of ESP and
science and technology this century. These can be described as: the
register analysis approach, the discourse analysis and variation studies
approach, the needs analysis approach and, most recently, the corpus
analysis approach2 which I shall go on to describe in order to show how
they influence the study of science and technology for university students
today.
Some of these approaches derived from the need to respond to a
practical crisis like the need during the Second World War for a means of
teaching/learning foreign languages quickly. Others have had the benefit
of taking up theoretical work done by linguists which has then been
2
The latter also often being developed and used for the gathering and analysis of data for writing new
dictionaries.
53
applied to teaching/learning situations. The teaching of foreign languages
has been benefited by studies in special language teaching/learning and
to a lesser extent special language teaching/learning has benefited from
general language teaching/learning (Robinson 1991, Swales 1985).
Hutchinson and Waters (1987, Ch. 2) identify four stages of
development of special language analysis, with a fifth emerging. To a
greater or lesser extent, all are germane to my project. They are:
1. The concept of special language: register analysis,
2. Beyond the sentence: rhetorical or discourse analysis,
3. Target situation analysis,
4. Skills and strategies
5. A learning-centred approach.
In this analysis the last two of these stages are considered to be
methodological approaches and not language research as such. Indeed,
Hutchinson and Waters themselves identify this division as one of “new
ideas about language and new ideas about learning” (ditto, 1987:14). The
methodological implications of learning English for science and
technology will be addressed later (Chapter 7 The Syllabus) after
presentation and discussion of the results of the research undertaken.
I will briefly discuss each approach before showing how they feed
in to my thesis.
The very first research material published for teaching was not
meant for special or specific purposes but rather for the general learner.
The idea that students could be helped to learn languages more easily
and quickly by having a select list of words came into vogue between the
wars. In terms of word lists or frequency counts, the earliest which was
54
used to provide a scientific foundation for teaching was developed in
America by Thorndike. Thorndike produced a list of 5,000 words for
teachers, culled from a corpus of four and a half million words, which he
published in 1921 as the Teacher’s Word Book. Subsequently, Horn
published 10,000 words taken from business and personal letters in
1926. The number of words published then multiplied to 20,000 in 1931
with Thorndike’s The Teacher’s Word Book of 20,000 words and in 1944
to 30,000 words with Thorndike and Lorge’s The Teacher’s Word Book of
30,000 words. These counts were used to decide on appropriate reading
materials for school children and are still popular means to decide on
appropriate materials for different school levels. In Britain, Michael West
published his A General Service List of English Words in 1953 which
contained 2,000 words and a supplementary list of scientific and
technical vocabulary. Similarly, the graded readers in English such as
those published by Penguin are meant to correspond to levels of reading
competence in EFL learners learning British English. West’s work like
that of Palmer was specifically addressed to EFL learners whereas the
American work by Thornton and Lorge was for general reading in
mainstream education systems. However, all of these word lists were
based on written material but in France work by Gougenheim was taking
place on a spoken corpus which was to provide the basis of Français
Fondamental, first published in 1954. The first revision of this was
published in 1959 and was composed of 1475 entries, 1222 of which
were lexical items and 253 grammatical words. As will be shown later
(2.5 The Corpus Analysis Approach) the whole concept of distinctions
between grammatical words and lexical items is called into question
through corpus research. Nevertheless lists can still be used to help
define a specific type of language or special language and to underpin
syllabi.
55
2.2.1 European Languages for Special Purposes
2.2.2 Methodologies
56
Threshold Level reflects the functional/notional approach popular in the
1970’s because of Wilkins’ work (among others), Wilkins (1976) Notional
Syllabuses, Oxford: Oxford University Press. The lists produced based on
the functional/notional approach were not produced empirically and
many other possibilities of what to include at what stage in learning
exist. The variety of coursebooks produced to teach functions/notions in
the late 1970s and early 1980s reflect this and authors such as Abbs at
the TESOL Conference in Lisbon in 1979 admitted that things were
beginning to get out of hand when students were being taught how to
react angrily to situations! So as a basis for syllabus design this
approach left something to be desired.
If an empirical approach to register analysis is to be adopted for
syllabus design, then there is more to the production of frequency counts
that needs to be taken into account. For example there is the question of
the definition of the words listed, reflected in the description of the
Français Fondamental list mentioned above. What is lexical and what
grammatical? The whole question of what a word is must also be
addressed. Moreover, the corpus from which the word lists were
produced must be clearly categorised so that the results can be seen to
be pertinent to the learners’ needs. These aspects must be made very
clear if any comparison between research undertaken can be made and
scientific results verified. Therefore, further consideration will be given to
these aspects later in examining the results obtained from the corpora.
57
argues that any published analysis should provide enough information
for another researcher to reproduce the results, as with any other piece of
scientific research. Halliday (1993:103-4) wants the research to make use
of the same theories and methods of analysis (understandably advocating
his own systemic functional grammar analysis here, above all others), so
that comparison with other studies is possible. Robinson (1991:31) wants
specific information on the materials used for the research to be
provided. She (ibid.) says:
“First, the fact that research exists on the same topic or subject matter that
the students are interested in is not sufficient to make that research useful.
We need to know the source of the material that has been researched: its
date and geographical origin. In addition, we need to know the level of the
material: does it represent specialist to specialist communication, or
specialist to non-specialist? What was the mode of the material? Was it
originally spoken or written, prepared or unprepared? All these alternatives
will have an effect on the language forms selected.
Second, we need to know the size of the corpus that has been researched
(.....) Larger-scale studies may be able to arrive at reliable generalisations.
Smaller-scale studies, however, may be able to go into more explanatory
detail.”
3
See McCarthy and Carter 1994 Pp 4-16 for a discussion of modes and their features.
59
Stern (1983:131-2) argues that the study of lexis or vocabulary
has received little attention from English-speaking linguists4 because it
does not lend itself easily to structural and systematic treatment in the
way that syntax and phonology have done but that this is an area of
research which is very important for language teaching. More recent
studies (Sinclair 1991) would suggest that the learning of lexical items in
isolation does not reflect actual English usage where words and their
meanings are associated with particular structures and contexts.
Sinclair therefore suggests that words must be studied in context in
order to show their specific meanings and associated structural
restrictions.
Halliday put forward the idea that language shows variety in terms
of its use and not its user. For example, there is language specific to food
or cooking: tomato, apple, bread, butter or sport: referee, goalkeeper but
the user can only show different dialects which may be regional or social
and so on. This sets the course for discourse analysis which will be
described later but which Halliday (1993) still refers to as Register
Analysis. Investigation of varieties of English can then show what the
learner needs to cope with in a specific area.
Register analysis is based on the idea that nouns make distinctions
because they are used for concepts or principles (which is somewhat
similar to the ideas of Condillac in the eighteenth century, Philosophical
writings of Etienne Bonnot, Abbé de Condillac, Hillsdale, N.J.:Lawrence
Erlbaum, 1982). Consequently, if the register of say biology, as distinct
from other registers, can be identified, then a specific syllabus can be
4
Stern points out that research has been carried out by French and German linguists and reference is made
later (2.2.7 The Impact of Modern Technology on Register Analysis) to Hoffman who describes the
research done (on English) by German linguists in the GDR for teaching purposes.
60
drawn up which would be more limited in range and therefore, would not
diffuse students’ learning energies. West describes this as the ‘surrender
value’ of the course of study. A high ‘surrender value’ would mean greater
efficiency in terms of meeting the student’s language requirements or as
Swales (1985) puts it, as getting ‘maximum educational value’ out of the
course.
White (1975) reached the conclusion that it was a “unique
constellation of features rather than any single characteristic” that made
one register distinctive from another. These features however have to be
identified from the kinds of materials that are likely to be used by the
students so that a pedagogical selection can be made for course design.
Mindt (1997:42) describes this process of designing a grammar for foreign
language learners as:
Sinclair and Renouf (1992) suggest that in general language courses “the
main focus of study should be on:
61
2.2.5 Publications and Coursebooks based on Register Analysis
62
Scientific English and Herbert’s (1965) The Structure of Technical English
demonstrate this approach. Swales (1985:18) comments that Herbert’s
book was still in print and still being used when he published his book
twenty years later. He (ibid.) attributes this to the fact that it “shows a
highly professional concern with the language of EST” however the
methodology used was rather dull and the combination and connection
between the diagrams used and the accompanying text was often
obscure.
63
characterization have been “surprisingly careless” and that even though
Bloomfield wrote forty pages in 1938 on “Linguistic aspects of science”
only one explicit example is given throughout those forty pages. This
problem continues to the present day for example, although Stubbs
(1996:152) argues for clarity on texts analysed, he is himself open to
criticism (Hoey 1993) for not giving sufficient information on the school
textbooks in his own analysis.6
The criticism was based on the fact that researchers appeared to be
claiming that features were unique to one type of text or that one feature
uniquely characterises a text. Once it was understood that the distinction
was much more of degree the objections were largely overcome. As White
(1975) said:
Firstly, it became clear that ... it is not possible to take the occurrence of
any specific feature as being criterial of one and only one particular
register. Secondly, it was obvious that what made one register distinctive in
comparison with another was a unique constellation of features rather than
any single characteristic.
5
Although Robinson’s comment on small scale specific studies, see above P 55 Scientific Specificity, could
still hold true.
6
First presented with Andrea Gerbig (1993) as “Human and Inhuman Geography: On the Computer-
Assisted Analysis of Long Texts.” In Hoey (ed.1993) Data, Description, Discourse. London: HarperCollins.
64
that is found in corpora is not distinctive7. This led to the idea that there
is a core of language that is common or sub-technical (cf. Robinson
1991). Trimble (1985:129) equates sub-technical vocabulary with “those
words that have the same meaning in several scientific or technical
disciplines” together with “those “common” words that occur with special
meanings in specific scientific and technical fields”. This approach goes
hand-in-hand with the methodological principle that only students of an
intermediate level of English competence could or should be exposed to
an English for Special Purposes course, as these students will already
have sufficient basic knowledge of the language to be able to appreciate
the difference between these common forms and the language that is
scientifically specific or purely technical. Trimble (1985:7) suggested that
students at the tertiary level are “assumed to be fairly advanced in
English” but, nevertheless, recognised that not all of the students could
be assumed to be equally accomplished in all of the language skills.
In the same line of argument, Hoffman (1981:114) claims that:
7
Recent large corpus studies by Sinclair (1991) demonstrate that there is a body of very frequently
occurring lexis and that if those items that only occur once are removed from the frequency list it shrinks
to half its size.
65
days strict adherence to this sort of approach would be deemed too
limiting for syllabus design, although it is generally accepted that it still
has a part to play in it.
Swales (1984:1) reported that although frequency analyses found
little favour in British and American ESP work, a revival of this form of
study is taking place because of the fact that “frequency analysis is
ideally suited to computerization”. He also predicted (1984:214) that ESP
would only come of age when computers and video recorders were used
and the processes of technical and sub-technical vocabulary acquisition
were properly investigated and not merely imagined. Computers, through
concordances, can already provide learners with much better
investigative tools and give access to real language use instead of
inventions by the course writer. Tribble and Jones (1990:15) find that the
results obtained from a concordance “will only be as interesting as the
raw material on which you put it to work” so that appropriate corpora
must be used to generate instances of language usage which match the
students’ needs.
Biber, Conrad and Reppen (1998:136) say that teachers must
understand the processes by which register is understood so that they
can facilitate its acquisition. However, they go on to suggest that the
ability to describe and understand the differences between registers has
proved to be very difficult without the use of corpus-based studies.
Furthermore, the features that distinguish one register from another are
rarely features unique to that register. Registers usually share many
linguistic features; it is the relative use of these features that usually
distinguishes one from another. Therefore what is needed is a
comparative quantifying approach in order to know whether one feature
in a register is rare or common.
66
2.2.8 Variation Studies
67
and formal writing at the other. Nevertheless, academic prose was found
by Biber to be one of the most widely differing sub-genres in the analyses
he carried out. Biber’s work will be discussed in more detail later in 3.3
Biber’s Methodology of Variation Studies and Corpora Analyses when
making a classification of the textbooks on the undergraduate students’
bibliographies.
68
Discourse world
↕
Core generic function
↕
Genres
↕
Generic blends
↕
Registers
Where the Discourse world is divided into spoken and written; the Core
Generic Function is Reporting; the Genres are for example, the
Information Report, the Progress Report or the Weather Report; the
Generic Blends are for example, Reporting and Predicting, Reporting and
Recommending or Reporting and Evaluating; and the Registers are for
example, a Weather Forecast which can be even further sub-divided into
TV/Radio and Newspaper. These can then be linked to different
“Prototypical linguistic features”, such as “Past tense, passives, relational
processes”. McCarthy and Carter (1994:33) survey some of the
uncertainties that are found about definitions of genre and suggest that
the results are that “the notion of genre becomes as slippery as the
notion of register.” They ask if this distinction is necessary at all but they
note that it has had important implications for discourse analysis.
Biber (1988) uses genre to refer to “categorizations assigned on the
basis of external criteria” and text-type to refer to “groupings of texts that
are similar with respect to their linguistic form.” The term register is still
used most extensively by Halliday and the Australian school.
69
2.3 The Discourse Analysis Approach
2.3.1 Definition
8
Once again text here should be understood to mean both written and spoken language.
71
studies, their key principle is that language is being used poetically or
aesthetically when the expressive aspect is predominant.
73
To rehabilitate literacy in science teachers and students will have to work
towards a much clearer grasp of the function of language as technology in
building up a scientific picture of the world. Technical language has evolved
in order to classify, decompose and explain. The major scientific genres –
report, explanation and experiment – have evolved to structure texts which
document a scientist’s world view. The functionality of these genres and the
technicality they contain cannot be avoided; it has to be dealt with. To deal
with it teachers need an understanding of the structure of the genres and the
grammar of technicality. With this knowledge they can begin to tackle the
problem of science literacy ... Without it they will continue to focus on
content without taking language into account, probably with an increasing
emphasis on science activities rather than science texts. The linguistic
technology is the key -–not just to science literacy but to understanding and
practising science itself. Ways must be devised to provide access to this
technology. And the answer must not involve watering the technology down.
74
medium (e.g. written, spoken, typed) and immediacy (e.g. face-to-face
or distant).
Halliday further defined these aspects of register (organisation of
content) through their metafunctions (organisation of language). The
choices for meaning are organised into the following metafunctions: Field
is associated with Ideational meaning (resources for building content);
the metafunction associated with Tenor is Interpersonal meaning
(resources for interacting) and the metafunction associated with Mode is
Textual meaning (resources for organizing texts).
Halliday argues that scientific texts are derived historically from
the need to condense information about previous scientific discoveries.
They are therefore characterised by dense nominalization as this is the
best means of conveying dense information. Here, once again, is the
notion that solid background (or underlying) scientific knowledge of the
subject being studied is necessary. Furthermore, this background
knowledge is assumed by authors of textbooks to exist in their readers.
This is in conformity with Labov’s (1972) suggestion that only by using
the concept of ‘shared knowledge’ can discourse be interpreted correctly.
However, Trimble (1985:114) says that his research “showed and
continues to show that the majority of non-native students lack the
cultural background that enables them to bring more than a very limited
amount of the presupposed information to their reading of EST
discourse”. Here he is referring to the information presupposed by the
writers of scientific and technical discourse to be ‘possessed’ by the
reader.
Horne Tooke (1778, 1786) argued back in the eighteenth century
that language contained ‘abbreviations’. Like Halliday’s ‘condensation’ of
knowledge he argued that abbreviations were necessary so that thoughts
could be expressed in real time and that abbreviations had been
developed over time so that an (empirical etymological) analysis of
75
language would show that abbreviations such as prepositions could be
traced to their historical (nominal) roots. In the Diversions of Purley: 9-15
he says:
Swales argued that these moves were more important than standard
English grammar. However, in his later work (ibid. 1984:213) he says
that genre-analysis has a “price to pay” in that by revealing something of
the “internal logic and external language of a conventionally-constrained
communicative event” it may “have little to say about other, apparently
quite similar, communicative events”. He gives the example of his own
work when he says that there is “no such thing as an Introduction in
academic writing” and explains that introductions “would appear to be
quite differently organized in different genres such as scholarly papers,
theses, projects and essays”. These findings suggest that analyses of text
organisation must be carried out on specific genres which are relevant for
students in a particular setting in order to develop specific teaching
materials that the students could use to develop their understanding and
scientific literacy in the Hallidayan sense given above in 2.3.5. McCarthy
and Carter (1994) give some suggestions about how this kind of text
organisation analysis can be carried out in the classroom using
frameworks.
There is however another practical application of this type of
analysis in specific fields such as business negotiations. Johns (1991)
reports that interest in Uljin and Gorter’s (1990) work on
77
discourse/rhetorical moves in business negotiations has increased with
the enlargement of the European Union and the consequent enlargement
of different language contexts/interfaces that must be dealt with.
78
masked by a study of the whole text. She suggests that her research is,
nevertheless, in line with Selinker and Trimble’s (1976) recommendation
to work at a higher than sentence level.
The ‘whole text’ approach has gained more and more adherents in
recent years. Hoey (1991) argues for it as does Stubbs (1996). Much
published work however still concentrates on detailed analyses of small
fragments of texts. Stubbs (1996) recognises this fact but goes on to
argue for complementing the analysis of text fragments by the analysis of
long texts. McCarthy and Carter (1994:112) claim that “Matters
traditionally thought of as the domain of semantics and syntax can be
placed squarely at the heart of discourse analysis.” They also suggest
that (1994:106) a top-down approach can “assist the job of relating
higher order categories in the syllabus (such as text-type) to the micro-
syllabus elements of grammar and lexis”.
79
realizing that adjustments are often necessary. As a result they read ‘should’
with the meaning found most commonly in ESL/EFL grammars and so
assume that a choice is possible.”
80
2.3.9 Coursebooks based on Discourse Analysis
81
realisations of these are the same in the target language as in the mother
tongue. Langkilde (1981:517) found, for example, that for undergraduate
students of economics in Copenhagen long adverbials (in French)
interfere with the comprehension of sentences and “disturbs the well-
established patterns that the students are used to finding”. This
phenomenon may well operate in the opposite direction between English
and Portuguese academic prose as Portuguese corresponds more closely
to French in its use of long adverbials9. Trimble (1985:131) believes that
one of the features of scientific or technical discourse that is a “special
problem for the majority of non-native students” is the use of noun
compounds or strings which are Germanic in origin and so not natural in
many languages which is certainly the case with Portuguese native
speakers.
It is also a fallacy that there is a universal ‘scientific’ way of looking
at things and that everyone with adequate intellectual gifts thinks
‘scientifically’. One of the characteristics of science is that it needs to be
taught to people; it does not exist naturally. Moreover, many scientific
discoveries have been shown to be rather haphazard (or in a hypothetico-
deductive form rather than an inductive one10) and order and method
have only been imposed when the scientists concerned have written up
their work as a paper for other scientists to read.
Beaugrande’s (1997:44) offers an ideological critique of education
and scientific training. He argues that “In theory, all citizens have the
same basic human rights to freedom of speech, public education,
scientific training,” and yet he claims that “in practice, the great majority
are systematically excluded.” This he argues is because they cannot
9
However, Quirk (1995:127) finds a higher proportion of adverbials in speech than expected which he says
“runs counter to the widespread belief that written English is more complex syntactically than impromptu
speech and that the incidence of ‘adverbial clauses’ is a significant marker of relative syntactic
complexity.”
10
An example of this is the book by Watson and Crick (1968) The Double Helix which describes their
discoveries. Karl Popper (1972) in The Logic of Scientific Discovery argues that scientific method is
hypothetico-deductive and not, as many believe, inductive.
82
understand the discourse of science and therefore are not science
‘literate’ as Halliday and Martin (1993) term this phenomenon. When
looking into the future Beaugrande (1997:59), claims that less attention
has been focused upon the ‘twin knowledge crisis and communication
crisis’. However, he believes that there is ‘an exploding body of knowledge
that is locked up in discourse accessible to only a few people
concentrated in centres of wealth and power’ which, he argues, needs to
be made available to everyone through the results of the analysis of
discourse being applied to teaching. So, for Beaugrande, discourse
analysis is to be seen as the key to unlock the door of scientific language.
This position seems reminiscent of the plain English group whose aim is
to make bureaucratic jargon much more transparent for the average
person so that they are not considered “functionally illiterate” as the
Americans describe the inability to cope with filling in forms and other
such language manipulation activities which the average person can be
expected to meet in their daily lives. The language of science and
technology as it has developed over the last two centuries would seem to
be a far cry from bureaucratic jargon because it demands sufficient
background knowledge of the concepts concerned to understand rather
than a certain legalistic hedging of the terms used as in bureaucratic
jargon.
85
One aspect of teaching that Hutchinson and Waters advocate
strongly is motivating ESP/EAP students to learn. In universities in
Portugal language studies are often seen as a necessary evil by both
students and staff in science and technology departments. It is therefore
often relegated to a minor position on the curriculum which cannot fail to
reinforce the idea in some students that it is of little importance to their
overall studies when in fact the ability to function effectively in several
languages will often become increasingly important as far as both their
courses and later careers are concerned.
It is not the basic components of his (the author’s) language that differ, it is
the statistical properties of the mixture in which they occur, and the
intention, the purpose, behind their selection and use.
86
Thus there is no absolute division to be found between register (word and
sentence) and discourse (above the sentence level) analysis and that both
of these continue to be studied by linguists in order to define text types or
genres.
Eggins and Martin (1997:230) describe Register and Genre Theory
as “linguistic approaches to discourse which seek to theorise how
discourses, or texts, are like and unlike each other, and why.” They go on
to define the steps that need to be taken when applying such a theory.
The first step, they maintain, is to describe the linguistic patterns or
“words and structures” in the texts being analysed. The second step is to
try to explain the linguistic differences found between the texts being
studied. In short, Register and Genre Theory is a theory of functional
variation or how texts coincide or differ one from another for a particular
purpose.
Eggins and Martin (1997:251) define the terms register and genre
as ‘context of situation’ and ‘context of culture’, respectively and they say
these “identify the two main dimensions of variation between texts.”
Register is seen as lower level (bottom-up) realisations of variation and is
constituted by lexical, grammatical and semantic choices. This theory
brings together work from both Register and Discourse Analysis as
described above and will be called Variation Studies (after Biber 1988) in
this thesis as it is applied to the textbooks under study.
Eggins and Martin (1997) explain that genre can be seen in many
different ways. There is the conventional literary model of “types of
literary productions” including short stories, poems and novels. Then
there is the linguistic definition Bakhtin (1986) gives which broadens
genre to include everyday speech and writing with the literary genres.
Genre in linguistics is also defined functionally in terms of its social
purpose. Eggins and Martin (1997:236) summarise this saying “Thus,
different genres are different ways of using language to achieve different
87
culturally established tasks, and texts of different genres are texts which
are achieving different purposes in the cultures”, or what may more
simply be described as text and talk in context.
Similarly, the needs of students and course needs have to be
studied alongside these analyses and continue to be important for the
study of English for Special Purposes and syllabus development. Stubbs
(1996:19) criticises the work carried out on scientific research articles by
Swales (1990) because he failed to relate the linguistic features he found
to a theory of variation in English. Stubbs (ibid.) suggests that any study
of genres “must be located in a description of variation in the language
overall” and that Biber’s work is a good example of how wide a range of
variation there is within academic prose.
89
2.4.2 The Development of Needs Analysis
Stage 3 in the 1980’s ESP and general language teaching which covered
a range of analyses, target situation analysis, deficiency analysis,
strategy analysis, means analysis and language audits as exemplified by
Tarone and Yule (1989), Allwright (1982), Holliday and Cooke (1982),
Allwright and Allwright (1977), and Pilbeam (1979);
West (1993), naturally enough, sees this latter stage with computer-
based analyses as the future of needs analysis. The use of technology in
both analysing and selecting materials is purported to make the syllabus
more appropriate for learners needs.
90
2.4.3 Needs and Syllabus Design
0.0 Participant
0.1 Identity (Age/Sex/Nationality/Residence)
0.2 Language ((L1/L2/Present level of L2/Other L2s known)
1.0 Purposive Domain
1.1 ESP classification (English for Occupational Purposes (EOP) or English for
Academic Purposes (EAP), if EOP, pre- or post-experience, if EAP, discipline
based or school subject)
1.2 Occupational purpose (specific job or post/central duty/other duties)
1.3 Educational purpose (specific discipline/central area of study/academic
design classification)
2.0 Setting
2.1 Physical setting: spatial (location/country/town/place of work/place of study)
2.2 Physical setting: temporal (point of time/duration/frequency)
2.3 Psychosocial setting (noisy, demanding, culturally different, aesthetic -
unfamiliar)
3.0 Interaction (with others)
3.1 Position (role relationships - dependent on purposive domain e.g. student)
3.2 Role-set (other interlocutors etc.)
3.3 Role-set identity (number/age/sex/nationality of interlocutors thus affecting
role relationship)
3.4 Social relationships (or role relationships e.g. superior-subordinate, peer-
peer, official-member of public, doctor-patient, teacher-learner)
4.0 Instrumentality
4.1 Medium (spoken or written)
4.2 Mode (monologue/dialogue)
4.3 Channel (e.g. face-to-face, text for silent reading, phone)
5.0 Dialect
5.1 Regional (and British English/American English, etc.)
5.2 Social class
5.3 Temporal
91
6.0 Target Level
6.1 Dimensions (size and complexity of utterance/material (text), range and
delicacy of forms and functions, speed and flexibility of communication)
6.2 Conditions (degree of tolerance of: 1. error, 2. repetition, 3. hesitation,
4. stylistic error, 5. reference)
7.0 Communicative Event (i.e. what the learner has to do, either/and productive and
receptive)
7.1 Main (macro activities e.g. waiter serving customer in restaurant, student
in university seminar)
7.2 Other (micro activities e.g. taking down customer’s order or student
introducing a new point)
8.0 Communicative Key (i.e. how the learner does the activities above determined by
1,2,3 - attitude factor).
Table 2.1 Munby’s Communicative Needs Processor
92
(a) Necessities which are ‘the type of need determined by the demands of
the target situation, that is, what the learner has to know in order to
function effectively in the target situation’ (Hutchinson and Waters,
1987:55). Identifying these necessities is often referred to as target-
situation analysis (see Chambers, 1980).
(b) Lacks. Analysis of what the learner already knows leads to recognition
of the gap which exists between this and the target situation in other
words the ‘learner’s lacks’ (Hutchinson and Waters, 1987:55-56).
(c) Wants. These wants are the learners’ perceived needs or subjective
needs. (Hutchinson and Waters, 1987:57). The learners’ needs
(subjective needs) may be in conflict with the needs analysis that has
been carried out and therefore may be in conflict with the aims of the
course, as determined by those responsible for the course. However, in
such a situation it may be possible to incorporate some of the generally
perceived (subjective) needs of the learners into the course. An example
of this would be to incorporate speaking tasks into courses which are
predominantly designed to aid reading.
93
Hutchinson and Waters criteria are highly relevant for the research
carried out here on undergraduate students with (a) the necessities, what
the students need to know, being found from the corpora analyses, and
(b) the lacks, what the students already know, being identified from the
results of the language tests carried out on the undergraduates joining
the first year science and technology courses. Incorporating students
wants would be a more difficult task given the numbers involved and
certain constraints such as the requirements to test all of the students in
the same way at the same time.
94
Sinclair (1991:1) suggests that traditional linguistics had been
limited by the amount that one person could experience and remember
and he equates the situation with that of the physical sciences 250 years
before. Halliday (1993:7) sees the start of corpus-based linguistics as
laying the foundations for a quantitative and qualitative breakthrough in
understanding linguistic systems and of this having started in the 1960’s
with Randolph Quirk in Britain and Freeman Twaddell in the United
States.
11
Meij’s ‘spread’ could be equated with the computer corpus concept of ‘range’ of an item in frequency
studies.
96
or some kind of social characteristic tagging of the participants and
situational tagging); whether it contains complete texts or samples from
texts; and the selection of texts may be either made by convenience
versus purposeful versus random within strata versus proportional
random.
97
Leech does not, for example, consider translation studies which is an
enormous area of its own with very specific views on language although
this could be subsumed under computational linguistics in general and
is perhaps hinted at in machine translation. Nor does he make special
mention of collocations, an area of language learning and teaching that
has become extremely important in recent years (Sinclair 1991, Tribble
and Jones 1990). Collocations are now seen to be the building blocks of
language and can be used for vocabulary management, to disambiguate
similar terms and formulate or check hypotheses about language use, to
help learners to understand texts, for self-access outside the classroom
and to provide teachers with suitable teaching materials. However, this
list does show some of the areas within which computer corpora can be
applied to the study of the English of science and technology in applied
linguistic research and language teaching.
Biber (1994:180) suggests that recent debate centres around
whether to use large corpora as opposed to what is known as “balanced”
corpora (that is, made up of a number of similar sized texts possibly from
a wide range of registers) for the design of general purpose corpora. He
argues (1994:180) that “it is important to address the question of
whether the varieties represented match the intended uses of a corpus”
and he claims that studies of a single sub-language are “legitimately
based on corpora representing only that variety”. This is the view that
this study takes about corpora; they must represent the variety of
English that the students are expected to come into contact with and
need to understand for their studies in science and technology.
Biber (1994:11) lists the following areas of study in linguistics that
corpora can help with: “individual words, grammatical features, men’s
and women’s language, children’s acquisition of language, author style,
register patterns” and goes on to suggest that dialect and register
patterns could be investigated for sociolinguistic fields when looking at
98
the “complex co-occurrence patterns among features in different
registers” which would be difficult to do without recourse to computers
on a large scale. He also (1994:12) mentions the study of styles across
historical periods which could provide the opportunity of investigating the
development of registers over time and emphasises the role of corpora in
educational linguistics. With respect to the latter he says that “large-scale
studies of use are helpful in designing effective materials and activities
for classroom and work-place training, allowing us to help students with
the language that is actually used in different target settings.” He also
recommends corpora use in language testing, that is, “making tests
which conform to the actual language that students will be using on a
regular basis”. These conclusions form the basis of the working
presuppositions of this study.
The preliminary tests that were designed for the undergraduates,
described in Chapter 4 went some way towards this goal of conforming to
what was seen as the target language that the students would be coming
into contact with in science and technology. The tested items were from
the materials that were to be taught in the discipline. They were not
however, derived from corpora developed for the purpose, which, in the
light of modern computer corpus methods is a weakness of the testing.
The reason for this was that the testing had already got underway before
the corpora used in this study had been developed but now that they are
available there is no reason why they should not be used for this purpose
in the future.
99
based on what was then considered a huge corpus of 7.3 million words of
written and a smaller corpus of about 1 million words of spoken
language. The ‘main’ corpus was started in 1960 and subsequently
smaller ‘side’ corpora were developed (notably the Bank of English and a
corpus especially prepared for Teaching English as a Foreign Language
(TEFL) textbook writing (see Willis, 1989). Sinclair and Jones (1974)
report that “The first corpus, in 1961, was a mere 135,000 words”. This
reflects the changes that have taken place with regard to the gathering of
data. Initially every text had to be transcribed onto computer manually
and the original computer programs for handling the texts had to be
developed. Later text which had already been transcribed on computer
through word processing became available and later still the use of
optical scanners (usually known as Optical Character Recognition or
OCRs) simplified the transcription of text and speeded up its conversion
into electronic data.
Sinclair (1987:2) describes the criteria on which the ‘main’ 7.3
million word Cobuild corpus was developed to be relevant “for the needs
of an international user” and which the team defined as the following:
- written and spoken modes
- broadly general, rather than technical, language
- current usage, from 1960, and preferably very recent
- “naturally occurring” text, not drama
- prose, including fiction and excluding poetry
- adult language, 16 years or over
- ‘standard’ English, no regional dialects
- predominantly British English, with some American and other varieties.
100
book authorship - 75% male: 25% female
English language variety - 70% British: 20% American: 5% Other
language mode - 75% writing: 25% speech
101
“to provide objective evidence about the English that most people read, write,
speak and hear every day of their lives”.
102
Academic prose 80 160,000
General fiction 29 50,000
Mystery fiction 24 48,000
Science fiction 6 12,000
Adventure fiction 29 58,000
Romantic fiction 29 58,000
Humor 9 18,000
TOTAL 500 1,000,000
Table 2.2 Texts, Categories and Numbers of Words in the LOB Corpus
The LOB Corpus is tagged and part of the LOB known as the
Lancaster Parsed Corpus contains 133,000 words that have been
syntactically analysed.
There is also now the Freiburg corpus with approximately 1 million
words of British English, parallel to the LOB corpus, but compiled from
material published in 1991. The fact that corpora are seen to be
becoming dated means that their authority to describe modern English
usage is also diminished and so many more of this type of up-to-date
corpora are being prepared to keep abreast of changes that are
constantly taking place in language usage. These more modern corpora,
when produced using similar criteria, can be used for diachronic and
other comparative studies. The other reason that more up-to-date
corpora are being produced is that the techniques now available and the
research carried out on machine readable or electronic texts has brought
some of the original criteria into question. The insights gained from such
research now implies that more modern corpora can be obtained in many
more different states of tagging depending on the purpose to which they
are to be put. Biber’s research which forms the basis of this study drew
on some of the earlier LOB texts.
103
2.5.5 The Brown Corpus
104
Leenders have found. For the purposes of the research described here
these failings make the use of such corpora inappropriate. Neither the
absence of textbooks nor the presence of an overwhelming amount of
journalese is suitable for analysis of the language that undergraduates of
science and technology need to confront and is therefore not suitable for
the purposes of this study.
Furthermore, Minugh (1997:68), despite recognising that the
Brown and LOB corpora were “a revolution in their time”, describes the
difficulty of using such corpora for searches for neologisms because of
their date of development. Minugh (1997) recommends the use of British
and American Newspaper CD-ROMs for this sort of search. In other
words, these corpora are also becoming dated and are therefore not
suitable for finding representatives of colloquial or modern language
terminology or coinings. This limitation is particularly relevant for those
conducting research into speech and current news because of the ability
to change quickly and reflect fads and fashions. Some of those changes
will become part of the language but others will disappear almost as
quickly as they came. This is the heart of the problem that dictionaries
such as the Oxford have every time a new edition is published. Terms
which are regarded as fashionable or corruptions are often decried by
readers and reviewers as having no place in such an established
reference work on the English language.
12
Text here means a communicative event
105
spoken and half written material. Six major speech situations are
represented: private conversations, public conversations (including
interviews and panel discussions), telephone conversations, radio
broadcasts, spontaneous speeches, and prepared speeches divided up in
the following way:
107
spoken and written. Leech (1993:13) gives the following information on
the composition of this corpus:
Genre
Books (55-65%)
Periodicals (20-30%)
Miscellaneous (published) (5-10%)
Miscellaneous (unpublished) (5-10%)
To be spoken (2-7%)
108
The spoken, face-to-face conversation corpus is as follows:
109
emerged because of this, that of “ephemera” which includes any material
that people come into contact with unintentionally, such as unsolicited
mail and advertising.
The number of texts and number of words contained in the categories
Academic Prose and Fiction are as follows:
The samples are taken from many registers from the early 1900s to
the 1980s. It can be seen that this corpus also has limitations for a
description of either general English or for analysis of specific varieties of
English usage. It suffers from a lack of balance to provide what is
described as ‘general English’ by the Bank of English criteria mentioned
above (section 2.5.3). It also suffers from having ‘text fragments’ which
Sinclair (1991) regards as a failing of many corpora. It also covers too
wide a period of time for much research on either modern usage or for
diachronic study purposes.
Many projects on specific issues that researchers feel are not or are
underrepresented in the established large scale projects described in
more detail above are taking place in universities around the world. A
short summary of some of the main areas that these cover is given below
to demonstrate the trends in recent corpora studies.
110
An Australian corpus (ACE) produced at Macquarie University, New
South Wales and an International Corpus of English (ICE) produced at
University College London (http://www.ucl.ac.uk/~ucleseu/design.html)
have also recently been developed to address other types of Englishes in
the world. ACE contains one million words of Australian English
compiled along the same lines as the Brown Corpus for purposes of
comparison. ICE contains one million words from the English of
Australia, Canada, East Africa, Hong Kong, India, New Zealand, Jamaica,
Nigeria, Singapore and the Philippines. The Melbourne-Surrey Corpus
has 100,000 words from Australian newspapers.
The Kohlapur corpus contains 1 million words of written Indian
English from 1987. It uses the same categories as the Brown Corpus and
LOB Corpus.
A corpus of spoken American English (CSAE) is being constructed
at the University of California which eventually hopes to contain one
million words.
The Northern Ireland Transcribed Corpus has about 400,000 words
of spoken material from 42 locations and over three age groups.
The CHILDES Project (http://poppy.psy.cmu.edu/childes/database.html) is
developing a corpus of children’s spoken and written language. There is
also the Polytechnic of Wales (POW) corpus of 61,000 words of children’s
spoken language which has been parsed using Hallidayian Systemic-
Functional Grammar.
The increase in the number of corpora and such corpora as those
on language development will surely have an influence on teaching and
learning as they show what actually takes place rather than what some
small scale studies have suggested is the case in both language
acquisition and language diversity.
111
2.5.10 EFL Student Corpora
112
present. These variation studies across time as opposed to across genres
are developments which seem to be harking back to some of the other
traditional (now computer assisted) studies of variation in old
manuscripts.
2.5.12 Concordances
113
seen to be from novels from earlier periods like those of Jane Austen in
the early nineteenth century. Such examples can be regarded as dated
and often ‘unusual’ rather than reflections of modern-day English usage.
The biggest criticism of many of the first corpora produced is precisely
this, that they have already become dated and cannot be seen to be
representative of modern English usage any longer. They are already
caught in the trap of ‘historical’ rather than ‘current’ usage.
114
This dissertation will take up three major lines of research from the
register analysis, discourse analysis and corpus analysis mentioned
above which are essential to syllabus development.
First a register analysis will be carried out on the physics and
chemistry books from the students bibliographies, as register analysis
can be applied to syllabus design following Jones’ orientation (1991),
using frequency counts to identify what is lacking in any syllabus or
materials for specific learners. Consideration of cognates will be made in
order to fine tune these lists even further and to anticipate areas of
difficulty for Portuguese native speaker students.
This will be compared with a CD-ROM multimedia encyclopaedia in
order to bring out similarities and differences between texts that are of
the same academic level, according to Huddlestone (1971) and Swales
(1985), and which will serve to reflect the moves in education towards the
use of this kind of technological resource for both student and teacher-
directed learning (see Guillot and Kenning 1995:365). Work with
interactive and multimedia resources such as those available on CD-
ROM and through the Internet are seen as being increasingly important
in education as discussed in the introduction to this study. This
comparison will also provide information on the range of the items, so
that the relevant context (and, therefore, specific use and meaning) of the
lexis can also be determined.
The corpora from the physics and chemistry textbooks will be
explored using Biber’s (1988) methodology for variation studies, in order
to highlight the ways in which these conform to and differ from both
academic prose and general language use he classifies it. Biber’s
methodology is explicitly defined so that it is possible to build this study
on his work in an accurate and principled manner. This is deemed to be
a prerequisite of any research in corpora studies so that a precise
115
description of the criteria used to produce the data is available which can
thus be evaluated in the light of the purpose to which it is to be put.
The students’ language needs will also be ascertained by using the
results obtained from tests on entering the university to determine the
strengths and weaknesses which will need to be addressed by any
syllabus designed for these students. The results of the tests are
classified into grammatical categories that correspond to those used on
the corpora as far as possible and test items are exemplified to provide
clearer description especially for those tests that took place before the
advent of this study. By comparing and contrasting these categories it is
possible to reach some conclusions about the areas that need specific
attention in the syllabus designed for these students.
Finally the results of what can be seen as a data-driven description
will be brought together to suggest what should to be included in any
syllabus adopted for these students. At this point the research findings
from other corpus-based studies have to be taken into account in both
grading of material through core or key patterns of usage. The
commonest forms of language use and the combinations these typically
form or core patterns in these textbooks must be matched with the
abilities shown by the undergraduates with them on the tests they have
taken. Nevertheless, syllabus design calls into question many other
methodological aspects which must be addressed. What can feasibly be
achieved, despite the inherent constraints, is one of the major
considerations here. It is essential that innovation of the kind mentioned
above in terms of modern technology is incorporated into the syllabus
and more of the same kind of teaching/learning is not carried out for
these very specific students with specific goals and requirements. The
wider educational implications of innovation in the syllabus proposed will
be addressed.
116
Chapter 3 Research Methodology
Chapter 3
Research Methodology
As described in Chapter Two, much work has been carried out this
century and in particular in the last thirty years to try to define exactly
what makes different styles of English different. Lexis, syntax, pragmatics
and discourse features have all been studied in order to discover
differences and many claims have been made, some on rather slim
evidence. For example Tarone et al’s study (1981), although very
professional, was based on only two Astrophysics articles which were eight
pages and seven pages long respectively (see section 2.2.3 for further
discussion of this point). Swales (1985) attributes this state of affairs as
existing because ESP ‘practioners’ that is, teachers who also produce
materials , design courses and conduct research, are usually working in
isolation and do not often look back to the work that has gone before, nor
do they learn from work that is being conducted in parallel to their own
and which might usefully contribute to their work. One thing has become
increasingly apparent and that is that each learning context needs to be
studied in order to provide an accurate picture if the results of such
research work are to have practical applications. Nevertheless, this
specific work must be related to other work in the field.
Halliday (1993:124) says:
“There are practical reasons for analyzing scientific texts. The most obvious is
educational: Students of all ages may find them hard to read, and we know
from various research reports that, in English at least, the difficulty is largely
116
a linguistic one. So if we want to do something about it we need to understand
how the language of these texts is organized.”
117
materials of this level, which suggests unfortunately that this situation is
likely to continue for the foreseeable future.
Researchers often recognise that different styles of English are more
prevalent at different academic levels. The difference between an
undergraduate and a post-graduate science student, for example, would
suggest widely different text types (from textbooks teaching the subject
matter of the course to journals reporting the latest research in very
specific branches of science) and, therefore, styles of English. Similarly,
some English for Science and Technology (EST) practitioners believe in
adopting much more popular and accessible texts which would also bring
with them a considerable difference in style and content than the average
science textbook. One example of this difference in style is shown by
research carried out by Darian (1981) into the manner in which
definitions are handled in such magazines as Popular Science or Time
magazine and the Journal of Astrophysics. Darian finds, not surprisingly,
that definitions are handled differently in popular magazines from those
used in specialist journals, and that these are different again from those
used in textbooks used to teach the subject. Although it can be argued
that the students may be more motivated by certain types of (more
popular) materials, these are not considered to be a suitable basis for an
analysis of syllabus design for tertiary level students. The assumption that
will be made in this analysis is that if students are taught to cope with the
kinds of scientific texts that appear in their undergraduate bibliographies,
they will be better able to cope later on, whether it be with the literature of
their specialisation (where incidentally lexical density has not been found
to be a barrier to the specialist) or in other professional outcomes of their
courses1.
1
such as teaching cf. Arroteia, Jorge Carvalho; Martins, António Maria (1997) Inserção Profissional do
Diplomados pela Universidade de Aveiro: Trajectórias Academicas e Profissionais, Aveiro: Universidade de
Aveiro.
118
3.1 Frequency and Range List
119
usefulness for teaching purposes and further specifying appropriate use.
Contrasting the three corpora will also give information about coverage,
that is, the number of things that can be expressed by any given item.
Coverage and range together will provide clearer evidence for which items
to include in the syllabus. Furthermore, examination of the frequency lists
allows prediction of areas of difficulty for students whose first language is
Portuguese.
120
because of similar word formation or shared Latin roots, could be easier
(cognates) or, because of different roots, more difficult for Portuguese
undergraduates, or examples of false friends. For example the word
“abnormal” and its Portuguese equivalent anormal are sufficiently close to
suggest that positive transfer could take place and the students’ ‘guess’ or
semantic prediction would probably be accurate. However, words like
“able” and its Portuguese equivalent capaz are considered to be difficult,
although an alternative hábil could be used in some circumstances and
would be closer to the English form. The latter would be more accessible
provided that the students recognised the similar pronunciation of the two
words rather than their orthographic form. This contrastive analysis aims
to predict the learnability (Mackey 1965) of the language found in the
corpora and is incorporated into this study through examination of the
frequency lists from the corpora described in more detail in Chapter 5 in
order to identify cognates and thereby identify the areas of difficulty for
students.
More recently internationalisms have appeared where the same (or
very similar word) is being used in many countries. An example of this
might be computer jargon like software/hardware which are used in many
languages and have caused the European Union to fund projects to
produce terminology banks in various areas including that of information
technology.
3.1.2 Context
121
the meaning of a word but they warn “There are even more dangerous
traps when the overseas context that appears to correspond to the native
speaker’s context in fact differs.” They suggest that students should be
encouraged to pay particular attention to collocations. Sinclair (1994-
98:18-19) suggests that the text itself contains everything that the reader
needs but warns that there are restrictions which, with the help of the
computer, can be explored to provide “models which help the text to reveal
itself to us”. Johns (1994-98:103) sees that the text that should be used
by students should reflect the target material the student needs to get to
grips with but should not be treated in a manner that would lead students
to develop ‘bad’ reading strategies and that any simplified text will only be
“used as a stepping stone to the real thing”.
3.1.3 Collocations
122
example of the word digital and says “it is impossible to decide whether the
term denotes a special technical quality or is just an element of general
language use. The denotative meaning of the word is determined by its
textual venue i.e. whether we encounter it in a technical statement on
computer operations or in a sales talk in a watch shop, where the word
might be in a familiar juxtaposition to the word watch.” Darian (1981)
suggests that “ultimately the fullest meaning of a word lies at the
discourse level, which allows for an extended definition and deeper
exploration.” Martin (1992:172) claims that “technical language both
compacts and changes the nature of everyday words.” Students need to
connect words learnt at school with new contextual meanings in more
specialised contexts to avoid a particular kind of “false friends” where
words change their meanings in these different contexts (Hoffman 1981).
Moon (1994-98:122-124) lists ‘fixed expressions’ from her analysis of a
newspaper editorial. She (1994-98:126-7) finds the most common
expressions in the lexicon as a whole to be ‘functional’ or ‘grammatical’ as
opposed to ‘lexical’ and that (ibid.:134) examining the fixed expressions in
text provides information on the message and the speaker/writer’s
presentation and how this relates to objective statement or subjective
interpretation.
Whilst recognising that the analysis of specialist corpora will not
always reveal what the researcher expected, Tribble and Jones (1990:35-
36) make the following comment in relation to the utility of concordancing
for teaching purposes:
Two generalizations can be made about applications of concordance output,
in spite of their diversity. Firstly, most of them favour discovery learning.
That is, they present language in a way that enables learners to discover new
knowledge for themselves, rather than being spoon-fed. Secondly, they do
this by providing examples of authentic language. The fact that the source
material for exercises is drawn from real life rather than concocted by
123
teachers increases motivation, as it gives learners immediate contact with
the target language in use.
124
English fits in neatly with the English requirements of undergraduate
students in the University of Aveiro. One extremely useful addition is that,
not only can a word frequency study be carried out, but a further
dimension can be added to the research and that is the context in which
each high frequency word is to be found. So, not only can useful
information about lexis be obtained, but also a clearly defined use of those
items in specific scientific texts and also the link between the word and its
discourse setting. The number of texts that high frequency words are
found in can also add to the information about which words students are
most likely to encounter and, therefore, need to learn.
In addition, Rosenthal (1996:114) reports that introductory science
textbooks for further education in the United States have been getting
longer, broader and deeper in their coverage and reading complexity
making many of them become encyclopaedic.
Swales (1985) argues that this “‘level of brow’ is not as important as the
expected relationship between the author and reader”. He describes these
‘mid-brow’ texts as “essentially instructional”. Similarly, Darian (1981:29-
30) describes the relationship between Material and Type of Audience. His
division is as follows:
1. Popular magazines, newspapers Uneducated layman
125
2. Scientific American and popular books A reader conversant in the
general area (e.g. business, social
science)
3. High school text Layman - limited general
knowledge and technical
background information
4. Introductory college text Layman - educated to college
level of general knowledge
5. Scholarly journal, specialized book-length Specialist and advanced
study (e.g. a volume on optics) graduate student
Table 3.2 Darian’s Level of Text and Audience
Darian claims that for each of his categories the writer assumes a different
level of “presupposition or background knowledge” on the part of the
reader. Glaser (1982:76-77) describes the difference in style between what
she calls “the academic scientific and technological style” addressed to
“‘insiders’ of a particular field of knowledge” and “the popular-scientific
style” used for “a general audience composed of non-specialists”. Glaser
describes the specific features of each of these being governed by the fact
that in the former “knowledge of the subject and the appropriate
terminology, the code of formulas and symbols and the various functions
of the syntactic patterns” is presupposed whereas the latter “show
entertaining deviations from the specialist’s topic for the purpose of
motivating the reader”. Furthermore, she distinguishes both of these
styles from a “didactic” style which attempts to make “a job-specific
problem (a scientific or technological subject) understandable to the
learner” found in textbooks, handbooks and other teaching material used
at schools and universities which are “subject to the didactic principle of
intelligibility of the text.” Similarly, Myers (1994-98: 189) finds that
different styles of research articles and popularizations construct different
126
views of science and that scientists “see their work as much more
tentative and mediated than does the public.” Myers (ibid.) found
differences in syntax, vocabulary and organisation between these two
types of ‘scientific text’ and he believes that teachers and students must
take these differences into account to “follow the entry of students into a
research community.”
This thesis contends that the appropriate material for
undergraduate students is the textbook, corresponding to Darian’s fourth
category above and Huddlestone’s second category of ‘mid-brow’. These
students are at a stage where their bibliographies reflect “instructional”
texts and therefore the encyclopaedia is an appropriate research tool as it
is also ‘essentially instructional’. This is because the students are in
transition form secondary to tertiary education and have yet to develop
greater knowledge of the subjects in their core disciplines on the Ano
Comum. Both of these text-types also fall into the category of educational
texts which will be reflected in their style.
127
describes the differences between the focus of scientific texts and
“popularisations” which are prepared for a more general audience. It is
only appropriate in variation studies for a variety of text-types to be
examined together, but what those texts are needs to be defined clearly.
Biber (1988:208-210) describes precisely which texts he included in his
variation studies on speech and writing. Many of the texts were taken
from the LOB and London-Lund corpora mentioned earlier in Chapter
Two. This level of specificity makes Biber’s analysis an appropriate tool for
this study.
Their ‘Problem 1’ refers to regular plurals, they suggest that ‘The pupil
who has mastered regular plurals will recognise monuments instantly if he
knows monument and vice versa. The difference is lexical not grammatical.’
2
Sinclair (1991) uses the term ‘word-form’ for this concept.
128
‘Problem 2’ is whether the word is a noun or a verb. Take, for example,
the word ‘play’. Is this to be regarded as two different words, once as a
verb and again as a noun? The Grolier encyclopaedia corpus does not
make any such distinction and so it is only through a more searching
analysis using concordancing that such distinctions can be resolved. This
is a very important issue however as Biber (1998:34-5) points out with his
finding that “deal/deals functioning as a verb is almost twice as common
as the noun use” in academic prose (from the Longman-Lancaster corpus)
whereas fiction (from the same corpus) shows the opposite with “the noun
use being considerably more common than the verb use”. This kind of
information is extremely important for ESP language learners and should
be brought out in the materials designed for their use.
“It is now possible to compare the usage patterns of, for example, all the forms
of a verb, and from this to conclude that they are often very different one from
another. There is a good case for arguing that each distinct form is potentially a
unique lexical unit, and that forms should only be conflated into lemmas when
their environments show a certain amount and type of similarity.”
129
case, in EST texts. Halliday (1993:71) also argues that it is impossible to
separate the grammar from the vocabulary and that it is the ‘the total
effect of the wording -words and structures-’ that the reader responds to.
130
‘Problem 7’ is that of prefixes. Bright and McGregor claim that ‘any pupil
will be able to jump to the meaning of such items as ‘action - reaction’.
Whether or not this claim (and other similar ‘leaps’ in understanding by
students) is true would appear to depend to a certain extent on the
contact that students have had with English in their schools and their
understanding of discourse and shared scientific background knowledge
as discussed earlier. This will be taken up again later in the evaluation of
our students’ test results on entering the University (see Chapter 4).
‘Problem 9’ is concerned with what they term ‘form words’ such as ‘a, the,
and’. The Grolier encyclopaedia expressly excludes a number of such
words on the grounds that they are too common. A list of these very
“common” items is included at the end of each of the alphabetical lists as
they occur both in this chapter and in the Appendices.
‘Problem 10’ is phrasal verbs which are treated as separate items by the
Grolier encyclopaedia, that is to say, the verb and its particle appear
separately. This is a difficulty that can only be cleared up by examination
of the context of use of the main verbs found to be phrasal. Sinclair lists
the phrasal verbs that account for nearly 30% of all phrasal verbs in the
COBUILD corpus as “bring”, “come”, “get”, “go”, “put” and “take”. Separate
study of these would need to be made if this proved to be an area that the
students had particular difficulty with on their test results. The results
(see Chapter 4) produce mixed results in fact depending on the phrasal
verb being tested.
131
3.1.8 Other Features of the Text and Corpus
132
3.1.8.2 Abbreviations.
133
through exclamations like Ah and Ha which may be pronounced in very
much the same way for a Portuguese speaker but quite differently by a
native English speaker.
134
claims that higher education has not yet, found a means of coping with
this as yet other than through the tutorial question and answer system to
draw out where and when the misconceptions occur.
Strevens (1978:193) maintains that Latin and Greek roots and
affixes combine to form an extremely large number of words which are
‘science-specific’. He cites the roots aqua-, cyto-, hydro-, plasma-, pyro-,
and the prefixes ante-, anti-, poly-, post-, pre-, sub-, and suffixes -fer, -ite, -
logy, -valent. Strevens (ibid.) maintains that this scientific vocabulary
makes up a ‘normal part of the training of all scientists’. Portuguese
students are fortunate in that they have a Latinate language which may go
some way towards providing them with knowledge of and insight into the
scientific applications of Latin roots.
Quirk (1995) has shown that some words are preferred in certain
texts or registers even though there may well be a very similar synonym.
“Ancient” and “old” for example may exist in almost equal numbers of
texts (range) and frequencies in the corpora whereas “attempt” as opposed
to “try”, and “change” as opposed to “alter”, may exist in different
frequencies showing preference for one form over the other. Quirk (ibid.)
argues that these kinds of choices, although apparently arbitrary, can
indicate formality in texts and may therefore be representative of the
particular genre they are found in. Lemke (1998:92) suggests that choices
of lexis contribute to the “attitudinal stance of a text to its audience, to its
content, and to other text-embodied viewpoints”. McCarthy and Carter
(1994:104-5) suggest that vocabulary choice is just as discourse sensitive
as grammatical choices and that if language is to be considered as
discourse “vocabulary must be a concern as much as any other aspect of
language form”.
135
A similar position is adopted by Biber, Conrad and Reppen
(1998:43-54) who demonstrate that “big”, “large” and “great”, which are
often presented to students as synonyms, are usually used in quite
distinct patterns and with specific meanings. They find (1998:51) that
fiction and academic prose have different preferences for these words with
“big” being more common in fiction and “large” in academic prose. While
both registers use “great” with “deal” as a collocate, fiction uses many
more senses of “great” than does academic prose. They account for these
findings by suggesting that “fiction texts contain frequent physical
descriptions” and “more varied descriptions” whereas academic prose texts
“deals with size” and “specific measurements”. They go on to examine the
collocates associated with these words in the two registers. Similarly
(1998:98-99) they examine the preferential use of “begin” and “start” in
fiction and academic prose and discover that the intransitive use of start
is the most common in both registers but is more prevalent in academic
prose. In contrast “begin” is usually used intransitively but in fiction it is
used mainly with a to - clause. In other words, they argue that the
patterns of language use are not synonymous across registers. Thus the
vocabulary preferences found in the corpora are significant both as
representations of the style of the texts and as a means of demonstrating a
model of authentic usage to students.
136
headings. Any corpora searched will not provide these distinguishing
features so that an analysis of the texts themselves as published can often
reveal other interesting and important features of those texts (see later
5.2).
Laurillard (1993-97:27) describes academic knowledge at university
level as a process of ‘mediating learning’ because the students have to
learn what others have given insights into rather than what they can have
direct experience of. She suggests that because academic knowledge “has
this second-order character, it relies heavily on symbolic representation as
the medium through which it is known” and although the medium may be
language it may also be “mathematics symbols, diagrams, musical
notation, phonetics, or any symbol system that can represent a
description of the world.” Therefore, students in university have two
problems to overcome, the first that of handling the representation system
and the second the ideas which they represent. Some features that must
be taken into consideration therefore are the use of typographics, titles,
subtitles, summaries and conclusions, drawings and diagrams, and
formulae, numbers, equations and tables. These features should add to
the student’s understanding of the text, provided that the student is aware
of the conventions used and has been trained in recognising the
multimodality of texts. Lemke (1998:95) suggests that scientific text is not
meant to be read in a linear manner and for him it represents a “primitive
form of hypertext” where “footnotes represent an optional branch for
readers, so do figures and their captions, and the parenthetic or main-text
expressions such as ‘(Table 3)’ or ‘as seen in the first table’ which point to
them.” In contrast speech is linear in this respect. The number of
dimensions that are then available to the reader is much wider and access
to them is much more open, the reader can choose what information to
access then from the different textual and visual information present in
scientific texts. Nevertheless, students have to have background
137
knowledge of the canonical forms used in science in order to be able to
understand and interpret the information available.
3.1.9.1 Typographics.
138
The use of a number of symbols through computers has taken this
further in modern materials and, if over used, these may serve to irritate
rather than encourage as is the case of the ubiquitous, perfidious ‘smiley’
to indicate a joke or other attempt to be friendly or light-hearted. Attempts
to make materials for learners more attractive may require a clear
statement at the beginning of how these symbols will be used in any text.
If learners skip past these early explanations in the textbook, they may be
in danger of missing many of the connections the author intends to make.
Van Dijk (1997:10-11) claims that discourse topics define the overall
‘unity’ of discourse and are “typically expressed in such discourse
segments as headlines, summaries or conclusions.” He also claims that
they “also happen to be the information that we usually remember best of
a discourse,” which, if the case, means that these features are especially
important for study purposes.
139
went before. In this way they serve to prepare the learner for the task and
serve as a check on learning. These types of activities are known as ‘wrap
around exercises’ to assist with text processing and to enable the learner
to monitor progress on their own at home.
140
were meant to be read as part of the verbal text “in terms of semantics,
cohesion and frequently grammar”.
As with the use of other textual features, the use of diagrams and
drawings should enhance the understanding of the surrounding text,
provided that the referencing to these is understood. In general, visual
material in the text was seen as a form of redundancy as it reiterates what
is being discussed. However, Lemke (1998:104) disagrees with this
position and claims that visual figures and mathematical expressions add
important or necessary information and so complement or complete the
main text. Modern discourse analysis sees multi-modality in texts as an
essential feature of study in discourse semiotics. Kress, Leite-García and
Leeuwen (1997:257) say that “producers of texts are making greater and
more deliberate use of a range of representational and communication
modes which co-occur within the one text” and that the reader has to take
these into account in order to “read texts reliably”.
Van Dijk (1997:6) suggests that “in these times of multi-media
communication ...an analysis of the visual dimension of discourse is
indispensable.” Van Dijk is much more interested in non-verbal signs or
semiotics but, nevertheless, the visual element in student’s textbooks
should be an aid to understanding the discourse of the text if the students
can interpret them accurately. Lemke (1998:87) argues that semiotic
systems such as language, tables, graphs, images and diagrams do not
just “add-on” meaning to a text but actually create new orders of meaning
thereby “multiplying meaning”. Furthermore, Kress, Leite García and van
Leeuwen (1997) suggest that it is important to see visual images as
independent vehicles for meaning in their own right. If the students can
make the connections between visual images and text or ‘read’ visual
141
images in scientific texts accurately, this would help the students to
ascertain the meaning in those texts. The question of whether students
can do this successfully is taken up again later in 5.2.4.
142
Table 3.3 Students’ Number of Years of Study of English
Years of study 1993/4 1994/5 1995/6 1996/7 1997/8
0 4.7%
1
2 2.7% 2.8% 1% 0.32%
2.5 0.7%
3 18% 13% 14% 8% 5.73%
4 3.3% 0.6% 1% 1% 1.91%
5 38% 26.5% 29% 30% 23.25%
6 1.3% 4% 3% 2% 5.09%
7 28% 47.5% 47% 53% 54.78%
8 2.7% 4% 6% 4% 7.01%
9 1% 0.96%
10 0.7% 0.6%
11 0.7% 0.6%
15 0.64%
18 0.32%
(3.57% of those who took the test did not answer this question at all.)
It was expected that the students would have studied English for an
average of three, five or seven years, with some time gap between the years
in which they studied English and university. What was most worrying for
the academic year 1993/4 was the percentage of students with no English
at all embarking on the course in conjunction with a significant number of
students who had studied English for seven or more years.
3
When a student who gave an answer like this was questioned about it later, she admitted that she had in fact
given information about her age and not her studies. However, this particular student justified her answer
by explaining that she had in fact spent most of her life in America so she felt that the English language had
indeed been part of her entire life.
143
The results for the academic year 1994/5 were a little more
encouraging in that they show no students with absolutely no English,
nevertheless, there are still a significant number with very little English.4
The figures for the academic year 1995/6 show much more clearly
the expected breakdown into three, five and seven years of English. Some
of the intermediate figures could be accounted for by students who have
had to repeat a year at school, which, if true, would suggest that those
percentages were students who could be considered weaker than others in
the same broad categories.5
The figures for the academic year 1997/98 show how there is a
general trend for students to be stronger in English than before, and the
answer ‘fifteen years of study’ reflects students who had been brought up
abroad in English speaking countries. Certain courses, such as Novas
Tecnologias e Comunicações - New Technologies and Communication
(NTC), appear to be attracting students who are generally stronger in
English which is not perhaps surprising given the nature of this course
which has a slightly more ‘humanities’ or ‘arts’ bias than the other Ano
Comum courses. These students also continue with their English studies
for a further year unlike most of the science and technology students in
the university.
4
Chatel (1999:246) records a similar change from 1988 to 1994 for sociology students in the University of
Coimbra.
5
Drª Maria Adelaide de Araújo Nunes of the University of Evora (1999:258) describes the “uncongenial
environment” for ESP with students who “have had only a few years of English at secondary school and/or
having systematically failed the subject there” and so “feel at a loss and are understandably reluctant to
study a subject that they hoped they would never encounter in their lives again”.
144
Figure 3.1 Pie Graph for the Academic Year 1993/94 showing the Students’ Number of
Years of English
1993/94
3-5 yrs
59%
Figure 3.2 Pie Graph for the Academic Year 1994/95 showing the Students’ Number of
Years of English
1994/95
> 3 yrs < 7yrs
3% 5%
3-5 yrs
40%
6-7 yrs
52%
145
Figure 3.3 Pie Graph for the Academic Year 1995/96 showing the Students’ Number of
Years of English
1995/96
< 7yrs
6%
3-5 yrs
44%
6-7 yrs
50%
Figure 3.4 Pie Graph for the Academic Year 1996/97 showing the Students’ Number of
Years of English
1996/97
> 3 yrs
1% < 7yrs
5%
3-5 yrs
39%
6-7 yrs
55%
146
Figure 3.5 Pie Graph for the Academic Year 1997/98 showing the Students’ Number of
Years of English
1997/98
> 3 yrs
< 7yrs
1%
3-5 yrs 8%
31%
6-7 yrs
60%
One other factor could be affecting these figures and that is that in
the first year all the students took the test but in subsequent years
students were advised that they need not take the test if they felt that
their English was not of a high enough standard. However, in recent years
students have changed their attitude and appear to treat tests as a kind of
lottery where they hope that through some stroke of luck they will pick the
winning combination of answers. They perceive that at any rate they have
nothing to lose by trying. It may also reflect a change taking place in
schools where an increasing number of students opt for English as their
main foreign language at an earlier age and so feel that they are of a
higher level. There are also more and more private language schools
opening up all over Portugal and some of their students must now be
coming through to university in increasing numbers.
Over the years some of the students have felt compelled to add
comments to the question they were asked about how many years they
had studied English. Rather like the example above, some students gave
an explanation for their answers. This could be because they had repeated
a year, as surmised above, or that they had done all of their studies in
147
English in another country, for example, Australia, South Africa or
America. However, a small number of students gave value judgements
about the quality of the teaching they had received, one student replied
“três anos e pessimos” whilst others inadvertently showed the difficulties
they had with English by answering “I am three years” to this question.
Other students explained that their studies had taken place a number of
years before they had taken up their university place and so their English
was ‘rusty’ and, yet others, that they had studied both at school and at
private language schools, thus completing ‘double years’ or that they had
taken the Cambridge University examinations in English.
These figures suggest that assumptions made about the level of the
students’ English could be wildly inaccurate, although there is a general
trend for the students to have studied English for longer in the secondary
school6. One other consideration is that, although the students may have
studied English for seven years, most are unlikely to have studied it in the
final year of their secondary school course as they will have chosen to
follow science subjects and not the humanities. Given the kinds of
problems explained above and the increasing pressure on grades for
university entrance, it is also possible that some of the students had not
studied English for more than two years because of repeating the final
year to improve their grades.
The fact that students have studied English in secondary school
does not negate the fact that their knowledge of language is limited to
what was taught on a general English course which is likely to have
concentrated on spoken English and more ‘literary’ kinds of
comprehension and composition, and to have dealt little, if at all, with the
6
Drªs Ana Maria Ferreira, Dulce Ramos and Fátima Braga da Silva from the University of Porto in their
paper “Evaluation des Curricula de FLE au Portugal” (1999:333-337) show the numbers of students
studying French, English, German and Spanish in the central region of Portugal in the academic year
1994/95. The figures for English demonstrate clearly that a significant number of students continue their
language studies into the final three years before university (approximately half of those studying English in
the 3rd cycle - 7th to 9th years of school).
148
language of science and technology. Langkilde (1982:523) describes the
barriers students in the Copenhagen School of Economics were found to
have because “unless they are made aware of the necessity of developing a
particular method for dealing with specialized texts they will for a long
time go on treating an economic text in the same way as they treated a
chapter from Balzac or a scene from Molière in grammar school.” Tavares,
Valente and Roldão (1996:62) say that the English Programmes for
schools do mention types of texts but these are given as “dialogue,
interview and advertisement” (I:42) and discourse organisation as
“descriptive, narrative and argumentative”7 (I:48)P. These authors suggest
that cultural identity and understanding, within the general development
of the pupil as a responsible citizen, are the main concerns in the
programmes for modern languages in Portuguese schools at the moment.
They also point out that teachers need to be up-to-date with their training
if they are to be able to cope with the requirements of the programmes, an
issue that has been mentioned many times in relation to teaching science
and technology. The school teachers themselves usually come from a
‘literary’ or humanities background and are, therefore, unlikely to feel
comfortable teaching English for science and technology.
Research carried out with students of the fourth year of the teaching
degrees in Portuguese and English, and English and German,
demonstrate that these future teachers have difficulty with numbers in
English just as the students entering the university have (see Test Results
for New Students, Chapter 4). This situation would therefore seem to be
self-perpetuating as teachers are generally unwilling to teach something
that they themselves find difficulty with. Swales (1973:9) describes how
teachers found it “almost impossible to view their Science students’
interests as different from their own” and therefore assumed the students
would find boring what bored them.
7
My translation of “dialogo, entrevista, anúncio” and “descrição, narração, argumentação”
149
Overall these results would suggest that increasingly the students
could be expected to have an intermediate level of English but with no
science subject specialisation in English. The structure or form words
mentioned earlier should be quite well known to the students but, as will
be shown later in the test results, some discourse markers are less well
understood. The syllabus design implications of these findings are to
complicate the issue of the level at which to pitch the instruction. The
needs of those students in the bottom 1% with less than three years of
English can hardly be met, and this will lead to their virtual exclusion
from most of the activities designed for the majority of 60% with six or
seven years of English. Equally well the top 8% may find the level pitched
beneath their capabilities and so lose motivation. These more able
students must be included in the activities carried out in such a way that
they feel stretched and that they are also making progress. It might be
possible to engage these students in helping their classmates to reach a
higher standard and incidentally help to create bonds between students in
this new environment which is seen to be necessary for successful
learning (Tavares et al.1996).
The knowledge of science and technology that the students bring
with them to the first year of university is also variable. Some students
will have chosen to study physics in their final year at school and some
chemistry, some will have studied more mathematics than others and so
on. This implies that homogeneity in terms of subject knowledge cannot
be guaranteed either in the students entering the foundation year
disciplines. This fact will have repercussions on all of the strategies and
skills that these students require in order to be able to perform well in
their studies of the subject matter in a foreign language.
150
The corpora from the Physics and Chemistry textbooks on the
students’ bibliographies will be examined using Biber’s (1988)
methodology of text variation to try to see what must be taught to the
students in the university that is specific to this text-type and, thereby, to
make the course relevant and to ‘fill in the gaps’ that the students bring
from their studies in school. Biber was conducting research into variation
between speech and writing but he provides a very explicit methodology
for the description of the linguistic characteristics of the range of genres in
English that he included in his study, which will allow comparison of the
physics and chemistry texts under study here with his results for
Academic Prose.
Biber’s goal was to include all the ‘potentially important linguistic
features’ of the different genres included in his study in order to identify
the ‘linguistic parameters along which genres vary, so that any individual
genre can be located within an ‘oral’ and ‘literate’ space, specifying both
the nature and the extent of the differences and similarities between the
genre and the range of other genres in English’. It is this identification of
the differences that needs to be studied in order to identify those areas to
be included in a syllabus for undergraduate science and technology
students. Biber claims that it is ‘bundles’ of linguistic features that occur
together in texts that ‘work together to mark some common underlying
function’.
Biber identifies 67 features from previous research, which can be
grouped into sixteen major grammatical categories: (A) tense and aspect
markers; (B) place and time adverbials; (C) pronouns and pro-verbs; (D)
questions; (E) nominal forms; (F) passives; (G) stative forms; (H)
subordination features; (I) prepositional phrases, adjectives and adverbs;
(J) lexical specificity; (K) lexical classes; (L) modals; (M) specialised verb
classes; (N) reduced forms and dispreferred structures; (O) coordination;
and (P) negation. He gives very precise definitions of each of these features
151
(see Appendix A) together with the functions that other researchers have
ascribed to these features. For example in Tense and Aspect Markers he
suggests that past tense forms are usually taken as the primary surface
marker of narrative; perfect aspect forms are associated with
narrative/descriptive texts and certain kinds of academic writing, and that
these co-occur with past tense forms as markers of narrative; and that
present tense verbs can be used in academic styles to focus on the
information being presented and remove focus from temporal sequencing.
By using Biber’s features it will be possible to analyse both the
register and discourse of some of the texts in the undergraduates’
bibliographies in order to apply the results to designing an appropriate
syllabus for these students.
152
Chapter 4 Test Results for New Students
Chapter 4
In the first academic year of the Ano Comum and of the Preliminary
Test, that is, 1993-1994, there were approximately 1350 students
studying English in the Ano Comum. and in 1994-1995 there were
approximately 1200 new students entering the Ano Comum and about
180 repeating this discipline. These numbers have continued much the
same for the academic years 1995/96, 1996/7 and 1997/8.
As I showed earlier most of the students entering this discipline
could have studied English for either three, five or seven years in their
secondary schools. What the students have learned, have learned
incompletely or have not learned at all in their secondary schools is
crucial for syllabus design, so the results of the Preliminary Test were
analysed for it to be possible to decide what needs to be given particular
attention in their proposed syllabus.
155
already obtained the Cambridge University Certificate of Proficiency in
English which is a qualification regarded as a minimum English teaching
qualification by the Ministry of Education in Portugal. An innovative
decision was therefore taken to test all the potential students
immediately at the beginning of the academic year and give all those who
were deemed to have a sufficient knowledge already the opportunity of
being excused from the discipline altogether. This decision was
applauded by the student body who suggested unsuccessfully that they
would like it extended to other core disciplines. The effect of the decision
to innovate in such a way was to reduce class size a little in an attempt to
give the less proficient students more time and attention in class and to
permit those students with greater proficiency to concentrate their efforts
in other areas where they might not be so proficient. Many students
started learning English (mainly in private schools) whilst they were still
in primary school and this early teaching has also come to be seen as
beneficial in state education in Portugal. Changes have been introduced
in the curriculum to permit different schemes of foreign language study
often also extending this to the final years of secondary schooling for all
children no matter what their core curriculum. Innovation for
undergraduates on science and technology courses has also been the
focus in tertiary education. Students who were found to have great
competence in English were also considered to be likely to be demotivated
by being in a mixed ability class with over forty other students. In actual
fact some students who had expected to be excused from the discipline
were surprised to learn that it was their knowledge of science that let
them down in the test. The specific English being tested went beyond the
mundane day-to-day usage of children and required a more mature,
informed view from students
As was mentioned earlier, all the new students coming in were
tested to see if some of them could be offered the chance of not taking
156
this discipline at all because their English was considered to be of a
sufficiently high level. This level would correspond to already knowing
enough English to be able to pass comfortably the kind of test that they
would be given at the end of the year after studying specific English for
science and technology in large mixed ability groups for two hours per
week for one academic year. In other words, a proficiency test was needed
to evaluate the student’s knowledge. It was decided that an adequate
initial standard of English would equate with a mark of fifteen or above
out of twenty.1
The test had to be one that could be administered and marked
easily, given the numbers involved. A multiple choice format was chosen
as an objective test and so that a template could be used for ease of
marking. A short paragraph was also included to verify the results of the
multiple choice test. This was changed to a reading comprehension test
in the fifth year of the test as it was felt that this area of competence
needed to be checked so that we could feel reasonably confident that the
students who passed the test well were capable of coping with the
reading that they would have to do in English on their courses. The
ability to write well in English was also considered to be less important to
the students’ immediate curricular needs. Four versions of the same test
were produced in order to avoid copying, this was later changed to two
versions of the test because this was found to be both much easier for
the writers of the test and yet sacrificed nothing of the security aspect of
the testing. The test was made up of both grammatical and vocabulary
items as both of these areas were deemed pertinent and specific to the
language of science and technology which the students would need to
cope with in their studies.
1
It was considered that the students would be disinclined to accept our offer if their mean mark was lower
than this, which would defeat the object in view of reducing class size and not wasting student study time.
157
The results of the test were analysed and those students who
obtained a grade of fifteen or more were duly informed that they need not
attend classes and indeed had already obtained a final mark for the
course (fifteen and above). This does not mean that all the students thus
informed (about 10% of those who took the test) decided to accept this
result. No bar was placed on these students attending the course, if they
so wished, and indeed some did choose to attend. The students could
also choose to take the examinations at the very end of the academic year
if they felt that they could do better than they had initially. Some
students felt that this was possible after they had had access to the
materials used in the discipline from which they could then study some
of the relevant scientific and technical English which they perhaps felt
they were unsure about initially.
This test was analysed and refined for use in future years but
results for the first years show that the major discriminators were
specific vocabulary and grammatical items such as the present perfect,
second conditional, gerunds, phrasal verbs and specific lexical items.
Increasingly items have been included in this preliminary test which
reflect the syllabus of the first year, items on pronunciation and
numerical knowledge for example.
The test that was used in the first academic year, 1993-1994,
consisted of fifty multiple choice questions and a short, 100 word,
paragraph on a given topic. The reason for this format was first and
foremost that it would be very time-consuming to administer any other
sort of test to such a large body of students and be able to publish the
results early enough so as not to take up too much of limited teaching
time. The written part of the test acted as confirmation of the result
158
obtained in the multiple choice test. The topics given on the first test
were:
i) The importance of computers for students at university.
As the students had to write only 100 words on one of these topics,
they had to be extremely concise. Writing such a short amount is often
more difficult than permitting the students to write as much as they
wish. Indeed, many students attempted to go beyond the specified limit
whilst others did not even attempt the written section at all. The
questions also required an expository or argumentative style of writing.
Although discrete item tests are not considered very valid, they do
have the advantage of being reliable. As Weir (1988) points out, the test
can also be made more valid by taking into account the needs of the
students on their individual courses. The different departments took the
optimal view that students would need all four skills of reading, writing,
speaking and listening in order to pursue an academic career (see later
4.2 Needs Analysis by University Department) but the constraints imposed
by the length of time available for the discipline meant that the goals
would have to be somewhat more short-term and reflect the arguably
more receptive skills of reading and listening. The latter could only
represent a small percentage of the whole syllabus even so. Therefore, the
syllabus that the students’ would subsequently pursue could not be
considered communicative in any modern sense of that term. Reading
and some listening would form the bulk of the syllabus and these would
be approached in a way that could give the students ‘enabling skills’ in
the hope that given time they might build on what could be
taught/learned in such a small space of time. In other words, the
methodology used would be as learner centred as possible in order to
159
meet the needs of the individual students as far as this could be
achieved. The test then had no reason to reflect other methodological
aims. A greater allocation of course time and resources would have
behoved a more comprehensive test.
The questions on this first test started with the simple present
tenses, negatives and question forms and went on to modal
constructions, conditionals, phrasal verbs and passive constructions. In
other words the accepted ‘easy structures’ to the more complex. Some
specific vocabulary questions were also included. The results showed that
approximately ten per cent of the students enrolled had achieved a mark
of fifteen or more and could then be released from the discipline.
However, evaluation of the test results also showed that ten per cent of
the students could not competently handle what are considered basic
structures. For example, present simple question forms, present
continuous and present perfect tested in the following way:
Questions of the type:
3. ____________________ coffee?
A Do she like B Likes she C Does she like D Like she
caused nine per cent of those tested to make an error and eleven point five
4. I ____________________ English.
A am study B studying C studies D am studying
Most of the students tested, that is 96%, could not answer the following
correctly:
160
Certain items like specific vocabulary, prepositions and the subjunctive
More than 50% of the students could not manage questions on the
second conditional (67%), indirect question forms (60%), the future perfect
(62%), “suggest” with a direct and indirect object (70%), the phrasal verbs
161
“get over” and “put in for” (54% and 61%) and a further five vocabulary
questions including such items as “traffic jam” and “experiment” (50% and
62%). With such generalised difficulty, the syllabus must obviously take
such language deficiencies into account as teaching syllabi always
consider the average student. Extreme positions whether higher or lower
are inevitably for a smaller number of students and so those who
represent the middle ground, median or more properly the standard
deviation of + or – 1 on the normal curve found from testing are always
those taken as the ‘average’ students for whom any course is designed.
This is contrary to many of the older systems, particularly of higher
education, which aimed to teach an elite group with all others falling by
the wayside. The numbers of students involved in modern education in
developed countries necessitates an attempt to raise the general level of
education of all of those involved in the education process and
necessitates new methodologies to achieve this aim. Therefore those test
results that show widespread difficulty but not almost total impossibility
for students are taken as items that need to be included in the syllabus
in order to raise the standard of English of the majority of the
undergraduates.
The test that was used in the second academic year, 1994-1995,
also consisted of fifty multiple choice questions and a short, 100 word,
paragraph on a given topic. The reasons for this being exactly the same
format as that in the first year was that the numbers involved continued
to prohibit almost any other practical possibility. However, this time the
test items were altered to incorporate some of the items considered
fundamental to the course as it had been taught in the first year of
operation of the Ano Comum. Other items that were considered to be
inadequate discriminators, after the test results had been studied for this
162
purpose specifically, were eliminated. Thus further validation of the test
was incorporated without sacrificing either any of the reliability of the
test, its objectivity or, above all, its speed of administration and
correction.
Overall results were now also available about pass rates and grades
of the first year on this foundation course and these results also
validated the test in that the percentage for allowing students to choose
not to take the course at all equated well with that of all the students of
the year reaching a high grade that is, 15 or more (approximately 12%).
The items considered fundamental that were now included in the
test covered both the specific vocabulary that had been taught during
1993-1994 and an attempt to assess the student’s awareness of
pronunciation. The results this time showed that approximately 13% of
the students had reached a level which was considered adequate, and
could be released. The proportionate increase in the number of students
released was most likely to be due to the fact that, when the answers to
the query about the numbers of years they had studied English were
collated, it was found that 57% of the students had studied six years or
more (43% had studied five years or fewer) whereas in 1993-94 33.4% of
the students had studied six years or more and 66.7% had studied five
years or fewer (see 3.2.1 The Students’ Level of English).
Nevertheless, when the answers to the multiple choice questions
were once again analysed, it was found that the students continued to
have difficulties with modals (97%), direct and indirect objects after
“suggest” (63%), phrasal verbs (71% and 83%), reciprocal pairs (63%) and
the subjunctive (89%).
It was perhaps less surprising to find more than 75% of the
students having problems with those more specific vocabulary items that
had been introduced. Questions like:
163
When light enters another medium it ________________ .
A reflects B absorbs C bends D glows
More than half of the students (52%) could not identify the sounds
of the alphabet in the following:
The letter “A” does not contain the same sound as __________ .
A “J” B “K” C “H” D “Q”
164
There would appear to be some strange discrepancies here. If
students find phrasal verbs as easy as (or easier than) telling the time or
the present perfect, and if the future perfect is easier than questions
using an auxiliary, something is apparently going wrong somewhere,
given that, in the first test, 62% of students had difficulty with the future
perfect question and 54% and 62% with the phrasal verb questions.
Although it is difficult to say what the exact cause of these phenomena is,
it may be attributable to the fact that more emphasis may have been
placed on what is considered difficult in previous teaching/learning
situations and so the students have fixed these items better. The
similarity or difference between English and Portuguese in some of these
structures, such as the future perfect, may explain that what is difficult
for other students is not necessarily so for Portuguese students because
of similarities between the languages and vice versa with other structures
such as the present perfect which is not used in the same way in the two
languages.
It is even possible that the idea of what is difficult for students to
learn is in fact incorrect. McDonough (1980:311) says
165
Nevertheless, in terms of course design it does indicate that certain
“basic” items cannot be ignored if 25% of the students cannot cope with
them adequately, nor should we assume that time has to be spent on
teaching lists of phrasal verbs when the students are, in the majority,
able to cope with the more common ones adequately. Indeed, work done
on corpus studies for the COBUILD project suggests that six common
verbs account for nearly 30% of the ‘most important’ phrasal verbs as
mentioned earlier. The use of corpus studies to decide many such
questions for course and materials design is becoming increasingly
important (see Wichmann, Figelstone, McEnery, Knowles (eds. 1997),
Biber, Conrad and Reppen (1998), McCarthy and Carter (1994), Stubbs
(1996))
Only two topics were offered, because when three topics had been offered,
one was invariably largely ignored. These topics required the students to
use either the past tense as in (i) or the future as in (ii).
The fifty multiple choice questions gave the following error percentages:
166
Question inversion 24% Adjective + enough 27%
Reciprocal Verbs 25% Future Perfect 17%
Present Perfect 22% Subjunctive 91%
Telling the Time 21% Passive 24%
Present Perfect 11% Pronunciation 81%
Simple Past 16% Modal verb 60%
Comparative 17% Pronunciation 47%
Adjective Alphabet
Time Clause 25% Conditional 32%
(advice)
First Conditional 34% Reciprocal Pairs 84%
Superlative 15% Relative Pronoun 25%
Advice (had better) 94% Preposition 83%
Direct Object (lack 74% Phrasal Verb 50%
of)
Modal 21% Conjunction 36%
Second Conditional 47% Vocabulary 54%
Infinitive 56% Vocabulary 84%
Indirect Question 36% Vocabulary 67%
Modal (past) 32% Possessive Pronoun 37%
Future Continuous 73% Preposition 61%
Irregular Verbs 74% Vocabulary 15%
Past Continuous 18% Vocabulary 32%
Past Perfect 46% Vocabulary 49%
Past Tense 39% Reciprocal Pairs 59%
Phrasal Verb 67% Reciprocal Pairs 44%
167
gerund), modals, and pronunciation recognition. Perhaps more surprising
on this test is that one of the questions on the present perfect proved to
be an inadequate discriminator in that it was correctly handled by almost
90% of the students taking the test. This particular question was the
following:
7. She ____________ to England
A. have never been B. has never be C. has never been D. have never be
Giving advice using “had better” was the question 94% of the students
got wrong. The difficulty here is almost certainly the fact that the full
form “had better” as opposed to the contracted “’d better” was given. This
allowed confusion between “would better” and “had better” in the
distractors. The question was the following:
The test for this academic year continued much as before with one
or two modifications such as the topics for the paragraphs which were
changed to:
i) The worst dangers of pollution.
ii) What the world will be like after the year 2000.
168
Table 4.2. Analysis of 1996/97 Test Results by Item.
Test Item Percentage Test Item Percentage Error
Error
Pronouns 7% Conditional 38%
Past Tense 35% wish + past perfect 36%
Question inversion 27% Adjective + enough 25%
Reciprocal Verbs 29% Future Perfect 22%
Numbers in Words 78% Subjunctive 89%
Telling the Time 38% Numbers 29%
Present Perfect 58% Pronunciation 87%
Simple Past 25% Modal verb 50%
Comparative 31% Pronunciation 53%
Adjective Alphabet
Time Clause 23% Conditional 32%
(advice)
First Conditional 33% Reciprocal Pairs 67%
Spelling 19% Graeco-Latin Plural 69%
Advice (had better) 89% Preposition 92%
Direct Object (lack 76% Phrasal Verb 48%
of)
Possessive Pronoun 15% Conjunction 28%
Second Conditional 43% Vocabulary 52%
Infinitive 50% Vocabulary 92%
Indirect Question 58% Vocabulary 74%
Modal (past) 26% Possessive Pronoun 39%
Future Continuous 70% Preposition 64%
Irregular Verbs 72% Vocabulary 53%
Conjunction 31% Vocabulary 24%
Past Perfect 44% Vocabulary 46%
Comparative 35% Reciprocal Pairs 64%
Phrasal Verb 56% Reciprocal Pairs 43%
169
Many of the results on this test continue to confirm what had been
found in previous tests but the inclusion of new items, such as numbers
in words, tested competence in other areas which pertain ever more
closely to the students’ future studies. These most likely were not taught
at all in school. The question on numbers in words includes the different
use of the comma and full stop between Portuguese and English. The
comma represents a decimal point and the full stop division into
thousands in Portuguese and vice versa in English.
The following question caused 74% of the students difficulty:
Similarly, the Graeco-Latin plurals were tested and showed that the
students had difficulty here too. Once again this is probably because
these plurals had not been specifically taught.
The following question caused 69% of the students difficulty:
37. Individual teachers may use different ________ when marking the test.
A. Criterions B. criteria C. criteriae D. criterii
171
(contrast)
Conjunction 15% Graeco-Latin Plural 79%
(reason)
Conjunction 48% Relative Pronoun 17%
(reason)
Conjunction 23% Vocabulary 77%
(contrast)
Conjunction (cause 32% Vocabulary 75%
and effect)
Pronunciation 35% Conditional 26%
Vocabulary 43% Metric/Imperial 69%
Equivalence
Conjunctions (cause 39% Reciprocal Pairs 67%
and effect)
Superlative 52% Conjunction 17%
(contrast)
Adjectives 71% Translation (false 51%
friends)
Pronunciation 25% Vocabulary 85%
Comparative 80% Conjunction 28%
172
subjects areas. This shows the detrimental competitive nature of
educational systems which can lead to emphasis often only being given to
the grade obtained, rather than to ability or performance. A change of
emphasis would focus on cognitive knowledge and would preclude
specific ‘teaching for the test’ to be given for a certain test score to be
achieved.
One other area that has obviously continued not to be given much
stress in school curricula is numbers2. As was the case in 1996 with 74%
of students mentioned earlier, 71% of the students in 1997 could not
handle the correspondence between a number and its form in words.
Of almost equal difficulty (69%) was metric to imperial
equivalences, in this case recognising the nearest equivalent to 100 yards
in metres. This, as was mentioned earlier, may have nothing at all to do
with the language involved but be much more a question of cultural
knowledge, recognising the difference between measurement systems in
different countries. Although students studying in the Ano Comum were
not expected to learn the equivalent measurements and conversions of
metric to imperial measurements and vice versa, because this
information is readily available in a good dictionary, students were
expected to have some idea of the relative sizes so that logical
assumptions could be drawn. To take an example: If a student were faced
with the sentence Scientists can calculate the distance of the earth to the
moon to within six inches. The student should recognise that six inches is
not the distance from the earth to the moon as this is equivalent to a
distance of approximately 15 cm. and, therefore, this sentence must
rather be a discussion of accuracy of measurement and not the actual
measurement of the distance mentioned.
2
A similar test was tried on the students in the fourth year of their teaching degrees which showed that these
future teachers also had difficulty with numbers and were often unaware of the contrasts between the
English and Portuguese use of the comma and point in numbers. It is not surprising, therefore, that those
students who had only studied English in school should find this item difficult, a situation which is
unlikely to change significantly in the near future.
173
Similarly, 80% of the students could not distinguish between the
relative sizes of a British billion and an American billion, (1012 and 109
respectively). Students need to have the ability to question such items
and not merely to assume that the same thing is meant by all those using
the same word, in this case the word billion. However, it is arguable that
this item would in fact cause any difficulty for these undergraduates
because in Portuguese um bilião stands for a thousand million exactly
like the American measurement and the aberration here is the British
measurement of a million million which may be on the point of fading out
of use3.
Other frequent items like the adjective wide caused considerable
difficulty (71% of students got this item wrong). This item appears in the
frequency list 1515 times in 1876 articles and in both the physics and
chemistry corpora studied here, which would suggest that this adjective
is essential for undergraduates and has not been learned by almost three
quarters of those entering the university.
False friends (eventually - eventualmente) and specific vocabulary
(clerical work associated with offices not the clergy) items were the worst
overall items causing 90% and 85% respectively of the students to make
errors. Graeco-Latin plurals also caused considerable error (79%)
although these are often seen to be significant in scientific writing (see
later results 5.1.6 Plurals from Latin and Greek for a further discussion of
this).
Conditional sentences and question forms did not appear to cause
undue difficulty for most of the students who took this test with 26% and
20% making errors on these items. Some linking devices and deictic
pronouns caused more difficulty than others; this could be attributed to
3
Recently (BBC World Business Report 8/4/99) even British use (certainly in economics) has tended to
favour the American thousand million. The answer, aired on the BBC World Business Report programme,
to an e-mail inquiry from a viewer confirmed that the BBC were in fact using the American billion in their
reporting.
174
their being relatively unusual. Only 17% of the students had difficulty
with these but 55% had difficulty with thereby and whereas.
It cannot be taken for granted that the students do not know some
items included in ‘advanced’ grammar books, nor can it be assumed that
the basic structures have been learned soundly. What these results do
suggest however is that it would be of more use, for example, to include a
list of irregular verbs for the students to take away and learn rather than
a list of phrasal verbs. In other words, the students’ proficiency profile
must guide the choice made about what language should be taught in
the first year.
In addition, corpus analysis (on the LOB and the Brown corpora)
for a syllabus for use with students in Germany by Mindt (and Tesch
1990) (1997:40-50) has shown that with careful grading students can
learn a higher percentage of the most frequent and therefore important
irregular verbs (apart from be, have and do) even if they stop learning
after a short period of time. They contrast their list with alphabetical lists
of the kind normally presented for students to learn and show that after
learning five of the verbs (say, make, go, take, come) on their list the
students will be “familiar with 27 per cent of all irregular verb forms of
English”. The corresponding figure for the alphabetical list is 3.6 per cent
(beat, become, begin, bet, bite). After learning ten of the verbs on their list
(say, make, go, take, come, see, know, get, give, find) the student “has
mastered 45.6 per cent of the verb patterns of irregular verbs”. The
combination of what the student already knows and the results of
corpora analyses like the one described here can provide a much more
reasoned syllabus that aims to make maximum use of the students’
study time. Mindt (1997)and Tesch (1990) like Renouff (1984) also find
that their corpora analyses show the discrepancies that occur between
actual language use and what is presented in coursebooks. (Tesch
studied some and any, Mindt modal will and would and Renouff see).
175
These are taken up in more detail later in Chapter 7. Similar research on
the corpora shows that certain irregular verbs are more suitable for the
students of science and technology (see 6.3) and would cover the
difference in the transition from secondary to tertiary education for
undergraduates.
In order to meet the demand for English language teaching for the
Ano Comum in accordance with what the departments whose students
are involved in the foundation year feel is necessary, a simple needs
analysis was requested from colleagues in other university departments.
Colleagues were asked to indicate their views as to why they thought it
necessary for the students to study English language for their courses.
This was the first stage in the needs analysis, the results of which are
given below.
Colleagues in other departments asked that the students be able to
speak fluently, read fluently and listen effectively. Below is a
representative sample of what our colleagues in other departments
perceive as the English needs of their students taken from the replies
received to the initial simple needs analysis which consisted of a letter
from the English Area Co-ordinator to the different departments involved
in the Ano Comum asking for their comments. Nine replies were received
relating to different courses included in the foundation year. Not all
departments replied. The replies received were from the co-ordinators
responsible for the following courses: Mathematics (Teaching of); Applied
Mathematics and Computation; Geological Engineering; Biology; Biology
and Geology (Teaching of); Ceramics and Glass Engineering; Materials
Engineering; Engineering and Industrial Management.
176
• Inglês coloquial
Consideramos que a disciplina de inglês dever ter como objectivo, entre outros,
The third point given in this needs analysis, that of being able to deal
with bibliographies that were largely in English, was taken as particularly
pertinent to all of the students involved in the foundation year. The aims
put forward by these replies are obviously ideal and would stand the
students in good stead for their futures both in terms of further study
(possibly abroad) and in their careers. However, reaching the ideal is
177
limited by a number of constraints which few of these same respondents
cared to acknowledge in the curriculum they had created.
4.3 Constraints
4
The distance-learning courses are run by the University of Aveiro through the Internet and are open to all
students who must enrol and obtain a login to be able to access the relevant material. Working students
are obviously targeted by this scheme of work besides those repeating disciplines. Other systems, like
course tutor support schemes have been instigated to help students to plan their studies and cope with the
psychological strain of the move to a university environment which studies (Tavares et al. 1996)
mentioned earlier had shown to be a reason for lack of success in the first year.
178
The third constraint was that the students on the course were not
homogeneous, either in terms of the number of years they had studied
English (see earlier Chapter 3, 3.2.1 The Students’ Level of English) or in
terms of the subject which they had chosen to pursue for their degree.
The latter was again a change, as in the past some courses had had
English as a discipline and these groups had been homogeneous in terms
of their degree subject, for example, in Management and Electronics. This
meant that subject specific coursebooks such as English for Electronics
could be used for the appropriate group but that now this was no longer
possible.
The overall percentages of students with more than five years of study
of English has gradually been creeping up from the initial 67% with five
years or fewer of study and 33% with more than five years of study, to a
complete reversal of this situation in the academic years 1997/98 with
32% with five years or fewer of study and 68% with more than five years
of study as mentioned earlier in section 3.2.1. This trend could be
accounted for in a number of ways; the information about the foundation
course year may well have become better known among those students
who were hoping to avoid language study by opting to take up other
courses, in other places, although this seems unlikely; it may be that the
a general trend in the secondary schools to teach more English has
filtered through to higher education; or it may simply be a reflection of
the fact that all the students entering the science and technology courses
have now decided to try their luck at the preliminary examination and so
these figures are a much more complete representation of the student
intake. A trend towards more years of English in secondary schools is a
positive development as it will only help students at university to come to
grips with studying through English language textbooks.
Nevertheless, the aim of the foundation year to give all of the student
intake into science and technology courses a sound basis on which to
179
build their studies in later years and to maximise their learning potential
appears to be being subverted. It seems to have become another hurdle
for students to jump so that strategies like evasion, gamesmanship and
cramming are encouraged. It is interesting to speculate whether students
would voluntarily study English if their departments recommended but
did not insist on it. It is possible that quite sophisticated translation
services would be set up on the periphery of the university if certain texts
were seen to be essential and were only available in English. Students
traditionally claim that they have too little time for their studies and this
would be one means to avoid having to study English at all. This line of
thought leads directly on to the subject of motivation in the students
once again. If the subject is only seen as a hurdle to be jumped, the
students’ focus will necessarily be on test results rather than on
achieving their maximum potential. This means that the course will have
to subversively achieve this end against the wishes of those students who
take this position. Students who are at the bottom of the scale with few
years of English may also be discouraged from the outset or may decide
to take further courses in English outside the university. One means of
persuading the students to focus on learning more English is first and
foremost by encouraging them to attend most of the classes. This has
been successfully achieved by linking evaluation with class attendance.
Students are offered different evaluation schemes depending on whether
they attend a minimum of two thirds of the classes given. This is quite a
normal procedure in the university for practical classes and those that
use continuous evaluation of students so there is no real difficulty in
applying continuous assessment to these classes. One other effect of this
scheme is that the need for hundreds of individual interviews,
traditionally taking twenty minutes each with all the practical
administrative difficulties that this implies in a short examination period,
is avoided.
180
The other means to try to engage the students is to make the work as
appropriate as possible for them. This can be approached through the
analysis of the language of science and technology and the language
requirements of the books in English on the bibliographies which they
have to deal with. Chapter five will take up this aspect of the students’
language needs.
181
Chapter 5 Scientific English for
Undergraduate Learners
Analysis of Results
Chapter 5
Learners
1
Minugh (1997:79) in his use of Newspaper CD-ROMs for teaching at the University of Stockholm also
mentions this lack of information about the actual size of published CD-ROM material to be one of the
disadvantages, although he hopes that the companies involved in producing them can be persuaded to
incorporate this information in the future. He further laments the fact that “the most frequent words are
classified as “noise” and cannot be searched for.”
2
These are sometimes referred to as structure words of which it is estimated that there are about two
hundred (see Bowen, Madsen and Hilferty (1985:194)) or form words (see earlier 3.1.7 Bright and
McGregor 1970)) or function words (see Biber et al. (1998:29)). However, it is clear that the Grolier
includes any very common word and not only words such as articles, prepositions, and pronouns.
3
Sinclair recommends that the same areas of books are not studied in case they demonstrate only one
specific variety of English, for example the English associated with introductions or first chapters. See
later bibliography for a similar justification of Sinclair’s hypothesis.
184
When an item is to be found in these texts the word is marked with
the text which contains it. For example, able occurs in both the
chemistry and the physics corpora and so this word is marked with the
letters C and P after the range and frequency figures given in the
multimedia encyclopaedia listing.
Some interesting anomalies occur with the words listed including
those given in the Grolier encyclopaedia as ‘too common’. Under the letter
‘a’ across and around do not appear in the 50,000 words of the chemistry
text at all, under ‘b’ be, became, become, been, begun, being, and
bibliography do not appear in either of the corpora and only by appears in
both corpora. It is not surprising that the word bibliography should not
appear in these corpora as the bibliographies were not included in the
corpora as they traditionally occupy a position at the end of the textbook
which was not taken to be representative of either physics or chemistry
texts per se.
The Grolier’s idiosyncratic pronunciation scheme is also to be found
in the list. Each of these items is identified by the abbreviation (pronun),
for pronunciation, immediately after the entry.
186
airfields *animals 1122/2441CP *architectural 392/607
177/186 *annexed 186/207 *architecture 1028/2856
airplane 115/182CP announced 233/277 *archive 978/2752
airport 153/206P *annual 1033/1512C *area 2756/4964CP
*album 118/191 *annually 410/474C *areas 1820/3630CP
*alcohol 221/440C another 2209/3383C argued 327/394
*algae 152/340 *anthology 113/134 argument 111/148P
alive 132/140 *anthropology 102/218 arid 201/298
alleged 123/134 *anti 448/599 arise 183/205P
*allegorical 101/132 *antibiotics 119/197 *aristocracy 123/145
*alliance 316/472 *anticipated 133/143 *aristocratic 157/184
allied 407/685 *antiquity 124/136 arm 215/270P
allies 282/465 anything 114/121 armed 311/413
allow 362/411CP apart 320/368CP armies 249/454
allowed 589/708C apparatus 117/152CP arms 362/635
allowing 231/253CP *apparent 368/431 army 1255/2496P
allows 280/319CP apparently 429/488P arose 276/318
alluvial 108/149 appeal 210/261 aroused 154/159
almost 1735/2419CP appeals 106/134 arranged 361/410C
alone 421/471C appear 697/872CP arrangement 269/335C
*alphabet 101/247P *appearance 647/777C arrangements 180/217CP
already 546/653CP *appearances 102/107 array 155/181
alter 111/123CP appeared 1049/1335C arrest 112/139
*altered 198/223C *appearing 125/129C arrested 190/204
alternate 126/137 appears 553/643C arrival 233/261
alternating 145/190 *application 362/470C arrived 338/420CP
*alternative 230/276CP *applications 359/553CP *art 2573/7631CP
*altitude 239/359CP applied 1011/1315CP article 183/278
*altitudes 116/143C applies 103/107CP articles 314/421
aluminum 287/495CP apply 159/183CP *artifacts 155/192
always 783/959CP applying 147/158CP *artificial 456/672
amateur 146/227C appointed 920/1078 *artisans 117/141
*ambassador 149/181 appointment 168/187 *artist 683/
*ambitious 157/172 approach 551/783CP *artists 665/1236
amendment 251/691 approaches 205/265CP *arts 1051/1704
amino 103/286 appropriate 327/380CP ash 136/185
amount 657/1036CP *approval 150/194 aside 152/169
amounts 490/690CP *approved 246/296 asked 152/162CP
*analysis 572/894CP *approximately *aspect 224/255P
*analytical 106/129C 842/1073P *aspects 498/617CP
analyzed 124/144C Apr 1651/1796 *assassinated 183/217
*anatomy 199/306 April 508/678 *assassination 177/234
ancestor 152/173 *aquatic 170/259 assembled 106/122C
ancestors 154/192 arc 170/314P *assembly 529/862
*ancestral 105/136 arch 112/209 asserted 116/134
ancestry 120/144 *archaeological 263/349 assigned 211/230C
ancient 1900/3090 *archaeologists 117/152 assist 114/121P
angle 190/357P *archaeology 189/368 assistance 237/326P
angles 165/264P *archbishop 147/201 assistant 259/289P
*angular 126/188 *architect 508/761 assisted 157/165
*animal 1002/1793CP *architects 236/382 *associate 182/198
Field Code Changed
187
*associated 1197/1526CP attempted 560/658P *autobiographical 217/264
*association 763/1080 attempting 150/157P *autobiography 496/562
*associations 150/193 attempts 613/723P *automatic 133/219
assume 142/151CP attend 113/137 *automatically 115/139
assumed 548/646CP attended 327/343 automobile 289/420CP
assumption 123/154CP *attention 689/842 automobiles 186/237C
*astronomer 185/253P *attitude 116/155 *autonomous 190/252
*astronomical 193/310P *attitudes 177/230 autumn 153/199
*astronomy 302/633P *attorney 209/280 availability 100/108
athletic 104/151 *attract 205/242C available 825/1150CP
*atlas 171/281 *attracted 397/436 avant 176/217
*atmosphere 469/977C *attraction 161/187C average 975/1647CP
*atmospheric 241/377C *attractions 101/109C averages 226/250
*atom 214/594CP *attractive 173/204C averaging 116/130
*atomic 440/1037CP *attributed 271/296 avoid 228/266CP
*atoms 342/971CP audience 266/394 award 468/655
attached 461/606CP audiences 161/227 awarded 411/488
attack 526/754C Aug 1710/1875 awards 214/291
attacked 339/403 August 562/799CP aware 126/138CP
attacking 109/124 *author 732/909 awareness 138/155
attacks 328/425 *authorities 310/366 away 670/825CP
attain 126/130CP *authority 646/1000 axis 230/454CP
attained 172/187 *authorized 147/169 ay (pronun) 250/256
attempt 703/847C *authors 186/254
The lists obtained for the rest of the alphabet are given in Appendix
B with these Portuguese cognates, as defined above, removed. The words
not included, on the grounds that they are too common, have been shown
in italics at the end of the respective list for each letter of the alphabet.
188
Table 5.2 Frequency and Range Results for Abstract Nouns and Adjectives.
Length Long
1133 Articles - 1573 Occurrences 4041 Articles - 6701 Occurrences
20 – measurement 46 - plant
17 - metric system 45 - mammal
13 – lens 40 - bird
11 – fish 24 - flower
Width Wide
175 Articles - 216 Occurrences 1515 Articles - 1876 Occurrences
7 – dendrochronology 10 - sound recording and reproduction
5 - river and stream 9 - ice hockey
8 - television
7 - mammal
Breadth Broad
38 Articles – 38 Occurrences 651 Articles - 802 Occurrences
1 – anthropology 11 - plant
1 – dimension (mathematics) 7 - antibiotics
5 - Antarctica
5 - mammals
Depth Deep
350 Articles – 496 Occurrences 829 Articles - 1177 Occurrences
13 – depth charge 36 - deep sea life
10 – perception 13 - ocean and sea
8 – water wave 13 - syntax
7 - gulf and bay 9 – geosyncline
189
At least two of these areas will be unknown to the average
humanities trained teacher: dendrochronology and geosyncline.
Dendrochronolgy is the science of using tree rings to date structures and
events or to reconstruct past environmental conditions. A geosyncline is a
large, usually elongate depression in the crust of the earth which during
subsidence has accumulated very great thicknesses (thousands of meters)
of sedimentary and usually also volcanic, rocks. The latter is therefore a
part of geology.
From these results it is possible to conclude that the presentation
and practice of these adjectives and nouns in teaching materials would
have to be through different contexts or settings if a ‘natural’ use of such
items was to be given. Length should, from this perspective, be presented
in a physics context for example, whilst long would be more ‘naturally’
presented in a context of biology. The noun breadth has such a low
frequency that it could be ignored but broad should be in a biology context
while wide seems more ‘at home’ in an electronics context. The results for
tall and high as ‘corresponding’ adjectives for height once again show that
different scientific settings are used with each; biology for tall and height
and physics and electronics for high, although height could also be
included in physics, chemistry or maths.
It would be possible for any item or sets of items to be examined in
this way to contextualise the setting of any of the vocabulary or syntax
that has been identified for the syllabus. This would then show the use
and meaning of these items in scientific settings, which as was discussed
earlier in Chapter 2 , does not necessarily correspond to the use or
meaning in everyday contexts.
5.1.4 Abbreviations.
Three letter abbreviations for the months of the year appear much
more often than the full words for these in the frequency lists. Such
findings would obviously not imply that the students should only be
taught these abbreviations but they do imply that these features are the
natural ones to be included in the students’ study materials. An obvious
application here is in Tables, Figures or Graphs where abbreviations of
this type are to be expected. Field Code Changed
191
5.1.5 Pronunciation Conventions.
4
It is interesting to note that phonetic transcription is not usually included in bilingual and Portuguese
dictionaries which may be a reflection of the fact that there is a general idea that Portuguese is a ‘phonetic’
language and that, therefore, transcription is not necessary. This is an increasingly debatable proposition
especially with new spelling conventions coming into being.
192
forms in the Grolier encyclopaedia reveals that there is usually a
substantial difference between the frequency of either the singular or the
plural of Latin and Greek root words. Some simply fail to appear at all,
parentheses (10) is a case in point where the singular parenthesis does not
appear at all in the corpus. Occasionally there is a regular ending applied
to a Latin or Greek root word such as indexes together with indices and of
almost the same frequency (36 and 31 respectively). However, in the case
of formulae and formulas the latter is much more frequent than the former
(4 and 88 respectively). There are very few singulars and plurals that
appear in almost equal numbers; nova (10), novae (10) and novas (10) and
stimulus (146) and stimuli (122), all the others encountered show marked
preference for one or the other form.
Sinclair (1991:67ff) demonstrates that the of singular and plural
forms of nouns are not equivalent, by documenting the different
patterning of eye and eyes. He finds that “There is hardly any common
environment” between the two word forms and they “do not normally have
the capacity to replace each other”. The plural co-occurs with adjectives
such as blue, brown, covetous, manic. The singular hardly ever refers to the
anatomical object, except when talking about injury or handicap. The
singular and plural also occur in different sets of fixed phrases (all eyes
will be on, rolling their eyes, turn a blind eye, keep an eye on). It is this sort
of analysis which highlights the fact that lexis and syntax are totally
interdependent.
193
Many authors have recognised the importance of Latin and Greek
terms in scientific texts but these have tended to be taught in lists of
singular and plural forms rather than in specific contexts where either the
singular or the plural would be most appropriate. The kind of language
manipulation exercise where students produce the singular or plural of a
Graeco-Latin word is generally considered unsuccessful as a
teaching/learning strategy and would ignore the semantic differences
inherent in scientific contexts. For example, in the Physics and Chemistry
corpora only two Latin and Greek singular and plural forms exist together.
These are: bacteria - bacterium and axis - axes. Formula - formulas are to be
found in the Chemistry corpus but no formulae. All other Graeco-Latin
words are only found either in the singular or in the plural only. For
example, analysis, apparatus, appendix, criterion, data, parentheses.
The Latin and Greek roots and affixes described by Strevens
(1978:193) given earlier in 3.1.8.4 are not found in their entirety in the
Physics and Chemistry corpora either. His cyto, plasma, pyro, ante, and
post, are found in neither the physics nor the chemistry corpora and many
of the other prefixes like hydro and poly are in combinations such as
hydrogen and polygon which are cognates with Portuguese. Similarly with
his suffixes, -ite is most often found in words such as white and write and -
valent in equivalent. The Grolier does present examples of all of the
prefixes and suffixes mentioned by Strevens but once again closer
examination shows that the majority of entries for the prefixes and
suffixes are not of the type anticipated by him. For example under the
prefixes hydro- and anti- the largest number of entries refer to hydrogen
and antiques. It would seem therefore from these results that these items
need not be heavily focused upon in the syllabus.
White (1998:275) argues for a distinction to be made between
science and technology texts. He (ibid.) says that science texts show
preference for non-vernacular Latin/Greek borrowings, whereas
194
technology texts prefer elaborated nominals where all the elements are of
vernacular derivation together with acronyms or provisional or ‘proto-
nouns’. A proto-noun is a word that is now commonly used as a noun but
was originally an acronym such as scuba (self-contained underwater
breathing apparatus) or laser (light amplification by simulated emission of
radiation). However, White (1998:285) claims that the “fact that classical
scholarship is no longer so widespread may offer a part explanation as to
why Greek/Latin coinings have declined” in science, even though “they
remain the norm”. The textbooks studied show a combination of both the
science and technology features that White found with the Latin/Greek
borrowings mentioned above, together with vernacular and proto-nouns
such as scuba.
195
only three could be considered sufficiently different from Portuguese to
warrant attention (amateur, apparatus and available). It would seem from
this small comparison that it is true that the results found through
intuition and those found in reality through empirical research are
different. However, it could be that the genre studied here differs
considerably from that used by Bright and McGregor about which,
incidentally, they give no specific information.
As mentioned earlier, Hoffman (1981), Darian (1981), Weber (1981)
and White (1998) all argue that even the same lexical item can (and
usually does) take on new meaning when used in a scientific text. White
(1998:285) further suggests that the use of acronyms in technological
texts is because these are “eminently well equipped to meet the constant
need for new lexis to map the ever-unfolding reality of technological
development”. The fact that only part of the lexis of science and technology
can be seen to be in any way stable suggests that any study such as this
will only be representative as long as the textbooks are considered
sufficiently up-to-date for use with undergraduates. The textbooks that
have been selected for use in this study continue to be in the students’
bibliographies. Despite the fact that newer editions have recently been
published, these have not undergone any significant changes.
196
5.2.1 Typographics.
can be constructed ……
See Section 1.7 to review In SI units, pressure is measured in pascals (Pa), defined as
the definition of a newton
one newton per square meter:
“Effusion
Whereas diffusion is a process by which one gas gradually mixes with another, effusion is
the process by which a gas under pressure escapes from one compartment of a container to
another by passing through a small opening. Figure 5.20 shows the …..”
198
5.2.2 Titles, Subtitles, Summaries and Conclusions
199
for in-class assignments whilst making smooth transitions from topic to
topic.
200
must go beyond the simplest interpretation of these features. Tarone et al.
(1981:201) find in the astrophysics journals that they studied that:
One of the most striking characteristics of the sentences used in these papers is
the fact that lengthy equations are embedded within them, and must be
arranged in such a way as not to interfere with the reader’s processing of the
basic grammar of the sentence. Because of end-weight such equations are often
placed at the end of clauses, and the use of active or passive verb forms is often
conditioned by this requirement.
If we had used the empirical formula HO for the calculation, we would have written
1.008 g
%H = × 100% = 5.926%
17.01g
16.00 g
%O = × 100% = 94.06%
17.01g
201
In Serway there is a similar situation with an entire paragraph
containing equations which must be read as part of the sentence
grammar:
v = ∫ a dt + C1
where C1 is a constant of integration. For the special case where the acceleration a is a
constant, this reduces to
v = at + C1
202
the undergraduate physics and chemistry textbooks being studied here
which often integrate graphics and equations. The example from Serway
given above ends with a graph showing the velocity versus time curve for a
particle moving with a velocity that is proportional to the time which was
represented by the formulae contained in the paragraph quoted and is
followed by a further paragraph extrapolating further and containing
formulae as part of the sentence grammar. Lemke (ibid.) suggests that
experimental-empirical reports tend to have more graphics, whilst
theoretical analyses have more equations. This might explain why the
textbooks have both together in order to survey the experimental work
which has been carried out and also to represent the theoretical
contribution made to the field by that work.
Tarone et al. only considered active and passive voice in the
combination of clauses and equations but in these textbooks many other
structures were found such as: conditionals (see example above),
contrasts and comparisons (Similarly,..., are compared as follows, however,
not etc.) alternative either...or constructions (see example above),
exemplifications (that is, in other words, is given by, as follows etc.), logical
conclusions (we can now write ..., we can write this as ..., we obtain ..., we
would thus write) and the expression of laws as formulae (This is called the
associative law of addition: A + (B +C) = (A + B) + C, and is known as the
commutative law of addition: A + B = B + A, Charles’ law: V ∝ T...etc.). The
syllabus for the undergraduates coming into contact with these types of
texts must take account of the integration of grammar and formulae and
equations, exploring the different types of sentence structure that they are
usually found and the means by which these are expressed in words.
Recognition of the oral expression of formulae and equations is important
for understanding lectures given in English and also for note taking.
203
5.2.4 Diagrams and Drawings
1st stroke: induction stroke: while the inlet valve is open, the descending piston draws
fresh petrol-and-air mixture into the cylinder.
2nd stroke: compression stroke: while the valves are closed, the rising piston
compresses the mixture to a pressure of about 7-8 atm.; the mixture is then
ignited by the sparking plug.
3rd stroke: power stroke: while the valves are closed, the pressure of the gases of
combustion forces the piston downwards.
4th stroke: exhaust stroke: the exhaust valve is open and the rising piston discharges the
spent gases from the cylinder.
The diagrams were presented in the order shown below:
204
The students could have adopted a number of strategies to find the
correct sequence. One of the obvious features was the fact that the
labelling had been retained in image three, which would therefore,
conventionally, make this the first diagram. The answer to this problem as
given by the encyclopaedia in the original was, stroke 4, stroke 3, stroke 1,
stroke 2. The discussion of this order invariably revolved around the fact
that there was inconsistency between the second image and the fourth
image. The second image does not conform to “forcing the piston
downwards” whereas the fourth image does. The students often failed to
take note of the ‘spark’ that image four displays but which is difficult to
detect in a black and white image like those shown above.
In other words the relationship between diagrams and the
supporting text is not as self-explanatory as may at first be thought. This
suggests that the syllabus must reflect this difficulty by presenting a
number of different types of relationships between visuals and the text. As
was mentioned earlier in 5.2.3 this could include equations and visuals
being read as part of a paragraph in the physics textbook with which the
students would need practice and would need to develop strategies to
overcome some of the difficulties encountered like the mismatch described
above. This is an area that is increasingly important with the change
towards much more use of visual representation in modern life both in
textbooks and in computers and specific discourse strategies which
encourage students to explore the relationships set up by visual
representation is increasingly important in education (Carter, Goddard,
Reah, Sanger and Bowring 1997).
Analysis of the Physics and Chemistry textbooks shows that
photographs, diagrams, graphs and drawings of apparatus are all found
together with formulae and examples in highlighted boxes. The relative
composition between pictures and text on each page is, on average, one
Field Code Changed
205
third visual to two thirds text, although these values can range quite
widely from between 70% for pictures and 30% text, and 90% text and
10% diagrams.5
Chang (1991:xxiii) says that their use of a “5-color design” will help
students “to visualise the appearance of compounds and various chemical
processes”. They also mention that in this edition they have added “many
full-color photographs and line drawings” and have introduced “A number
of marginal arts” to “enhance discussion and to accompany worked
examples”. Moreover, they have attempted to be consistent in their use of
colour to illustrate similar concepts “wherever appropriate”. In other
words, a principled approach has been taken by the textbook editors,
which the students need to be aware of, in order to benefit from the
insights these features should bring.
5
It is interesting to note that, even in some translations of textbooks into Portuguese, the labelling of
diagrams remains the same.
206
solid pedagogic principles. Chang (1991:xxii) claims to use a “Problem-
Solving Pedagogy” where students are “asked to examine the
reasonableness of the answer” they give to problems in the end-of-chapter
exercises. Students are expected to explain the “why” of chemistry through
the review questions, the “how” of chemistry through the problems and to
identify the “concept, topic or technique to be applied” through the
miscellaneous problems.
The way that these particular textbooks have attempted to overcome
Halliday and Martin’s ascription of pedagogic limitations is by dealing with
topical issues or everyday situations. Each of the chapters in this physics
textbook contains an essay on a topic of more general interest, but with a
very specific physics focus, and each chapter is followed by a number of
questions or problems based on laws dealt with in the chapter, but which
attempt to put the scientific theories into more popular situations such as
those of travel, sports, nature and so on. Chang (1991:xxii) explains that
“to define complex terms in a clear manner, and to explain difficult
concepts carefully” use is made of analogy, for example downhill skiing
and dynamic chemical equilibrium are used to introduce a chemical
concept. Similarly, the chemistry textbook claims in the preface
(1991:xxii):
207
and that this process of imagining concrete analogies is not “a reliable way
of gaining access to the experience of academic knowledge”. She (1993:59)
also claims that “Physics is notorious for alluring concrete analogies that
lead you falsely” and even suggests that teachers themselves use
inappropriate analogies in their teaching. Whilst it is not being suggested
that these textbooks lead students astray with their extensive use of
analogies they may be falling into the trap that Laurillard warns about
because the students are not capable of thinking scientifically and may
therefore draw the wrong conclusions from the analogies given. It is in this
way that even teachers can fall into traps as Martins and Veiga (1999)
explore in their study on training primary school teachers to teach science
to pupils by exploring contexts from their daily lives. They found that the
teachers themselves often needed to learn how to think scientifically in
order to overcome misconceptions before they could help their classes to
do the same.
Some of the titles of the real world or “Chemistry in Action”
analogies in the chemistry textbook show the diversity of the subjects the
students will encounter, for example these range from “The Scientific
Method and the Extinction of the Dinosaurs”, “Salvaging the Recorder
Tape from the Challenger”, “Black and White Photography”, “Breath
Analyzer”, “Scuba Diving and the Gas Laws”, “Fuel Values of Foods and
Other Substances”, “How a Bombardier Beetle Defends Itself”,
“Determining the Age of the Shroud of Turin”, to “The Thermodynamics of
a Rubber Band” and many more. The scope that these provide for
misconceptions is therefore quite broad depending on the ideas the
students already have on these diverse topics.
In addition, this textbook details nine supplements for use with it
including video and computer programs, these are: Student Solutions
Manual, Microscale and Macroscale Experiments for General Chemistry,
Study Guide, Instructor’s Manual, Test Bank, R-H Test, Overhead
208
Transparencies, Chemistry at Work Videodisc, Micro Guide. These
supplements are not available to the undergraduates through the library
which contains multiple copies of the textbooks themselves in English.
The latest edition of Serway references a world wide web site at the
University of Texas which will provide answers to students questions. It is
likely that this site was intended for those hundreds of colleges and
universities in the United States which he claims successfully use the
textbook. The tendency to provide more and more support material
through computers (Serway also has accompanying computer simulations
and spreadsheets) cannot be ignored. The syllabus for these
undergraduates will have to encompass computer literacy in order to keep
in step with the developments that are taking place in educational
materials for science and technology students.
Halliday (1991) makes a case for the validity of examining not only
lexical frequency in text but also grammatical frequency. He claims that
grammatical frequency is even more powerful than lexical frequency
because the system is closed and the number of choices is small, so that
significant probabilities can be calculated.
Biber standardised or “normalized” grammatical features found in
texts, that is, he standardised the raw data to reflect the frequency in a
thousand word extract by dividing the number of occurrences of a certain
grammatical feature by the total number of words in the text and then
multiplying by one thousand6. By observing this same level of scientific
rigour any text can be compared with Biber’s results in order to draw
conclusions about its position on the continuum of variation and thereby,
6
In Biber, Conrad and Reppen (1998:33) they refer to this process as a normed count. Field Code Changed
209
for the purposes of this research, to draw conclusions about the nature of
the text concerned and its consequent difficulty for students. The specific
definitions of each of the sixty-five algorithms used by Biber and in this
research can be found in Appendix A.
The values presented in the table below include normalised
frequency values and the chi-square test (χ2) to show which values are
significant. The degrees of freedom value (df) is one and, at the five percent
level, the critical value is 3.84. Yates correction was used on all values of
less than five.
Table 5.3 Normalised Frequencies from the Main Corpora compared to Biber’s Academic
Prose with Statistical Significance Values (chi-square χ2)
Linguistic Feature Physics Text Chemistry text Biber’s
Academic Prose
Past Tense 1.5 1.3 21.9
χ2 = -19.95 χ2 = -20.33
Perfect Aspect 0.9 1.0 4.9
χ2 = -4.13 χ2 = -3.95
Present Tense 44.2 69.2 63.7
χ2 = -5.97 χ2 = 0.47
Place Adverbials 3.7 1.3 2.4
χ2 = 0.26 χ2 = -1.06
Time Adverbials 1.8 1.4 2.8
χ2 = -0.8 χ2 = -1.29
First Person Pronouns 7.6 8.7 5.7
χ2 = 0.63 χ2 = 1.58
Second Person Pronouns 2.5 0.9 0.2
χ2 = 16.2 χ2 = 0.2
Third Person Personal Pronouns 4.5 2.6 11.5
χ2 = -4.89 χ2 = -7.68
Pronoun it 4.5 3.8 5.9
χ2 = -0.61 χ2 = -1.14
Demonstrative Pronouns 8.4 5.3 2.5
χ2 = 13.92 χ2 =3.14
Indefinite Pronouns 0.1 0 0.2
χ2 = -1.8 χ2 = -2.45
Pro-verb DO 0.1 0.1 0.7
χ2 =-1.73 χ2 = -1.73
Direct WH-questions 4.3 3.3 0
χ2 = 14.44 χ2 = 7.84
Nominalizations 51.8 49.8 35.8
χ2 = 7.15 χ2 =5.47
Gerunds 6.8 7.0 8.5
χ2 = -0.34 χ2 = -0.26
Total Other Nouns 155.7 159.1 188.1
210
χ2 = -5.58 χ2 = -4.47
Agentless Passives 19.5 11.9 17.0
χ2 = 0.36 χ2 = 1.53
By - Passives 3.6 2.8 2.0
χ2 = 0.61 χ2 = 0.05
BE as Main Verb 20.4 19.5 23.8
χ2 = -0.48 χ2 = -0.78
Existential there 0.5 1.0 1.8
χ2 = -5.04 χ2 = -0.94
that Verb Complements 4.7 3.6 3.2
χ2 = 0.31 χ2 = -0.003
that Adjective Complements 0.1 0.1 0.4
χ2 = -1.6 χ2 = -1.6
WH - Clauses 0.3 0.1 0.3
χ2 = -0.83 χ2 = -1.63
Infinitives 0.1 9.2 12.8
χ2 = -13.61 χ2 = -1.01
Present Participial Clauses 2.3 1.2 1.3
χ2 = 0.19 χ2 = -0.28
Past Participial Clauses 0.1 0 0.4
χ2 = -1.6 χ2 = -0.85
Past Participial WHIZ Deletion 3.2 0.6 5.6
Relatives χ2 = -1.5 χ2 = -5.4
Present Participial WHIZ 2.8 1.3 2.5
Deletion Relatives χ2 = -0.02 χ2 = -1.16
that Relative Clauses on Subject 1.5 2.0 0.2
Position χ2 = 3.2 χ2 = 8.45
that Relative Clauses on Object 0.5 0.4 0.8
Position χ2 = -0.8 χ2 = -1.01
WH Relative Clauses on Subject 1.5 1.5 2.6
Position χ2 = -0.98 χ2 =-0.98
WH Relative Clauses on Object 2.1 1.1 2.0
Position χ2 = -0.08 χ2 = -0.98
Pied-piping Relative Clauses 0.8 0.5 1.3
χ2 = -0.77 χ2 = -1.3
Sentence Relatives 0 0 0
Causative Adverbial 0.7 1.2 0.3
Subordinators χ2 = 0.03 χ2 = 0.53
Concessive Adverbial 0.3 0.4 0.5
Subordinators χ2 = -0.98 χ2 = -0.72
Conditional Adverbial 4.2 2.2 2.1
Subordinators χ2 = 1.22 χ2 = -0.08
Other Adverbial Subordinators 2.2 1.5 1.8
χ2 = -0.01 χ2 = -0.35
Total Prepositional Phrases 127.5 125.4 139.5
χ2 = -1.03 χ2 = -1.43
Attributive Adjectives 29.8 36.6 76.9
χ2 = -28.84 χ2 = -21.12
Predicative Adjectives 6.1 3.1 5.0
χ2 = 0.24 χ2 = -1.15
Total Adverbs 6.4 8.5 51.8
χ2 = -39.79 χ2 = -36.19
Field Code Changed
211
Type/Token Ratio 51.9 38.3 50.6
χ2 = 0.03 χ2 = -2.99
Word Length 5.8 6.0 4.8
χ2 = 0.05 χ2 = 0.1
Conjuncts 2.4 4.5 3.0
χ2 = -0.4 χ2 = 0.33
Downtoners 0.9 1.7 2.5
χ2 = -1.76 χ2 = -0.68
Hedges 0.1 0.1 0.2
χ2 = -1.8 χ2 = -1.8
Amplifiers 1.2 1.6 1.4
χ2 = 0.35 χ2 = 0.06
Emphatics 2.2 2.2 3.6
χ2 = -1.0 χ2 = -1.0
Discourse Particles 0.3 0.1 0
χ2 = -0.04 χ2 = -0.16
Demonstratives 7.2 5.8 11.4
χ2 = -1.55 χ2 = -2.75
Possibility Modals 5.5 5.1 5.6
χ2 = -0.001 χ2 = -0.04
Necessity Modals 1.6 1.5 2.2
χ2 = -0.22 χ2 = -0.65
Predictive Modals 3.8 2.8 3.7
χ2 = -0.04 χ2 = -0.53
Public Verbs 1.8 2.9 5.7
χ2 = -3.4 χ2 = -1.91
Private Verbs 11.5 7.2 12.5
χ2 = -0.08 χ2 = -2.25
Suasive Verbs 0.3 0.3 4.0
χ2 = -4.41 χ2 = -4.41
Seem/appear 0.1 0.5 1.0
χ2 = -2.96 χ2 = -1
Contractions 0.2 0 0.1
χ2 = -1.6 χ2 = -3.6
Subordinator -that Deletion 0.2 0.1 0.4
χ2 = -1.23 χ2 = -1.2
Stranded Prepositions 0 0 1.1
χ2 = -2.33 χ2 = -2.33
Split Infinitives 0.1 0.1 0
χ2 = -0.16 χ2 = -0.16
Split Auxiliaries 0.4 1.0 5.8
χ2 = -5.59 χ2 = -4.31
Phrasal Coordination 22.5 20.1 4.2
χ2 = 75.43 χ2 = 56.46
212
The algorithms that are found to differ significantly from Biber’s
Academic prose for each of the corpora are the following:
Table 5.4 The Physics Main Corpus: Significantly Higher and Lower Results
Table 5.5 The Chemistry Main Corpus: Significantly Higher and Lower Results
Taking the two main corpora together there are eleven algorithms
found to be significantly lower or higher in both the physics and the
chemistry corpora from the sixty-five examined by Biber, these are:
Significantly Lower in both Main Corpora: Past Tense, Perfect Aspect, Third
Person Personal Pronouns, Total Other Nouns, Attributive Adjectives,
Total Adverbs, Suasive Verbs, Split Auxiliaries.
213
Significantly Higher in both Main Corpora: Direct WH-questions,
Nominalizations, Phrasal Coordination.
215
long essay per chapter in the Physics textbook. The essay which has been
used for comparative study from the Chemistry textbook is entitled
Salvaging the Recorder Tape from the Challenger and it appears in the
Chemistry in Action section of the third chapter of the textbook and is
thus part of the overall corpus taken from this textbook. There are two
other Chemistry in Action essays in this chapter illustrating the chemical
reactions which have been described in the preceding chapter but these
are extremely short with lots of equations and lots of photographs
respectively. These essays were therefore rejected as being too limited in
scope for the purpose of comparison. This particular essay is concerned
with the crash of the space shuttle Challenger in 1986 and the
subsequent recovery of the tape of the flight and the chemistry used in
order to be able to listen to the seawater damaged recording that had been
made of the fateful flight. The essay is only about 450 words long and
presents a token/type ratio of 51.1, which is considerably higher than for
the chemistry corpus as a whole.
The results for each of the linguistic features as described in
Appendix A for these two essays are given below:
Table 5.6 Normalised Frequencies from the Sub-Corpora compared to Biber’s Academic
Prose with Statistical Significance Values (χ2)
Linguistic Feature Physics Essay Chemistry Biber’s
Essay Academic Prose
Past Tense 7.6 20.4 21.9
χ2 = -9.34 χ2 = -0.1
Perfect Aspect 4.9 0 4.9
χ2 = -0.05 χ2 = -5.95
Present Tense 35.0 24.9 63.7
χ2 = -12.93 χ2 = -23.63
Place Adverbials 1.7 0 2.4
χ2 = -0.6 χ2 = -3.5
216
Third Person Personal Pronouns 16.3 6.8 11.5
χ2 = 2.0 χ2 = -1.92
Pronoun it 8.3 4.5 5.9
χ2 = -5.32 χ2 = -0.61
Demonstrative Pronouns 2.1 4.5 2.5
χ2 = -0.32 χ2 = 0.9
Indefinite Pronouns 0 0 0.2
χ2 = -2.45 χ2 = -2.45
Pro-verb DO 0.7 0 0.7
χ2 = -0.36 χ2 = 0.06
Direct WH-questions 0.7 0 0
χ2 = 0.04
Nominalizations 13.9 38.5 35.8
χ2 = -13.4 χ2 = 0.2
Gerunds 5.9 6.8 8.5
χ2 = -0.8 χ2 = -0.34
Total Other Nouns 164.8 226.2 188.1
χ2 = -2.89 χ2 = 7.72
Agentless Passives 10.4 22.6 17.0
χ2 = -2.56 χ2 =1.85
By - Passives 2.8 4.5 2.0
χ2 = 0.05 χ2 = 2.0
BE as Main Verb 23.2 18.1 23.8
χ2 = -0.02 χ2 = -1.37
Existential there 1.0 2.3 1.8
χ2 = -0.94 χ2 = 0
that Verb Complements 3.8 6.8 3.2
χ2 = 0.003 χ2 = 3.0
that Adjective Complements 1.0 0 0.4
χ2 = 0.03 χ2 = -2.03
WH - Clauses 3.8 0 0.3
χ2 = 30.0 χ2 = -2.67
Infinitives 7.6 20.4 12.8
χ2 = -2.11 χ2 = 4.51
Present Participial Clauses 1.0 2.3 1.3
χ2 = -0.49 χ2 = 0.192
Past Participial Clauses 0 0 0.4
χ2 = -2.03 χ2 = -2.03
Past Participial WHIZ Deletion 1.7 11.3 5.6
Relatives χ2 = -3.46 χ2 = 5.8
Present Participial WHIZ 1.4 2.3 2.5
Deletion Relatives χ2 = -1.02 χ2 = -0.2
that Relative Clauses on Subject 0.7 2.3 0.2
Position χ2 = 0 χ2 = 12.8
that Relative Clauses on Object 1.4 0 0.8
Position χ2 = 0.01 χ2 = -2.11
WH Relative Clauses on Subject 4.5 0 2.6
Position χ2 = 0.75 χ2 = -3.7
WH Relative Clauses on Object 2.1 9.1 2.0
Position χ2 = -0.08 χ2 = 21.78
Pied-piping Relative Clauses 0.3 0 1.3
χ2 = -1.73 χ2 = -2.49
Sentence Relatives 0.7 0 0 Field Code Changed
217
χ2 = 0.04 χ2 = 0
Causative Adverbial 1.7 0 0.3
Subordinators χ2 = 2.7 χ2 = -2.13
Concessive Adverbial 0 0 0.5
Subordinators χ2 = -2.0 χ2 = -2.0
Conditional Adverbial 3.5 0 2.1
Subordinators χ2 = 0.39 χ2 = -3.22
Other Adverbial Subordinators 1.4 0 1.8
χ2 = -0.45 χ2 = -2.94
Total Prepositional Phrases 112.4 131.2 139.5
χ2 = -5.26 χ2 = -0.49
Attributive Adjectives 30.2 76.9 76.9
χ2 = -28.36 χ2 = 0
Predicative Adjectives 14.9 6.8 5.0
χ2 = 19.6 χ2 = 0.65
Total Adverbs 14.9 15.8 51.8
χ2 = -26.29 χ2 = -25.02
Type/Token Ratio 55.4 51.1 50.6
χ2 = 0.46 χ2 = 0.01
Word Length 5.6 6.0 4.8
χ2 = 0.02 χ2 = 0.1
Conjuncts 5.9 9.1 3.0
χ2 = 1.92 χ2 = 10.45
Downtoners 1.7 2.3 2.5
χ2 = -0.68 χ2 = -0.2
Hedges 0.3 0 0.2
χ2 = -0.8 χ2 = -2.45
Amplifiers 6.6 2.3 1.4
χ2 = 19.31 χ2 = 0.29
Emphatics 6.6 2.3 3.6
χ2 = 1.74 χ2 = -0.9
Discourse Particles 0.3 0 0
χ2 = 0.04
Demonstratives 5.9 4.5 11.4
χ2 = -2.65 χ2 = -4.18
Possibility Modals 10.4 4.5 5.6
χ2 = 4.11 χ2 = -0.46
Necessity Modals 4.2 0 2.2
χ2 = 1.02 χ2 = -3.31
Predictive Modals 8.3 2.3 3.7
χ2 = 4.54 χ2 = -0.98
Public Verbs 3.1 0 5.7
χ2 = -1.69 χ2 = -6.74
Private Verbs 7.3 2.3 12.5
χ2 = -2.16 χ2 = -9.16
Suasive Verbs 0.7 0 4.0
χ2 = -3.61 χ2 = -5.06
Seem/appear 0.3 0 1.0
χ2 = -1.44 χ2 = -2.25
Contractions 1.0 0 0.1
χ2 = 1.6 χ2 = -3.6
Subordinator -that Deletion 1.0 0 0.4
χ2 = 0.03 χ2 = -2.03
218
Stranded Prepositions 0 0 1.1
χ2 = -2.33 χ2 = -2.33
Split Infinitives 0 2.3 0
χ2 = 3.24
Split Auxiliaries 1.4 6.8 5.8
χ2 = -4.14 χ2 = 0.17
Phrasal Coordination 17.0 0 4.2
χ2 = 36.02 χ2 = -5.26
Independent Clause 1.4 22.6 1.9
Coordination χ2 = -0.53 χ2 = 214.76
Synthetic Negation 2.8 4.5 1.3
χ2 = 0.77 χ2 = 5.61
Analytic Negation 8.0 0 4.3
χ2 = 2.38 χ2 = -5.36
Table 5.7 The Physics Sub-Corpus: Significantly Higher and Lower Results
As the object of the study of these sub-corpora is to see how far they
differ from the main corpora (and Biber’s findings), the results which show
a significant difference from both the main corpus and Biber’s findings are
those of interest. The algorithms which differ from Biber’s findings in both
the main physics corpus and the physics sub-corpus are the following:
219
Nevertheless, the fact that the sub-corpus shows a greater
occurrence of certain features makes the essay significant in terms of
syllabus design, where certain features should be included in the syllabus
because of their presence in typical materials that the students will come
across in their studies. McCarthy and Carter (1994:112) say that
“whatever aspects of lexico-grammar we choose to look at, we cannot
really separate them from the concerns of creating discourse”. In other
words, these features make up the whole and cannot be taken out of
context without misrepresenting natural language use, in this case the
style of the science textbook in exemplifying real-world situations. If we
want students to cope with these kinds of texts, we must bring the
students into contact with the specifics of those texts.
The features that differ in the physics sub-corpus need not be
compared to those found for the chemistry sub-corpus as they are, in
themselves, a deviation from the norm of the main (physics) corpus and so
deserve study in their own right.
The results for the chemistry sub-corpus are given below in Table
5.8.
Table 5.8 The Chemistry Sub-Corpus: Significantly Higher and Lower Results
220
The chemistry sub-corpus differs from the chemistry main corpus in
one crucial way; it contains many more significantly high features than
the main corpus. That is to say, it contains many more examples of
features that are not so prevalent in either the main corpus or Biber’s
findings for Academic Prose.
The algorithms that the main and sub corpora share are the
following:
Although the points in common are few, this finding is even more
significant as it shows the wide degree of difference between the text as a
whole and the essay studied here. This can only reinforce the conviction
that this kind of differentiation in the text will cause some students to
have greater difficulty than ever with the attempt by the author to
exemplify what is being studied through ‘real world’ situations. That is to
say, what is intended by the author to provide pedagogical enlightenment
can prove to be linguistic obfuscation for the non-native speaker learner.
Like Sinclair, Biber (1988:238-9) comments on the fact that the
longer the text the fewer new word types there are to be found, so that if
the entire length of text is considered, as in the figures calculated for the
main corpora above, such an accurate description of the difficulty of a
particular text especially for comparative purposes, is not demonstrated.
Furthermore, Biber himself (1988:48) suggests that “academic prose is
contextualized in that it crucially depends on shared (academic)
background knowledge for understanding”. However, as Bloor and Bloor
(1991:2) point out there is a “false expectation that educational structures
and systems do not differ internationally”, which means that we would do
well to anticipate differences in the students’ academic background from
Field Code Changed
221
the background assumed in the textbook being examined. Halliday and
Martin (1993:2) suggest that native speaker students of science are
“alienated” by the language of science. If all of these conclusions are true,
how much more alienated will the foreign language learner be by both a
combination of the (foreign) language of science and the lack of a shared
academic background to the subject. This is especially the case with the
essays under discussion, with their dense text and exophoric appeal to
native speaker background understanding in scientific, general and
literary knowledge.
7
The figures given in brackets refer to the number of times that each word appears in the text, that is, their
frequency.
222
of biology by mistake. The questions which immediately follow this essay
also continue in the same vein and hummingbird, elephant and guinea pig
are used in the follow-up work.
The fact that certain items are used many times over in a text leads
to a lower type/token ratio, despite the fact that they may be difficult in
themselves, especially for non-native science students8. This is further
complicated when these human biological structures are compared to the
structures of buildings and braces, columns and cables are encountered
being compared with muscles and tendons, with an earlier analogy being
drawn between the strength of a wire or a rope. The collocations for braces
are as follows:
the skeleton - supported by various braces and cables which are muscles and tendons
the strength of his columns and braces is proportional to their cross-sections
The specific vocabulary and specific grammar of texts are now seen
to be inseparable. Halliday and Martin (1991:4) point out that
8
It is debatable if many native speakers would be able to draw an adequate distinction between gazelle and
deer. The Oxford Advanced Learner’s Dictionary gives the following definitions “gazelle small, graceful
antelope, deer any of several types of graceful, quick-running, ruminant animal, the male of which has
antlers”
Field Code Changed
223
“it does not make sense to condone relative frequency in lexis but deny its
validity in grammar (...) the concept of the relative frequency of positive:
negative, or of active: passive is no more suspect than the concept of the
relative frequency of a set of lexical items. It is, on the other hand, considerably
more powerful, because the relative frequencies of the terms in a grammatical
system, where the system is closed and the number of choices is very small
(typically just two or three), can be interpreted directly as probabilities having a
significance for the language as a whole.”
224
aberration in this respect “showing a topical concern for concrete events
and participants”). Lemke (1990:440) disagrees and claims that
225
seen to indicate the ‘absence of reciprocity’ of senders and receivers. As
the examples of I are both from quotations, it would be wrong to suggest
that this shows Philip Morrison’s ego involvement in the text, but the
unusual presence of 30 occurrences of we, eleven examples of our and
four examples of us in a scientific text should alert to what this implies in
terms of involvement. This text shows features that would more usually be
associated with other genres than the one being studied and suggests a
greater degree of informality in the text.
There are seven examples of you, which as Biber suggests requires a
specific addressee and indicates a high degree of involvement with that
addressee. This is perhaps less surprising in a text that is meant to be
instructional. Biber has used second person pronouns as a marker of
register differences and so once again there is evidence of involvement in
this text. There is one example of one as a pronoun, If one wishes.
McCarthy and Carter (1994:15) find the pronoun ‘one’ to be a marker of
‘absence of intimacy’ in either the spoken or the written mode.
There are sixteen examples of it. In a previous study Biber (1986)
suggested that a high frequency of this pronoun marked a relatively
inexplicit lexical content due to strict time constraints and showed a non-
informational focus, in other words it in high frequencies is used in text-
types like telephone conversations, face-to-face conversations, personal
letters, spontaneous speeches and interviews. Others (Kroch and Hindle
1982) have also associated greater use of this pronoun with spoken
situations which is clearly not the case here. This text breaks away from
the general situation in the textbook being studied which suggests that
there must be provision made in the syllabus for a sufficiently wide range
of text-types so that these features can be studied in an appropriate
context, which would nevertheless not be face-to-face conversation.
However, McCarthy and Carter (1994) recommend finding texts that
combine the discourse features that students need to study even if these
226
are in another text-type which can be used as an appropriate vehicle for
studying those particular discourse features. Halliday and Martin (1993)
say that there is a lot of discussion in the science classroom to clarify the
language of the scientific textbook being studied even though the students
usually only write short sentences and definitions. The authors of the
textbooks being analysed here are anticipating that their books will be
used mostly in classes with teachers. Non-native speaker undergraduates
however are expected to have to read these textbooks alone. It may
therefore be entirely appropriate for texts from these textbooks to be used
in language classrooms where discussion of the texts can take place with
a language teacher rather than a science teacher. This would go some way
towards reproducing the expected mode of use of the textbooks and
perhaps thereby help the students to examine both the language and the
scientific discourse they need to cope with.
There are 21 examples of his and 9 examples of their. Biber reports
that third person personal pronouns mark relatively inexact reference to
persons outside the immediate interaction and, in previous studies (1986),
has found that they co-occur frequently with past tense and perfect aspect
forms as a marker of narrative, reported (as opposed to immediate), styles.
There are no examples at all of either she or her. The gender deficiency
found in this work confirms the findings of linguists like Halliday, Martin
and Beaugrande who claim that the language of science is the domain of
white, middle-class, adult males. However, the use of one and you in the
same essay points to a certain confusion of usage of pronouns and the use
of one as a pronoun is not included in Biber’s analysis at all. Once again
this essay would appear to be atypical of its genre according to Biber’s
findings. Serway suggests that he uses informal language in order to make
his work clear and penetrable for students but this confusion does not
support his proposition.
227
The relevant normalizations for pronouns in this essay are as
follows:
First Person Pronouns 17.3
Second Person Pronouns 5.2
Third Person Pronouns 16.3
Pronoun it 8.3
Biber’s averages for these features were 5.7, 0.2, 11.5 and 5.9 respectively.
Examination of Biber’s findings reveals that for first person
pronouns the Gulliver’s Travels text is closer to the averages Biber found
for Religion (16.6), Biographies (22.1), and Science Fiction (22.2). The use of
biographical data on the scientists whose theories are discussed is
common in scientific textbooks and given the subject of Gulliver’s Travels
fiction is also included to some extent in this essay.
For second person pronouns Prepared Speeches (5.2) and Hobbies
(4.2). For third person pronouns Hobbies (14.1).
For pronoun IT Humor (8.2) and Prepared Speeches (8.9) and Press
Reviews (7.9).
If these texts provide an accurate picture of the use of these
features, it would be possible to widen the scope of the materials used
with students by including some of these as alternatives and making the
work more varied and interesting whilst sacrificing none of the relevancy.
Biber, Conrad and Reppen (1998:61) find that “academic prose uses
nominalizations to treat actions and processes as abstract objects
separated from human participants.” and that academic prose “more often
refers to a process with a stative nominalization, where fiction and the
spoken corpus describe a specific person’s action with a verb or adjective.”
In other words (1998:75) academic prose shows “a preference for static
rather than dynamic packaging of information.” They find that six different
nominalised words are very common in academic prose, that is with
frequencies of over 500 per million words. These are movement, activity,
228
information, development, relation and equation. In contrast, no
nominalisations were found to occur in fiction or speech this frequently.
All of the nominalisations mentioned above are found in both the physics
and the chemistry corpora studied here.
Abstraction is seen by Halliday and Martin (1993) to be one of the
reasons that science writing is so different from other writing that
students come into contact with and it is this abstraction factor that leads
to the difficulty experienced with science texts9. However, Biber’s
definition of nominalisations is somewhat different from Halliday and
Martin’s. Biber includes all words ending in -tion#, -ment#, -ness#, or-ity#
(plus plural forms) only, whereas Halliday and Martin allow anything which
can function as an element in another clause. Halliday and Martin
(1993:15) say
9
Halliday and Martin argue that students actually enjoy the technical terminology of science texts and do not
have difficulty with it as long as it is presented systematically.
Field Code Changed
229
as a noun means “actual physical fighting” and as a verb “means
something like ‘struggle against’”.
Other features identified by Biber as representative of
academic prose are passives and the use of the past tense and, as
mentioned earlier, it is the co-occurrence of some of these features,
(for example third person pronouns plus past tense plus perfect
aspect forms), that is important in positioning the text on the
continuum of the genre. In order to compare these factors with those
Biber obtained, it is necessary to normalise the text to a standard
1,000 words as Biber did in his research. The figures obtained from
this normalisation process are as follows:
Past Tense = 7.6 Biber found a mean of 21.9 for this feature.
This feature is only a third as frequent as in the Biber findings
putting it more on a par with Professional Letters (10.1) in Biber’s study.
Biber comments on the fact that his category of academic prose contains
wide variations and he suggests that Humanities prose shows a high score
because of its ‘topical concern for concrete events and participants’ while
engineering/technology prose reflects ‘concern with abstract concepts and
findings rather than events in the past’ and therefore has a low score. This
is borne out by the results in the main physics corpus (1.5) which is even
lower and not matched by any of Biber’s categories. It is interesting to
note that the Chemistry sub-corpus is very close to Biber’s finding (20.4 to
21.9 respectively) but that overall the main corpus is even lower (1.3) than
the physics main corpus.
Agentless Passives = 10.4 Biber found a mean of 17 for these
Passive Voice = 2.8 Biber found a mean of 2 for this feature
Passives are taken as characteristic of writing and when the agent is
dropped there is a static, more abstract presentation of information. In the
case of agentless passives this text is more on a par with Press Reportage
230
and Popular Lore in Biber’s study (11 and 10.6 respectively). Svartvik
(1966) calculated the number of passive clauses per 1,000 words of
running test for his 320,000 word corpus of eight text types. His results
showed an average of 11.3 and a range of 3.0 in advertising to 23.0 in
science. A comparable average in the physics text under inspection would
be 13.2, once again considerably lower than that referred to by other
investigators. It could be argued that it is the attempt by the authors to
reach (involve) the readers (students) that causes this finding.
Perfect Aspect Verbs = 4.9 Biber found a mean of 4.9 for this
Biber notes that these verbs have been associated with
narrative/descriptive texts and with certain types of academic writing. It is
interesting that this text is exactly the same as Biber’s finding for
academic prose, whereas the main physics corpus is only 0.9.
Nevertheless, this is the lowest mean score for this feature found in any of
the text types examined by Biber which makes academic prose in a
category of its own as regards the use of the perfect. For syllabus purposes
this is particularly significant and must be explored.
Nominalizations = 13.9 Biber found a mean of 35.8 for these.
This score is almost matched by that for Hobbies in Biber’s study (13.1),
closely followed by Science Fiction (14.0) and then Humor (12.1) and
General Fiction (10). Perhaps this finding is not so unexpected, given that
the discussion contained in the text examined here deals with the science
in Gulliver’s Travels.
Nouns = 164.8 Biber found a mean of 188.1 for this feature.
In Biber’s study Adventure Fiction most closely matches this mean score
(165.6) followed by Mystery Fiction (165.7) and General Fiction (160.7). This
may reflect the nature of the subject matter once again or the attempt to
make this physics text more amenable to its audience.
Prepositions = 114.4 Biber found a mean of 139.5 for these
231
The prepositions examined by Biber are taken from Quirk et al. (1985:665-
7). Biber finds that prepositions tend to co-occur frequently with
nominalizations and passives in academic prose and other informational
types of written discourse. The closest mean score for this feature in
Biber’s study is for Hobbies (114.6) and Popular Lore (114.8).
These results would place this text in rather different company than
that given in Biber’s results for academic prose, however, these are mean
scores and considerable variation has to some extent been integrated into
Biber’s study by including all kinds of academic prose and not only the
academic prose of science and technology which is of prime interest for
the students who have to study such texts in the University of Aveiro.
Nevertheless, the precise description of the study undertaken by Biber has
allowed a number of features in this physics text to be compared with his
findings and so allows a scientific comparison and interpretation to be
made, which in turn can be the basis for a reasoned approach to the
relative difficulty of such study material for our students and,
consequently, a clearer definition of the approach that needs to be
adopted in teaching such students to cope with their textbooks. In this
case, some of the features of abstraction are not present to any significant
degree, as defined above, but that the attempt to be more accessible will,
in fact, lead to even greater difficulty for the foreign language student of
physics at university level.
Challenger
232
Second Person Pronouns 2.3 Broadcasts (2.7) and Religion (2.9)
Third Person Pronouns 6.8 Professional Letters (8.7)
Pronoun it 4.5 Press Reportage (5.8) and Academic Prose (5.9)
Past Tense 20.4 Academic Prose (21.9) and Broadcasts (18.5)
Agentless Passives 22.6 Official Documents (18.6) and Academic
Prose (17)
Passive Voice 4.5 Official Documents (2.1) and Academic Prose (2.0)
Perfect Aspect Verbs 0 Academic Prose (4.9)
Nominalizations 38.5 Official Documents (39.8) and Academic Prose
(35.8)
Nouns 226.2 Broadcasts (229.8) and Press Reportage (220.5)
Prepositions 131.2 Biographies (122.6) and Broadcasts (118.0)
233
highlighting the even greater difference found in the physics corpora
studied here.
One of the features that has to be taken into consideration in this
sub-corpus is the cultural and historical aspects of the case of the crash
of the Challenger space shuttle. Americans could be expected to
‘remember’ this event as it was a tragedy for a nation who have
traditionally found failure difficult to accept. Foreign students could not be
expected to share such a collective consciousness on this topic and indeed
Chang’s essay does not appear in a Portuguese translation.
5.4 Mathematics
234
content of the text they are associated with and the same is usually true of
numbers and formulae included in the texts; they reiterate the
commentary of the text, exemplify or complement the meaning in some
way. However, English and Portuguese do not follow the same resolution
of mathematical problems. A simple example will illustrate this difference.
If one number is divided by another the ‘working’ of the calculation will be
different even though the result should turn out the same. Take, for
example, 1526 divided by 32. In English this would appear as follows:
47.6875
32) 1526
128
246 4/128
224 7/224
220 6/192
192 8/256
280 5/160
256
240
224
160
160
000
As can be seen from the above, the answer to this problem, the
quotient, is given above the line at the very top of the calculation, 47.687,
the divisor is on the left, 32, and the number to be divided inside the
frame to the right, 1526. Each subtraction is shown below the number to
be divided in a series of steps. The indication that the result is a decimal
is given by the punctuation ‘full stop’ between the whole numbers and the
decimals and the necessary ‘working’ is given to one side of the calculation
itself (in this case on the right although there is no hard and fast rule
about this positioning).
235
In Portuguese this calculation would look something like the
following:
1526.000) 32
246 47,6875
220
280
240
160
00
236
because America still adheres to the Imperial system. Despite the fact that
Britain is now almost completely metric, other differences still remain
between British and American measurements an American “ton” is lighter
than a British “ton” and is, of course, different again to its metric
equivalent “tonne” and a British gallon is more than an American gallon.
International scientific convention, and European Union regulations,
would require metric measurements to be given for everything but, as was
described above for the corpora, the fact that the authors have attempted
to bring their observations to bear on the everyday world and common
American pursuits invites the use of imperial measurements which make
up part of that world. The words “foot” and “feet” and “inches”, and “miles”
and “acres” and “pounds” and “tons” are indeed found in the corpora and
will probably cause difficulty as with the confusing billion and ton which
are also there. Serway says that he uses metric measurements in all but
the engineering sections which he nevertheless keeps to a minimum.
237
In other words, the breaking strength of a wire or rope is proportional to its area of cross-
section, or to the square of its diameter
Because the strength of his columns and braces is proportional to their cross-sectional area
and thus to the square of their linear dimension
238
Chapter 6 Discussion of the Results
Chapter 6
“There is a widespread consensus that language is never neutral and texts are
never innocent. Things can always be formulated differently, any linguistic
expression of the facts chooses some aspects of reality and downplays others,
and all choices are political (Martin, 1985). Representations are always from
a point of view, and express group interests. Such points of view are not
usually explicit, are often denied and may not be directly observable, because
they are often a matter not of individual words, but of patterns of distribution
and frequency.” Stubbs (1996:235)
242
The findings from the frequency analyses and corpora studies must
therefore be examined in order to suggest what implications they bring to
the teaching of undergraduates at university.
1
Biber et al. (1998:136) gives the example of ‘balls’ and ‘strikes’ being used as countable nouns only in
broadcasts of baseball games, as the exceptional, rare situation where these features are found only in that
one register rather than being shared with other registers to a greater or lesser degree.
Field Code Changed
243
teaching materials should reflect this difference rather than being
prescriptive and suggesting that only one plural is ‘correct’ usage when
the corpora suggest that actual usage is other than this2. The advantage
of having access through the corpora to the context of these forms and
their range across texts also provides information on the most useful
items to be used in each particular situation. The differences found
between Physics and Chemistry on words that would have been predicted
to be essential for science provide clear guidelines for the context that
these should be presented and studied in as was described in 5.1.2.
Stubbs (1996:40) reaches a number of conclusions about work on
lexico-grammar that are relevant here. He makes the following points:
1. Any grammatical structure restricts the lexis that occurs in it, and
conversely, any lexical item can be specified in terms of the structures in
which it occurs.
2. Such restrictions are typically not absolute, but clear tendencies:
grammar is inherently probabilistic.
3. Meaning is not constant across the inflected forms of a lemma.
4. Every sense or meaning of a word has its own grammar and each
meaning is associated with a distinct formal patterning. Form and meaning
are inseparable.
5. Words are systematically co-selected: the normal use of language is to
select more than one word at a time.
6. Since paradigmatic choices are not made independently of position in
syntagmatic chain, the relation between paradigmatic and syntagmatic has to
be rethought.
7. Traditional word-classes and syntactic units also have to be rethought.
Native speakers have only limited intuitions about such statistical
tendencies. Grammars based on intuitive data will imply more freedom of
2
Peters (1998:6-12) reports on the Langscape Project of Macquarie University, Sydney, Australia on the
Langscape 1 questionnaire on spelling by age group and nationality. The Langscape 4 questionnaire is
244
combination than is in fact possible. Grammar is corpus-driven in the sense
that the corpus tells us what the facts are. Some of these facts may seem
intuitively obvious in retrospect. But they cannot be predicted in advance
and they certainly cannot be exhaustively documented from intuition.
investigating the issue of the preferences for the plurals from Latin.
Field Code Changed
245
application of technology and ‘general’ science giving way to discovery
learning then moving on to a much more elitist ‘pure’ form of science
study and finally recently to a more liberal study of science which
includes the history of science and discussion of the moral, social,
cultural and ethical aspects of the application of science. Matthews also
describes how the pedagogic aspects of the pure science curriculum were
not taken into account and teachers were not involved in the design of
the school curricula which were dictated by scientists alone. This was
especially the case after America felt that it lagged behind the USSR
when Sputnik was launched in 1959. However, this has since been
superseded, as mentioned above, and there is no longer such a
centralised curriculum as prevailed at that time and the twin technology
(applied science) and pure science elements are now integrated in the
modern curriculum.
Furthermore, White (1998:276) argues that technology extends the
everyday sense of terms which are possible because “the polysemous
nature of much vernacular lexis means that different phenomena may be
referenced by the same lexical item”. The use of polysemy to extend the
sense of lexis makes it an important area to concentrate on in teaching to
demonstrate and sensitise the students to this phenomenon in their
reading of such textbooks as these.
Results.
248
Dimension 6 distinguishes discourse that is informational but is
produced under real-time conditions so that it displays fragmented
presentation of information with tacking on of clauses rather than
carefully integrated presentation of information.
Biber (1988:94) explains how he standardised the frequencies of
the features in each factor so that those features that occurred with great
frequency would not have an inordinate influence on the factor score.
Applying the same calculations to the results obtained from the physics
and chemistry main and sub-corpora, these texts can be compared with
the corpus as a whole which Biber examined. All of the features were
standardised to a mean of 0.0 and a standard deviation of 1.0. The
results are as follows:
Table 6.1. Mean scores of each of the Dimensions compared with Biber’s Academic Prose
corpus results
Physics Main Chemistry Physics Sub- Chemistry Sub- Biber’s
Corpus Main Corpus corpus corpus Academic
Prose
Dimension 1 - 7.65 - 8.03 - 1.8 - 5.96 - 14.9
Dimension 2 - 5.06 - 6.02 - 3.23 - 2.41 - 2.6
Dimension 3 5.34 2.75 1.89 - 1.07 4.2
Dimension 4 - 5.42 - 4.79 1.56 - 2.37 - 0.5
Dimension 5 5.72 3.53 4.45 11.44 5.5
Dimension 6 - 0.73 - 1.57 0.94 - 1.31 0.5
The following figures show the results of the corpora examined in this
study compared with Biber’s main texts mean scores for each of the six
dimensions.
250
6.3.1 Discussion of Dimension 1 ‘Involved versus Informational
Production’
nouns,
word length,
type/token ratio, and
attributive adjectives.
Biber explains that the negative features on this factor are all
associated with careful, precise presentation of informational content,
which is not usually a characteristic of speech, whereas the positive
features are characteristic of “on-line” information that is to say, Field Code Changed
251
information that is produced immediately or what teachers usually refer
to as ‘thinking on one’s feet’ and show involvement and interactive or
affective purpose. Biber (1988:132) does not however see this dimension
as a distinction between speech and writing per se but rather as “the
interpretation of involved real-time production versus informational,
edited production”. White (1998:289) argues that ‘hedges’ mark one of
the differences in the lexico-grammar of the scientific and vernacular
technological systems of valeur which would suggest a much more subtle
refinement would have to be made between the corpora included in
Biber’s study in order to separate the scientific from the technological.
Figure 6.1 shows that the results for both the main and the sub-
corpora are generally in the same direction as Biber’s findings. This is not
surprising as the texts are highly informational. There are some features
included in this Dimension however, that might help to explain why all of
the results are higher than those Biber found for academic prose. For
example, one of the factors was WH questions which were found to be
significantly higher in both of the main corpora (see Chapter 5, Tables 5.4
and 5.5). This is one of the features found in large numbers in both of the
textbooks for undergraduates studied here, which contain several pages
of problems for the students to solve at the end of each of the chapters.
The Physics sub-corpus shows a significantly higher frequency of
pronoun it and analysis done by McCarthy (1994-98:275) provides a
tentative conclusion on the uses of it, this and that in texts, seen in this
dimension as demonstrative pronouns and pronoun it:
(1) It is used for unmarked reference within a current entity or focus of attention.
(2) This signals a shift of entity or focus of attention to a new focus
(3) That refers across from the current focus to entities or foci that are non-current,
non-central, marginalizable or other attributed.
252
McCarthy (ibid.) sees this kind of finding as raising fundamental
questions about “how writers (and speakers) structure their arguments,
create foci of attention in texts and signal desired interpretations.” The
tentative interpretation I would make here is that the physics sub-corpus
displays different argument structures than the other corpora.
In three of the corpora; the Physics main corpus and the physics
and chemistry sub-corpora, second person pronouns are also found to be
significantly higher than in Biber’s findings (see Chapter 5, Tables 5.3,
5.6 and 5.7). This is because of the essays that are used to demonstrate
real-world applications of the theories discussed in the chapters (see
5.3.4 and 5.3.6 for discussion of the essays used in the sub-corpora). The
intention of the authors is more ‘involved’ and ‘affective’ in order to teach
the reader. Biber, Conrad and Reppen (1998:149-150) suggest that “first-
and second person pronouns, wh-questions, emphatics, amplifiers, and
sentence relatives can all be interpreted as reflecting interpersonal
interaction and the involved expression of personal feelings and
concerns.” Glaser (1982:78) found that “emotive features and figures of
speech alongside with the visual code are predictable characteristics” of
the ESP style of using analogies from the learner’s everyday experience.
On the other hand, factors such as nouns and attributive
adjectives, which were seen as negative factors for this dimension, were
significantly lower in both of the physics and chemistry main corpora
with the exception of both the sub-corpora for nouns and the chemistry
sub-corpus for attributive adjectives. The effect of this would be to raise
the result more towards the centre of the scale, as can be verified in
Figure 6.1. The physics sub-corpus then shows affinity with fiction rather
than academic prose which is understandable given the topic of Gulliver’s
Travels as mentioned earlier.
The implications of these findings for teaching and syllabus design
is to reconsider whether some other text types and cultural topics should
Field Code Changed
253
not be included in science and technology courses both for the subjects
covered and for the textual attributes that pertain to them. Sports are
used consistently as a means to involve the (student) reader, as can be
seen from the following extracts from the corpora. In the physics corpora
there are references to American Football, Golf and Baseball as in:
Physics Text
PROBLEMS
34. A quarterback takes the ball from the line of scrimmage, runs backward
for 10 yards, then sideways parallel to the line of scrimmage for 15 yards.
At this point, he throws a 50-yard forward pass straight downfield
perpendicular to the line of scrimmage. What is the magnitude of the
football's resultant displacement?
36. A novice golfer on the green takes three strokes to sink the ball. The
successive displacements are 4 m due north, 2 m northeast, and 1 m 30º
west of south. Starting at the same initial point, an expert golfer could
make the hole in what single vector displacement?
4. A golf ball is hit off a tee at the edge of a cliff. Its x and y coordinates
versus time are given by the following expressions:
In addition, the spin of a projectile, such as a baseball, can give rise to
some very interesting effects associated with aerodynamic forces (for
example, a curve thrown by a pitcher).
5.28 A dented (but not punctured) Ping-Pong ball can often be restored to
its original shape by immersing it in very hot water. Why?
254
5.29 Discuss the following phenomena in terms of the gas laws: (a) the
pressure in an automobile tire increasing on a hot day, (b) the "popping" of
a paper bag, (c) the expansion of a weather balloon as it rises in the air, (d)
the loud noise heard when a light bulb shatters.
5.30 Nitric oxide (NO) reacts with molecular oxygen as follows The heat
generated in this reaction helps melt away obstructions such as grease, and
the hydrogen gas released stirs up the solids clogging the drain.
256
6.3.2 Discussion of Dimension 2 ‘Narrative versus Non-
Narrative Concerns’3
The results for Dimension 2 are once again in keeping with the
general tendency for academic prose, although the results for the main
physics and chemistry corpora are an exaggeration of the tendency
towards non-narrative concerns as Figure 6.2 shows. The features that
Biber grouped under the heading ‘Narrative versus Non-narrative
Concerns’ were:
past tense verbs,
third person pronouns,
perfect aspect verbs,
public verbs,
synthetic negation and
present participial clauses.
3
In Biber, Conrad and Reppen (1998:148) this dimension is relabelled “Narrative versus non-narrative
discourse”.
4
The factors for these two were eliminated by Biber because he included each feature on only one factor
score in order to maintain their independence although he (1988:89) found them to have factorial scores
of - . 47 for present tense verbs and - .41 for attributive adjectives which he regarded as salient in his
calculations.
Field Code Changed
257
corpora for attributive adjectives and the physics and chemistry sub-
corpora for present tense would have the effect of emphasising the
tendency towards lower results than those Biber found. This implies that
the effect of including these features in the calculations on this
dimension would have been to produce negative weightings on this
dimension with the result that the corpora would have shown an even
more extreme negative trend and would have increased the distance from
any of Biber’s findings even further.
Biber (1988:137-8) sees non-narrative purposes as
“(1) the presentation of expository information, which has few verbs and
few animate referents; (2) the presentation of procedural information,
which uses many imperative and infinitival verb forms to give a step-by-
step description of what to do, rather than what somebody else has done,
and (3) description of actions actually in progress.”
Trimble (1985:123-4) suggests that there are three areas where the non-
temporal use of tense regularly occurs in written EST discourse and
258
these are: 1. when apparatus is described, 2. when reference is made to a
visual aid, and 3. when previously published research is referred to.
Points 1 and 2, describing apparatus and making reference to a
visual aid are significant in the corpora studied with such exhortations in
the physics corpus as:
260
Figure 6.3 Dimension 3 ‘Explicit versus Situation-Dependent
Reference’
| official documents
7 |
| professional letters
|
6 |
|
| physics main corpus
5 |
|
| press reviews; academic prose
4 |
| religion
|
3 |
| chemistry main corpus
| popular lore
2 |
| editorials; biographies; physics sub-corpus
| spontaneous speeches
1 |
|
| prepared speeches; hobbies
0 |
| press reportage; interviews
| humor
-1 | chemistry sub-corpus
| science fiction
|
-2 |
|
|
-3 |
| general fiction
| personal letters; mystery & adventure fiction
-4 |
|
|
-5 |
| telephone conversations
|
-6 |
|
|
-7 |
|
|
-8 |
|
|
-9 | broadcasts
Dependent Reference’5
time adverbials,
place adverbials and
adverbs.
5
In Biber, Conrad and Reppen (1998:148) this dimension is relabelled “Elaborated versus situation-
dependent reference” because it is characterised by “highly explicit, context-independent reference versus
situation-dependent reference”.
262
Biber (1988:110) says that WH relative clauses together with
phrasal co-ordination and nominalization show referentially explicit
discourse which is usually integrated and informational. He (ibid.)
suggests that this dimension distinguishes between endophoric and
exophoric reference (Halliday and Hasan 1976). This would place the
chemistry sub-corpus in a different category from all the other academic
prose categories and from the main corpora.
Biber, Conrad and Reppen (1998:153) describe the use of wh-
relative clauses (including pied-piping constructions) as specifying “the
identity of referents within a text in an explicit and elaborated manner”
whereas time and place adverbials “are used for text-external references
to the physical context of the discourse. The following extracts are taken
from the beginning and the end of the sub-corpus essay to demonstrate
these features:
When the space shuttle Challenger exploded in flight on January 28, 1986,
the crew cabin separated from the rest of the orbiter and broke up when it
hit the water. The cabin was equipped with tape recorders to collect shuttle
data and record conversations among the crew. However, there was no
"black box" to protect the tapes as is used in airplanes. Thus, when the
tapes were found six weeks later in 90 feet of water they were considerably
damaged by exposure to seawater and resultant chemical reactions. The
tapes were described as "a foaming, concretelike mess, all glued together."
The recording showed that at least some of the crew members were aware
in the final seconds that the shuttle was in trouble. The impressive fact
about this tape-salvaging project is that the principle involved is no more
complex than what you would encounter in an introductory chemistry
experiment!
264
Figure 6.4 Dimension 4 ‘Overt Expression of Persuasion’
4 |
|
|
| professional letters
|
|
3 | editorials
|
|
|
|
|
2 | romantic fiction
| hobbies
| personal letters
| physics sub-corpus
|
|
1 | interviews; general fiction
|
|
| telephone conversations; prepared speeches
| spontaneous speeches; religion
|
0 | official documents
| face-to-face conversations; humor; popular lore
| academic prose
| biographies; mystery and science fiction; press reportage
|
|
-1 |
| adventure fiction
|
|
|
|
-2 |
|
| chemistry sub-corpus
|
|
| press reviews
-3 |
|
|
|
|
|
-4 |
|
| broadcasts
|
|
| chemistry main corpus
-5 |
|
| physics main corpus
|
Persuasion’6
prediction modals,
necessity modals,
possibility modals,
conditional clauses,
suasive verbs,
infinitives and
split auxiliaries.
6
In Biber, Conrad and Reppen (1998:148) this dimension is relabelled “Overt Expression of
Argumentation” because (1998:155) this dimension “marks the degree to which persuasion is marked
overtly, whether marking the speaker’s point of view, or the speaker’s attempt to persuade the addressee.”
266
Predictive modals are used to refer to the future and consider
events that will or will not occur (e.g., what changes will occur, won’t fly),
possibility modals and conditional clauses are used to consider different
perspectives on the issue (e.g., If we design a new large object, may enter,
We cannot just scale up and down blindly, we can sometimes foresee, In
this way we can employ scaling)
In contrast, the other corpora are all to be seen as not involving
opinion or argumentation at all. The fact that science texts neither show
doubts nor allow alternative points of view or argumentation of the facts
presented to the reader may be one of the reasons that they are said to
exclude (see Halliday and Martin 1993). On the other hand, once again
(see Glaser above) the physics sub-corpus is an example of the author’s
attempt to be open-ended and include the student reader in the
discourse.
The problems that need to be dealt with as seen from the main
corpora are therefore similar to the problems that native speakers would
have with scientific texts, which is understanding the concepts developed
by the authors or as Laurillard (1993-7:27) says “the problems stem from
the fact that the two worlds, of everyday knowledge and academic
knowledge, are not as synergistic and inseparable as Vygotsky suggested,
but are contrasting and separate.” Students need to learn what experts
are telling them rather than what they can observe from everyday
experience, thus they need to develop academic knowledge of the world.
268
6.3.5 Discussion of Dimension 5 ‘Abstract versus Non-Abstract
Information’7
conjuncts,
agentless passives,
adverbial past participial clauses,
by-passives,
past participial WHIZ deletions and
other adverbial subordinators.
7
In Biber, Conrad and Reppen (1998:148) this dimension is relabelled “Impersonal versus non-impersonal
style” because this dimension marks “informational discourse that is impersonal, technical, and formal in
style versus other types of discourse.”
Field Code Changed
269
and processes” rather than “the descriptions of specific people performing
actions” which will be found in fiction and conversation.
As was mentioned above for overt expression of persuasion, the
problem to be approached here is the same as that found in native
speaker students in higher education having to learn the academic
representations of the world which are presented as facts rather than
negotiable concepts where argumentation is possible. Many teaching
materials produced for students of science and technology have treated
the use of the passive as a transformation of the active voice in
sentences, which is to misrepresent the meaning of the passive voice.
Discourse analysis has shown how power and authority are confirmed
through the use of the passive and foregrounding of information in texts
(see van Dijk 1997, Stubbs 1996). The use of the passive is deliberate in
science and technology texts to achieve this authoritarian position vis-à-
vis the reader, who is therefore not permitted to question the “laws” put
forward in the text.
270
Figure 6.6 Dimension 6 ‘On-Line Informational Elaboration’
| prepared speeches
|
|
| interviews
3.0 |
|
|
|
| spontaneous speeches
2.5 |
|
|
|
|
2.0 |
|
|
|
|
1.5 | press editorials; professional letters
|
|
|
|
1.0 | religion
| physics sub-corpus
|
|
|
0.5 | academic prose
|
| face-to-face conversations
|
|
0 |
|
|
| bibliographies
|
-0.5 |
|
| hobbies; physics main corpus
| popular lore
| press reportage; official documents; telephone conversations
-1.0 | press reviews
|
| romantic fiction
| broadcasts; chemistry sub-corpus
| personal letters
-1.5 | humor
| general fiction and science fiction; chemistry main corpus
|
|
| mystery and adventure fiction
-2.0 |
|
|
|
|
Elaboration’
The features that have high positive weights on this dimension are:
272
back two or three dogs of his own size; but I believe that a horse could not
carry even one of his own size.
If we go far enough toward the very small, surfaces no longer appear
smooth, but are so rough that we have difficulty in defining a surface.
Other descriptions must be used. In any case, it will not come as a complete
surprise that in the domain of the atom, the very small, scale factors
demonstrate that the dominant pull is one which is not easily observed in
everyday experience. Such arguments as these run through all of physics.
274
Dimension 3 5.34 5.1 Soc. Sc. 2.75 2.7 Nat. Sc.
Dimension 4 -5.42 -2.1 Nat. Sc. -4.79 -2.1 Nat. Sc.
Dimension 5 5.72 7.3 Med. 3.53 3.4 Soc. Sc.
Dimension 6 -0.73 -0.8 Nat. Sc. -1.57 -0.8 Nat. Sc.
Table 6.3 The physics and chemistry sub-corpora compared with Biber’s academic Prose
sub-genres
Physics Sub- Biber’s sub- Chemistry Sub- Biber’s sub-
Corpus genre Corpus genre
Dimension 1 -1.8 -4.4 Maths. -5.96 -4.4 Maths.
Dimension 2 -3.23 -3.1 Maths. -2.41 -2.6 Nat. Sc.
Dimension 3 1.89 2.7 Nat. Sc. -1.07 ---
Dimension 4 1.56 2.6 Pol./Ed. -2.37 -2.1 Nat. Sc.
Dimension 5 4.45 3.7 Pol./Ed. 11.44 9.7 Tec./Eng.
Dimension 6 0.94 0.9 Pol./Ed. -1.31 -0.8 Nat. Sc.
The following figures show the results of the corpora examined in this
study compared with Biber’s mean scores for the academic sub-genres
for each of the six dimensions.
276
Figure 6.8. Dimension 2 ‘Narrative versus Non-Narrative
Concerns’ for the Academic Prose Sub-Genres
0 |
|
|
|
|
-1 |
| Medical
| Humanities
|
|
-2 |
|
| chemistry sub-corpus
| Natural Science
| Social Science; Politics/Education
-3 | Mathematics
| physics sub-corpus
|
|
|
-4 | Technology/Engineering
|
|
|
|
-5 | physics main corpus
|
|
|
|
-6 | chemistry main corpus
278
Natural Science which shows situation-dependent reference rather than
inexplicit reference. Biber (1988:193) argues that texts taken from
disciplines such as geology, meteorology, and biology deal with “specific
aspects of the physical environment and thus make extensive reference
to that environment.” The main corpora studied here divide equally
between these two positions, the physics main corpus being on a par
with the Technology/Engineering prose and the chemistry main corpus
on a par with Natural Science. This may not be at all surprising but it
will have to be taken into consideration in developing the syllabus. The
subject matter of the Physics sub-corpus Gulliver’s Travels was shown
earlier (5.3.4) to have extensive biological referencing in it which may
explain why it is even further away from the Technology/Engineering
sub-genre and closer to Natural Science than the main physics corpus.
282
6.4.5 Discussion of Dimension 5 ‘Abstract versus Non-Abstract
284
6.4.6 Discussion of Dimension 6 ‘On-Line Informational
Two clear trends are apparent from the test results obtained
for the new students in the first year of the university. First, it seems that
the students are studying more English at school. This result is
confirmed by other factors such as the increase in the number of
students studying the English language in schools particularly in the
fifth and sixth years, despite the changes in demographics that the
schools are undergoing. The second feature that can be observed is that
the more years of English the students have had at school the better their
results are on the preliminary test. This may seem obvious, however the
view that the students are not learning anything, and that the same
material is repeated over and over again in each school year as a result of
this, is often bemoaned at conferences on teaching and meets with
Field Code Changed
285
widespread sympathy from teachers. These results run counter to that
general idea. The students are indeed learning more English when they
have more time dedicated to the study of the language. This is not to say
that the students have learnt the specific language of science and
technology but it does suggest that strategies for dealing with the
comprehension of texts have been developed and that these strategies
can be applied by the more advanced students in other situations.
Rosenthal (1996:19), reviewing the research on second language students
and academic success in further education, finds that the level of
language proficiency necessary to ensure academic success takes five to
seven years to develop. She is discussing the American system which has
both immigrants whose first language is not English and foreign students
in higher education and notes that there are many different systems in
operation within the universities and colleges in the United States to
teach English to such students. However, she also points out that this
time factor cannot be overlooked but is a considerably longer period of
time than that allotted to any of the language programmes in the colleges
and universities. Added to this she recognises that the fact that faculty
members in the other academic disciplines often have no idea how
students learn English which leads to a separation of English and the
study of the academic subject matter. She recognises that this situation
is no longer appropriate because the acquisition of English occurs best
when students are using the new language purposefully but that many
mainstream faculty members have unrealistic expectations of what
students can achieve within the confines of the language classroom.
Despite the very different circumstances between the American system
she is describing and the Portuguese system examined here, the latter
unrealistic expectation holds true. In other words, time is of the essence
but so is contact with the language in purposeful contexts.
286
The results that show that the students have a greater ability with
some of the structures traditionally regarded as difficult or more
advanced and struggle with those traditionally considered simple call into
question these distinctions. Examination of the irregular verbs in the
frequency lists for the chemistry and physics corpora reveal that the
results of Mindt’s corpora analyses (1997:47-49) are not duplicated in the
results of this study. Mindt’s top ten irregular verbs (apart from do, have
and be) are say, make, go, take, come, see, know, get, give and find. Of
these say, go, and come are not found with any frequency in either of the
main corpora and take and get are only found in one of them. On the
other hand, show and write are extremely common in these corpora.
Halliday (1993:19) working on the original COBUILD corpus provides a
list of the first 25 most common verbs, once again without the most
frequent ones of all be, have and do. This list does not coincide with
Mindt’s list entirely either. The most common irregular verb forms as
given by Halliday (ibid.) include think and tell before find and give. Added
to this is the problem of how many of these verbs are actually found in
the past tense forms. Halliday (1993:21) finds that the use of the present
and the past tenses is approximately the same, however, the corpora
studied show a preference for the present tense. These findings suggest
that the syllabus must be built on the evidence contained in the
frequency listings of the corpora used here, rather than on any other
arbitrary corpus, or indeed other studies conducted for other purposes
on other material, in order to be both useful and relevant for the
students. (see McCarthy and Carter 1994:20).
Similarly, the grammatical structures taught should appear in
actual contexts of use so that they reflect the meanings and common
usage of these kinds of textbooks. Biber, Conrad and Reppen (1998)
mentioned above, Trimble (1985), Stubbs (1996) and McCarthy and
Carter (1994) (among others) have all demonstrated how both the
Field Code Changed
287
grammatical choices and their meanings differ between genres and text-
types. The students were not specifically tested to see if they could
distinguish between general English usage and the specific science and
technology usage of certain grammatical items. Nevertheless, their test
results do show that many of the students have not grasped the general
usages they should have learnt in school.8 Of particular importance is
the specific use of modals, conditionals and irregular verbs and their
meanings in scientific texts.
8
Mindt (1997:43-45) argues that the grammars used in schools (in Sweden) do not show the most common
forms of usage of some and any (amongst others) and so misrepresent the language as used by native
speakers in any case. Therefore he argues for a new approach to didactic grammar based on corpus
analysis which would show the most frequent usages for students to learn in a graded manner.
288
Chapter 7 The Syllabus
Chapter 7
The Syllabus
ESP can be broken down into other sub-divisions like EAP, which
is broken down further by Jordan (1997:3) into ESAP (that is, English for
Specific Academic Purposes and EGAP (English for General Academic
Purposes), although he reports that the more usual model in the USA is
to break EAP down into EAP and EST. EAP is seen by Jordan to cover
both the more specific focus of a subject such as engineering and also the
skills and proficiency in formal academic style and register that the
students need for study purposes. In either division, EAP can be taken as
the short term objective of the undergraduates studying on the science
and technology courses analysed here. This is particularly relevant as a
teaching objective given the constraints imposed by both the size of the
classes, their heterogeneous nature and the paucity of time available. The
objective of the discipline would be to cope with the students’ immediate
course concerns. However, the medium and longer term needs of the
students cannot be ignored and a process whereby the students can be
given the skills to continue learning (or the “learning to learn” dimension)
must not be ignored.
294
to students from a different culture and academic environment and the
fact that most of the students on the first year discipline are also in the
university environment for the first time.
Laurillard (1993:2) argues that at undergraduate level it is
unrealistic to expect students to take control of their own learning, but
goes on to show how she sees the student developing academic
knowledge through mediated learning. Waters and Waters (1992:264) say
that in their experience “what students frequently lack is not only a
knowledge of study skills, but, more fundamentally, the underlying
competence necessary for successful study - self-confidence, self-
awareness, the ability to think critically and creatively, independence of
mind and so on.” The general development that is taking place in
undergraduates who have just made the transition from school to
university must also be addressed together with the more straightforward
aspect of the lack of study skills. One means of achieving this
development in students is through different methodologies in the
classroom which encourage increasing confidence through working on
different levels of material with success and interacting with classmates
in pairs or small groups to avoid shyness and ridicule and to encourage
the sharing of knowledge between students. More general discussion
encouraging students to explore their own ideas and opinions on topics
which are given proper consideration and treated with respect by the
teacher and other students can aid self-confidence and creativity.
Different schemes which allow students to interact with the lecturer on a
more personal basis such as outside the classroom in attendance hours
and through different means of directing questions which may be
personal (through e-mail) or taken up with the whole group by the
lecturer if they are found to be more generalised can also lead to success
and have the added advantage that the lecturer reflects upon what has
been successfully achieved and what has not be grasped by the students.
Field Code Changed
295
Rosenthal (1996:150-174) describes some approaches adopted in the
United States which encourage lecturers to evaluate their own
performance and laments the fact that often the principles that guide
scientific research are not applied to science teaching. She (1996:178)
sums this up thus:
ii) Both American and British English will have to be included especially
for comparative purposes as students may have come from different
educational backgrounds where one or the other of these Englishes will
probably have been taught.
iii) Basic scientific reading texts must be used as a core for the discipline
and text attack strategies will have to be taught because the students
are also coming to grips with the more advanced scientific subject
matter (even in Portuguese). Discourse and text analysis - study of
such aspects of scientific texts as cohesion, pronoun referencing,
deixis, linking words, cause and effect, definition and classification. It
would be possible to use corresponding texts in different subject areas
with respect to these (McCarthy and Carter 1994) or even to conduct
analyses of Portuguese and English texts on the same subject (Leech
1997).
iv) Mathematical symbols, formulae and numbers in British and
American English will have to be taught, including contrasting
Portuguese and English use, as these are likely to have been omitted
on school courses but are extremely important to the understanding of
the scientific texts on the students’ bibliographies. Formulae would
also involve the revision of the alphabet and learning how equations
are put into words or form part of the text being studied. The patterns
created by the use of mathematical functions, graphs and tables are
1
Many of the Departments in the University of Aveiro promote conferences aimed at undergraduates, usually in the
final years of their courses, which may have international speakers. The Departments of Management and Tourism
(amongst others) also often have visiting scholars some of whom were American and lecture undergraduates in
English.
Field Code Changed
297
considered by Lemke (1998:102) to be “important in the value-scheme
of natural science” which the students must be able to perceive in
order to understand the relationship between the patterns and
assumptions made in scientific theory. Weights and measures in both
the metric and imperial systems must be examined including British
and American differences. Consciousness raising and alerting the
students to areas that are different will include looking at some
cultural differences between the two languages as mentioned earlier.
v) Note-taking and summarising should be included in the course as
these are skills that will be necessary throughout the students
courses. The summaries and notes produced by students might well be
in Portuguese if they are for personal use for understanding and
storing information for use in other disciplines. The study of
comparative texts mentioned earlier might help here to highlight
signalling devices in discourse that the students must be conscious of
to organise their notes.
vi) Reference skills such as dictionary work will have to be included to
provide knowledge of sources of information for the students to further
their studies independently but increasingly these reference skills will
have to be extended to the use of CD-ROM material and the Internet.
Dictionary work focusing on abbreviations, countable and uncountable
nouns, spelling and pronunciation is one simple means of providing
the students with a means of discovering more about the language
when they need to. An attempt to equip the students with the means to
proceed further in their studies on their own initiative can be
approached through more detailed study of types of definitions; both
those used in reference materials and those to be found in the
textbooks on the bibliography of both an overt and covert type (Darian
1981) with the corresponding text-type signalling.
298
vii) The students will have to be taught to interpret graphs, tables and
diagrams in English and to recognise the referencing to these in the
text and the means by which they complement or add information to
the main text. Listening activities where the students have to complete
graphs and tables and interpret them into another form must be
included. These activities could usefully be carried out in a language
laboratory. Laurillard (1993-97:112) rates the combination of audio
and visual material as one of the most productive because it gives
greater control to the student and could be used to set tasks that
“enhance and interpret students’ experience of the world”. She (ibid.)
suggests that the visual part need not necessarily be printed material
(which is however the most flexible medium), but may be an object
which the student has to observe or a situation could be created
whereby the audio material guides the student to perform some other
operation, such as work on a computer. Lemke (1998:93) argues that
the juxtaposition and combination of visuals in texts will multiply the
meanings so that “we can mean more, mean new kinds of meanings
never before meant and not otherwise mean-able.” The way in which
this is achieved has to be explored as “the user must integrate visual
and verbal realisations of objects, concepts, relations and processes in
the joint interpretation of text and figure.”(Lemke 1998:110).
viii) Appropriate specific vocabulary will have to be taught in
appropriate contexts with their usual collocations taken from the
corpora, together with the pronunciation of these and the semantic
variation that science and technology texts cause in lexis and their
associated grammatical structures. Specific grammar will have to be
taught with its usual realisations in scientific texts as necessary for the
effective realisation of the tasks undertaken. The development of more
student autonomy in using the corpora for their own difficulties with
It would appear from the list given above that some of the items
would already have been taught in MT. However, although teachers can
most definitely appeal to frameworks taught in school, it is not a sound
notion that everyone thinks in the same "scientific way". An example of
this is the "choke" on a car. In Portuguese, in older models of cars, the
driver abre o ar literally translated as "opens the air " on a cold morning
in order to start the car. In English exactly the opposite is done, the
"choke " is pulled out, thereby cutting off (choking off) the air to the petrol
mixture. The basic scientific principle, of enriching the petrol mixture, is
the same but it is not expressed in the same way. In other words, on the
surface of this expression there is a different scientific explanation of
what takes place. Halliday and Martin (1993:16) argue that there are
some minor variations among different languages of how grammar
construes phenomena into a scientific theory. They (ibid.) suggest that
English and French are different not so much because the grammar of
scientific theory is different between them, but because the English
language constructs reality more along empiricist lines whilst the French
language constructs reality along rationalist lines in scientific theory. Dr
Catherine Middlecamp, Director of Chemistry at the University of
Wisconsin-Madison (whose report is included in Rosenthal’s survey of
science teaching for language minority students in the USA, 1996) argues
that Western scientists are more inclined to use categories than others.
She explains that although the categories into which chemistry is usually
broken down such as organic, inorganic, analytic biophysical etc. appear
to be culture-free, in reality they are not. Furthermore, she argues that
300
even if two cultures are similar in their tendencies to categorise the world
“there is no guarantee that the lines will be drawn in the same places”.
The question needs to be raised as to whether there is a Portuguese
‘scientific way’ and if so, what this is. Kaplan (1966:15) describes
different scientific discourse patterns in paragraph development
employed by different linguistic systems. He finds that there is a
difference between English and the Romance languages (which would
include Portuguese) because Romance languages include digressions and
include extraneous information. The differences that exist between
different linguistic systems has significance for Portuguese students who
may well have recourse to many of these (scientific) systems. The
students may be unaware that there are distinctions between scientific
approaches when they are consulting books in different languages. The
naming of processes and theories are also often different between the
languages of science the French, for example, have not always adhered to
the International Scientific (SI) system, preferring to coin their own
terms.
Equally well, the fact that reading skills are transferable should be
used to help to get the students to read effectively in English. However, a
number of the students on this course undoubtedly opted for science and
technology because they did not like, or demonstrated less aptitude for,
foreign languages. This being the case, it will be an uphill struggle to
create the conditions necessary for successful transfer of skills. Halliday
and Martin (1997:49) discuss the “ongoing apprenticeship of students
into science discourse” which implies that what has gone before has to be
taken into account to decide what will follow. An attempt to motivate
students, who may well have an active dislike of English, must be made
to encourage them to engage with and enjoy the study of science and
technology through English. As Hutchinson and Waters (1987:141) say,
1. interlocking definitions
2. technical taxonomies
3. special expressions
4. lexical density
5. syntactic ambiguity
6. grammatical metaphor
7. semantic discontinuity
2
Even in language acquisition there has been a move towards the use of corpora for recent studies. Biber,
Conrad and Reppen (1998:172-202) report on analyses of 8-12 year olds using the CHILDES corpora as
they found that previous research was often limited in scope, used only one or two subjects and focused
on a small number of linguistic features and often a single register.
3
The word is read in combination *mi+sle+ d /mizld/ rather than two parts miss + led /mis'led/. The
confusion arises because of the overgeneralisation of the -ed ending being seen as a past tense suffix
added to the verb which would then be *misle.
306
the “existence of a lexically minimal term - a single word form - to
reference a given category is generally seen as evidence that the category
is stable and salient within its ideational domain.”
Nattinger and DeCarrico (1992) suggest that, although much of the
research done was concerned with language acquisition in children, there
is no reason to believe that adults would go about the language-learning
task any differently, and indeed misunderstandings like those mentioned
above confirm this. Nattinger and DeCarrico (1992) go even further and
suggest that “It is our ability to use lexical phrases, in other words, that
helps us to speak with fluency.”
On the other hand, there is increasing evidence from computer
corpus-based research that language itself occurs in a largely predictable
way. That is, the commonest forms of language occur in overwhelmingly
high frequencies and collocations. Collocation here means the co-
occurrence of certain words within a short space of each other in a text. A
certain word or ‘node’ is the focus of attention and the words to either the
left or the right of it are studied, these are called the collocates. The use
of a concordance which focuses on the node can reveal important
language patterns in texts. Often the position in the sentence can also be
revealed by this type of concordance study of language patterns which is
an important piece of evidence for students to observe for their own use
in writing in English. One of the most important aspects of this
computer-based research is that it is reflecting natural language use,
that is, it is descriptive and not prescriptive and examples of use are not
invented ones which can be, as Sinclair (1991) points out, “extremely
unlikely to occur in speech or writing”. Researchers in these areas report
that the commonest forms are in the majority in the frequency and
collocation studies they have done, no matter how large the corpus they
are using. Sinclair says that if those words that occur only once in a
corpus were removed the corpus would be reduced by half. He also
Field Code Changed
307
suggests that “grammatical and lexical distinctions may be closer
together than is normally allowed”.
What are these common forms and what does this research imply
in terms of language learning and teaching? This research does indeed
suggest that language is a much more finite system than has hitherto
been believed. The commonest language is used most of the time in
predictable or “prefabricated” chunks and it should be this language that
students should be provided with, in order to give them a rapid, fairly
comprehensive grasp of naturally-occurring language in the shortest
possible time scale. The idea of teaching what is most frequent has been
around for a long time (at least since West in the 1920’s); the only danger
is that what is most frequent today will not be the same as what is most
frequent tomorrow and decisions about what needs to be taught should
be based on the most up-to-date data from corpora that are made up
from that specific language that is the students’ target.
Some of the findings from computer corpus-based research run
counter to what intuition about language would suggest and, more
importantly, run counter to what coursebook writers believe to be the
case. This is true of both meaning, form and usage, as the following
examples demonstrate. A list of the commonest meanings of the verb ‘see’
would include ‘using the eyes’, ‘looking at’, ‘meeting’, ‘grasping with the
mind or imagination’, ‘discovering or checking’, ‘experiencing or
witnessing’, ‘other meanings e.g. accompany or escort’ and phrasal verbs
( taken from the Oxford Advanced Learners Dictionary in that order).
The actual findings in percentages, however, show that the most
common (53% of the Birmingham corpus), examples are in the sense of ‘I
see’ and ‘you see’4. When coursebooks were examined these were found
4
Similarly, Brown (1994:61-79) examines the inter-relationships between the sense of a verb and the
various syntactic patterns in which it can be found and which are often absent in the Oxford Advanced
Learner’s Dictionary. Nevertheless, he (1994:77) regards understanding “the kinds of mechanisms that
can be employed in texts to convey more than is explicitly asserted” as essential for advanced students.
308
to account for only 10% of occurrences. Biber, Conrad and Reppen
(1998:80-82) describe a similar misrepresentation in ESL textbooks in
their representation of subject position that-clauses. Biber et al (ibid.77)
find that that-clauses in subject position are rare in all genres (only 5-10
occurrences per million words) but that these are virtually non-existent
in the spoken corpus they examined. One of the ESL textbooks examined,
however, had two exercises for the students to use subject position that-
clauses orally. Biber et al. (ibid. 81) conclude that the results from corpus
analyses could improve textbooks in two ways;
“From a lexical point of view, it is not always desirable to imply that there is
an identity between the forms of a word. ...But often, particularly with the
commoner words of the language, the individual word forms are so different
from each other in their primary meanings and central patterns of behaviour
(including the pragmatic and stylistic dimensions), that they are essentially
different ‘words’, and really warrant separate treatment on a language course.”
“selects the indefinite article a and most emphatically rejects the definite
article the. When in predicative position, it attracts strongly a modifier such
as very, pretty, extremely. When attributive, it is commonly found with
another adjective with which it combines in meaning, so that a nice relaxing
time is nice because it is relaxing. Where nice immediately precedes a noun,
and has no modifier itself, the nouns it goes with seem to be frequently
selected from a few short lists – day, evening, etc., boys, girls, etc., and
surprise. Often there are set phrases.
310
but they are not always distinguishable grammatically. Hopper (1997:93)
argues that there is little terminology that the modern linguist uses that
would have been unfamiliar to Quintilian and that, because of this, some
integral parts of the language which, as Stubbs (1993:17) says, “lie
somewhere between word and group … are missed both by current
grammatical descriptions and also by conventional definitions of
collocation”. Hopper (ibid.) suggests that this situation is also the case
with the English verbal expression. He (1997:94) uses Firth’s sentence
“She kept on popping in and out of the office all the afternoon” as an
example of the difficulty of identifying the verb in such sentences. He
(1997:99) concludes that corpus linguistics is showing that the “category
of Verb itself might be more in the nature of a cluster or family-
resemblance category rather than a simple word class” or “folk category”.
He (1997:101) recommends the use of discourse as a data source so that
this can be made evident.
According to Sinclair (1997:37) rather than making the language
limited, the fact that regular linking of grammar or form and meaning will
not only cut down on the load the learner has to cope with but it will
make the curriculum more interesting and will allow the learner to
‘develop unique and personal utterances which are almost guaranteed to
be acceptable’. The example that he gives here is the structure
‘a(n) X of Y’
where X can be measures such as pint, yard, ounce, etc.; informal
portions blob, dash, lump, shred, etc.; shapes shaft, stick, tuft, etc.; flows
of liquid dribble, jet, spurt, etc.; containers bag, bucket, tank, tub, etc.;
formal collectives herd, flock, team, etc.; and informal collectives bunch,
clump, group, etc. If -ful is added to some things which are not normally
seen as containers such as bag to become bagful then almost anything
can become a container - a skirtful, a houseful, a shipful, etc. Sinclair
argues that this is what language is like and therefore, this is what
Field Code Changed
311
should be taught. He provides the following checklist for the language
teacher:
312
“lexical items should not be taught and learnt in isolation but only in their
proper contexts. This means shifting the emphasis from individual words to
the collocations in which they normally occur.
It is only when the student has acquired a good command of a very
considerable number of collocations that the creative element can be relied
on to produce phrases that are acceptable and natural to the native
speakers.”
Mindt (1997:44) points out that although ‘any 1’ as defined above is the
most frequent form of ‘any’ this is rarely mentioned in teaching materials
and is rarely mentioned in grammars of contemporary English. However,
he (ibid.) notes that in the English textbooks he examined this usage was
present in the same frequency but it was never explicitly taught in any of
the exercises on ‘any’ which restricted the teaching to types 2 and 3.
Based on these findings Tesch (1990:345f) proposes a new approach
in the teaching of some and any. The grading she suggests is not
assumed to take place within one lesson but would normally spread over
several teaching units but would include the use of the ‘missing’ meaning
of any which is the most frequent and where:
314
all cases of would: certainty/prediction, volition/intention, possibility/
high probability, hypothetical event or result, and habit.
The results were as follows:
will would
certainty/prediction: 71% 31%
volition/intention: 16%
possibility/ high probability: 10% 33%
hypothetical event or result: 18%
habit: 13%
Mindt argues from these results that because of their different semantic
profiles will and would should be treated separately in teaching materials.
Similar work has been carried out in Portugal by Prof. Casanova
from the University of Lisbon who argues (1995:100) that most English
grammars (and therefore language teachers) give inaccurate explanations
of English grammar which makes them inadequate or unusable. In the
case of the present perfect, the emphasis that is normally found in
grammars is on an incomplete action which was started in the past
represented by the verb tense but as Prof. Casanova shows this is simply
not correct and causes many exceptions to need to be cited. One of the
examples Prof. Casanova uses to demonstrate the inadequacy of this
explanation is the difference between John has lived in Paris and John
has lived in Paris for ten years. In these cases it is the adverb of time that
indicates that the action is incomplete rather than the verb tense. In the
former case he no longer lives in Paris yet in the latter he does which is
expressed by for ten years rather than the verb tense.
Mindt (1997:46) suggests that his research work emphasises the
importance of distributional data in grammars for teaching purposes.
Without distributional data there can be no informed grading of the
functions of a grammatical form in a language course. The absence of
distributional data in almost all preceding grammars results in a grading
Field Code Changed
315
that is based on intuition rather than on empirical evidence and very
often does not reflect the actual use of English. Halliday (1993:1) argues
that it is only with the development of the modern corpus that “serious
quantitative work in the field of grammar” can take place, the results of
which can show the probabilities of one grammatical pattern occurring
rather than another. The results that are obtained from such quantitative
research, Halliday (1993:6) suggests, are important for “learning and
teaching languages”. Through his work on the COBUILD corpus, Halliday
(1993:20-21) argues that positive and negative occur in English on a ratio
of 9:1 and that the 25 most frequent forms mostly occur only as verbs
whereas in the next 25 a large number of the forms function as both
noun and as verb. Francis (1991:145), working on the same University of
Birmingham COBUILD corpora, finds that “different senses of a noun
display different grammatical behaviour”. Todaka (1996:13) working from
the UCLA Oral Corpus and the Brown University Corpus finds that the
difference in usage of between and among can better be explained by
regarding their difference as a “distinction between ‘individual’ and
‘collective’”, that is, if the items in the NP objects are seen individually,
between is used, if not among is used. Added to this, the sentence
construction most often used with between is between A+B+(C...) whereas
that with among is most often among plural noun. He notes however that
when either of these could be used the preference for one or the other
depended upon the discourse register (formality) and the prescriptive
rules. He (1996:13) suggests that learners of English can apply his
findings to “everyday uses of these prepositions”. Despite all these
studies, there is, as yet, no work that is available for either teachers or
learners that describes English language usage comprehensively.
Minugh (1997) argues that, whenever school grammars use the
words ‘usually’ or ‘often’, students should be encouraged to go to a
corpus and examine a series of instances. In this way, he says they could
316
gain insight into the fact that the rules in school grammars are
‘necessarily overly simplistic and categorical’.
Johns (1997:102) says that working with data leads to not only “a
radical revision of preconceived ideas about what one should be teaching”
but also “how one might teach it.”
a) The simple principle ‘It is probably not worth teaching anything that
does not occur at least x times in a corpus of y million words’ (x and y
being redefinable taking into consideration the level of the learners) makes
it possible to exclude immediately much that is traditionally enshrined in
classroom tradition.
b) Pari passu the work suggests ways of dealing with areas of language
which have traditionally been poorly taught or regarded as unteachable
(e.g. article usage) and reveals areas of language structure (e.g. the
contextual patterning of nouns) that have been neglected both descriptively
and pedagogically.
c) The data controls not only which features of the language are taught, but
which exponents are presented and which meanings are taken as primary
(e.g. in Academic English, may, showing an estimate of probability based
on ‘experience’).
d) More fundamentally, the traditional division between independent
‘levels’ of language (e.g. lexis-syntax-discourse) appears increasingly
untenable once one starts to place at the centre of one’s concern the ways in
which words behave in context. As a result, although the materials have for
the most part a syntactic/functional starting point they could (as the
students themselves have observed) as well be labelled ‘Remedial
Vocabulary’ as ‘Remedial Grammar’.
Work done by Phillips (1985) and later by Hoey (1991) suggest that
discourse can also be explained better by means of lexical phrases.
Nattinger and DeCarrico (1992) say “Lexical phrases are parts of
language that have clearly defined roles in guiding the overall discourse.
In particular, they are the primary markers which signal the direction of
Field Code Changed
317
discourse, whether spoken or written.” Although corpus-based research
can aid teachers to see what is natural language use, they must be
careful to bear in mind not only the date of the corpus but also to make a
clear distinction between spoken and written language. Much of the work
done has shown a contrast between the two and has even gone as far as
noting differences between different age groups. There are dictionaries
based on computer corpora which clearly demonstrate the most frequent
meanings and collocations of words together with an explanation of
differences between spoken and (general) written language use
(Longmans, Collins, Cambridge etc.). The Longman’s Dictionary of
Contemporary English (LDOCE) claims to have 25,000 fixed phrases and
collocations. The editors say:
320
that students are aware of the discourse features that are associated with
them, which may include a transition of the tense used as Biber et al.
(1998:128) have shown in their study on research articles. McCarthy and
Carter (1994:58) suggest that students might be encouraged to produce
text frames which map the article or text being studied.
Hoey’s work (1991) demonstrates that students should be taught to
recognise cohesive devices in order to understand texts and that in order
to write more natural texts in English they should be aware of and use
different forms of repetition. Nattinger and DeCarrico (1992:60) say that
lexical phrases “signal the direction of discourse” whether the
information to follow is in contrast to, is in addition to, or is an example
of information that has preceded and, therefore, students should
recognise and practise this. An obvious way of doing this is through the
use of Cloze exercises which highlight the fact that only certain words are
possible and reflect the limited nature of most of the language that native
speakers use (with the exception of poetry and other forms of imaginative
creative writing which deliberately extends or breaks the rules). Pronoun
referencing and deictic features of text are very specific to academic prose
as Biber et al. (1998) and McCarthy and Carter (1994) show and their
specific use should be studied once again by taking actual examples from
the corpora and getting students to work on them in a number of ways.
Sinclair and Renouf (1987) suggest that “the main focus of study
should be on:
All these are available through computer corpora but even Sinclair
and Renouf allow that the use of a grammatical table “may improve the
Field Code Changed
321
learning process” by shedding “light from a different angle” and support
an ‘eclectic’ position.
Three final principles that can be deduced from the research
described are:
This first idea has also been backed up by psycholinguistic research into
the manner in which learners remember vocabulary. It is suggested that
schema are used and that these schema or word groupings are referred
to in order to enlarge upon and refine understanding. The second
principle is that recycling of items can lead the student to extend the
range of the word and gain insight into its use and facets, thereby
refining the meaning of the word in specific contexts. The third principle,
that is, repetition of items, is something which coursebook writers often
fail to do or fail to do consistently and which teachers must make an
effort to remedy. The better a teacher knows the materials that are being
used on the course, the easier this is. Students often take the stance that
work done in an earlier part of a course is no longer relevant later in the
course. This may be a response which has been produced from school
activities which divide up the material to be taught into convenient
sections which are then tested (and forgotten?) and not referred to
directly again later in the course. The detrimental effect that testing can
have on teaching leading to ‘teaching for the test’ has to be avoided
especially at this tertiary level where the students are learning ever more
detailed information in fewer subject areas and so cannot afford to ‘forget’
the earlier concepts on which the more specialised work is based. This
322
position cannot be applied to language learning either because the
process is clearly cumulative.
Finally, Portuguese corpora are being produced and when these are
available there should be an even more valuable tool to help teaching.
Bahns (1993:56) describes work carried out on lexical collocations
between German and English. Through contrastive analysis of “tens of
thousands” of lexical collocations the students are helped to identify
equivalent phrases and observe where differences occur so that they can
avoid errors in English. There is a need to reduce the learning load for
students through analysing and isolating the differences and similarities
between the two languages so that the students can be helped to produce
natural language and to avoid specific types of errors. It might also be
possible to find parallel texts (from European sources) in both English
and Portuguese which would be useful for examining differences and
similarities in scientific and technological discourse. Leech (1997:21)
describes such texts being developed through the C.R.A.T.E.R. (McEnery
et al 1994) and Multext (Ide and Vérons 1994) research projects. He
suggests that the fact that these texts are often highly specialised and
technical is a drawback but this may be a positive aspect for our
undergraduate students.
Using parallel texts would also accept the fact that often students
would be producing some kind of summary or translation from English
into Portuguese for their own use. Halliday (1993:125) suggests that
when texts are translated the translator does not normally alter the
discourse structure of the text that is being translated so this can help
the students to analyse scientific and technological discourses. Carter
(1993:146) argues that this kind of contrastive analysis can help to
produce awareness of socio-cultural meaning which is an extremely
important need if the textbooks continue to be based on the American
models as studied here. He (1993:147) goes further than this however,
Field Code Changed
323
and suggests that greater language awareness of this kind increases
learner autonomy and gives learners greater control over their learning,
which, for university undergraduates, is an essential part of the
educational process. Aitcheson (1994:95) suggests that understanding
words is “not just a case of sorting out the meanings of individual lexical
items” but that, to understand something fully “involves understanding
the mental models of a culture.” Adams, Heaton and Howarth (1991:11)
suggest that understanding “how cross cultural problems arise can help
the course designer, the teacher and the student to make reasoned
choices at the rhetorical and stylistic levels.” It has been argued that
recognising the different meanings of technical words in scientific
discourse is one of the basic skills that the reader needs in order to
understand that discourse but it is obvious from the research mentioned
above this is a somewhat simplistic view and the respective culture that
underlies the text must also be taken into account. Brumfit (1994:32)
suggests that the emphasis must be on knowledge as a process rather
than as static information and that it is essential for teachers to be
sensitive to the different understandings developed by particular cultural
and linguistic groups in order to be able to help students with their
individual needs.
Computer corpora can be used in at least three ways in teaching.
Fligelstone (1993:98) identifies these as:
324
• Teaching to establish resources (i.e. designing and creating the corpus)
The first of these principles will take some time to develop and might
more easily be used on mainstream language course with students who
have more time to focus on language itself. As with the debate about
teaching students the phonetic alphabet time is the principal constraint
in teaching about language. The second and third of these principles
would require the expertise of the lecturer working with the students on
the corpora and is feasible now that the corpora have been produced. The
fourth principle is something that could be applied if students had
specific areas of their studies which they felt needed addressing so that
the corpora could be built up or driven by what the students perceived
they needed to work on.
In conclusion, the data gathered through this study can provide
the examples of natural science and technology language of explanation
and exposition of the science textbook together with the means to present
actual language use of this medium to students on the English discipline
in the first year of university. Initially the corpora would be exploited for
teaching (and testing) resources but at a later stage, with adequate
resources available, could lead on to being exploited by the students
themselves on an individual basis to solve their individual language
problems or difficulties.
326
the broadest sense … it can enhance learning ‘through the language’
about the cultures and ideologies which inform the target language and
its uses.” There are some difficulties attached to activities like the use of
e-mail however, as Johansson (1991:307) points out; it is a medium
somewhere between speaking and writing and for this reason it is more
prone to error than more studied, revised writing. He (ibid.) suggests that
it is also more “playful and creative, less bound by conventions” so that it
would be a means for students to feel less inhibited in their use of written
language but it would not be a suitable means for encouraging accuracy
in language use.
The repetitive aspect of e-mail “conversations” through computers
also requires some reading skill techniques like scanning and skimming.
However these would be conducted in an even more interesting, on-
screen situation where the text scrolls up and down. The replies given
and further discussion of points raised have to be picked out from the
repeated material and signings on and off in the electronic conversations
that take place on chat pages.
Computers change the roles that normally pertain, in that the
reader may become the writer or editor of the text and can control the
amount and type of information that is displayed on the monitor (Gill and
Whedbee 1997:160). The Internet is another source of material which
students can access and which they could then edit for their own
purposes. Substantial editing is necessary especially if information
gained through e-mail or the Internet is to be incorporated in the
students’ own documents, and practice in doing this would be required.
With more of these activities taking place there would be a need to revise
the strategies used and to feed the insights gained through such
activities back into teaching materials, thus keeping flexibility and
openness to change a basic requirement of the syllabus.
328
communication skills, where possible linking autonomous learning to
institutionalised learning.
330
Chapter 8 Conclusion
Chapter 8
Conclusion
332
sufficient to assess the students’ future success in their first year as
undergraduates. The number of different combinations of circumstances
that the students present on entering the university is vast and
knowledge about those competencies, seen as core competencies in
science, but which many of the students lack would also aid in targeting
the syllabus for these students.
The wide variety in levels between the strongest and the weakest
students suggests that new strategies must be found to cope with large
groups of such a heterogeneous nature. The suggestion that is put
forward here is that these new strategies should be based on materials
and evidence obtained from the frequency counts and variation studies
carried out on the undergraduate textbooks. The teachers would then
have appropriate and relevant materials, and good information, so that
the focus would be on the items that would be most useful to students.
The use of computers and corpus analysis in the discipline would allow
the students themselves to approach their individual problems with the
language of science and technology and would eventually allow self-
access and distance and continuous learning to take place by means of
the university computer network.
In addition, the opportunity to work with colleagues from the
departments which teach the first year undergraduates in an
interdisciplinary manner would help to reinforce the teaching at this level
and provide a coherent framework for students to appreciate the
relevance of the work being done in language classes. From this co-
operation it would be possible to develop projects where the English
language needed by students for their project work could be analysed
and formulated from the language classes. This interdisciplinarity would
also serve to motivate those students who find it difficult to perceive the
relevance of their language studies to their courses. In other words,
content-based EFL will provide a focus and goal for the students.
Field Code Changed
333
The testing of the students would then have to change. The present
system does not take into account the target language of science and
technology and so the corpora produced here should be exploited for
testing of students, both at the preliminary stage and for the normal
university evaluation tests throughout the year. In this way, it would be
possible to see if the students who were released from the discipline did
indeed cope with actual language from the corpora of undergraduate
textbooks rather than that perceived by their teachers to be relevant.
Added to this, through the use of tests available through computers, it
would be perfectly possible to design a novel testing procedure which
could in itself be more flexible, allowing students to attempt certain tests
when they felt that they were ready. Incidentally, it should not prove too
difficult to improve the speed of marking and feedback to students by
having a computer-marking system, releasing the teacher for other
valuable activities.
The possibility of developing further teaching materials through
analyses developed by students themselves from their own interests and
needs is feasible provided that the necessary resources are available.
These would comprise not only up-to-date computers with network
connections but also teaching staff who are confident with both the
technology and corpus-based techniques. This latter knowledge would to
a great extent avoid the complaint that language lecturers do not feel
confident with the subject matter of the materials they are trying to use
with students of science and technology as their focus would be entirely
on the evidence presented from the corpora in a linguistic analysis. In
other words teachers would be focusing on the language and not on
science and technology per se.
The role of the lecturer would also undergo a change in the type of
contact and interaction with students. The use of e-mail would allow a
much closer one-to-one interaction between student and tutor and might
334
develop a different relationship from that enjoyed in a large group of
students meeting for a limited amount of time. The use of e-mail itself
would be a means of moving forwards into the modern world of
communications and language use itself, although the emphasis would
be on individual support from the teacher for students. Experience from
other universities (Motteram, University of Manchester 1998) who have a
highly developed system of tutoring through e-mail would suggest that
tutors would eventually develop a series of frequently asked questions
(FAQs) which could be made available for students to consult and
thereby save some of the tutors’ time answering the same questions over
and over again. Similarly, support material could be provided on-line for
students to work on their own.
More and more corpora are becoming available on-line and on CD-
ROM and with a small investment of time and money many other
resources could be exploited in the language class. As was mentioned at
the beginning of this thesis (see 1.5 The Situation in Portugal), the
undergraduates in the first year are on numerous, different engineering
or degree courses and the use of different corpora in this way would allow
for a diversity of interests which might only become apparent at a later
stage in the students’ courses. Use of the European Union terminological
database EURODICAUTOM on-line would be one means of addressing
the diverse engineering needs in the students who should be encouraged
to focus on precision in language for the communication of scientific and
technical data and perhaps even to make or perfect comparable
terminology in Portuguese where this is lacking. Allowing for this kind of
subject flexibility would also lay the groundwork for the students to take
up a means of continuing their language studies beyond the end of the
first year and adapting the materials they use to their actual needs.
What is missing from this work is a comprehensive analysis of the
use of lectures and other spoken communication for students in the first
Field Code Changed
335
year (and subsequent years) of their courses. The spoken corpus would
also vary widely after the first year and would require other multimedia
resources. Video in particular should be exploited more to present and
practice listening comprehension and note-taking. The actual kinds of
lectures (or spoken communication such as papers at conferences) that
the students could be expected to come into contact with should also be
gathered into a database of materials for both self-access and class use.
Interdisciplinary work with the other departments could allow videoing of
actual lectures or parts of lectures delivered in the university in English.
These lectures, or excerpts from lectures, could then form the basis for
language study materials. There are examples of university lectures in
science and technology available through the Internet from some
American universities. The reason that these lectures are available on-
line is for the students on those courses to study from and then to
contact the tutor via e-mail with any queries and to deliver their
assignments. A similar system could be experimented with in the way
described above.
In conclusion, the corpora produced here could be exploited for use
with students in the first year in collaboration with other disciplines to
focus more closely on those areas identified by colleagues in other
disciplines to be central to the first year students’ needs. Further corpora
are needed to include spoken language from science and technology. The
testing of the students also needs to take into account the competencies
the first year students require to cope with English science and
technology texts. The use of information technology needs to be
reinforced to provide the students with more resources, support and
individual contact with their tutors, as well as to prepare the students for
their future professional lives and as life-long learners.
Many of the recommendations made here can be realised in the
short or medium term in this university with its sophisticated resources
336
and in other Portuguese universities which want to adopt common-core
courses and modern technology. What would need to be introduced to
continue the relevance and utility of the language taught/learned by
these undergraduates is to extend the English discipline into other years
of the courses. Language classes might be provided in parallel with
courses to be taken on an ad hoc basis as students saw fit. This would
necessitate a reappraisal of the language needs of the students at later
stages in their courses and the development of an appropriate syllabus,
methodology and materials. The suggestion (see 7.5 Methodological
Implications) that students themselves could be encouraged to provide the
materials that they need to work on which could then be turned into an
electronic corpus of materials which would form the basis for the
language studies carried out by the students would be relevant in this
case.
There is evidence (see Chapter 4) that in the first year,
approximately 10% of the new students (those with fewer than five years
of English studies at school) would benefit from more hours of study to
bring them up to the level of the other students and to make their science
studies through English a feasible proposition. Increasing the number of
hours devoted to English only for these students and including provision
for them to work extensively through self-access material would equip
them better for their future studies. Nevertheless, the results of the
research carried out in this thesis can form the basis for specific
materials for different language competencies in students by drawing on
parallel texts which, nevertheless demonstrate the relevant discourse
features displayed in the main corpora. In this way these students could
be brought closer to understanding the texts that they are encouraged to
consult through understanding of the characteristics of those text-types.
Steps are already being taken to exploit computer resources with
students and to provide on-line English texts for students to work on in a
Field Code Changed
337
variety of ways (including pronunciation of new vocabulary). Using
suitable materials in the language laboratory which accurately reflect the
students’ needs is also being attempted rather than continuing the
tradition of decontextualised drills and pronunciation work. The results
of this analysis has alerted lecturers to making their materials reflect the
target material for these undergraduates and to place emphasis on
interpretation of visual materials together with texts. All of these different
facets are being brought together into a syllabus which recognises that
much of the work has to be carried out outside the classroom by the
students on their own and gives weight to oral classroom interaction in
order to make the most of the contact time available. Constant
reappraisal of the syllabus has always been a feature of the discipline.
New insights into both the students’ competence and needs and new
research findings and the materials used, taking into account materials
that have worked successfully with students are fed back into the
syllabus for the first year students. The corpora will go further than this
however, as they will serve as a guide and object of study for the lecturers
themselves to use to inform their ideas and judgements of what scientific
English is and more importantly is not.
338
Bibliography
Bibliography
Adams, P., Heaton, B., Howarth, P. (eds. 1991) Socio Cultural Issues in
English for Academic Purposes, Macmillan, London.
Allen, J.P.B. & Widdowson, H.G. (eds. 1974) English in Focus, Oxford,
Oxford University Press
Allen, J.P.B. & Widdowson, H.G. (1978) Teaching the Communicative Use of
English, in Mackay & Mountford (eds. 1978) English for Specific Purposes,
London, Longman.
339
Astor, C., (ed. 1997) “Voices in Education” in Education in the United
States: Continuity and Change, U.S. Society & Values Electronic Journals of
the U.S. Information Agency, Vol. 2 Nº 5 December 1997 pp.37-39
Bakhtin, M. M. (1986) Speech Genres and Other Late Essays. Trans. Vern
McGee. Eds. C. Emerson and M. Holquist. Austin: University of Texas
Press.
340
Birnbaum, I. (1987) “IT for Better Teachers” Educational Computing, Sept.
1987 Vol. 8, Issue 6. Pp. 19-21.
Brown, G., Malmkjaer, K., Pollit, A., Williams, J. (eds.1994) Language and
Understanding, Oxford: Oxford University Press.
341
Butler, C. (1985) Statistics in Linguistics, Oxford: Blackwell.
342
Clinton, W.J (1997) in Astor, C., (ed. 1997) “Voices in Education” in
Education in the United States: Continuity and Change, U.S. Society &
Values Electronic Journals of the U.S. Information Agency, Vol. 2 Nº 5
December 1997 pp.37-39
Crystal, D (1998) “To surf or not to surf: that is the question” in Network A
Journal for English Language Teacher Education Vol.1 Number 1 December
1998 The British Council.
343
Danesi, M. and Di Pietro, R. (1990) Contrastive Analysis for the
Contemporary Second Language Classroom, Ontario: Ontario Institute for
Studies in Education.
“Education and the wealth of nations” in The Economist March 29th 1997
Pp.15-16.
Eggins and Martin (1997) “Genres and Registers of Discourse” in van Dijk,
T. A., (ed. 1997) Discourse as Structure and Process, London: SAGE
Publications.
Evelyn Ng, K.L. and Olivier, W.P (1987) “Computer Assisted Language
Learning: An Investigation on some Design and Implementation Issues” in
System, Vol. 15, No. 1. Pp. 1-17.
344
Ewer, J. and Hughes-Davies, E. (1971-72) Further notes on developing an
English language programme for students of science and technology,
English Language Teaching Journal XXVI/1 and 3
345
Friel, M. (1978) A verb frequency count in legal English, ESPEMA, 10,
Spring 1978
Gill, A. M. and Whedbee, K. (1997) “Rhetoric” in van Dijk, T. A., (ed. 1997)
Discourse as Structure and Process, London: SAGE Publications.
Guillot, M-N. & Kenning M-M. (1995) Exploiting the Potential of CD-ROM
Databases: Staff Induction at the University of East Anglia in Computer
Assisted Language Learning 1995 Vol.8 No.4 pp. 365-381.
346
Hakuta, K. (1976) ‘Becoming bilingual: a case study of a Japanese child
learning English’. Language Learning 26: 321-51.
Halliday, M., McIntosh, A. & Strevens, P. (1964) The linguistic sciences and
language teaching. London: Longman.
Halliday, M.A.K. and Martin, J.R. (1993) Writing Science. Literacy and
Discursive Power London: The Falmer Press.
347
Heslot, J. (1982) Tense and other Indexical Markers in the Typology of
Scientific Texts in English, in Hoedt, J., Lundquist, L., Picht, H. &
Qvistgaard, J. (eds. 1982)) Proceedings of the 3rd European Symposium
on LSP, Copenhagen August 1981, The Copenhagen School of Economics.
Hoedt, J. & Turner, R. (eds. 1981) New Bearings in LSP, The Copenhagen
School of Economics.
348
Hopper, P. J. (1997) “Discourse and the category ‘Verb’ in English” in
Language and Communication, Vol. 17, Nº 2: Pergamon. Pp.93-102.
349
Hymes, D. (1971) On Communicative Competence. Philadelphia: University
of Pennsylvania Press.
Johns, A.M. (1986) Coherence and Academic Writing: some Definitions and
Suggestions for Teaching, in TESOL Quarterly Vol. 20 Nº2 1986 Pp247-
265).
Johns, T (1994-98) “The text and its message” in Coulthard, M. (ed. 1994 -
1998) Advances in Written Text Analysis, London: Routledge. Pp. 102-116.
Jones, C. (1991) An integrated model for ESP syllabus design. English for
specific Purposes,10, 3, Pp.155-72.
350
Kachru, B.B. (1990) “World Englishes and Applied Linguistics” in Halliday,
M.A.K., Gibbons, J., Nicholas, H. (eds. 1990) Learning Keeping and Using
Language Volume II Selected Papers from the 8th World Congress of
Applied Linguistics, Sydney, 16-21 August 1987. Amsterdam/
Philadelphia: John Benjamin’s Publishing. Pp.203-229.
351
Kroch, A.S. and Hindle, D.M. (1982) A Quantitative Study of the Syntax of
Speech and Writing, Final Report to the National Institute of Education.
Labov, W. (1972) Rules for Ritual Insults, in Sudnow, D. (ed. 1972) Studies
in Social Interaction, New York: The Free Press.
352
Lemke, J. L. (1990) “Technical Discourse and Technocratic Ideology” in
Halliday, M.A.K., Gibbons, J., Nicholas, H. (eds. 1990) Learning Keeping
and Using Language Volume II Selected Papers from the 8th World
Congress of Applied Linguistics, Sydney, 16-21 August 1987.
Amsterdam/Philadelphia: John Benjamin’s Publishing. Pp.435-460.
353
Mettinger, A. (1994) Aspects of Semantic Opposition in English, Clarendon
Press Oxford, Oxford University Press.
Minugh, D. (1997) “All the Language that’s Fit to Print: Using British and
American Newspaper CD-ROMs as Corpora,” in Wichmann, A.,
Fligelstone, S., McEnery, T., Knowles, G. (eds. 1997) Teaching and
Language Corpora, P 67-82 London: Longman.
354
National Education Goals Report: Building a Nation of Learners (1997)
Washington: U.S. Department of Education. National Education Goals
Panel. (http://www.negp.gov/).
355
Phillips, M. (1987) Communicative Language Learning and the
Microcomputer, London: The British Council.
Pilbeam, A. (1979) “The language audit.” Language Training, 1,2, Pp. 4-5.
356
Richterich, R. (1971) Analytical classification of the categories of adults
needing to learn foreign languages. Reprinted in Trim et al.(1973/1980)
Strasbourg: Council of Europe/Oxford: Pergamon.
Sedelow, S., Yeates and Sedelow, W.A. Jr (1994) “A Topologic Model of the
English Semantic Code and its Role in Automatic Disambiguation for
Discourse Analysis” in Hockey, S. and Ide, N. (eds 1994) Research in
Humanities Computing Oxford: Clarenden Press.
357
Selinker, L. & Trimble, L. (1976) Scientific and technological writing: the
choice of the tense, Forum (English Teaching Forum), 14 (4), 22-26.
358
Strevens, P. (1973) Technical, technological and scientific English, English
Language Teaching Journal XXVII/3
Sudnow, D. (ed. 1972) Studies in Social Interaction, New York: The Free
Press.
359
Swales, J. (1978) Writing “Writing Scientific English”, in Mackay &
Mountford (eds 1978) English for Specific Purposes, London: Longman. Pp.
43-55.
Swales, J. (1985), Episodes in ESP, Oxford: Pergamon Press Ltd.
Tarone, E. & Yule, G. (1989) Focus on the language learner. Oxford: Oxford
University Press.
360
Trim, J. L. M, Richterich, R., Van Ek, J. & Wilkins, D. (1973/80). Systems
development in adult language learning. Strasbourg: Council of
Europe/Oxford: Pergamon.
van Dijk, T. A., (ed. 1997) Discourse as Structure and Process, London:
SAGE Publications.
van Dijk, T. A., (ed. 1997) Discourse as Social Interaction, London: SAGE
Publications.
Van Ek, J. A., The Threshold Level for Modern Language Learning in
Schools Council of Europe, Strasbourg 1976, Longman 1977
361
Weber, H. Language for specific Purposes, Text Typology, and Text
Analysis: Aspects of a Pragmatic-Functional Approach in Hoedt, J. &
Lundquist, L., Picht, H. & Qvistgaard, J. (eds. 1982)) Proceedings of the
3rd European Symposium on LSP, Copenhagen August 1981, The
Copenhagen School of Economics. Pp. 219- 234.
Wilks, Y.A., Slator, B.M.,& Guthrie, L.M. (1996) Electric Words: Dictionaries
Computers and Meanings, Mass: MIT Press.
362
Wilson, E. (1997) “The Automatic Generation of CALL Exercises from
General Corpora” in Wichmann, A., Fligelstone, S., McEnery, T., Knowles,
G. (eds. 1997) Teaching and Language Corpora, London: Longman. Pp.116-
130
Wingard, P. (1981) Some verb forms and functions in six medical texts, in
Selinker (ed. 1981) London, Newbury House.
“World Education League: Who’s top?” The Economist March 29th 1997
Pp.21-25.
363
Appendices
Appendix A
(C3) PRO-VERBS
12.pro-verb do (e.g., the cat did it)
DO when NOT in the following constructions:
DO + (ADV) + V (DO as auxiliary)
ALL-P/T#/WHP+DO (DO as question)
This feature was included in Biber (1986a) as a marker of register
differences. Do as pro-verb substitutes for an entire clause, reducing the
informational density of a text and indicating a lesser informational focus, due to
processing constraints or a higher concern with interpersonal matters.
372
Appendix A Biber’s Algorithms and Functions
Nominalizations have been used in many register studies. Chafe (1982, 1985,
and Danielewicz 1986) focuses on their use to expand idea units and integrate
information into fewer words. Biber (1986a) finds that they tend to co-occur with
passive constructions and prepositions and thus interprets their function as
conveying highly abstract (as opposed to situated) information. Janda (1985)
shows that nominalizations are used during note-taking to reduce full sentences
to more compact and efficient series of noun phrases.
15. gerunds
All participle forms serving nominal functions - these are edited by hand.
Gerunds (or verbal nouns) are verbal forms serving nominal functions. As
such, they are closely related to nominalizations in their functions. Some
researchers (e.g., Chafe 1982) do not distinguish among the different participial
functions, treating gerunds, participial adjectives (nos. 40-1), and participial
clauses (nos. 25-8) as a single feature. In the present study, these functions are
treated separately.
16. total other nouns
All nouns included in the dictionary, excluding those forms counted as
nominalizations or gerunds.
This count provides an overall nominal assessment of a text. Nominalizations
and gerunds are excluded from the total noun count so that the three features will
be statistically independent.
375
evaluation (commitment, etc.) being given in the main clause and the
propositional information in the that-clause.
Some verb complements do not have an overt complementizer (e.g., I
think he went); these are counted as a separate feature (no. 60).
22.that adjective complements (e.g., I'm glad that you like it)
ADJ + (T#) + that
(complements across intonation boundaries were edited by hand)
Most studies of that-clauses consider only verb complements. Winter
(1982) points out, however, that verb and adjective complements seem to have
similar discourse functions, and so both should be important for register
comparisons. Because there is no a priori way to know if that verb and adjective
complements are distributed in the same way among genres, they are included as
separate features here. Householder (1964) has compiled a list of adjectives that
occur before that-clauses; Quirk et al. (1985:1222-5) give a grammatical and
discourse description of these constructions.
379
more causative subordination overall in speech, the form as is used as a causative
subordinator more in writing.
36.concessive adverbial subordinators: although, though
Following a general pattern for adverbial clauses, concessive adverbials can also
be used for framing purposes or to introduce background information, and they
have different functions in pre- and post-posed positions (McClure and Geva
1983; Altenberg 1986). Both Altenberg and Tottie (1986) find more concessive
subordination overall in writing.
37.conditional adverbial subordinators: if, unless
Conditional clauses are also used for discourse framing and have differing
functions when they are in pre- or post-posed position (Ford and Thompson
1986). Finegan (1982) finds a very frequent use of conditional clauses in legal
wills, due to the focus on the possible conditions existing when the will is
executed. Several researchers have found more conditional clauses in speech than
in writing (Beaman 1984; Tottie 1986; Biber 1986a; Ford and Thompson 1986),
but the functional reasons for this distribution are not clear.
38.other adverbial subordinators: (having multiple functions)
since, while, whilst, whereupon, whereas, whereby, such that, so that xxx, such
that xxx, inasmuch as, forasmuch as, insofar as, insomuch as, as long as, as soon
as
(where xxx is NOT: N/ADJ)
383
Discourse particles are used to maintain conversational coherence
(Schiffrin 1982, 1985a). Chafe (1982, 1985) describes their role as 'monitoring
the information flow' in involved discourse. They are very generalized in their
functions and rare outside of the conversational genres.
51.demonstratives that/this/these/those
(This count excludes demonstrative pronouns (no. 10) and that as relative,
complementizer, or subordinator.)
Demonstratives are used for both text-internal deixis (Kurzon 1985) and
for exophoric, text-external, reference. They are an important device for marking
referential cohesion in a text (Halliday and Hasan 1976). Ochs (1979) notes that
demonstratives are preferred to articles in unplanned discourse.
56.private verbs
(e.g., anticipate, assume, believe, conclude, decide, demonstrate, determine,
discover, doubt, estimate, fear, feel, find, forget, guess, hear, hope, imagine,
imply, indicate, infer, know, learn, mean, notice, prove, realize, recognize,
remember, reveal, see, show, suppose, think, understand)
This class of verbs is taken from Quirk et al. (1985:1181-2).
57.suasive verbs
(e.g., agree, arrange, ask, beg, command, decide, demand, grant, insist, instruct,
ordain, pledge, pronounce, propose, recommend, request, stipulate, suggest,
urge)
This class of verbs is taken from Quirk et al. (1985:1182-3).
58.seem/appear
These are 'perception' verbs (Quirk et al. 1985:1033, 1183). They can be used to
mark evidentiality with respect to the reasoning process (Chafe 1985), and they
represent another strategy used for academic hedging (see the discussion of
downtoners - no. 46).
387
Appendix B
391
Appendix B Alphabetical Frequency List
392
Appendix B Alphabetical Frequency List
393
Appendix B Alphabetical Frequency List
394
Appendix B Alphabetical Frequency List
395
Appendix B Alphabetical Frequency List
396
Appendix B Alphabetical Frequency List
397
Appendix B Alphabetical Frequency List
398
Appendix B Alphabetical Frequency List
399
Appendix B Alphabetical Frequency List
400
Appendix B Alphabetical Frequency List
401
Appendix B Alphabetical Frequency List
402
Appendix B Alphabetical Frequency List
403
Appendix B Alphabetical Frequency List
404
Appendix B Alphabetical Frequency List
405
Appendix B Alphabetical Frequency List
406
Appendix B Alphabetical Frequency List
407
Appendix B Alphabetical Frequency List
408
Appendix B Alphabetical Frequency List
409
Appendix B Alphabetical Frequency List
410
Appendix B Alphabetical Frequency List
411
Appendix B Alphabetical Frequency List
412
Appendix B Alphabetical Frequency List
413
Appendix B Alphabetical Frequency List
414