Assessing Speaking

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 39

ASSESSING SPEAKING

PAPER

By:

FIRDAUS NUR HABIBA (201910560211002)

IKA YULIANA (201910560211011)

ENGLISH LANGUAGE EDUCATION DEPARTMENT


FACULTY OF MAGISTER ENGLISH LANGAUGE EDUCATION
UNIVERSITY OF MUHAMMADIYAH MALANG
2020

1
A. Basic Types of Speaking
1. Imitative

Types of speaking performance is the ability to simply imitate a word or


phrase or possibly a sentence. It is interested only in pronunciation no inferences are
made about the test-taker’s ability to understand or convey meaning to participate in
an interactive conversation. The only role of listening here is in short-term storage of
a prompt, just long enough to allow the speaker to retain the short stretch of language
that must be limited.

2. Intensive

A second type of speaking frequently employed in assessment contexts is the


production of short stretches of oral language designed to demonstrate competence in
a narrow band grammatical, phrasal, lexical, or phonological relationships. The
speaker must be aware of semantic properties in order to be able to respond.

3. Responsive

Responsive assessment tasks include interaction and test comprehension but


at the somewhat limited level of very short conversations, standard greetings and
small talk, simple requests and comments, and the like. The stimulus is almost
always a spoken prompt.

4. Interactive

The difference between responsive and interactive speaking is in the length


and complexity of the interaction, which sometimes includes multiple exchange
and/or multiple participants. Interaction can take the two forms of transactional
language which has the purpose of exchanging specific information or interpersonal
exchanges which have the purpose maintaining social relationships.

5. Extensive

2
Extensive oral production tasks include speech, oral presentations, and story-
telling, during which the opportunity for oral interaction from listeners is either
highly limited or ruled altogether.

B. Micro and Macro-skills of Speaking

3
There is such an array of oral production tasks that a complete treatment is
almost impossible within the confines of one chapter in this book. Below is a
consideration of the most common techniques with brief allusion to related tasks to
design tasks:

a. No speaking task is capable of isolating the single skill of oral production.


b. Electing the specific criterion, you have designated for a task can be
tricky because beyond the word level, spoken language offers a number
of productive option to test-takers.
c. It is important to carefully specify scoring procedures for a response so
that ultimately you achieve as high a reliability index as possible.
C. Designing Assessment Tasks: Imitative Speaking

An occasional phonologically focused to repetition task is warranted as long


as repetition tasks are not allowed to occupy a dominant role in an overall oral
production assessment, and as long as you artfully avoid a negative washback effect.
In a simple repetition task, test-takers repeat the stimulus, whether it is a pair of
words, a sentence, or perhaps a question.

A scoring specifications must be clear in order to avoid reliability


breakdowns. A common form scoring simply a two or three-point system for each
responses.

4
1. Phone-pass Test

An example of a popular test that uses imitative (as well as intensive)


production tasks is Phone-pass, widely used, commercially available speaking test in
many countries. Among a number of speaking tasks on the test, repetition of
sentences (of 8 to 12 words) occupies a prominent role. The phone-pass test elicits
computer-assisted oral production over a telephone. Test-takers read aloud, repeat
sentences, say words, and answer questions.

5
Scores for the Phone-pass test are calculated by computerized scoring
template and reported back to the test-taker within minutes. Six scores are given: an
overall score between 20 and 80 and five sub-scores on the same scale that rate
pronunciations, reading fluency, repeat accuracy, repeat fluency, and listening
vocabulary.

He tasks on Parts A and B of the Phone-pass test do not extend beyond the
level of oral reading and imitation. Parts C and D represent intensive speaking.
Section E is used only for experimental data gathering and does not figure into the
scoring. The scoring procedure has been validated against human scoring with
extraordinary high reliabilities and correlation statistics. (.94 overall).

6
D. Designing Assessment Tasks: Intensive Speaking

Intensive tasks may also be described as limited response task (Madsen,


1983), or mechanical tasks (Underhill, 1987), or what classroom pedagogy would
label as controlled response.

1. Directed Response Tasks

In this type of task, the test administrator elicits a particular grammatical form
or a transformation of a sentence. Such tasks are clearly mechanical and not
communicative, but they do require minimal processing of meaning in order to
produce the correct grammatical output.

2. Read-Aloud Tasks

Intensive read-aloud tasks include reading beyond the sentence level up to a


paragraph or two. This technique is easily administered by selecting a passage that
incorporates test specs and by recording the test-taker’s output: the scoring is
relatively easy because all of the test-taker’s oral production is controlled. An earlier
form the Test of Spoken English (TSE) incorporated one read-aloud passage of about
120-130 words with a rating scale for pronunciation and fluency. The following
passage is typical:

7
8
3. Sentence/ Dialogue Completion Tasks and Oral Questionnaires

Another technique for targeting intensive aspects of language requires test-


takers to read dialogue in which one speaker’s lines have been omitted. Test-takers
are first given time to read through the dialogue to get its gist and to think about
appropriate lines to fill in, then as the tape, teacher, or test administrator procedures
one part orally, the test-takers responds. Here’s an example.

9
An advantage of this technique lies in its moderate control output of the test-
taker. While individual variations in responses are accepted, the technique tape into a
leaner’s ability to discern expectancies in a conversation and to produce
sociolinguistically correct language. One of disadvantage of this technique is its
reliance on literacy and an ability to transfer easily from written to spoken English.
Another disadvantage is the contrived, inauthentic nature of this task.

10
4. Picture-Cued Tasks

One of the more popular ways to elicit oral language performance at both
intensive and extensive level is a picture-cued stimulus that requires a description
from the test-taker. Picture maybe very simple, designed to elicit a word or a phrase.
Here is an example of a picture-cued elicitation of the production of a simple
minimal pair.

11
12
13
14
15
16
17
5. Translation (of Limited Stretches of Discourse)

Translation is part of our tradition in language teaching that we tend to


discount or disdain, if only because our current pedagogical stance plays down its
importance. Translation method of teaching are certainly passé in an era of direct

18
approach to creating communicative classroom. Also, translation is a well-proven
communication strategy for learners of a second language.

Condition may vary from expecting an instant translation of an orally elicited


linguistic target to allowing more thinking time before producing a translation of
somewhat longer text, which may optionally be offered to the test-taker in written
form. The advantages of translation lie in its control of the output of the test-taker,
which of course means that scoring is more easily specified.

E. Designing Assessment Tasks: Responsive Speaking


1. Question and Answer

Question and answer tasks can consist of one or two questions from an
interviewer or they can make up a portion of a whole battery of questions and
prompts in an oral interview. They can vary from simple questions like “What is this
called in English?” to complex questions like “What are the steps governments
should take?” the first question is intensive in its purpose. It is a display question
intended to elicit a predetermined correct response. Questions at the responsive level
tend to be genuine referential questions in which the test-taker is given more
opportunity to produce meaningful language in response.

Notice that question number 5 has five situationally linked questions that may be
vary slight depending on the test-taker’s response to a previous question.

19
A potentially tricky form of oral production assessment involves more than
one test-taker with an interviewer, with students in an interview context, both test—
takers can ask questions of each other.

2. Giving Instructions and Directions

Using such a stimulus in an assessment context provides an opportunity for


the test-taker to engage in a relatively extended stretch of discourse, to be very clear
and specific, and to use appropriate discourse makers and connectors. The technique
is simple: the administrator poses the problem, and the test-taker responds. Scoring is
based primarily on comprehensibility and secondarily on other specified grammatical
or discourse categories. Here some possibilities.

20
3. Paraphrasing

Another type of assessment task that can be categorized as responsive asks


the test-taker to read or hear a limited number of sentences (perhaps two or five) and
produce a paraphrase of the sentence. For example:

21
A more authentic context for paraphrase is aurally receiving and orally
relaying a message. In the example below, the test-taker must relay information from
a text phone call to an office colleague named Jeff.

The advantages of such tasks are that they elicit short stretches of output and
perhaps tap into test-taker’s ability to practice the conversational art conciseness by
reducing the output/input ratio.

F. Test of Spoken English (TSE)

The Test of Spoken English (TSE) is a 20-minute audiotaped test of oral


language ability without academic or professional environment. TSE scores are used
for selecting and certifying health professionals such as physicians, nurses,
pharmacists, physical therapists, and veterinarians.

The tasks on the TSE are designed to elicit oral production in various
discourse categories rather than in selected phonological, grammatical, or lexical
targets. The following content specifications for TSE represent the discourse and
pragmatics context assessed in each administration:

1. Describe something physical


2. Narrate from presented material
3. Summarize information of the speaker’s own choice
4. Give directions based on visual materials
5. Give instructions
6. Give an opinion

22
7. Support an opinion
8. Compare/contrast
9. Hypothesize
10. Function interactively
11. Define.

Using these specifications, Lazaraton and Wagner (1996) examined 15 different


specific in collecting background data from native and non-native speakers of
English.

1. Giving a personal description


2. Describing a daily routine
3. Suggesting a gift and supporting one’s choice
4. Recommending a place to visit and supporting one’s choice
5. Giving directions
6. Describing a favourite movie and supporting one’s choice
7. Telling a story from pictures
8. Hypothesizing about future action
9. Hypothesizing about preventative action
10. Making a telephone call to the dry cleaner
11. Describing an important news event
12. Giving an opinion about animals in the zoo
13. Defining a technical term
14. Describing information in a graph and speculating about its implications
15. Giving details about a trip schedule

Following is a set of sample items as they appear in the TSE Manual, which
is downloadable from the TOEFL website.

23
24
25
Holistic scoring taxonomies such as these imply a number of abilities that
comprise “effective” communication and “competent” performance of the task. The
original version of the TSE (1987) specified three contributing factors to a final score
“overall comprehensibility”: pronunciation, grammar, and fluency. The current
scoring scale of 20 to 60 listed above incorporates task performance, function,
appropriateness, and coherence as well as the form-focused factors.

Following is a summary of information on the TSE:

26
G. Designing Assessment Tasks: Interactive Speaking
1. Interview

Interviews can vary in length from perhaps five to forty-five minutes,


depending on their purpose and context. Placement interviews, designed to get a
quick spoken sample from a student in order to verify placement into a course, may
need only five minutes if the interviewer is trained to evaluate the output accurately.

a. Warm up: in a minute or so of preliminary small talk, the interviewer


directs mutual introductions, helps the test-taker become comfortable with
the situation, apprises the test-taker of the format and allays anxieties. No
scoring of this phase takes place.
b. Level check: through a series of pre-planned questions, the interviewer
stimulates the test-taker to respond using expected or predicted form and
functions.
c. Probe: probe questions and prompts challenge test-takers to go the heights
of their ability to extend beyond the limits of the interviewer’s
expectation through increasingly difficult questions.
d. Wind-down: the final phase of interview is simply a short period of time
during which the interviewer encourages the test-taker to relax with some
each questions, sets the test-taker’s mind at ease, and provides
information about when and where to obtain the result of the interview.
This part is not scored.

27
28
29
The success of an oral interview will depend on:

 Clearly specifying administrative procedures of the assessment (practically)


 Focusing the questions and probes on the purpose of the assessment (validity)
 Appropriately eliciting an optimal amount and quality of oral production
from the test-taker (biased for the best performance)
 Creating a consistent, workable scoring system (reliability)

30
2. Role Play

Role playing is popular pedagogical activity in communicative language


teaching classes. Within constraints set forth by the guidelines, it frees students to be
somewhat creative in their linguistic output. In some versions, role play allows some
rehearsal time so that students can map out what they are going to say. The test
administrator must determine the assessment objectives of the role lay, then devise a
scoring technique that appropriately pinpoints the objectives.

31
3. Discussions and Conversations

As formal assessment device, discussions and conversations with among


students are difficult to specify and even more difficult to score. Discussions may be
especially appropriate tasks through which to elicit and observe such abilities as:

 Topic nomination, maintenance, and termination


 Attention getting, interrupting, floor holding, control
 Clarifying, questioning, paraphrasing
 Comprehension signal
 Negotiating meaning
 Intonation patterns for pragmatics effect
 Kinesics, eye contact, proxemics, body language
 Politeness, formality, and other sociolinguistic factors.

Assessing the performance of participants through scores or checklist should


be carefully designed to suit the objective of the discussion.

4. Games

Among informal assessment devices are a variety of games that directly


involve language production. Consider the following types:

32
As assessment, the key is to specify the set of criteria and reasonably
practical and reliable scoring method. The benefit of such an informal assessment
may not be as much in summative evaluation as in its formative nature, with
washback for the students.

33
H. Oral Proficiency Interview (OPI)

The best-known oral interview format is one that has gone through a
considerable metamorphosis over the last half-century, the Oral Proficiency
Interview (OPI). Originally known as the Foreign Service Institute (FSI) test, the OPI
is the result of historical progression of revisions under the auspices of several
agencies, including the Educational Testing Service and the American Council of
Teaching Foreign Language (ACTFL).

34
First, they are more reflective of a unitary definition of ability. Instead of
focusing on separate abilities in grammar, vocabulary, comprehension, fluency, and
pronunciation, they focus more strongly on the overall task and on the discourse
ability needed to accomplish the goals of the task. Second, for classroom assessment
purpose, the six FSI categories more appropriately describe the components of oral
ability than do the ACTFL holistic scores, and therefore offer better washback.
Third, the ACTFL requirement for specialized training renders the OPI less useful
for classroom adaption.

Here is a summary of the ACTFL OPI:

I. Designing Assessment: Extensive Speaking


1. Oral Presentations

A summary of oral assessment technique would therefore be incomplete


without some consideration of extensive speaking tasks. Once again the rules for
effective assessment must be invoked: (a) specify criterion, (b) set appropriate tasks,
(c) elicit optimal output, and (d) establish practical, reliable scoring procedures.
Following is an example of a checklist for prepared oral presentation at the
intermediate or advanced level of English.

35
The washback effect of such as checklist will be enhanced by written
comments from the teacher, a conference with the teacher, peer evaluations using the
same form, and self-assessment.

2. Picture-Cued Story-Telling

One of the most common technique for eliciting oral production is through
visual pictures, photographs, diagrams, and charts. Consider the following set of
pictures.

36
Your criteria for scoring need to be clear about what it is you are hoping to
assess. Refer back to some of the guidelines suggested under the section on oral
interviews, above, or to the OPI for some general suggestions on scoring such a
narrative.

3. Retelling a Story, News Event

In this type of the task, test-takers hear or read a story or news event that they
are asked to retell. The objectives in assigning such a task vary from listening
comprehension of the original to production of a number of oral discourse features
(communicating sequences and relationships of events, stress and emphasis patterns,
expression in the case of a dramatic story), fluency, and interaction with the hearer.
Scoring should of course meet the intended criteria.

37
4. Translation (of Extended Prose)

Translation of words, phrases, or short sentences was mentioned under the


category of intensive speaking. Those texts could come in many forms: dialogue,
directions for assembly of a product, a synopsis of a story or play or movie,
directions on how to find something on a map, and other genres. The advantages of
translation is in the control of the content, vocabulary, and, to some extent, the
grammatical and the discourse features. The disadvantage is that translation of longer
texts is a highly specialized skill for which some individuals obtains post-
baccalaureate degrees. Criteria for scoring should therefore take into account not
only the purpose in stimulating a translation but the possibility of errors that are
unrelated to oral productive ability.

38
REFRENCE

Brown, Douglas. (2004). Language Assessment Principle and Classroom Practice.


Pearson: Longman

39

You might also like