筑波大学 人文社会科学研究科                                                現代語・現代文化専攻                                           平井 明代研究室



2020年度  英語教育学Ⅶ

 

Chapter 12 Testing Listening

 

R.R

Introduction

·         Despite often coming as a pair with speaking, there are many advantages of testing listening separately from speaking. This is mainly because there are many situations in daily life where listening is exercised alone, such as listening to the radio, lectures, online talks, etc.

·         Likewise, as listening is a receptive skill, there are many similarities in the way that reading, and listening are tested.

·         However, due to the transient nature of spoken language, some problems do occur in testing listening as candidates won’t be able to go back and forth through the audio as they would a visual text.

 

Specifying what the candidate should be able to do

 

Content

1.      Global operations:

·         obtain the gist

·         follow an argument

·         recognize the attitude of a speaker

 

2.      Informational operations:

·         obtain factual information

·         follow instructions

·         understand expressions of need

 

3.      Interactional operations:

·         understand greetings and introductions

·         understand expressions of agreement and disagreement

·         recognize speaker’s purpose

 

Problems encountered while testing lower level listening skills:

·         discriminate between vowel and consonant phonemes

·         interpret intonation patterns (sarcasm, questions, etc.)

·         interpret non-verbal information (facial expressions and gestures)

 

Texts

·         Text type- includes how many speakers (monologue, dialogue, etc.) as well as other text types such as conversations, announcements, lectures, etc.

 

·         Text form- description, exposition, argumentation, instruction, and narration.

 

·         Length

 

·         Speed of speech- expressed as words per minute (wpm) or syllables per second (sps).

Average speeds for British English are indicated on this chart from Tauroza and Allison (1990)

 

·         Accents- regional or non-regional

 

Setting criterial levels of performance

Like testing reading, “if the test is set at an appropriate level, then a near perfect set of responses may be required for a ‘pass’”.

 

Setting the tasks

Selecting samples of speech (texts)

·         Keep test specifications in mind when selecting texts. For example, radio, online lectures, television, etc. are good sources of authentic speech when testing how well candidates can understand audio intended for expert speakers.

·         Recordings made specifically for non-expert speakers can be good options, however, as the recording often adjusts native-like attributes of speech in an effort to be accessible, it often introduces unreliability by creating difficulties unrelated to speech comprehension.

·         Thus, recordings made specifically for testing, must be created with care to be as natural as possible.

 

Writing items

·         When creating writing items for a listening test, it’s useful to listen to the text and take note of what candidates should be able to do with it.

·         Key words should be pointed out to the candidates, and they should be given ample time to familiarize themselves with the items in the written handout before they listen to the text.

 

Possible techniques

·         Multiple choice

o   Choices should be kept short and simple.

·         Short answer

o   Make sure the questions are short, straightforward, and obvious.

·         Gap filling

o   Keep the questions short and avoid unique answers.

·         Information transfer

o   Also useful for testing reading. Makes minimal demands on productive skills and can involve a variety of activities such as labeling diagrams or pictures, completing forms, showing routes on a map, etc.

·         Note taking

o   Instead of scoring the notes candidates will take directly, it’s better to have candidates use their notes to answer questions after a lecture. The more straightforward the questions, the more reliable the results of the test will be.

·         Partial dictation

o   Traditional dictation is often very difficult to score reliably, so partial dictation increases the reliability as long as listening remains the focus and other factors, such as correct spelling, are not included.

·         Transcription

·         Moderating the items

 

Presenting the texts (live or recorded?)

·         The benefit of recordings is that the uniformity of text allows for increased reliability. However, these need to be presented in a room with good acoustics or a language lab so that the audio is equally clear in all part of the room.

·         Otherwise, live performances of the text can be used as long as one reliable, responsible, and trustworthy speaker, who has a good command of the language, is used for each test room.

 

Scoring the listening test

·         As previously mentioned, when scoring a listening test, deducting points for grammar or spelling is not appropriate. As long as the intended response is clearly understood, it should be marked as correct.

 

Reader Activities

1.      a. Choose an online video lecture that would be appropriate for a group of students with whom you are familiar. Play a five-minute stretch to yourself and take notes. On the basis of the notes, construct eight short-answer items. Ask colleagues to take the test and comment on it. Amend the test as necessary and administer it without the video (audio only) to half of the group of students you had in mind. Analyze the results.

 

Video: Green with Happiness: Meet the Green Lady of Brooklyn https://www.youtube.com/watch?v=pE5h2kk0NTI
Questions (number of questions has been adjusted from eight for a five-minute video to five for a three-minute video:

1.      Where did the Green Lady grow up?

2.      What did she want to be when she grew up?

3.      Why was she excited to visit her father in Florida?

4.      How long has she been wearing green?

5.      Why does she continue to wear green?

 

b. Administer the same test to the other half of the group, showing them the video as well as the audio. what differences do you notice between the performance of the two groups of students? Go through the test item by item with the students and ask for their comments. How far, and how well, is each item testing what you thought it would test?

 

Comments:

Many students were interested in the contents of the video and found that since the video was short (three minutes), it was easy to follow and look for answers to the questions. Some students felt that five questions were a lot for three minutes although I adjusted the number of questions from eight for a five-minute video to five for a three-minute video.

Students were able to do well with specific questions such “Where”, “What”, and “How long”, but had a harder time answering deeper questions starting with “Why”.

 

2.      Design short items that attempt to discover whether candidates can recognize: sarcasm, surprise, boredom, elation. Try these on colleagues and students.

 

1.      You’re incredible!

2.      I had no idea.

3.      Wow!

 

Comments:

This was fun for students to try out. Sarcasm and boredom have similar tones at times, so it was a bit difficult for some students to distinguish between the two.

 

3.      Design a test that requires candidates to draw (or complete) simple pictures. Decide exactly what the test is measuring. Think what other things could be measured using this or similar techniques. Administer the test and see if the students agree with you about what is being measured.

 

Draw the face as described.

1.      He has curly hair.

2.      He has a beard.

3.      He is angry.

          

(From left to right: original unfinished drawing and two complete drawings from students.)

 

Comments:

Scoring a test by drawing alone would be very difficult and unreliable as it would vary depending on drawing ability. Thus, I think it would be necessary to include an “image bank” of options to draw in order to complete the test. This kind of test would be able to measure informational operations, but also interactional operations by including tasks such as “He said, “What are you doing?” and asking the participants to draw the facial expression that best reflected the tone of the question or statement.

 

Discussion Questions

1.      Sarcasm was mentioned multiple times in this text as a difficult form of speech to understand. What makes sarcasm so difficult to understand?

Sarcasm (皮肉)を含んだ文章が、非常に分かりにくいという傾向がありました。なぜ非常に分かりにくいのでしょうか。理由を答えてください。

 

 

2.      Have you encountered sarcasm (in conversation or through listening materials such as television or movies)? How did you react?

皮肉に遭遇したことがありますか(会話の中で、あるいはテレビや映画などのリスニング教材を通して)。どのように反応しましたか。