![]() |
2018年度 異文化言語教育評価論 |
The Summary of the Second Paper I Read
Title:
Distinguishing features in
scoring L2 Chinese speaking performance: How do they work?
1. Introduction
Distinguishing features has been widely used
in the construction of rating scales to L2 score performances. International
English tests such as TOEFL and IELTS all use distinguishing features to rate
L2 learners speaking performance. However, some researchers have criticized
that using distinguishing features to judge L2 learners speaking ability may
not be suitable.
In this regard, in order to validate the
distinguishing features widely used in scoring L2 learners speaking ability,
this paper aims to examine the relation between L2 learners’ speaking score and
distinct speaking features.
2. Literature Review
Although
previous L2 English studies (Brown et al., 2005; Brown, 2006a; Iwashita et al.,
2008) suggested that there is a link between distinguishing features and
scores, this issue has not been clearly examined in L2 Chinese studies.
Previous L2 Chinese studies (Wang, 2002; Zhu, 2009) mainly illustrate that
there is a relationship between distinguish features and L2 learners language
proficiency. It was found that distinguish features emerged with significant
difference among L2 learners performance at different proficiency levels. These
studies, however, did not clearly examine how do these distinguish features
relate to L2 learners speaking scores. In this regard, this paper would like to
fill in this gap and investigate how distinguishing features contribute to L2
Chinese learners speaking scores.
3. Identification of distinguishing features
In this study, distinguishing features were
identified by analyzing 4 sets of influential documents concerning L2 Chinese
teaching and testing syllabi with focuses on speaking proficiency. The title of
the used documents are listed as follows:
1. Chinese Proficiency Scales and Syllabus of
Graded Grammar
2. Chinese Language Proficiency
Scales for Speakers of Other Languages
3. Five-Band Holistic Scoring
Standards for L2 Chinese Speaking in Test Syllabus
for HSK-Advanced Level
4. Spoken Chinese Proficiency Grading Standards
and Testing Guidelines
By analyzing the documents, 7 distinguish
features were identified. The features were target-like
syllables, speech rate, pause time, word tokens, word types, grammatical
accuracy and grammatical complexity. These features comprised the
distinguishing features analyzed in this study.
4. Research Questions
In this study, two research questions were
addressed:
1. In what way does each of the
distinguishing features relate to the scores?
2. How do the distinguishing
features contribute to the scores?
5. Method
5.1 Measurement of Distinguishing Features
5.1.1 Pronunciation
While examining L2 Chinese learners’
pronunciation, the syllable has been widely used as a basic unit of analysis. A
syllable mainly includes three units: an initial, a final and a tone.
Therefore, in this study, if L2 learners could clearly pronounce the three
units of a syllable, it represents that they could produce target-like
syllables.
5.1.2Fluency
Two distinguish features (speech rate and
pause time) were under the category of fluency. In this study, speech rate was
calculated by counting the number of syllables, divided by total speech in
seconds. As for pause time, it was calculated by counting the total time of
unfilled paused and divided it by total speech duration in seconds.
5.1.3 Vocabulary
In regard to vocabulary, word tokens and word
types have been widely accepted as effective measures for evaluating L2
learners’ vocabulary. In this study, Word tokens and word types were counted in
terms of segmented words as presented in (1):
(1)
Example one shows the Chinese
segmented words. In the example, there are 5 word tokens and 4 word types.
5.1.4 Grammar
In Chinese, the ‘sentence’ has to be taken as
the basic unit of analyzing grammatical accuracy and grammatical complexity. In
this regard, this study examined L2 learners’ grammatical accuracy and grammatical
complexity by observing the percentage of error-free sentences and the mean
length (number of syllables) of sentences L2 learners applied.
5.2 Tasks
This study implemented 2
integrated tasks and 1 independent task in order to measure learners speaking
performance.
5.3 Participants
There were 66 students from L2
Chinese speaking courses for advanced level learners at a comprehensive
university in Shanghai. L2 learners speaking performance were all recorded for
further analysis. In addition, two raters participated in the scoring of the
speech data.
5.4 Scoring method
A five level scoring rubric was
developed for the scoring:
Raters used holistic scoring by assigning
single whole levels to L2 learners’ performance on all three tasks.
6. Results
6.1 In what way does each of the distinguishing features
relate to the scores?
The results first showed that the seven distinguish
features were strongly correlated with the scores as can be seen in Table 1:
Table 1 Correlation Matrix
Table 1 presented that speech rate, word tokens and word types were in
large size effect (|r| ≥ 0.50) in correlations with the scores. As for target-like
syllables, pause time, grammatical accuracy and grammatical complexity, they
were of medium effect sizes (0.30 ≤ |r |≤ 0.49) in correlations with the scores.
Such results indicate that the distinguishing features had strong or moderate
effect on the scores.
6.2 How do the
distinguishing features contribute to the scores?
In order to answer this question,
the researcher carried out two standard multiple regression in which six
distinguishing features (incorporating word tokens and word types, separately[1]) were regressed against the
scores. By doing so, the researcher could clearly examine the contribution of
each features to the scores. The results can be seen in Table 2 and 3:
Table 2 Regression 1 (including word
tokens)
Table 3 Regression 2 (including word
types)
The tables presented that target-like syllables, word tokens, word types
and
grammatical accuracy made a significant contribution to L2 learners’
scores. These
features also represented the three crucial categories in assessing L2
learners speaking
ability: pronunciation, vocabulary and grammar.
7. Conclusion and
Implication
This study attested that there
was a strong relation between distinguishing features and L2 learners’ performance.
This further suggested that distinguishing features could provide evidence for
constructing valid rating scales. In addition, the results in this study also
offered empirical evidence for developing automated scoring of L2 Chinese
speaking in the future.
My thoughts
from the two papers I read
Among the four language skills:
reading, writing, listening and speaking, speaking
skill is considered as the most difficult skill to evaluate. I am eager
to know what kind
of features should be taken into consideration while assessing students’
speaking ability.
By reading the journal articles, I learned that distinguish features can
contribute to L2
learners scores. The results clearly pointed out critical features which should
be
included while evaluating L2 learners speaking ability.
From the two studies I read,
they all presented that word tokens and word types can
be significant features while
assessing learners speaking ability. This entails that
learners’ vocabulary load can
be a great factor which may influence their speaking
ability. In addition, the studies also demonstrated that grammatical
accuracy could
be a crucial factor. This indicates that learners’ grammar knowledge can be
a great
factor affecting L2 learners speaking performance. In this respect, it is
obvious that
learners’ vocabulary load and grammar knowledge are crucial factors which
can
strongly influence their speaking performance.
However, in regard to
pronunciation, previous studies showed different outcomes.
Iwashita et al. (2008) showed
that this feature did not strongly influence learners
speaking performance, whereas Jin and Bak (2012) demonstrated that this
feature
could greatly influence learners speaking performance. As for the feature
of pause
time and speech rate, previous studies also had different results. The
study of Iwashita
et al. (2008) presented that the
pause time and speech rate feature could significantly
affect learners speaking performance, while the study of Jin and Bak
(2012) rejected.
In this respect, it is considered that more studies on examining speaking
fluency and
pronunciation are necessary. The main results from these studies are presented
in the
following table:
Table 4 The main results from the two
studies
Category |
Features |
Effects |
|
Yes |
No |
||
Pronunciation |
Target-like syllables |
〇 |
〇 |
Fluency |
Speech Rate |
〇 |
〇 |
Pause time |
〇 |
〇 |
|
Vocabulary |
Word tokens |
〇 |
|
Word types |
〇 |
|
|
Grammar |
Grammar Accuracy |
〇 |
|
By knowing these results, I consider that
vocabulary and grammar features have to be taken into consideration while
assessing L2 learners’ speaking ability.
However, although these studies did point out crucial features for
evaluating learners’ speaking performance, they also have some limitations.
First, these studies did not clearly distinct the difference between adjacent
levels. The results of the studies cannot obviously show how level 1 and level
2 learners perform differently in the speaking tests. I think that future
studies should figure out other ways to resolve this problem. In addition, both
studies did not really take accent into
consideration. According to Derwing and Munro (2001), accent can be a crucial
factor which may influence learners’ language performance. In this regard, I
consider that future research may also include accent as a feature and observe
whether this feature influence learners’ speaking scores.
In sum, from previous studies concerning assessing speaking, I learned
that vocabulary and grammar features can be good features to measure learners’
speaking ability. I also learned to interpret the results from different
statistical analyses such as Pearson’s r correlation coefficients and standard
multiple regression. By learning these things, I not only gained more knowledge
on the ways to assess speaking, but also learned more ways to implement and
interpret statistical analyses.
References
Brown,
A. (2006a). An examination of the rating process in the revised IELTS Speaking
Test. In
P. McGovern & S. Walsh
(Eds.), IELTS research reports 2006:41–70. Canberra &
Manchester: IELTS Australia and
British Council.
Brown,
A., & Taylor, L. (2006). A world survey of examiners’ views and experience
of the revised
IELTS Speaking Test. Cambridge
ESOL: Research Notes 26:14–18.
Derwing,
T & Munro, M. (2001). What speaking rates do non-native speakers prefer. Applied Linguistics 22.3:324-337.
Iwashita,
N., Brown, A., Mcnamara, T & O’hagan, S. (2008). Assessed Levels of Second
Language Speaking Proficiency: How Distinct?. Applied Linguistics 29.1:24-49.
Jin, T
& Mak, B. (2012). Distinguishing features in scoring L2 Chinese speaking
performance: How do they work?. Language
Testing 30.1:23-47.
Wang, J. (2002). A study of the scoring of three
types of oral test items (in Chinese). Chinese
Teaching
in the World 4:63–77.
Zhu, S. (2009). A study on the dynamic oral
text of Korean learners of Chinese (in Chinese).
Beijing: China Master Theses
Full-text Database.
[1] The
results also showed that there was a high correlation between word tokens and
word types. In order to clearly examine that the two features has a high
correlation to the scores, the researcher had to examine the two features
separately in the standard multiple regression. Two standard multiple
regression were thus conducted.