![]() |
2018年度 異文化言語教育評価論 |
Final Report
Speaking
assessment is a widely-popular topic since speaking proficiency is valued so
highly in the modern world. In order to prepare the right assessment test, it
is important to make sure that the test is valid, reliable, and fitting for the
context, in which the test is taken.
One
of the most trusted and influential approaches in assessment test validation is
an argument-based approach commonly known as Assessment Use Arguments framework,
initially proposed by Bachman (2003, 2005), and later improved by Bachman and
Palmer (2010). Assessment Use Arguments (AUA) framework is a system of logical
arguments, consisting of a warrant, a backing, a rebuttal and a rebuttal data,
which all facilitate in connecting test performance to decision making, and proving the validity of a particular test
construct. In other words, applying AUA framework to a test provides a systematic
chain of arguments, relevant to test stakeholders and specific context, in
order to offer evidence of the test validity. Bachman offers a simple illustration as to how
the framework is structured:
Data: Mark was born in the USA.
Claim: Mark is a
U.S. citizen.
Warrant: All individuals born in
the U.S. are U.S. citizens.
Backing: According to the U.S. Constitution,
anyone born in the U.S. Is a U.S. citizen.
Rebuttal: Marc has renounced his
U.S. citizenship.
Rebuttal data: Marc’s affidavit renouncing his U.S. citizenship.
In this situation, given the clear
evidence against the claim (rebuttal data), we conclude that Mark is not a U.S.
citizen.
(2005)
This example very basic,
and not related to language assessment. However, AUA framework is commonly used
to build evidence regarding a given test validity. One of the examples is Long,
Shin, Geeslin & Willis’s (2018) evaluation of a Spanish placement test
where the group created a placement test from scratch, conducted the test
among2201 participants, followed up with statistical analysis of the results,
built up the AUA framework, and used statistical analysis as evidence for
backing and rebuttal. The study concluded that based on AUA framework, the test was indeed valid and reliable.
Another
example is the use of AUA for validating TOEIC test. Schmidgall (2017)
implements the argument-based approach and offers evidence from test design
process, statistical and procedural monitoring, and research data to back the
arguments. In fact, the AUA claims are available on the TOEIC website to
communicate them to the major stakeholders (ETS.org). Schmidgall additionally
states that since TOEIC is designed for a variety of uses, it is also
reasonable to build individual AUAs to offer context-specific claims and
evidence. (2017).
Additionally,
the new speaking assessment system interact, introduced in New Zealand
in 2012-2013 was also submitted to AUA framework evaluation. In this case
interact was assessed from six dimensions to test usefulness of the method. The
dimensions included construct validity, reliability, interactiveness, impact,
practicality and authenticity, using evidence from the surveys and interviews
of the stakeholders (East, 2016). The results have reflected the overall
usefulness of interact as a form of assessment. Although, the perceived
usefulness was communicated largely from the teachers’ point of view.
Im,
Shin & Cheng (2019) mention another 33 journal articles and dissertations,
and procides the analysis for 8, which utilize traditional, as well as more
creative methods to argument-based validation of the test systems. They provides a thorough review AUA framework,
along with other validation methods, concluding that while AUA framework is a modern
and widely accepted method, it requires testing organizations to conduct
ongoing validation studies to accommodate continuous shift in social,
political, and cultural contexts, which inevitably influence the AUA framework
arguments. Additionally, these studies need to consider perspectives of
stakeholders to accurately shape intended scores and their application for
these specific contexts (Im, Shin & Cheng, 2019).
All
these examples illustrate how the argument-based approach can validate specific
claims (whether it’s validity, reliability, usefulness, etc) within specific
tests. When it comes to a speaking assessment, the rules remain the same. Test
developers can design a variation of a speaking activity with the evaluation
matrix, ask the students to participate, and based on the results evaluate the
speaking test using the AUA framework. However, while evaluating reading or
listening skills is more straightforward, and can even be done automatically
with a computer algorhythm, speaking carries additional layers, related to
social status, cultural background, gender, etc., which a computer cannot
always recognize due to currently present, but slow developments in
speech-to-text recognition software. These additional levels influence the way
we speak, resulting in different accents, lexicon, proficiency level, etc. And
since testing organizations are moving towards evaluating speaking proficiency
automatically, the issue of validity is urgently raised.
In
order to address the issue of validity in automated assessment of speaking
proficiency, the assessment software has to contain corpora with a large
variety of speaking patters, considering accents, intonation, lexicon,
syntactical patterns, etc, to be able to recognize and evaluate speech. Of course,
with a current technological developments, such a large task is still
far-reaching. However, an alternative strategy could include a focused
narrowly-based testing system, which has a limited database, but which is
relevant to the context of the testing environment. For example, in a case
where there is a company of IT developers in Russia who have to be able to talk
about their projects in English, the company can develop a speaking assessment
software which will be limited to testing IT jargon, will be able to recognize
phonetic peculiarities of English infused with Russian phonetic standard, and
perhaps contain syntactical and idiomatic variations of English pertaining to
Russian speakers, since it is common for students with lower proficiency to translate
directly from their language, therefore ruining the familiar SVO structure in
English and translating Russian idioms literally into English. At the end, this
new assessment system could be evaluated with a newly-created AUA framework,
fitting to this specific context, in order to confirm that this particular
assessment system offers results expected from the test.
In
conclusion, it is important to check the validity and reliability of the tests
to be able to provide speaking assessment accurate to the context in which the
test is taken. And AUA framework can be a very tool useful even when preparing
an assessment tool for speaking proficiency evaluation.
References
Bachman, L.
F. (2003). Constructing an assessment use argument and supporting claims about
test taker-assessment task interactions in evidence-centered assessment
design. Measurement: Interdisciplinary Research and Perspectives, 1,
63–65. https://doi.org/10.1207/S15366359MEA0101_03
Bachman, L.
F. (2005). Building and supporting a case for test use. Language
Assessment Quarterly, 2, 1–34. https://doi.org/10.1207/s15434311laq0201_1
Bachman, L. F.,
& Palmer, A. S. (2010). Language assessment in practice: Developing
language assessments and justifying their use in the real world. Oxford:
Oxford University Press.
East, M.
(2016). Assessing Foreign Language Students’ Spoken Proficiency: Stakeholder
Perspectives on Assessment Innovation. Springer.
Im, G.-H.,
Shin, D., Cheng, L. (2019). Critical review of validation models and practices
in language testing: their limitations
and future directions for validation research. Language Testing Asia,
9 (14). https://doi.org/10.1186/s40468-019-0089-4
Long, A. Y.,
Shin, S.-Y., Geeslin, K., & Willis, E. W. (2018). Does the test work?
Evaluating a web-based language placement test. Language Learning & Technology,
22(1), 137–156. https://dx.doi.org/10125/44585
Schmidgall,
J.E. (2017). Articulating and evaluating validity arguments for the TOEIC
tests. ETS Research Report Series, 2017-1 , 1-9. https://doi.org/10.1002/ets2.12182
ETS.org. (n.d.) The
Theory Behind the TOEIC® Program. https://www.ets.org/toeic/organizations/research/theory/