The problem of ensuring authenticity and reliability of LSP tests within the framework of an integrative validation model: A thematic review
2 Moscow State University of Psychology and Education, Moscow, Russian Federation.
Introduction. The article presents a review of Scholarly literature on the problem of ensuring the authenticity and reliability of LSP tests (Language for Specific Purposes) within the framework of the integrative validation model. The study examines current issues of LSP test validation, identifies the main problems associated with their low validity, and emphasizes the importance of integrating the requirements of the professional context into the testing process. The purpose of the article is to systematize the theoretical and methodological foundations of the integrative validation model and identify the conditions for overcoming the key contradiction between the authenticity and reliability of LSP tests.
Materials and Methods. The study is of a review nature and is based on a systematic analysis of Russian and international works devoted to the validation of LSP tests. The methods of comparative analysis, synthesis and generalization of data are used.
Results. The outcomes of the study are as follows: the basic principles of the integrative approach to the validation of LSP tests (contextual embeddedness of validity, socially determined nature of the test construct and continuity of the validation process) have been established; the key advantage of the integrative approach has been revealed, which consists in flexibility and adaptability, allowing to take into account the dynamics of the professional environment; the limitations have been identified, such as the vagueness of the construct, subjectivity of assessment and high resource intensity. In order to overcome the contradiction between authenticity and reliability, the following methodological conditions have been proposed: a clear definition of the boundaries of the construct, the development of empirically substantiated assessment criteria and standardization of the assessment procedure taking into account the requirement of adaptability to the professional context. The authors note that the unification of an empirically confirmed system of descriptors within the context-oriented approach creates conditions for overcoming the contradiction between the requirements of authenticity and stability of measurements.
Conclusions. The study concludes that the integrative validation model provides a theoretical and methodological basis for creating LSP tests that combine high reliability with authenticity. This opens up new prospects for developing assessment tools that meet the modern requirements of professional communication.
Language for Specific Purposes tests; Integrative validation model; Testing authenticity; Language test reliability; Standardization of assessment criteria; Professionally oriented communication; Measurement tools; Language testing methodology
- Whyte S. Revisiting Communicative competence in the teaching and assessment of language for specific purposes. Language Education & Assessment, 2019, vol. 2 (1). pp. 1-19. DOI: http://dx.doi.org/10.29140/lea.v2n1.33
- Bondareva E. V., Polshina Yu. A., Sorokina V. V. Main qualities of test in language testing and assessment terminology systems in English, Russian and Spanish languages. Bulletin of the Samara State Technical University. Series: Psychological and Pedagogical Sciences, 2020, vol. 17 (1), pp. 20-36. (In Russian) URL: https://www.elibrary.ru/item.asp?id=42668685 DOI: https://doi.org/10.17673/vsgtu-pps.2020.1.2
- Inozemtseva K. M., Morozova E. V., Kolesnikov I. M. Assessment of ESP students’ learning outcomes in a digital learning environment. RUDN Journal of Informatization in Education, 2022, vol. 19 (4), pp. 300-311. (In Russian) URL: https://www.elibrary.ru/item.asp?id=50144997 DOI: https://doi.org/10.22363/2312-8631-2022-19-4-300-311
- Khosravani M., Rostamian M., Ashraf H. A structural equation modeling of English tests’ social and educational consequences: Exploring target, leverage, risk, and critical variables. Language Testing in Asia, 2022, vol. 12 (1). URL: https://languagetestingasia.springeropen.com/articles/10.1186/s40468-022-00177-2 DOI: https://doi.org/10.1186/s40468-022-00177-2
- Wang Ju. Validity of Gaokao tests for understanding the Russian-language text. Yaroslavl Pedagogical Bulletin, 2024, no. 4, pp. 51-61. (In Russian) URL: https://elibrary.ru/ILLDUX DOI: http://dx.doi.org/10.20323/1813-145X-2024-4-139-51
- Knoch U., Chapelle C. A. Validation of rating processes within an argument-based framework. Language Testing, 2017, vol. 35 (4), pp. 477-499. DOI:
 https://doi.org/10.1177/0265532217710049
- Chapelle C. A. An introduction to language testing’s first virtual special issue: Investigating consequences of language test use. Language Testing, 2020, vol. 37 (4), pp. 638-645. DOI: https://doi.org/10.1177/0265532220928533
- Eckes T. Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly: An International Journal, 2012, vol. 9 (3), pp. 270-292. DOI: http://dx.doi.org/10.1080/15434303.2011.649381
- Gauthier G., St-Ong C., Tavares W. Rater cognition: Review and integration of research findings. Medical Education, 2016, vol. 50 (5), pp. 511-522. DOI: https://doi.org/10.1111/medu.12973
10. Luo L. Pursuing Authenticity in ESP Testing – the need for interdisciplinary collaboration. The Journal of Teaching English for Specific and Academic Purposes, 2019, vol. 7 (2), pp. 159-169. DOI: https://doi.org/10.22190/JTESAP1902159L
11. Giraldo F. Language Assessment Practices and Beliefs: Implications for Language Assessment Literacy, 2019, vol. 26 (1), pp. 35-61. DOI: https://doi.org/10.19183/how.26.1.481
12. Dong M. Structural relationship between learners’ perceptions of a test, learning practices, and learning outcomes: A study on the washback mechanism of a high-stakes test. Studies in Educational Evaluation, 2020, vol. 64, pp. 100824. URL: https://www.sci-hub.ru/10.1016/j.stueduc.2019.100824 DOI: https://doi.org/10.1016/j.stueduc.2019.100824
13. Davies A. The logic of testing languages for specific purposes. Language Testing, 2001, vol. 18 (2), pp. 133-147. DOI: http://dx.doi.org/10.1177/026553220101800202
14. Kane M. T. Explicating validity. Assessment in Education: Principles, Policy and Practice, 2015, vol. 23 (2), pp. 198-211. DOI: http://doi.org/10.1080/0969594X.2015.1060192
15. Shepard L. A. Evaluating test validity: Reprise and progress. Assessment in Education: Principles, Policy & Practice, 2016, vol. 23 (2), pp. 268-280. DOI: http://dx.doi.org/10.1080/0969594X.2016.1141168
16. Green P., Flaro L. Results from three performance validity tests in children with intellectual disability. Applied Neuropsychology. Child, 2016, vol. 5 (1), pp. 25-34. DOI: http://doi.org/10.1080/21622965.2014.935378
17. Lu W., Zeng Y., Chen J. Proposing a framework of validity evidence for a score report. Creative Education, 2021, vol. 12 (8), pp. 1912-1925. DOI: http://doi.org/10.4236/ce.2021.128146
18. Varakuta A. A., Shelomentsev P. Yu., Andrienko E. V. Defining different types of assessment systems and assessment procedures in the context of historical development. Science for Education Today, 2021, vol. 11 (6), pp. 121-141. (In Russian) URL: https://www.elibrary.ru/item.asp?id=47447641 DOI: http://dx.doi.org/10.15293/2658-6762.2106.07
19. Palmour L. Assessing speaking through multimodal oral presentations: The case of construct underrepresentation in EAP contexts. Language Testing, 2023, vol. 41 (1), pp. 9-34. DOI: https://doi.org/10.1177/02655322231183077
20. Pill J., McNamara T. How much is enough? Involving occupational experts in setting standards on a specific-purpose language test for health professionals. Language Testing, 2016, vol. 33 (2), pp. 217-234. DOI: https://doi.org/10.1177/0265532215607402
21. Elder C., McNamara T., Kim H., Pill J., Sato T. Interrogating the construct of communicative competence in language assessment contexts: What the non-language specialist can tell us. Language and Communication, 2017, vol. 57, pp. 14-21. DOI: https://doi.org/10.1016/j.langcom.2016.12.005
22. Wu W. M., Stansfield C. W. Towards authenticity of task in test development. Language Testing, 2001, vol. 18 (2), pp. 187-206. DOI: https://doi.org/10.1177/026553220101800205
23. Deygers B., Van Gorp K., Demeester T. The B2 level and the dream of a common standard. Language Assessment Quarterly, 2018, vol. 15 (1), pp. 44–58. DOI: https://doi.org/10.1080/15434303.2017.1421955
24. Hulstijn J. H. The shaky ground beneath the CEFR: Quantitative and qualitative dimensions of language proficiency. The Modern Language Journal, 2007, vol. 91 (4), pp. 663-667. DOI: https://doi.org/10.1111/j.1540-4781.2007.00627_5.x
25. Hulstijn J. H. The common european framework of reference for languages: A challenge for applied linguistics. ITL-International Journal of Applied Linguistics, 2014, vol. 165 (1), pp. 3-18. DOI: https://doi.org/10.1075/itl.165.1.01hul
26. Xi X. Validating TOEFL® iBT speaking and setting score requirements for ITA screening. Language Assessment Quarterly An International Journal, 2007, vol. 4 (4), pp. 318-351. DOI: https://doi.org/10.1080/15434300701462796














 
