Science for Education Today, 2025, vol. 15, no. 4, pp. 113–135
UDC: 
37.031+37.015.31+51-77

Effectiveness of prompt engineering strategies in generating mathematics educational content: An experimental study

Danilov A. V. 1 (Kazan, Russian Federation), Zaripova R. R. 1 (Kaz, Russian Federation), Lukoyanova M. A. 1 (Kazan, Russian Federation), Batrova N. I. 1 (Kazan, Russian Federation), Salekhova L. L. 1 (Kazan, Russian Federation)
1 Kazan Federal University
Abstract: 

Introduction. The article presents the results of a study on generating high-quality educational content in mathematical literacy for 5th-grade students using generative AI. The problem stems from the lack of adaptive assignments that meet educational standards and the limitations of AI (hallucinations, non-reproducibility). The aim of the study is to develop, test, and evaluate the effectiveness of an original prompt-engineering strategy for generating pedagogically relevant and age-appropriate math problems.
Materials and Methods. The study employs systemic and activity-based approaches. Methods include analysis of AI applications in education, experimental task generation using a hybrid prompt-engineering strategy (Few-Shot Learning + Chain-of-Thought + Role Prompting) based on ChatGPT-4o, expert evaluation (10 mathematics teachers with ≥12 years of experience), and statistical data processing (Cohen’s κ, mean values µ). Verification involved generating tasks in a new context (airports) and assessing them based on adequacy, student-appropriateness, and complexity criteria.
Results. Key findings demonstrate the successful implementation of the strategy, enabling the generation of structurally consistent tasks (κ = 0.82). The critical role of Chain-of-Thought prompting in creating multi-step problems is emphasized. The authors highlight the dual functionality of tasks (learning and assessment). The experiment confirmed high expert ratings for adequacy (µ = 4.81), format compliance (µ = 4.77), and descriptive completeness (µ = 4.82). A limitation in terminology complexity for some tasks was identified.
Conclusions. The study concludes that the combined prompt-engineering strategy is highly effective for generating standards-aligned tasks and has strong potential for integration into digital learning platforms. Further optimization of linguistic adaptation and the development of a validation pipeline are required for implementation.

Keywords: 

Prompt engineering; Educational task generation; Mathematical literacy; Generative AI; Chain-of-Thought; Role prompting.

For citation:
Danilov A. V., Zaripova R. R., Lukoyanova M. A., Batrova N. I., Salekhova L. L. Effectiveness of prompt engineering strategies in generating mathematics educational content: An experimental study. Science for Education Today, 2025, vol. 15, no. 4, pp. 113–135. DOI: http://dx.doi.org/10.15293/2658-6762.2504.05
References: 
  1. Kasymova T. D., Sydykova M. B., Zhaparova Z. A. The use of artificial intelligence in mathematics: Scientific and social aspects. Bulletin of Science and Practice, 2023, vol. 9 (6), pp. 32-37. (In Russian) URL: https://elibrary.ru/item.asp?id=54034992 DOI: https://doi.org/10.33619/2414-2948/91/03
  2. Prokopenkova E. G. The using of neural networks to create game-based math activities. MSPU Bulletin. Series: Informatics and Informatization of Education, 2023, no. 4, pp. 167-171. (In Russian) URL: https://elibrary.ru/item.asp?id=60001807 DOI: https://doi.org/10.25688/2072-9014.2023.66.4.13
  3. Moshenina E. D. The use of artificial intelligence to compile and solve problems aimed at the formation of mathematical literacy. Science in a Megapolis, 2024, no. 7, pp. (In Russian) URL: https://elibrary.ru/item.asp?id=67859517
  4. Schorcht S., Buchholtz N., Baumanns L. Prompt the problem - investigating the mathematics educational quality of AI-supported problem solving by comparing prompt techniques. Frontier in Education, 2024, vol. 9, pp. 1-15. DOI: https://doi.org/10.3389/feduc.2024.1386075
  5. Wanner J., Herm L.V., Heinrich K. et al. The effect of transparency and trust on intelligent system acceptance: Evidence from a user-based study. Electron Markets, 2022, vol. 32 (4), pp. 2079-2102. DOI: https://doi.org/10.1007/s12525-022-00593-5
  6. Navigli R., Conia S., Ross B. Biases in large language models: Origins, inventory and discussion. Journal of Data and Information Quality, 2023, vol. 15 (2), pp. 1-21. DOI: https://doi.org/10.1145/3597307
  7. Rodriguez-Torrealba R., Garcia-Lopez E., Garcia-Cabot A. End-to-end generation of multiple-choice questions using text-to-text transfer transformer models. Expert Systems with Applications, 2022, vol. 208, pp. 118258. DOI: https://doi.org/10.1016/j.eswa.2022.118258
  8. Küchemann S., Rau M., Neumann K., Kuhn J. Editorial: ChatGPT and other generative AI tools. Front. Psychol, 2025, vol. 16, pp. 1535128. DOI: https://doi.org/10.3389/fpsyg.2025.1535128
  9. Denishcheva L. O., Krasnyanskaya K. A., Rydze O. A. Approaches to drafting assignments for mathematical literacy of 5th-6th grade students. Domestic and Foreign Pedagogy, 2020, vol. 2 (2), pp. 181-201. (In Russian) URL: https://elibrary.ru/item.asp?id=44358182

10. Steiner G. Mathematical literacy for the non-mathematician. Nature, 1973, vol. 243, pp. 65-67. DOI: https://doi.org/10.1038/243065a0

11. Roslova L. O., Krasnyanskaya K. A., Kvitko E. S. Conceptual bases of formation and assessment of mathematical literacy. Domestic and Foreign Pedagogy, 2019, vol. 1 (4), pp. 58-79. (In Russian) URL: https://elibrary.ru/item.asp?id=39249304

12. Roslova L. O., Kvitko E. S., Denishcheva L. O., Karamova I. I. The problem of forming the ability to “apply mathematics” in the context of levels of mathematical literacy. Domestic and Foreign Pedagogy, 2020, vol. 2 (2), pp. 74-99. (In Russian) URL: https://elibrary.ru/item.asp?id=44358177

13. Denischeva L. O., Savintseva N. V., Safuanov I. S., Ushakov A. V., Chugunov V. A., Semenyachenko Yu. A. Рeculiarities of formation and assessment of schoolchildren’s mathematical literacy. Science for Education Today, 2021, vol. 11 (4), pp. 113-135. (In Russian) DOI: http://dx.doi.org/10.15293/2658-6762 URL: https://elibrary.ru/item.asp?id=46513828

14. Argunova N. V., Sotnikova N. V. Formation of mathematical literacy of 8th grade students based on solving practice-oriented tasks. Modern High Technologies, 2023, no. 9, pp. 80-84. (In Russian) URL: https://elibrary.ru/item.asp?id=54630263 DOI: https://doi.org/10.17513/snt.39764

15. Cohen J. A Coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, vol. 20 (1), pp. 37–46. DOI: https://doi.org/10.1177/001316446002000104

16. Krathwohl D. R. A Revision of Bloom’s taxonomy: An overview. Theory Into Practice, 2002, vol. 41 (4), pp. 212-218. DOI: https://doi.org/10.1207/s15430421tip4104_2

17. Ivanov V., Solnyshkina M., Solovyev V. Assessment of reading difficulty levels in Russian academic texts: Approaches and metrics. Journal of Intelligent & Fuzzy Systems, 2018, vol. 34 (5), pp. 3049-3058. URL: https://elibrary.ru/item.asp?id=38618046 DOI: https://doi.org/10.3233/JIFS-169489

Date of the publication 31.08.2025