混合课程场景下学生成绩预测模型可迁移性研究-清华大学教育研究

文章信息/Info

Title:: Research on Generalizability of Student Grade Predictive Model in Blended Courses

作者:: 罗杨洋¹ 韩锡斌² 宋玉强³; 1. 兰州大学高等教育研究院;2.清华大学教育研究院;3.中国石油大学(华东) 材料科学与工程学院

Author(s):: LUO Yang-yang¹ HAN Xi-bin² SONG Yu-qiang³; 1.Higher Institute of Education, Lanzhou University; 2.Institute of Education, Tsinghua University; 3.School of Materials Science and Engineering,China University of Petroleum(East China)

关键词:: 学习成绩预测; 混合课程; 预测模型迁移; 机器学习; 在线学习行为

Keywords:: student performance prediction; blended course; predictive model generalization; machine learning; online learning behavior

分类号:: G434

文献标志码:: A

摘要:: 在混合课程中收集学生学习过程数据,构建学生成绩预测模型,可辅助教师动态调整教学策略。目前混合课程场景下,学生成绩预测研究很少应用于实践的瓶颈问题之一是预测模型难以从构建场景中迁移到其他场景。本文梳理了混合课程学生成绩预测模型迁移时,影响预测结果准确率的因素,包括构建预测模型的结果准确率,具体场景下混合课程训练样本的特征,机器学习算法数据处理方式及目标课程的特征。使用A校两学期所有混合课程数据构建了成绩预测模型并将其迁移应用到B校的案例课程中连续观察三学年后发现:(1)使用样本规模较大,特征值的完整性和数据可变性较强的数据获得的成绩预测模型具有较强可迁移性,研究显示“高活跃型”混合课程符合这些特征;(2)增量学习方法和批量学习方法都可构建具有较高预测结果准确率(超过70%)的成绩预测模型,增量学习方法更有助于成绩预测模型的迁移应用;(3)在模型迁移应用时,增量学习数据处理算法构建的模型可得到更高的预测结果准确率,且应用目标课程与“高活跃型”混合课程的学生在线行为数据分布相似时,预测结果准确率较高。上述研究发现为混合课程场景下学生成绩预测模型的迁移提供了实证基础和基本思路。

Abstract:: Collecting data on the student learning process in blended courses and constructing predictive models of learning outcomes can assist instructors in dynamically adjusting instructional strategies. A significant obstacle preventing the practical application of predictive studies in blended course settings is the difficulty in generalizing models across different contexts. This paper examines factors that influence the generalizability of grade prediction models in blended courses, including the accuracy of the predictive model, characteristics of the training samples, data processing methods of machine learning algorithms, and the prerequisites for generalization in specific contexts. Using data from all blended courses over two semesters at University A, a grade prediction model was developed and subsequently applied to case courses at University B. After three semesters of continuous observation, the findings indicate that (1) models trained with larger sample sizes, more complete feature sets, and greater data variability demonstrated enhanced generalizability. “High Active” blended courses exhibited these characteristics; (2) both incremental and batch learning methods could construct grade prediction models with high accuracy (over 70%), with incremental learning methods proving more effective in generalizing these models; (3) models built using incremental learning data processing algorithms achieved higher predictive accuracy when generalized. Furthermore, predictive accuracy was higher when the distribution of student online behavioral data in the test courses resembled that of “High Active” blended courses. These findings provide an empirical foundation and a methodological approach for the generalization of grade prediction models in blended course environments.

清华大学教育研究[ISSN:1001-4519/CN:11-1610/G4]

文章信息/Info

常用功能

导航/Navigate

工具/Tools

统计/Statistics