Assessment for Learning MOOC’s Updates

Discussion Forum: Optional Update #6

Photo: Brooks, Matt. Wikimedia Commons https://commons.wikimedia.org/wiki/File:Data_Mining_(30208)_-_The_Noun_Project.svg

Educational Data Mining

Educational Data Mining can analyze a vast trove of data produced by students to find useful patterns of behavior and learning to assist teachers with curriculum design, teaching methodology, and identifying exceptional or underperforming students. Algorithms can check for common behaviors among students that may correlate with desired or unwanted behaviors. For example, a teacher can check if some topics or activities created more engagement among students in positive or negative ways by looking at time spent on an activity or discussions on course forums. If it provides better outcomes, then the instructor should continue the topic or activity. However, if it causes negative engagement, then the instructor should change or abandon it.

I see two big challenges with the use of data mining. The first challenge is its implementation and interpretation. Creating a formula to find significant patterns seems to be a two-pronged problem, because you need to write algorithms to find these patterns and then you need to verify that these patterns are valid after you find them, as the adage goes, “Correlation does not imply causation.” The other challenge is privacy. Although the data is or can be anonymized, it is possible to deduce from the information to identify specific students or groups of students. If you look at the data, then match it with age, gender, race, location of access, current and past courses, and other possible data points, you may be able to pinpoint students with a fair amount of accuracy. This power could easily be used for nefarious purposes.

Efficient Use of Time on High-Stakes Assessments Prediction

Levin (2021) investigated an educational data mining competition that pit researchers against each other to analyze and predict outcomes on efficient time use from the same set of data, scores from a standardized math test from 2019. These competing researchers looked at the data and wrote various algorithms to find features that would accurately find characteristics that would identify students who were efficient with their test-taking time and those who were not. Some of the features they investigated included time spent on a question, the number of times students accessed various online tools to calculate problems and how much time spent going from one question to another. He found that the most accurate models were significantly better at identifying students who were efficient at using their time than random chance (over 20% more accurate). However, none of the competing groups created a model that was at least 40% more accurate than random guessing, the gold standard.

This result is very promising, because it opens the possibility for computer testing systems or instructors to identify students who are poorly managing their time as a test is in progress, so an intervention could be performed before they finish their test. Also, these students can be identified for future tutoring on test-taking strategies. On the other hand, the accuracy and validity of the prediction is dependent on the features engineered into the algorithm’s design. If the algorithm is poorly designed, its prediction is not any better or possibly worse than just random guessing, which is correct 50% of the time.

Reference

Levin, Nathan A. (2021) Process Mining Combined with Expert Feature Engineering to Predict Efficient Use of Time on High-Stakes Assessments. Journal of Educational Data Mining: 13(2), 1-15 https://doi.org/10.5281/zenodo.5275311