Publication: Data Science and AI Techniques for Competency Assessment through Serious Games : Case of study: Shadowspect
Authors
Ana María Aguilar Igualada
item.page.secondaryauthor
Facultad de Informática
item.page.director
Manuel Jesús Gómez Moratilla ; José Antonio Ruipérez Valiente
Publisher
publication.page.editor
publication.page.department
DOI
item.page.type
info:eu-repo/semantics/bachelorThesis
Description
Abstract
La tecnología ha revolucionado diversos aspectos de la vida cotidiana, incluyendo la
educación. La integración de herramientas digitales en los procesos de aprendizaje y
evaluación ha emergido como un enfoque innovador para involucrar a los estudiantes
y mejorar los resultados educativos. Game Based Assessment (GBA) y Serious Games
(SG) destacan en este contexto, ofreciendo ambientes interactivos y atractivos que no
solo hacen el aprendizaje más entretenido, sino que también proporcionan retroalimentación
en tiempo real y una comprensión más profunda de los procesos cognitivos
de los estudiantes (Eseryel et al. [2014]).
La importancia de estudiar Game Based Learning (GBL) radica en su potencial para
transformar los métodos de enseñanza tradicionales. Las técnicas de evaluación tradicionales
a menudo no capturan la naturaleza dinámica e interactiva del aprendizaje
(Shute and Ke [2012]). Sin embargo, GBA ofrece un enfoque más holístico al integrar
la evaluación en entornos atractivos e interactivos (Shute et al. [2009]). Un beneficio
significativo de SG y GBL, más allá de los beneficios educativos, es que los juegos digitales
permiten la recopilación de grandes cantidades de datos, lo que facilita el análisis
de comportamientos y resultados de aprendizaje, habilitando estrategias educativas
más personalizadas y efectivas (Kim and Rosenheck [2018]).
La posibilidad de desarrollar modelos complejos de ML y Deep Learning (DL) a partir
de estos datos ofrece una herramienta poderosa para evaluar habilidades complejas que
son difíciles de medir con métodos convencionales (Géron [2022]).
El razonamiento espacial es una habilidad crítica que implica la capacidad de representar
y transformar mentalmente objetos y sus relaciones (Lowrie et al. [2020]).
Esta habilidad desempeña un papel fundamental en la geometría. La investigación
ha demostrado que mejorar el razonamiento espacial puede mejorar el rendimiento
matemático general (Lowrie et al. [2020]). Sin embargo, los test convencionales para
el razonamiento espacial suelen medir aspectos aislados de esta competencia (Atit
et al. [2020]), lo que motiva el uso de técnicas de ML en el contexto de GBA para
medir los procesos cognitivos que envuelven este constructo.
Sin embargo, para desarrollar un modelo robusto de ML para la evaluación del razonamiento
espacial, es crucial disponer de un conjunto de datos etiquetados de manera
consistente. En Wang et al. [2020] se muestra el problema de encontrar conjuntos de
datos etiquetados en escenarios real que no presenten inconsistencias debido al etiquetado
de múltiples expertos sin un criterio común.
Ante esta problemática, en este Trabajo Final de Grado (TFG), se propone una
rúbrica de etiquetado manual y un modelo de ML para la evaluación del razonamiento
x
espacial, usando los datos obtenidos a través del juego Shadowspect. Este juego desarrollado
por el Playful Journey Lab del MIT y Education Arcade, consiste en una serie
de puzles geométricos donde el usuario debe construir una figura, que se corresponda
con las vistas dadas, a partir de figuras más sencillas.
Para ello se propusieron los siguientes objetivos:
• Revisión de la literatura. Definir los constructos que modelan el razonamiento
espacial a partir de estudios psicológicos y herramientas de GBA ya utilizadas.
• Desarrollo de una rúbrica. Crear una rúbrica detallada para la etiqueta
manual del razonamiento espacial en el juego Shadowspect.
• Desarrollo de un modelo de ML. Entrenar modelos de aprendizaje automático
para predecir el rendimiento de los estudiantes en el razonamiento espacial
a partir de los datos etiquetados manualmente.
• Caso de uso. Demostrar la aplicación práctica del modelo mejor desempeñado
processes, making them more engaging and effective. Game Based Assessment (GBA) and Serious Games (SG) provide interactive environments that not only make learning enjoyable but also offer real-time feedback and deeper insights into students’ cognitive processes. The significance of studying Game Based Learning (GBL) lies in its potential to transform traditional teaching methods, which often fail to capture the dynamic nature of learning [1]. GBA presents a holistic approach to evaluation, enabling the collection of extensive data for personalized and effective educational strategies [2, 3]. Spatial reasoning is a critical cognitive skill involving the mental representation and manipulation of objects and their relationships [4]. It plays a fundamental role in geometry and overall mathematical performance. Traditional tests for spatial reasoning often isolate specific aspects of this skill [5], motivating the use of Machine Learning (ML) techniques in GBA to measure the cognitive processes involved comprehensively. However, developing a robust ML model for assessing spatial reasoning requires a consistently labeled dataset. This work addresses the challenge of inconsistent labeling by proposing a manual labeling rubric and an ML model for evaluating spatial reasoning using data from the Shadowspect game. Shadowspect, developed by the Massachusetts Institute of Technology (MIT) Playful Journey Lab and Education Arcade, involves solving geometric puzzles by constructing figures from simpler shapes. The objectives of this work are: • Literature review: Define constructs modeling spatial reasoning from psychological studies and existing GBA tools. • Rubric development: Create a detailed rubric for manual labeling of spatial reasoning in Shadowspect. • ML Model development: Train ML models to predict students’ spatial reasoning performance based on manually labeled data. • Use case demonstration: Show the practical application of the best-performing model on unlabeled data. To model spatial reasoning, we reviewed the literature to define relevant constructs for Shadowspect, aligning these with the game’s design and characteristics. The selected constructs are mental rotation, spatial orientation, spatial structuring, and spatial visualization. We identified aspects of Shadowspect related to these constructs and viii developed a rubric for manual labeling, iteratively refined through expert annotation and validation using Cohen’s Kappa [6]. The rubric was applied to a selected set of intermediate-level puzzles in Shadowspect, chosen for their balance of data quantity and solution complexity. The rubric represented a scoring system to evaluate the constructs, we combined these scores for an overall spatial reasoning score, and labeled a dataset of Shadowspect replays using an open-source web tool designed for optimizing game data annotation [7]. For ML model development, we extracted features from the replays and combined them with labeled data. We aimed to classify player performance as either “Shows spatial reasoning evidence” or “Does not show spatial reasoning evidence,” using both classification and regression models. Models included decision trees, K-Nearest Neighbours (KNN), Support Vector Machine (SVM), linear regression, and Gaussian Naive Bayes (GaussianNB), with data preprocessing involving standardization and one-hot encoding. We evaluated model performance using metrics such as accuracy and balanced accuracy, finding that the decision tree regression model achieved the best results. A use case demonstrated the model’s application by predicting the outcome of an unlabeled replay, highlighting its practical relevance. In conclusion, we successfully met our objectives by developing a robust rubric, annotating Shadowspect data, and designing effective ML models, achieving promising performance metrics. Future work will focus on increasing the dataset size, improving model performance with more complex algorithms, and externally validating models with conventional spatial reasoning tests.
processes, making them more engaging and effective. Game Based Assessment (GBA) and Serious Games (SG) provide interactive environments that not only make learning enjoyable but also offer real-time feedback and deeper insights into students’ cognitive processes. The significance of studying Game Based Learning (GBL) lies in its potential to transform traditional teaching methods, which often fail to capture the dynamic nature of learning [1]. GBA presents a holistic approach to evaluation, enabling the collection of extensive data for personalized and effective educational strategies [2, 3]. Spatial reasoning is a critical cognitive skill involving the mental representation and manipulation of objects and their relationships [4]. It plays a fundamental role in geometry and overall mathematical performance. Traditional tests for spatial reasoning often isolate specific aspects of this skill [5], motivating the use of Machine Learning (ML) techniques in GBA to measure the cognitive processes involved comprehensively. However, developing a robust ML model for assessing spatial reasoning requires a consistently labeled dataset. This work addresses the challenge of inconsistent labeling by proposing a manual labeling rubric and an ML model for evaluating spatial reasoning using data from the Shadowspect game. Shadowspect, developed by the Massachusetts Institute of Technology (MIT) Playful Journey Lab and Education Arcade, involves solving geometric puzzles by constructing figures from simpler shapes. The objectives of this work are: • Literature review: Define constructs modeling spatial reasoning from psychological studies and existing GBA tools. • Rubric development: Create a detailed rubric for manual labeling of spatial reasoning in Shadowspect. • ML Model development: Train ML models to predict students’ spatial reasoning performance based on manually labeled data. • Use case demonstration: Show the practical application of the best-performing model on unlabeled data. To model spatial reasoning, we reviewed the literature to define relevant constructs for Shadowspect, aligning these with the game’s design and characteristics. The selected constructs are mental rotation, spatial orientation, spatial structuring, and spatial visualization. We identified aspects of Shadowspect related to these constructs and viii developed a rubric for manual labeling, iteratively refined through expert annotation and validation using Cohen’s Kappa [6]. The rubric was applied to a selected set of intermediate-level puzzles in Shadowspect, chosen for their balance of data quantity and solution complexity. The rubric represented a scoring system to evaluate the constructs, we combined these scores for an overall spatial reasoning score, and labeled a dataset of Shadowspect replays using an open-source web tool designed for optimizing game data annotation [7]. For ML model development, we extracted features from the replays and combined them with labeled data. We aimed to classify player performance as either “Shows spatial reasoning evidence” or “Does not show spatial reasoning evidence,” using both classification and regression models. Models included decision trees, K-Nearest Neighbours (KNN), Support Vector Machine (SVM), linear regression, and Gaussian Naive Bayes (GaussianNB), with data preprocessing involving standardization and one-hot encoding. We evaluated model performance using metrics such as accuracy and balanced accuracy, finding that the decision tree regression model achieved the best results. A use case demonstrated the model’s application by predicting the outcome of an unlabeled replay, highlighting its practical relevance. In conclusion, we successfully met our objectives by developing a robust rubric, annotating Shadowspect data, and designing effective ML models, achieving promising performance metrics. Future work will focus on increasing the dataset size, improving model performance with more complex algorithms, and externally validating models with conventional spatial reasoning tests.
publication.page.subject
Citation
item.page.embargo
Collections
Ir a Estadísticas
Este ítem está sujeto a una licencia Creative Commons. http://creativecommons.org/licenses/by-nc-nd/4.0/