Computational approaches to detect experts in distributed online communities: a case study on Reddit

Strukova, Sofia; Ruipérez Valiente, José A.; Gómez Mármol, Félix

Por favor, use este identificador para citar o enlazar este ítem: https://doi.org/10.1007/s10586-023-04076-w

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Registro completo de metadatos

Campo DC	Valor	Lengua/Idioma
dc.contributor.author	Strukova, Sofia	-
dc.contributor.author	Ruipérez Valiente, José A.	-
dc.contributor.author	Gómez Mármol, Félix	-
dc.date.accessioned	2025-01-21T09:39:35Z	-
dc.date.available	2025-01-21T09:39:35Z	-
dc.date.issued	2024-04	-
dc.identifier.citation	Cluster Computing, 2024, Vol. 27, pp. 2181–2201	es
dc.identifier.issn	Print: 1386-7857	-
dc.identifier.issn	Electronic: 1573-7543	-
dc.identifier.uri	http://hdl.handle.net/10201/148894	-
dc.description	© The Author(s) 2023. This manuscript version is made available under the CC-BY 4.0 license http://creativecommons.org/licenses/by/4.0/ This document is the Published Manuscript version of a Published Work that appeared in final form in Cluster Computing. To access the final edited and published work see https://doi.org/10.1007/s10586-023-04076-w	-
dc.description.abstract	The irreplaceable key to the triumph of Question & Answer (Q & A) platforms is their users providing high-quality answers to the challenging questions posted across various topics of interest. From more than a decade, the expert finding problem attracted much attention in information retrieval research. Based on the encountered gaps in the expert identification across several Q & A portals, we inspect the feasibility of identifying data science experts in Reddit. Our method is based on the manual coding results where two data science experts labelled not only expert and non-expert comments, but also out-of-scope comments, which is a novel contribution to the literature, enabling the identification of more groups of comments across web portals. We present a semi-supervised approach which combines 1113 labelled comments with 100,226 unlabelled comments during training. We proved that it is possible to develop models that can identify expert, non-expert and out-of-scope comments peaking the AUC score at 0.93, accuracy at 0.83, MAE at 0.15 degrees and R2 score at 0.69. The proposed model uses the activity behaviour of every user, including Natural Language Processing (NLP), crowdsourced and user feature sets. We conclude that the NLP and user feature sets contribute the most to the better identification of these three classes. It means that this method can generalise well within the domain. Finally, we make a novel contribution by presenting different types of users in Reddit, which opens many future research directions.	es
dc.format	application/pdf	es
dc.format.extent	21	es
dc.language	eng	es
dc.publisher	Springer	es
dc.relation	This work was partially supported by REASSESS project (grant 21948/JLI/22), funded by the Call for Projects to Generate New Scientific Leadership, included in the Regional Program for the Promotion of Scientific and Technical Excellence Research (2022 Action Plan) of the Seneca Foundation, Science and Technology Agency of the Region of Murcia.	es
dc.rights	info:eu-repo/semantics/openAccess	es
dc.rights	Atribución 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	Reddit	es
dc.subject	User expertise	es
dc.subject	Computational social science	es
dc.subject	Data driven evaluation	es
dc.subject	Data mining	es
dc.title	Computational approaches to detect experts in distributed online communities: a case study on Reddit	es
dc.type	info:eu-repo/semantics/article	es
dc.relation.publisherversion	https://link.springer.com/article/10.1007/s10586-023-04076-w	es
dc.identifier.doi	https://doi.org/10.1007/s10586-023-04076-w	-
dc.contributor.department	Departamento de Ingeniería de la Información y las Comunicaciones	-
Aparece en las colecciones:	Artículos

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
s10586-023-04076-w.pdf		1,04 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro sencillo del ítem Mostrar el registro PREMIS del ítem Estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons