Multivariate feature ranking with high-dimensional data for classification tasks

Jimenez Barrionuevo, F.; Sanchez Carpena, G.; Palma Méndez, José Tomás; Miralles Pechuan, L.; Botia Blaya, J. A.

Por favor, use este identificador para citar o enlazar este ítem: 10.1109/ACCESS.2022.3180773

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Registro completo de metadatos

Campo DC	Valor	Lengua/Idioma
dc.contributor.author	Jimenez Barrionuevo, F.	-
dc.contributor.author	Sanchez Carpena, G.	-
dc.contributor.author	Palma Méndez, José Tomás	-
dc.contributor.author	Miralles Pechuan, L.	-
dc.contributor.author	Botia Blaya, J. A.	-
dc.date.accessioned	2022-06-10T22:10:13Z	-
dc.date.available	2022-06-10T22:10:13Z	-
dc.date.issued	2022-06-08	-
dc.identifier.citation	IEEE Access	es
dc.identifier.citation	https://ieeeaccess.ieee.org/about-ieee-access/learn-more-about-ieee-access/	es
dc.identifier.issn	2169-3536	-
dc.identifier.uri	http://hdl.handle.net/10201/121146	-
dc.description	©2022. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ This document is the Accepted Manuscript version of a Published Work that appeared in final form in IEEE Access. To access the final edited and published work see DOI 10.1109/ACCESS.2022.3180773	-
dc.description.abstract	In many machine learning classification problems, datasets are usually of high dimensionality and therefore require efficient and effective methods for identifying the relative importance of their attributes, eliminating the redundant and irrelevant ones. Due to the huge size of the search space of the possible solutions, the attribute subset evaluation feature selection methods are not very suitable, so in these scenarios feature ranking methods are used. Most of the feature ranking methods described in the literature are univariate methods, which do not detect interactions between factors. In this paper, we propose two new multivariate feature ranking methods based on pairwise correlation and pairwise consistency, which havebeen applied for cancer gene expression and genotype-tissue expression classification tasks using public datasets. We statistically proved that the proposed methods outperform the state-of-the-art feature ranking methods Clustering Variation, Chi Squared, Correlation, Information Gain, ReliefF and Significance, as well as other feature selection methods for attribute subset evaluation based on correlation and consistency with the multi-objective evolutionary search strategy, and with the embedded feature selection methods C4.5 and LASSO. The proposed methods have been implemented on the WEKA platform for public use, making all the results reported in this paper repeatable and replicable.	-
dc.format	application/pdf	es
dc.language	eng	es
dc.relation.isreferencedby	ED_IDENTRADA=1082	-
dc.rights	info:eu-repo/semantics/openAccess	-
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	-
dc.subject	Artificial intelligence	es
dc.subject	Feature Selection	es
dc.subject	Machine learning	es
dc.subject	rankers	es
dc.title	Multivariate feature ranking with high-dimensional data for classification tasks	es
dc.type	info:eu-repo/semantics/article	es
dc.identifier.doi	10.1109/ACCESS.2022.3180773	-
dc.contributor.department	Ingeniería de la Información y las Comunicaciones	-
Aparece en las colecciones:	Artículos

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
Multivaria..s.pdf		1,27 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro sencillo del ítem Mostrar el registro PREMIS del ítem Estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons