S3 Dataset

Espín López, Juan Manuel; Huertas Celdrán, Alberto; Marín-Blázquez, Javier G.; Esquembre, Francisco; Martínez Pérez, Gregorio

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10201/138858

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Título:	S3 Dataset
Fecha de publicación:	13-abr-2021
Materias relacionadas:	CDU::6 - Ciencias aplicadas
Palabras clave:	Continuous authentication Smartphone Sensors Applications usage Speaker recognition
Resumen:	The S3 dataset contains the behavior (sensors, statistics of applications, and voice) of 21 volunteers interacting with their smartphones for more than 60 days. The type of users is diverse, males and females in the age range from 18 until 70 have been considered in the dataset generation. The wide range of age is a key aspect, due to the impact of age in terms of smartphone usage. To generate the dataset the volunteers installed a prototype of the smartphone application in on their Android mobile phones. All attributes of the different kinds of data are writed in a vector. The dataset contains the fellow vectors: Sensors: This type of vector contains data belonging to smartphone sensors (accelerometer and gyroscope) that has been acquired in a given windows of time. Each vector is obtained every 20 seconds, and the monitored features are: - Average of accelerometer and gyroscope values. - Maximum and minimum of accelerometer and gyroscope values. - Variance of accelerometer and gyroscope values. - Peak-to-peak (max-min) of X, Y, Z coordinates. - Magnitude for gyroscope and accelerometer. Statistics: These vectors contain data about the different applications used by the user recently. Each vector of statistics is calculated every 60 seconds and contains : - Foreground application counters (number of different and total apps) for the last minute and the last day. - Most common app ID and the number of usages in the last minute and the last day. - ID of the currently active app. - ID of the last active app prior to the current one. - ID of the application most frequently utilized prior to the current application. - Bytes transmitted and received through the network interfaces. Voice: This kind of vector is generated when the microphone is active in a call o voice note. The speaker vector is an embedding, extracted from the audio, and it contains information about the user's identity. This vector, is usually named "x-vector" in the Speaker Recognition field, and it is calculated following the steps detailed in "egs/sitw/v2" for the Kaldi library, with the models available for the extraction of the embedding. A summary of the details of the collected database. - Users: 21 - Sensors vectors: 417.128 - Statistics app's usage vectors: 151.034 - Speaker vectors: 2.720 - Call recordings: 629 - Voice messages: 2.091
Autor/es principal/es:	Espín López, Juan Manuel Huertas Celdrán, Alberto Marín-Blázquez, Javier G. Esquembre, Francisco Martínez Pérez, Gregorio
URI:	http://hdl.handle.net/10201/138858
Tipo de documento:	info:eu-repo/semantics/dataset
Derechos:	info:eu-repo/semantics/openAccess Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Aparece en las colecciones:	Datos de investigación

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
S3Dataset.zip	ZIP with the data	47,81 MB	ZIP	Visualizar/Abrir

Mostrar el registro Dublin Core completo del ítem Mostrar el registro PREMIS del ítem Estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons