Repository logo
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    New user? Click here to register.
Repository logo

Repositorio Institucional de la Universidad de Murcia

Repository logoRepository logo
  • Communities & Collections
  • All of DSpace
  • menu.section.collectors
  • menu.section.acerca
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    New user? Click here to register.
  1. Home
  2. Browse by Subject

Browsing by Subject "Machine learning"

Now showing 1 - 20 of 40
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    A data Science approach to cost estimation decision making - Big Data and Machine Learning
    (Universidad de Murcia, Servicio de Publicaciones, 2022) Fernández-Revuelta Pérez, Luis; Romero Blasco, Álvaro
    Cost estimation may become increasingly difficult, slow, and resource-consuming when it cannot be performed analytically. If traditional cost estimation techniques are usable at all under those circumstances, they have important limitations. This article analyses the potential applications of data science to management accounting, through the case of a cost estimation task posted on Kaggle, a Google data science and machine learning website. When extensive data exist, machine learning techniques can overcome some of those limitations. Applying machine learning to the data reveals non-obvious patterns and relationships that can be used to predict costs of new assemblies with acceptable accuracy. This article discusses the advantages and limitations of this approach and its potential to transform cost estimation, and more widely management accounting. The multinational company Caterpillar posted a contest on Kaggle to estimate the price that a supplier would quote for manufacturing a number of industrial assemblies, given historical quotes for similar assemblies. Hitherto, this problem would have required reverse-engineering the supplier’s accounting structure to establish the cost structure of each assembly, identifying non-obvious relationships among variables. This complex and tedious task is usually performed by human experts, adding subjectivity to the process.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    A mapping study of ensemble classification methods in lung cancer decision support systems
    (Springer Nature, 2020-07-03) Mohamed Hosni; García Mateos, Ginés; Carrillo de Gea, Juan Manuel; Ali Idri; Fernández Alemán, José Luis; Informática y Sistemas; Facultad de Informática
    Achieving a high level of classification accuracy in medical datasets is a capital need for researchers to provide effective decision systems to assist doctors in work. In many domains of artificial intelligence, ensemble classification methods are able to improve the performance of single classifiers. This paper reports the state of the art of ensemble classification methods in lung cancer detection. We have performed a systematic mapping study to identify the most interesting papers concerning this topic. A total of 65 papers published between 2000 and 2018 were selected after an automatic search in four digital libraries and a careful selection process. As a result, it was observed that diagnosis was the task most commonly studied; homogeneous ensembles and decision trees were the most frequently adopted for constructing ensembles; and the majority voting rule was the predominant combination rule. Few studies considered the parameters tuning of the techniques used. These findings open several perspectives for researchers to enhance lung cancer research by addressing the identified gaps, such as investigating different classification methods, proposing other heterogeneous ensemble methods, and using new combination rules.
  • Loading...
    Thumbnail Image
    Publication
    Embargo
    A novel Machine Learning-based approach for the detection of SSH botnet infection
    (2021-02) Martínez Garre, José Tomás; Gil Pérez, Manuel; Ruiz-Martínez, Antonio; Ingeniería de la Información y las Comunicaciones
    Botnets are causing severe damages to users, companies, and governments through information theft, abuse of online services, DDoS attacks, etc. Although significant research is being made to detect them and mitigate their effect, they are exponentially increasing due to new zero-day attacks, a variation of their behavior, and obfuscation techniques. High Interaction Honeypots (HIH) are the only honeypots able to capture attacks and log all the information generated by attackers when setting up a botnet. The data generated is being processed using Machine Learning (ML) techniques for detection since they can detect hidden patterns. However, so far, research has been focused on intermediate phases of the botnet’s life cycle during operation, underestimating the initial phase of infection. To the best of our knowledge, this is the first solution in the infection phase of SSH-based botnets. Therefore, we have designed an approach based on an SSH-based HIH to generate a dataset consisting of executed commands and network information. Herein, we have applied ML techniques for the development of a real-time detection model. This approach reached a very high level of prediction and zero false negatives. Indeed, our system detected all known and unknown SSH sessions intended to infect our honeypots. Thus, our research has demonstrated that new SSH infections can be detected through ML techniques.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    A programmable web platform for distributed access, analysis, and visualization of data
    (Elsevier, 2023-10-26) Esquembre, Francisco; Chacón, Jesús; Saenz, Jacobo; Vega, Jesús; Dormido-Canto, Sebastián; Matemáticas
    Daily work of Fusion Data Research (FDR) scientists faces three practical challenges: (i) getting access to vast amounts of validated, curated, and (ideally) annotated discharge data, (ii) applying a wide variety of standard, domain-specific, and home-made analysis and visualization software libraries and routines, and (iii) using fast, specialized, and not easy to obtain hardware and software installations. This paper introduces a novel web platform that addresses these three challenges in a federated way. Based on a client–server architecture, the new platform allows for easy use and exchange of curated data, validated analysis and visualization routines, and even networked hardware and software installations among the FDR community. This exchange goes beyond the mere use of a code repository, but facilitates the creation of an actual ready-to-use network of computers which can be used remotely to configure and perform data analysis. The network functions in a federated way, in which each member of the community contributes, using the same web platform, with its data, programming experience, and hardware and software availability. The platform is open source.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    A study on LIWC categories for opinion mining in Spanish reviews
    (SAGE Publications, 2014-08-26) Salas Zárate, María del Pilar; López López, Estanislao; Valencia García, Rafael; Aussenac Gilles, Natalie; Almela, Ángela; Alor Hernández, Giner; Filología Inglesa
    With the exponential growth of social media, that is, blogs and social networks, organizations and individual persons are increasingly using the number of reviews of these media for decision-making about a product or service. Opinion mining detects whether the emotions of an opinion expressed by a user on Web platforms in natural language are positive or negative. This paper presents extensive experiments to study the effectiveness of the classification of Spanish opinions in five categories: highly positive, highly negative, positive, negative and neutral, using the combination of the psychological and linguistic features of LIWC (Linguistic Inquiry and Word Count). LIWC is a text analysis software that enables the extraction of different psychological and linguistic features from natural language text. For this study, two corpora have been used, one about movies and one about technological products. Furthermore, we conducted a comparative assessment of the performance of various classification techniques, J48, SMO and BayesNet, using precision, recall and F-measure metrics. The findings revealed that the positive and negative categories provide better results than the other categories. Finally, experiments on both corpora indicated that SMO produces better results than BayesNet and J48 algorithms, obtaining an F-measure of 90.4 and 87.2% in each domain.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    An interpretable semi‐supervised system for detecting cyberattacks using anomaly detection in industrial scenarios
    (Wiley Open Access, 2023-05-09) Perales Gómez, Ángel Luis; Fernández Maimó, Lorenzo; García Clemente, Félix J.; Huertas Celdrán, Alberto; Ingeniería y Tecnología de Computadores
    When detecting cyberattacks in Industrial settings, it is not sufficient to determine whether the system is suffering a cyberattack. It is also fundamental to explain why the system is under a cyberattack and which are the assets affected. In this context, the Anomaly Detection based on Machine Learning (ML) and Deep Learning (DL) techniques showed great performance when detecting cyberattacks in industrial scenarios. However, two main limitations hinder using them in a real environment. Firstly, most solutions are trained using a supervised approach, which is impractical in the real industrial world. Secondly, the use of black-box ML and DL techniques makes it impossible to interpret the decision made by the model. This article proposes an interpretable and semi-supervised system to detect cyberattacks in Industrial settings. Besides, our proposal was validated using data collected from the Tennessee Eastman Process. To the best of our knowledge, this system is the only one that offers interpretability together with a semi-supervised approach in an industrial setting. Our system discriminates between causes and effects of anomalies and also achieved the best performance for 11 types of anomalies out of 20 with an overall recall of 0.9577, a precision of 0.9977, and a F1-score of 0.9711.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Analysis of the hyperparameter optimisation of four machine learning satellite imagery classification methods
    (Springer, 2024-04-05) Alonso Sarría, Francisco; Valdivieso Ros, Carmen; Gomariz Castillo, Francisco; Geografía
    The classification of land use and land cover (LULC) from remotely sensed imagery in semi-arid Mediterranean areas is a challenging task due to the fragmentation of the landscape and the diversity of spatial patterns. Recently, the use of deep learning (DL) for image analysis has increased compared to commonly used machine learning (ML) methods. This paper compares the performance of four algorithms, Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Convolutional Network (CNN), using multi-source data, applying an exhaustive optimisation process of the hyperparameters. The usual approach in the optimisation process of a LULC classification model is to keep the best model in terms of accuracy without analysing the rest of the results. In this study, we have analysed such results, discovering noteworthy patterns in a space defined by the mean and standard deviation of the validation accuracy estimated in a 10-fold cross validation (CV). The point distributions in such a space do not appear to be completely random, but show clusters of points that facilitate the discovery of hyperparameter values that tend to increase the mean accuracy and decrease its standard deviation. RF is not the most accurate model, but it is the less sensitive to changes in hyperparameters. Neural Networks, tend to increase commission and omission errors of the less represented classes because their optimisation lead the model to learn better the most frequent classes. On the other hand, RF and MLP prediction layers are the most accurate from a general qualitative point of view.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Application of machine learning for data analysis in paediatric dentistry: a systematic review
    (SIOI, Italian Society of Paediatric Dentistry, 2025-05) Gómez Ríos, Inmaculada; Saura López, Virginia; Pérez Silva, Amparo; Serna Muñoz, Clara; Ortiz Ruiz, Antonio José; Dermatología, Estomatología, Radiología y Medicina Física; Facultad de Medicina
    Aim: The study aims to assess whether the application of machine learning (ML) for database analysis enhances the approach to oral diseases in the paediatric population. Materials: Dental caries affects 514 million children worldwide. Artificial intelligence (AI), particularly ML, has seen increased utilisation in medicine and dentistry, handling data beyond human capacity to discern patterns and make predictions. PubMed, Web of Science, Scopus, and Lilacs databases were searched. Topics covered include the impact of oral health on adolescents' quality of life, predictors of early childhood caries and of the need of second treatment under deep sedation, and the effectiveness of preventive dental services. Methods: Twenty articles meeting eligibility criteria were analyzed for quality using the QUADAS-2 scale. The systematic review adhered to the PRISMA statement, yielding 20 articles out of 1945 initially screened. Fourteen articles focused on caries prediction, highlighting socio-demographic, behavioural, and biological predictors. ML analysis revealed that children with early caries lesions incur higher costs for insurers, with those receiving sealants and fluoride demonstrating greater cost savings. Conclusion: ML algorithms can identify patterns in large datasets, enhancing approaches to paediatric oral diseases. Their integration into research and educational programs is recommended. Methodological guidelines and quality scales specific to such studies are necessary for improved scientific evidence.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Assessing the effects of compound events of temperature and air pollution on weekly mortality in Spain using random forests
    (Elsevier, 2025-10-18) Garnés-Morales, Ginés; Tortosa, Javier; Jiménez Guerrero, Pedro; Gil Guirado, Salvador; García Fernández, Esther; Montávez, Juan Pedro; Física
  • Loading...
    Thumbnail Image
    Publication
    Restricted
    AuthCODE: a privacy-preserving and multi-device continuous authentication architecture based on machine and deep learning
    (Elsevier, 2021-01-04) Sánchez Sánchez, Pedro Miguel; Fernández Maimó, Lorenzo; Martínez Pérez, Gregorio; Huertas Celdrán, Alberto; Ingeniería y Tecnología de Computadores
    The authentication field is evolving towards mechanisms able to keep users continuously authenticated without the necessity of remembering or possessing authentication credentials. While relevant limitations of continuous authentication systems -high false positives rates (FPR) and difficulty to detect behaviour changes- have been demonstrated in realistic single-device scenarios, the Internet of Things and next generation of mobile networks (5G) are enabling novel multi-device scenarios, such as Smart Offices, that can help to reduce or address the previous challenges. The paper at hand presents an AI-based, privacy-preserving and multi-device continuous authentication architecture called AuthCODE. AuthCODE seeks to improve single-device solutions limitations by considering additional behavioural data coming from heterogeneous devices. AuthCODE proposes a novel set of features that combine the interactions of users with different devices. The features relevance has been demonstrated in a realistic Smart Office scenario with several users that interact with their mobile devices and personal computers. In this context, a set of single- and multi-device datasets have been generated and published to compare the performance of our multi-device solution against single-device approaches. A pool of experiments with machine and deep learning classifiers measured the impact of time in authentication accuracy and improved the results of single-device approaches by considering multi-device behaviour profiles. Specifically, the multi-device approach using XGBoost with 1-minute window of aggregated features, achieved a 69.33%, 59,65% and 89,35% improvement in the FPR when compared to the single-device approach for computer, mobile applications and mobile sensors respectively. Finally, temporal information classified by a Long-Short Term Memory Network, allowed the identification of additional complex behaviour patterns.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Behavioral fingerprinting to detect ransomware in resource-constrained devices
    (Elsevier, 2023-12) Sánchez Sánchez, Pedro Miguel; Von der Assen, Jan; Shushack, Dennis; Perales Gómez, Ángel Luis; Bovet, Gérôme; Martínez Pérez, Gregorio; Stiller, Burkhard; Huertas Celdrán, Alberto; Ingeniería y Tecnología de Computadores
    The Internet of Things (IoT), a network of interconnected devices, has grown and gained traction over the last few years. This paradigm can impact our lives while also providing significant economic benefits. However, although resource-constrained IoT devices offer numerous advantages, they are also vulnerable to cyberattacks. As a result, ransomware severely threatens IoT devices managing sensitive and relevant information. Solutions based on Machine and Deep Learning (ML/DL) that consider behavioral data have been identified as promising. However, most detection solutions have been developed for Windows-based systems, which generally have more resources than IoT devices. As a result, these solutions are not suitable for resource-constrained components. In addition, no solution compares the pros and cons of different behavioral dimensions of resource-constrained devices. Thus, this work presents a framework that combines three different behavioral sources with supervised and unsupervised ML/DL algorithms to detect and classify heterogeneous ransomware impacting resource-constrained spectrum sensors. A pool of experiments has demonstrated the suitability of the proposed solution and compared its performance with a rule-based system. In conclusion, the usage of resources combined with local outlier factor and decision tree are the most promising combinations to detect anomalies and classify ransomware while consuming CPU, RAM, and time of devices in a reduced manner.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Comparison of manual and automated digital image analysis systems for quantification of cellular protein expression
    (Universidad de Murcia, Departamento de Biologia Celular e Histiologia, 2022) Jagomast, T.; Idel, C.; Klapper, L.; Kuppler, P.; Proppe, L.; Beume, S.; Falougy, M.; Steller, D.; Hakim, S.G.; Offermann, A.; Roesch, M.C.; Bruchhage, K.L.; Perner, S.; Ribbat Idel, J.
    Objective. Quantifying protein expression in immunohistochemically stained histological slides is an important tool for oncologic research. The use of computer-aided evaluation of IHC-stained slides significantly contributes to objectify measurements. Manual digital image analysis (mDIA) requires a userdependent annotation of the region of interest (ROI). Others have built-in machine learning algorithms with automated digital image analysis (aDIA) and can detect the ROIs automatically. We aimed to investigate the agreement between the results obtained by aDIA and those derived from mDIA systems. Methods. We quantified chromogenic intensity (CI) and calculated the positive index (PI) in cohorts of tissue microarrays (TMA) using mDIA and aDIA. To consider the different distributions of staining within cellular subcompartments and different tumor architecture our study encompassed nuclear and cytoplasmatic stainings in adenocarcinomas and squamous cell carcinomas. Results. Within all cohorts, we were able to show a high correlation between mDIA and aDIA for the CI (p<0.001) along with high agreement for the PI. Moreover, we were able to show that the cell detections of the programs were comparable as well and both proved to be reliable when compared to manual counting. Conclusion. mDIA and aDIA show a high correlation in acquired IHC data. Both proved to be suitable to stratify patients for evaluation with clinical data. As both produce the same level of information, aDIA might be preferable as it is time-saving, can easily be reproduced, and enables regular and efficient output in large studies in a reasonable time period.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Las contribuciones de los estudiantes a Wikipedia como evidencia de aprendizaje y de desarrollo de competencias en educación a distancia
    (Universidad de Murcia, Servicio de Publicaciones, 2025-07-30) Obregón Sierra, Ángel; Maina Patrás, Marcelo Fabián; Sin departamento asociado
    El presente estudio explora cómo el estudiantado de un máster en línea desarrolló conceptos disciplinares a través de una estrategia de escritura colaborativa en Wikipedia. Esta consistía en la búsqueda de información y la elaboración de contenidos conforme a los criterios de rigor y calidad de la enciclopedia virtual durante un total de ocho semestres. Con la ayuda de un servicio web que utiliza aprendizaje automático para la evaluación de ediciones en Wikipedia, se analizó la participación de 1779 estudiantes con un total de 57560 ediciones. Los resultados proporcionaron evidencias del aprendizaje del estudiantado, que editó correctamente en el taller grupal de la enciclopedia como paso previo a la publicación final del artículo. Se observó una mejora constante en las contribuciones, evidenciada por el incremento del grado de "buena fe" y la disminución significativa del “daño”. La implementación de esta estrategia de escritura colaborativa no solo ha permitido que los estudiantes desarrollen competencias y conocimientos propios a la asignatura, sino que también ha fomentado el pensamiento crítico, la reflexión, el trabajo en equipo y las competencias digitales. La supervisión del profesorado ha sido fundamental para asegurar la calidad y rigor de las contribuciones, demostrando que puede ser una metodología efectiva para el aprendizaje en entornos educativos superiores.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Desempeño logístico en entidades turísticas cubanas de la cadena de suministro: Estudio comparativo mediante machine learning
    (Universidad de Murcia: Escuela Universitaria Turismo de Murcia, 2025) Guerra Castellón, Emilio Enrique; Vázquez Alfonso, Yasser; Núñez Torres, Edgar; Departamentos
    Esta investigación evalúa el desempeño logístico de trece entidades turísticas cubanas de la cadena de suministro, identificando patrones y áreas de mejora mediante técnicas avanzadas de machine learning . Se utilizó el modelo de referencia de la logística de excelencia para la evaluación y se aplicaron técnicas de machine learning , tales como la regresión l ineal múltiple, clustering K-means y Random Forest . Los resultados mostraron un desempeño pr omedio de 3. 25 en una escala de 1 – 5, con ITH Ciego de Ávila l iderando ( 3. 92 ) y ITH Base de Transporte en último lugar ( 2. 46). Los módulos más débiles fueron Tecnologías de la Información, Sistema de Software y Barreras y Riesgos. El análisis identificó cua tro grupos de entidades con perfiles similares y destacó la relevancia del transporte y las tecnologías para mejorar el desempeño. Se concluye que son necesarias intervenciones específicas en tecnología y gestión de riesgos para optimizar la cadena de sumi nistro turística cubana.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Detecting flooded areas using Sentinel-1 SAR imagery
    (MDPI, 2025-04-11) Alonso Sarria, Francisco; Valdivieso Ros, Carmen; Molina-Pérez, Gabriel; Geografía
    Abstract: Floods are a major threat to human life and economic assets. Monitoring these events is therefore essential to quantify and minimize such losses. Remote sensing has been used to extract flooded areas, with SAR imagery being particularly useful as it is independent of weather conditions. This approach is more difficult when detecting flooded areas in semi-arid environments, without a reference permanent water body, than when monitoring the water level rise of permanent rivers or lakes. In this study, Random Forest is used to estimate flooded cells after 19 events in Campo de Cartagena, an agricultural area in SE Spain. Sentinel-1 SAR metrics are used as predictors and irrigation ponds as training areas. To minimize false positives, the pre- and post-event results are compared and only those pixels with a probability of water increase are considered as flooded areas. The ability of the RF model to detect water surfaces is demonstrated (mean accuracy = 0.941, standard deviation = 0.048) along the 19 events. Validating using optical imagery (Sentinel-2 MSI) reduces accuracy to 0.642. This form of validation can only be applied to a single event using a S2 image taken 3 days before the S1 image. A large number of false negatives is then expected. A procedure developed to correct for this error gives an accuracy of 0.886 for this single event. Another form of indirect validation consists in relating the area flooded in each event to the amount of rainfall recorded. An RF regression model using both rainfall metrics and season of the year gives a correlation coefficient of 0.451 and RMSE = 979 ha using LOO-CV. This result shows a clear relationship between flooded areas and rainfall metrics.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Distributed real-time SlowDoS attacks detection over encrypted traffic using Artificial Intelligence
    (Elsevier, 2021) Garcia, Norberto; Alcaniz, Tomás; González Vidal, Aurora; Bernal Bernabé, Jorge; Rivera, Diego; Skarmeta Gómez, Antonio; Ingeniería de la Información y las Comunicaciones; Facultades de la UMU::Facultad de Informática
    SlowDoS attacks exploit slow transmissions on application-level protocols like HTTP to carry out denial of service against web-servers. These attacks are difficult to be detected with traditional signature-based intrusion detection approaches, even more when the HTTP traffic is encrypted. To cope with this challenge, this paper describes and AI-based anomaly detection system for real-time detection of SlowDoS attacks over application-level encrypted traffic. Our system monitors in real-time the network traffic, analyzing, processing and aggregating packets into conversation flows, getting valuable features and statistics that are dynamically analyzed in streaming for AI-based anomaly detection. The distributed AI model running in Apache Spark-streaming, combines clustering analysis for anomaly detection, along with deep learning techniques to increase detection accuracy in those cases where clustering obtains ambiguous probabilities. The proposal has been implemented and validated in a real testbed, showing its feasibility, performance and accuracy for detecting in real-time different kinds of SlowDoS attacks over encrypted traffic. The achieved results are close to the optimal precision value with a success rate 98%, while the false negative rate takes a value below 0.5%.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Do governance structures drive green building adoption? A machine learning approach with random forests
    (Wiley, 2026-02-15) Valls Martínez, María del Carmen; Santos Jaén, José Manuel; Sánchez Pacheco, María Estefanía; Zambrano Farías, Fernando José; Economía Financiera y Contabilidad; Facultad de Economía y Empresa
    This study examines the determinants of firms' propensity to adopt green buildings in the Euro Stoxx 300 and the S&P 500 indices, during 2012–2023. Using random forest binary classifiers, we assess the relative importance of financial, sectoral, geographic, and climate governance predictors and uncover nonlinear relationships often overlooked by econometric approaches. Results show that sectoral affiliation is the most influential determinant in both markets. Governance-related predictors are collectively highly influential, although they exhibit different patterns across institutional contexts. In Europe, the presence of a Sustainability Committee charged with climate strategy is the most influential factor, whereas in the United States, nonfinancial/environmental performance disclosure plays a prominent role. CEO-related mechanisms show asymmetric effects. Other board characteristics, such as gender diversity, independence, size, skills, experience, turnover, meetings, and remuneration, also matter, but their impact varies by institutional context. Overall, the findings highlight that corporate governance plays a decisive yet asymmetric role in sustainable-building adoption.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Estimation of nitrogen content in cucumber plant (Cucumis sativus L.) leaves using hyperspectral imaging data with neural network and partial least squares regressions
    (Elsevier, 2021-10-15) Sabzi, Sajad; Pourdarbani, Razieh; Rohban, Mohammad H.; García Mateos, Ginés; Arribas, J. I.; Informática y Sistemas; Facultades de la UMU::Facultad de Informática
    In recent years, farmers have often mistakenly resorted to overuse of chemical fertilizers to increase crop yield. However, excessive consumption of fertilizers might lead to severe food poisoning. If nutritional deficiencies are detected early, it can help farmers to design better fertigation practices before the problem becomes unsolvable. The aim of this study is to predict the amount of nitrogen (N) content in cucumber (Cucumis sativus L., var. Super Arshiya-F1) plant leaves using hyperspectral imaging (HSI) techniques and three different regression methods: a hybrid artificial neural networks-particle swarm optimization (ANN-PSO); partial least squares regression (PLSR); and unidimensional deep learning convolutional neural networks (CNN). Cucumber plant seeds were planted in 20 different pots. After growing the plants, pots were categorized and three levels of nitrogen overdose were applied to each category: 30%, 60% and 90% excesses, called N30%, N60%, N90%, respectively. HSI images of plant leaves were captured before and after the application of nitrogen excess. A prediction regression model was developed for each individual category. Results showed that mean regression coefficients (R) for ANN-PSO were inside 0.937–0.965, PLSR 0.975–0.997, and CNN 0.965–0.985 ranges, test set. We conclude that regression models have a remarkable ability to accurately predict the amount of nitrogen content in cucumber plants from hyperspectral leaf images in a non-destructive way, being PLSR slightly ahead of CNN and ANN-PSO methods.
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    Estimation of soil properties using machine learning techniques to improve hydrological modeling in a semiarid environment: Campo de Cartagena (Spain)
    (Springer, 2025-03-11) Alonso Sarria, Francisco; Blanco Bernardeau, Arantzazu; Gomariz Castillo, Francisco; Romero Díaz, María Asunción; Geografía
    Soils are a key element in the hydrological cycle through a number of soil properties that are complex to estimate and exhibit considerable spatial variability. Therefore, several techniques have been proposed for their estimation and mapping from point data along a given study area. In this work, four machine learning methods: Random Forest, Support Vector Machines, XGBoost and Multilayer Perceptrons, are used to predict and map the proportions of organic carbon, clay, silt and sand in the soils of the Campo de Cartagena (SE Spain). These models depend on a number of hyperparameters that need to be optimised to maximise accuracy, although this process can lead to overtraining, which affects the generalisability of the models. In this work it was found that neural networks gave the best results in validation, but on the test data the methods based on decision trees, random forest and xgboost were more accurate, although the differences were generally not significant. Accuracy values, as usual for soil variables, were not high. The RMSE values were 8.040 for SOC, 7.049 for clay, 10.227 for silt and 13.561 for loam. The layers obtained were then used to obtain annual curve number layers whose ability to reproduce runoff hydrographs was compared with the official CN layer. For high flow events, the CN layers obtained in this study gave better results (NSE=0.807, PBIAS=-4.7 and RMSE=0.4) than the official CN layers (NSE=-2.28, PBIAS=135.82 and RMSE=1.8).
  • Loading...
    Thumbnail Image
    Publication
    Open Access
    FARMIT: Continuous Assessment of Crop Quality Using Machine Learning and Deep Learning Techniques for IoT-based Smart Farming
    (Springer, 2022-03-31) Perales Gómez, Ángel Luis; López de Teruel Alcolea, Pedro Enrique; Ruiz García, Alberto; García Mateos, Ginés; García Clemente, Félix Jesús; Ingeniería y Tecnología de Computadores
    The race for automation has reached farms and agricultural fields. Many of these facilities use the Internet of Things (IoT) technologies to automate processes and increase productivity. Besides, Machine Learning and Deep Learning allow performing continuous decision making based on data analysis. In this work, we fill a gap in the literature and present a novel architecture based on IoT and Machine Learning / Deep Learning technologies or the continuous assessment of agricultural crop quality. This architecture is divided into three layers that work together to gather, process, and analyze data from different sources to evaluate crop quality. In the experiments, he proposed approach based on data aggregation from different sources reaches a lower percentage error than considering only one source. In particular, the percentage error achieved by our approach in the test dataset was 6.59, while the percentage error achieved exclusively using data from sensors was 6.71.
  • «
  • 1 (current)
  • 2
  • »

DSpace software copyright © 2002-2026 LYRASIS

  • Cookie settings
  • Accessibility
  • Send Feedback