Search | arXiv e-print repository

Continual learning for rotating machinery fault diagnosis with cross-domain environmental and operational variations

Authors: Diogo Risca, Afonso Lourenço, Goreti Marreiros

Abstract: Although numerous machine learning models exist to detect issues like rolling bearing strain and deformation, typically caused by improper mounting, overloading, or poor lubrication, these models often struggle to isolate faults from the noise of real-world operational and environmental variability. Conditions such as variable loads, high temperatures, stress, and rotational speeds can mask early… ▽ More Although numerous machine learning models exist to detect issues like rolling bearing strain and deformation, typically caused by improper mounting, overloading, or poor lubrication, these models often struggle to isolate faults from the noise of real-world operational and environmental variability. Conditions such as variable loads, high temperatures, stress, and rotational speeds can mask early signs of failure, making reliable detection challenging. To address these limitations, this work proposes a continual deep learning approach capable of learning across domains that share underlying structure over time. This approach goes beyond traditional accuracy metrics by addressing four second-order challenges: catastrophic forgetting (where new learning overwrites past knowledge), lack of plasticity (where models fail to adapt to new data), forward transfer (using past knowledge to improve future learning), and backward transfer (refining past knowledge with insights from new domains). The method comprises a feature generator and domain-specific classifiers, allowing capacity to grow as new domains emerge with minimal interference, while an experience replay mechanism selectively revisits prior domains to mitigate forgetting. Moreover, nonlinear dependencies across domains are exploited by prioritizing replay from those with the highest prior errors, refining models based on most informative past experiences. Experiments show high average domain accuracy (up to 88.96%), with forgetting measures as low as .0027 across non-stationary class-incremental environments. △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.08554 [pdf, other]

Boosting-inspired online learning with transfer for railway maintenance

Authors: Diogo Risca, Afonso Lourenço, Goreti Marreiros

Abstract: The integration of advanced sensor technologies with deep learning algorithms has revolutionized fault diagnosis in railway systems, particularly at the wheel-track interface. Although numerous models have been proposed to detect irregularities such as wheel out-of-roundness, they often fall short in real-world applications due to the dynamic and nonstationary nature of railway operations. This pa… ▽ More The integration of advanced sensor technologies with deep learning algorithms has revolutionized fault diagnosis in railway systems, particularly at the wheel-track interface. Although numerous models have been proposed to detect irregularities such as wheel out-of-roundness, they often fall short in real-world applications due to the dynamic and nonstationary nature of railway operations. This paper introduces BOLT-RM (Boosting-inspired Online Learning with Transfer for Railway Maintenance), a model designed to address these challenges using continual learning for predictive maintenance. By allowing the model to continuously learn and adapt as new data become available, BOLT-RM overcomes the issue of catastrophic forgetting that often plagues traditional models. It retains past knowledge while improving predictive accuracy with each new learning episode, using a boosting-like knowledge sharing mechanism to adapt to evolving operational conditions such as changes in speed, load, and track irregularities. The methodology is validated through comprehensive multi-domain simulations of train-track dynamic interactions, which capture realistic railway operating conditions. The proposed BOLT-RM model demonstrates significant improvements in identifying wheel anomalies, establishing a reliable sequence for maintenance interventions. △ Less

Submitted 11 April, 2025; originally announced April 2025.

arXiv:2502.17788 [pdf, other]

On-device edge learning for IoT data streams: a survey

Authors: Afonso Lourenço, João Rodrigo, João Gama, Goreti Marreiros

Abstract: This literature review explores continual learning methods for on-device training in the context of neural networks (NNs) and decision trees (DTs) for classification tasks on smart environments. We highlight key constraints, such as data architecture (batch vs. stream) and network capacity (cloud vs. edge), which impact TinyML algorithm design, due to the uncontrolled natural arrival of data strea… ▽ More This literature review explores continual learning methods for on-device training in the context of neural networks (NNs) and decision trees (DTs) for classification tasks on smart environments. We highlight key constraints, such as data architecture (batch vs. stream) and network capacity (cloud vs. edge), which impact TinyML algorithm design, due to the uncontrolled natural arrival of data streams. The survey details the challenges of deploying deep learners on resource-constrained edge devices, including catastrophic forgetting, data inefficiency, and the difficulty of handling IoT tabular data in open-world settings. While decision trees are more memory-efficient for on-device training, they are limited in expressiveness, requiring dynamic adaptations, like pruning and meta-learning, to handle complex patterns and concept drifts. We emphasize the importance of multi-criteria performance evaluation tailored to edge applications, which assess both output-based and internal representation metrics. The key challenge lies in integrating these building blocks into autonomous online systems, taking into account stability-plasticity trade-offs, forward-backward transfer, and model convergence. △ Less

Submitted 24 February, 2025; originally announced February 2025.

arXiv:2502.16840 [pdf, other]

In-context learning of evolving data streams with tabular foundational models

Authors: Afonso Lourenço, João Gama, Eric P. Xing, Goreti Marreiros

Abstract: State-of-the-art data stream mining in supervised classification has traditionally relied on ensembles of incremental decision trees. However, the emergence of large tabular models, i.e., transformers designed for structured numerical data, marks a significant paradigm shift. These models move beyond traditional weight updates, instead employing in-context learning through prompt tuning. By using… ▽ More State-of-the-art data stream mining in supervised classification has traditionally relied on ensembles of incremental decision trees. However, the emergence of large tabular models, i.e., transformers designed for structured numerical data, marks a significant paradigm shift. These models move beyond traditional weight updates, instead employing in-context learning through prompt tuning. By using on-the-fly sketches to summarize unbounded streaming data, one can feed this information into a pre-trained model for efficient processing. This work bridges advancements from both areas, highlighting how transformers' implicit meta-learning abilities, pre-training on drifting natural data, and reliance on context optimization directly address the core challenges of adaptive learning in dynamic environments. Exploring real-time model adaptation, this research demonstrates that TabPFN, coupled with a simple sliding memory strategy, consistently outperforms ensembles of Hoeffding trees across all non-stationary benchmarks. Several promising research directions are outlined in the paper. The authors urge the community to explore these ideas, offering valuable opportunities to advance in-context stream learning. △ Less

Submitted 23 February, 2025; originally announced February 2025.

arXiv:2502.14011 [pdf, other]

DFDT: Dynamic Fast Decision Tree for IoT Data Stream Mining on Edge Devices

Authors: Afonso Lourenço, João Rodrigo, João Gama, Goreti Marreiros

Abstract: The Internet of Things generates massive data streams, with edge computing emerging as a key enabler for online IoT applications and 5G networks. Edge solutions facilitate real-time machine learning inference, but also require continuous adaptation to concept drifts. Ensemble-based solutions improve predictive performance, but incur higher resource consumption, latency, and memory demands. This pa… ▽ More The Internet of Things generates massive data streams, with edge computing emerging as a key enabler for online IoT applications and 5G networks. Edge solutions facilitate real-time machine learning inference, but also require continuous adaptation to concept drifts. Ensemble-based solutions improve predictive performance, but incur higher resource consumption, latency, and memory demands. This paper presents DFDT: Dynamic Fast Decision Tree, a novel algorithm designed for energy-efficient memory-constrained data stream mining. DFDT improves hoeffding tree growth efficiency by dynamically adjusting grace periods, tie thresholds, and split evaluations based on incoming data. It incorporates stricter evaluation rules (based on entropy, information gain, and leaf instance count), adaptive expansion modes, and a leaf deactivation mechanism to manage memory, allowing more computation on frequently visited nodes while conserving energy on others. Experiments show that the proposed framework can achieve increased predictive performance (0.43 vs 0.29 ranking) with constrained memory and a fraction of the runtime of VFDT or SVFDT. △ Less

Submitted 19 February, 2025; originally announced February 2025.

arXiv:2408.10482 [pdf, other]

doi 10.1109/CEC60901.2024.10611839

Evaluation Framework for AI-driven Molecular Design of Multi-target Drugs: Brain Diseases as a Case Study

Authors: Arthur Cerveira, Frederico Kremer, Darling de Andrade Lourenço, Ulisses B Corrêa

Abstract: The widespread application of Artificial Intelligence (AI) techniques has significantly influenced the development of new therapeutic agents. These computational methods can be used to design and predict the properties of generated molecules. Multi-target Drug Discovery (MTDD) is an emerging paradigm for discovering drugs against complex disorders that do not respond well to more traditional targe… ▽ More The widespread application of Artificial Intelligence (AI) techniques has significantly influenced the development of new therapeutic agents. These computational methods can be used to design and predict the properties of generated molecules. Multi-target Drug Discovery (MTDD) is an emerging paradigm for discovering drugs against complex disorders that do not respond well to more traditional target-specific treatments, such as central nervous system, immune system, and cardiovascular diseases. Still, there is yet to be an established benchmark suite for assessing the effectiveness of AI tools for designing multi-target compounds. Standardized benchmarks allow for comparing existing techniques and promote rapid research progress. Hence, this work proposes an evaluation framework for molecule generation techniques in MTDD scenarios, considering brain diseases as a case study. Our methodology involves using large language models to select the appropriate molecular targets, gathering and preprocessing the bioassay datasets, training quantitative structure-activity relationship models to predict target modulation, and assessing other essential drug-likeness properties for implementing the benchmarks. Additionally, this work will assess the performance of four deep generative models and evolutionary algorithms over our benchmark suite. In our findings, both evolutionary algorithms and generative models can achieve competitive results across the proposed benchmarks. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 8 pages, 1 figure, published in 2024 IEEE Congress on Evolutionary Computation (CEC)

Journal ref: 2024 IEEE Congress on Evolutionary Computation (CEC), Yokohama, Japan, 2024, pp. 1-8

arXiv:2303.13649 [pdf]

doi 10.1007/978-3-031-34344-5_13

Adversarial Robustness and Feature Impact Analysis for Driver Drowsiness Detection

Authors: João Vitorino, Lourenço Rodrigues, Eva Maia, Isabel Praça, André Lourenço

Abstract: Drowsy driving is a major cause of road accidents, but drivers are dismissive of the impact that fatigue can have on their reaction times. To detect drowsiness before any impairment occurs, a promising strategy is using Machine Learning (ML) to monitor Heart Rate Variability (HRV) signals. This work presents multiple experiments with different HRV time windows and ML models, a feature impact analy… ▽ More Drowsy driving is a major cause of road accidents, but drivers are dismissive of the impact that fatigue can have on their reaction times. To detect drowsiness before any impairment occurs, a promising strategy is using Machine Learning (ML) to monitor Heart Rate Variability (HRV) signals. This work presents multiple experiments with different HRV time windows and ML models, a feature impact analysis using Shapley Additive Explanations (SHAP), and an adversarial robustness analysis to assess their reliability when processing faulty input data and perturbed HRV signals. The most reliable model was Extreme Gradient Boosting (XGB) and the optimal time window had between 120 and 150 seconds. Furthermore, SHAP enabled the selection of the 18 most impactful features and the training of new smaller models that achieved a performance as good as the initial ones. Despite the susceptibility of all models to adversarial attacks, adversarial training enabled them to preserve significantly higher results, especially XGB. Therefore, ML models can significantly benefit from realistic adversarial training to provide a more robust driver drowsiness detection. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: 10 pages, 2 tables, 3 figures, AIME 2023 conference

arXiv:2212.14006 [pdf, other]

Anxolotl, an Anxiety Companion App -- Stress Detection

Authors: Nuno Gomes, Matilde Pato, Pedro Santos, André Lourenço, Lourenço Rodrigues

Abstract: Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive wa… ▽ More Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future. △ Less

Submitted 3 January, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

Comments: 7 pages, 3 figures, 2 tables IEEE 44th International Engineering in Medicine and Biology Conference

ACM Class: J.3; I.5.5; I.5.4

arXiv:1412.0744 [pdf, other]

doi 10.1371/journal.pone.0122199

Extraction of Pharmacokinetic Evidence of Drug-drug Interactions from the Literature

Authors: Artemy Kolchinsky, Anália Lourenço, Heng-Yi Wu, Lang Li, Luis M. Rocha

Abstract: Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, litera… ▽ More Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmaco-epidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1~=0.93, MCC~=0.74, iAUC~=0.99) and sentences (F1~=0.76, MCC~=0.65, iAUC~=0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. ... △ Less

Submitted 18 May, 2015; v1 submitted 1 December, 2014; originally announced December 2014.

Comments: PLOS One (2015)

ACM Class: H.2.8; H.3.1; J.3

arXiv:1210.0734 [pdf, other]

Evaluation of linear classifiers on articles containing pharmacokinetic evidence of drug-drug interactions

Authors: Artemy Kolchinsky, Anália Lourenço, Lang Li, Luis M. Rocha

Abstract: Background. Drug-drug interaction (DDI) is a major cause of morbidity and mortality. [...] Biomedical literature mining can aid DDI research by extracting relevant DDI signals from either the published literature or large clinical databases. However, though drug interaction is an ideal area for translational research, the inclusion of literature mining methodologies in DDI workflows is still very… ▽ More Background. Drug-drug interaction (DDI) is a major cause of morbidity and mortality. [...] Biomedical literature mining can aid DDI research by extracting relevant DDI signals from either the published literature or large clinical databases. However, though drug interaction is an ideal area for translational research, the inclusion of literature mining methodologies in DDI workflows is still very preliminary. One area that can benefit from literature mining is the automatic identification of a large number of potential DDIs, whose pharmacological mechanisms and clinical significance can then be studied via in vitro pharmacology and in populo pharmaco-epidemiology. Experiments. We implemented a set of classifiers for identifying published articles relevant to experimental pharmacokinetic DDI evidence. These documents are important for identifying causal mechanisms behind putative drug-drug interactions, an important step in the extraction of large numbers of potential DDIs. We evaluate performance of several linear classifiers on PubMed abstracts, under different feature transformation and dimensionality reduction methods. In addition, we investigate the performance benefits of including various publicly-available named entity recognition features, as well as a set of internally-developed pharmacokinetic dictionaries. Results. We found that several classifiers performed well in distinguishing relevant and irrelevant abstracts. We found that the combination of unigram and bigram textual features gave better performance than unigram features alone, and also that normalization transforms that adjusted for feature frequency and document length improved classification. For some classifiers, such as linear discriminant analysis (LDA), proper dimensionality reduction had a large impact on performance. Finally, the inclusion of NER features and dictionaries was found not to help classification. △ Less

Submitted 2 October, 2012; originally announced October 2012.

Comments: Pacific Symposium on Biocomputing, 2013

ACM Class: H.2.8; H.3.1; J.3

Journal ref: Pac Symp Biocomput. 2013:409-20

arXiv:1103.4090 [pdf]

A Linear Classifier Based on Entity Recognition Tools and a Statistical Approach to Method Extraction in the Protein-Protein Interaction Literature

Authors: Anália Lourenço, Michael Conover, Andrew Wong, Azadeh Nematzadeh, Fengxia Pan, Hagit Shatkay, Luis M. Rocha

Abstract: We participated, in the Article Classification and the Interaction Method subtasks (ACT and IMT, respectively) of the Protein-Protein Interaction task of the BioCreative III Challenge. For the ACT, we pursued an extensive testing of available Named Entity Recognition and dictionary tools, and used the most promising ones to extend our Variable Trigonometric Threshold linear classifier. For the IMT… ▽ More We participated, in the Article Classification and the Interaction Method subtasks (ACT and IMT, respectively) of the Protein-Protein Interaction task of the BioCreative III Challenge. For the ACT, we pursued an extensive testing of available Named Entity Recognition and dictionary tools, and used the most promising ones to extend our Variable Trigonometric Threshold linear classifier. For the IMT, we experimented with a primarily statistical approach, as opposed to employing a deeper natural language processing strategy. Finally, we also studied the benefits of integrating the method extraction approach that we have used for the IMT into the ACT pipeline. For the ACT, our linear article classifier leads to a ranking and classification performance significantly higher than all the reported submissions. For the IMT, our results are comparable to those of other systems, which took very different approaches. For the ACT, we show that the use of named entity recognition tools leads to a substantial improvement in the ranking and classification of articles relevant to protein-protein interaction. Thus, we show that our substantially expanded linear classifier is a very competitive classifier in this domain. Moreover, this classifier produces interpretable surfaces that can be understood as "rules" for human understanding of the classification. In terms of the IMT task, in contrast to other participants, our approach focused on identifying sentences that are likely to bear evidence for the application of a PPI detection method, rather than on classifying a document as relevant to a method. As BioCreative III did not perform an evaluation of the evidence provided by the system, we have conducted a separate assessment; the evaluators agree that our tool is indeed effective in detecting relevant evidence for PPI detection methods. △ Less

Submitted 22 April, 2011; v1 submitted 21 March, 2011; originally announced March 2011.

Comments: BMC Bioinformatics. In Press

Showing 1–11 of 11 results for author: Lourenço, A