Search | arXiv e-print repository

The Amazon Nova Family of Models: Technical Report and Model Card

Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation. △ Less

Submitted 17 March, 2025; originally announced June 2025.

Comments: 48 pages, 10 figures

Report number: 20250317

arXiv:2506.10689 [pdf, ps, other]

Underage Detection through a Multi-Task and MultiAge Approach for Screening Minors in Unconstrained Imagery

Authors: Christopher Gaul, Eduardo Fidalgo, Enrique Alegre, Rocío Alaiz Rodríguez, Eri Pérez Corral

Abstract: Accurate automatic screening of minors in unconstrained images demands models that are robust to distribution shift and resilient to the children under-representation in publicly available data. To overcome these issues, we propose a multi-task architecture with dedicated under/over-age discrimination tasks based on a frozen FaRL vision-language backbone joined with a compact two-layer MLP that sh… ▽ More Accurate automatic screening of minors in unconstrained images demands models that are robust to distribution shift and resilient to the children under-representation in publicly available data. To overcome these issues, we propose a multi-task architecture with dedicated under/over-age discrimination tasks based on a frozen FaRL vision-language backbone joined with a compact two-layer MLP that shares features across one age-regression head and four binary under-age heads for age thresholds of 12, 15, 18, and 21 years, focusing on the legally critical age range. To address the severe class imbalance, we introduce an $α$-reweighted focal-style loss and age-balanced mini-batch sampling, which equalizes twelve age bins during stochastic optimization. Further improvement is achieved with an age gap that removes edge cases from the loss. Moreover, we set a rigorous evaluation by proposing the Overall Under-Age Benchmark, with 303k cleaned training images and 110k test images, defining both the "ASORES-39k" restricted overall test, which removes the noisiest domains, and the age estimation wild shifts test "ASWIFT-20k" of 20k-images, stressing extreme pose ($>$45°), expression, and low image quality to emulate real-world shifts. Trained on the cleaned overall set with resampling and age gap, our multiage model "F" lowers the root-mean-square-error on the ASORES-39k restricted test from 5.733 (age-only baseline) to 5.656 years and lifts under-18 detection from F2 score of 0.801 to 0.857 at 1% false-adult rate. Under the domain shift to the wild data of ASWIFT-20k, the same configuration nearly sustains 0.99 recall while boosting F2 from 0.742 to 0.833 with respect to the age-only baseline, demonstrating strong generalization under distribution shift. For the under-12 and under-15 tasks, the respective boosts in F2 are from 0.666 to 0.955 and from 0.689 to 0.916, respectively. △ Less

Submitted 12 June, 2025; originally announced June 2025.

arXiv:2505.08382 [pdf, ps, other]

Continuous World Coverage Path Planning for Fixed-Wing UAVs using Deep Reinforcement Learning

Authors: Mirco Theile, Andres R. Zapata Rodriguez, Marco Caccamo, Alberto L. Sangiovanni-Vincentelli

Abstract: Unmanned Aerial Vehicle (UAV) Coverage Path Planning (CPP) is critical for applications such as precision agriculture and search and rescue. While traditional methods rely on discrete grid-based representations, real-world UAV operations require power-efficient continuous motion planning. We formulate the UAV CPP problem in a continuous environment, minimizing power consumption while ensuring comp… ▽ More Unmanned Aerial Vehicle (UAV) Coverage Path Planning (CPP) is critical for applications such as precision agriculture and search and rescue. While traditional methods rely on discrete grid-based representations, real-world UAV operations require power-efficient continuous motion planning. We formulate the UAV CPP problem in a continuous environment, minimizing power consumption while ensuring complete coverage. Our approach models the environment with variable-size axis-aligned rectangles and UAV motion with curvature-constrained Bézier curves. We train a reinforcement learning agent using an action-mapping-based Soft Actor-Critic (AM-SAC) algorithm employing a self-adaptive curriculum. Experiments on both procedurally generated and hand-crafted scenarios demonstrate the effectiveness of our method in learning energy-efficient coverage strategies. △ Less

Submitted 13 May, 2025; originally announced May 2025.

Comments: Submitted to IROS 2025

arXiv:2504.06176 [pdf, other]

A Self-Supervised Framework for Space Object Behaviour Characterisation

Authors: Ian Groves, Andrew Campbell, James Fernandes, Diego Ramírez Rodríguez, Paul Murray, Massimiliano Vasile, Victoria Nockles

Abstract: Foundation Models, pre-trained on large unlabelled datasets before task-specific fine-tuning, are increasingly being applied to specialised domains. Recent examples include ClimaX for climate and Clay for satellite Earth observation, but a Foundation Model for Space Object Behavioural Analysis has not yet been developed. As orbital populations grow, automated methods for characterising space objec… ▽ More Foundation Models, pre-trained on large unlabelled datasets before task-specific fine-tuning, are increasingly being applied to specialised domains. Recent examples include ClimaX for climate and Clay for satellite Earth observation, but a Foundation Model for Space Object Behavioural Analysis has not yet been developed. As orbital populations grow, automated methods for characterising space object behaviour are crucial for space safety. We present a Space Safety and Sustainability Foundation Model focusing on space object behavioural analysis using light curves (LCs). We implemented a Perceiver-Variational Autoencoder (VAE) architecture, pre-trained with self-supervised reconstruction and masked reconstruction on 227,000 LCs from the MMT-9 observatory. The VAE enables anomaly detection, motion prediction, and LC generation. We fine-tuned the model for anomaly detection & motion prediction using two independent LC simulators (CASSANDRA and GRIAL respectively), using CAD models of boxwing, Sentinel-3, SMOS, and Starlink platforms. Our pre-trained model achieved a reconstruction error of 0.01%, identifying potentially anomalous light curves through reconstruction difficulty. After fine-tuning, the model scored 88% and 82% accuracy, with 0.90 and 0.95 ROC AUC scores respectively in both anomaly detection and motion mode prediction (sun-pointing, spin, etc.). Analysis of high-confidence anomaly predictions on real data revealed distinct patterns including characteristic object profiles and satellite glinting. Here, we demonstrate how self-supervised learning can simultaneously enable anomaly detection, motion prediction, and synthetic data generation from rich representations learned in pre-training. Our work therefore supports space safety and sustainability through automated monitoring and simulation capabilities. △ Less

Submitted 11 April, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

Comments: 15 pages, 10 figures

arXiv:2503.13573 [pdf]

doi 10.1016/j.patcog.2025.111581

Online Signature Verification based on the Lagrange formulation with 2D and 3D robotic models

Authors: Moises Diaz, Miguel A. Ferrer, Juan M. Gil, Rafael Rodriguez, Peirong Zhang, Lianwen Jin

Abstract: Online Signature Verification commonly relies on function-based features, such as time-sampled horizontal and vertical coordinates, as well as the pressure exerted by the writer, obtained through a digitizer. Although inferring additional information about the writers arm pose, kinematics, and dynamics based on digitizer data can be useful, it constitutes a challenge. In this paper, we tackle this… ▽ More Online Signature Verification commonly relies on function-based features, such as time-sampled horizontal and vertical coordinates, as well as the pressure exerted by the writer, obtained through a digitizer. Although inferring additional information about the writers arm pose, kinematics, and dynamics based on digitizer data can be useful, it constitutes a challenge. In this paper, we tackle this challenge by proposing a new set of features based on the dynamics of online signatures. These new features are inferred through a Lagrangian formulation, obtaining the sequences of generalized coordinates and torques for 2D and 3D robotic arm models. By combining kinematic and dynamic robotic features, our results demonstrate their significant effectiveness for online automatic signature verification and achieving state-of-the-art results when integrated into deep learning models. △ Less

Submitted 17 March, 2025; originally announced March 2025.

Journal ref: Science direct, March 17 2025

arXiv:2502.07105 [pdf, ps, other]

doi 10.1162/99608f92.db29c137

Toward a Principled Framework for Disclosure Avoidance

Authors: Michael B Hawes, Evan M Brassell, Anthony Caruso, Ryan Cumings-Menon, Jason Devine, Cassandra Dorius, David Evans, Kenneth Haase, Michele C Hedrick, Alexandra Krause, Philip Leclerc, James Livsey, Rolando A Rodriguez, Luke T Rogers, Matthew Spence, Victoria Velkoff, Michael Walsh, James Whitehorne, Sallie Ann Keller

Abstract: Responsible disclosure limitation is an iterative exercise in risk assessment and mitigation. From time to time, as disclosure risks grow and evolve and as data users' needs change, agencies must consider redesigning the disclosure avoidance system(s) they use. Discussions about candidate systems often conflate inherent features of those systems with implementation decisions independent of those s… ▽ More Responsible disclosure limitation is an iterative exercise in risk assessment and mitigation. From time to time, as disclosure risks grow and evolve and as data users' needs change, agencies must consider redesigning the disclosure avoidance system(s) they use. Discussions about candidate systems often conflate inherent features of those systems with implementation decisions independent of those systems. For example, a system's ability to calibrate the strength of protection to suit the underlying disclosure risk of the data (e.g., by varying suppression thresholds), is a worthwhile feature regardless of the independent decision about how much protection is actually necessary. Having a principled discussion of candidate disclosure avoidance systems requires a framework for distinguishing these inherent features of the systems from the implementation decisions that need to be made independent of the system selected. For statistical agencies, this framework must also reflect the applied nature of these systems, acknowledging that candidate systems need to be adaptable to requirements stemming from the legal, scientific, resource, and stakeholder environments within which they would be operating. This paper proposes such a framework. No approach will be perfectly adaptable to every potential system requirement. Because the selection of some methodologies over others may constrain the resulting systems' efficiency and flexibility to adapt to particular statistical product specifications, data user needs, or disclosure risks, agencies may approach these choices in an iterative fashion, adapting system requirements, product specifications, and implementation parameters as necessary to ensure the resulting quality of the statistical product. △ Less

Submitted 29 May, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

arXiv:2501.14249 [pdf, other]

Humanity's Last Exam

Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai. △ Less

Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

Comments: 29 pages, 6 figures

arXiv:2411.01144 [pdf, other]

LEARNER: Learning Granular Labels from Coarse Labels using Contrastive Learning

Authors: Gautam Gare, Jana Armouti, Nikhil Madaan, Rohan Panda, Tom Fox, Laura Hutchins, Amita Krishnan, Ricardo Rodriguez, Bennett DeBoisblanc, Deva Ramanan, John Galeotti

Abstract: A crucial question in active patient care is determining if a treatment is having the desired effect, especially when changes are subtle over short periods. We propose using inter-patient data to train models that can learn to detect these fine-grained changes within a single patient. Specifically, can a model trained on multi-patient scans predict subtle changes in an individual patient's scans?… ▽ More A crucial question in active patient care is determining if a treatment is having the desired effect, especially when changes are subtle over short periods. We propose using inter-patient data to train models that can learn to detect these fine-grained changes within a single patient. Specifically, can a model trained on multi-patient scans predict subtle changes in an individual patient's scans? Recent years have seen increasing use of deep learning (DL) in predicting diseases using biomedical imaging, such as predicting COVID-19 severity using lung ultrasound (LUS) data. While extensive literature exists on successful applications of DL systems when well-annotated large-scale datasets are available, it is quite difficult to collect a large corpus of personalized datasets for an individual. In this work, we investigate the ability of recent computer vision models to learn fine-grained differences while being trained on data showing larger differences. We evaluate on an in-house LUS dataset and a public ADNI brain MRI dataset. We find that models pre-trained on clips from multiple patients can better predict fine-grained differences in scans from a single patient by employing contrastive learning. △ Less

Submitted 2 November, 2024; originally announced November 2024.

Comments: Under review at ISBI 2025 conference

arXiv:2410.21521 [pdf, other]

A Multi-Agent Reinforcement Learning Testbed for Cognitive Radio Applications

Authors: Sriniketh Vangaru, Daniel Rosen, Dylan Green, Raphael Rodriguez, Maxwell Wiecek, Amos Johnson, Alyse M. Jones, William C. Headley

Abstract: Technological trends show that Radio Frequency Reinforcement Learning (RFRL) will play a prominent role in the wireless communication systems of the future. Applications of RFRL range from military communications jamming to enhancing WiFi networks. Before deploying algorithms for these purposes, they must be trained in a simulation environment to ensure adequate performance. For this reason, we pr… ▽ More Technological trends show that Radio Frequency Reinforcement Learning (RFRL) will play a prominent role in the wireless communication systems of the future. Applications of RFRL range from military communications jamming to enhancing WiFi networks. Before deploying algorithms for these purposes, they must be trained in a simulation environment to ensure adequate performance. For this reason, we previously created the RFRL Gym: a standardized, accessible tool for the development and testing of reinforcement learning (RL) algorithms in the wireless communications space. This environment leveraged the OpenAI Gym framework and featured customizable simulation scenarios within the RF spectrum. However, the RFRL Gym was limited to training a single RL agent per simulation; this is not ideal, as most real-world RF scenarios will contain multiple intelligent agents in cooperative, competitive, or mixed settings, which is a natural consequence of spectrum congestion. Therefore, through integration with Ray RLlib, multi-agent reinforcement learning (MARL) functionality for training and assessment has been added to the RFRL Gym, making it even more of a robust tool for RF spectrum simulation. This paper provides an overview of the updated RFRL Gym environment. In this work, the general framework of the tool is described relative to comparable existing resources, highlighting the significant additions and refactoring we have applied to the Gym. Afterward, results from testing various RF scenarios in the MARL environment and future additions are discussed. △ Less

Submitted 2 December, 2024; v1 submitted 28 October, 2024; originally announced October 2024.

Comments: Accepted to IEEE CCNC 2025. Added revisions from paper reviews

arXiv:2410.13026 [pdf, other]

Design and Feasibility of a Community Motorcycle Ambulance System in the Philippines

Authors: Aaron Rodriguez, Aidan Chen, Ryan Rodriguez

Abstract: This study investigates the potential for motorcycle ambulance (motorlance) deployment in Metro Manila and Iloilo City to improve emergency medical care in high-traffic, underserved regions of the Philippines. VSee, a humanitarian technology company, has organized numerous free clinics in the Philippines and identified a critical need for improved emergency services. Motorlances offer a fast, affo… ▽ More This study investigates the potential for motorcycle ambulance (motorlance) deployment in Metro Manila and Iloilo City to improve emergency medical care in high-traffic, underserved regions of the Philippines. VSee, a humanitarian technology company, has organized numerous free clinics in the Philippines and identified a critical need for improved emergency services. Motorlances offer a fast, affordable alternative to traditional ambulances, particularly in congested urban settings and remote rural locations. Pilot programs in Malawi, Thailand, and Iran have demonstrated significant improvements in response times and cost-efficiency with motorlance systems. This study presents a framework for motorlance operation and identifies three potential pilot locations: Mandaluyong, Smokey Mountain, and Iloilo City. Site visits, driver interviews, and user surveys indicate public trust in the motorlance concept and positive reception to potential motorlance deployment. Cost analysis verifies the financial feasibility of motorlance systems. Future work will focus on implementing a physical pilot in Mandaluyong, with the aim of expanding service to similar regions contingent on the Mandaluyong pilot's success. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 7 pages, 8 figures

arXiv:2410.04173 [pdf, other]

Fast Object Detection with a Machine Learning Edge Device

Authors: Richard C. Rodriguez, Jonah Elijah P. Bardos

Abstract: This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humano… ▽ More This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humanoid robot and to support real-time object recognition, scene understanding, visual navigation, motion planning, and autonomous navigation of the robot. This study compares processors for inferencing time performance between a central processing unit (CPU), a graphical processing unit (GPU), and a tensor processing unit (TPU). CPUs, GPUs, and TPUs are all processors that can be used for machine learning tasks. Related to the aim of supporting an autonomous humanoid robot, there was an additional effort to observe whether or not there was a significant difference in using a camera having monocular vision versus stereo vision capability. TPU inference time results for this study reflect a 25% reduction in time over the GPU, and a whopping 87.5% reduction in inference time compared to the CPU. Much information in this paper is contributed to the final selection of Google's Coral brand, Edge TPU device. The Arduino Nano 33 BLE Sense Tiny ML Kit was also considered for comparison but due to initial incompatibilities and in the interest of time to complete this study, a decision was made to review the kit in a future experiment. △ Less

Submitted 5 October, 2024; originally announced October 2024.

arXiv:2409.15090 [pdf, other]

Using Similarity to Evaluate Factual Consistency in Summaries

Authors: Yuxuan Ye, Edwin Simpson, Raul Santos Rodriguez

Abstract: Cutting-edge abstractive summarisers generate fluent summaries, but the factuality of the generated text is not guaranteed. Early summary factuality evaluation metrics are usually based on n-gram overlap and embedding similarity, but are reported fail to align with human annotations. Therefore, many techniques for detecting factual inconsistencies build pipelines around natural language inference… ▽ More Cutting-edge abstractive summarisers generate fluent summaries, but the factuality of the generated text is not guaranteed. Early summary factuality evaluation metrics are usually based on n-gram overlap and embedding similarity, but are reported fail to align with human annotations. Therefore, many techniques for detecting factual inconsistencies build pipelines around natural language inference (NLI) or question-answering (QA) models with additional supervised learning steps. In this paper, we revisit similarity-based metrics, showing that this failure stems from the comparison text selection and its granularity. We propose a new zero-shot factuality evaluation metric, Sentence-BERT Score (SBERTScore), which compares sentences between the summary and the source document. It outperforms widely-used word-word metrics including BERTScore and can compete with existing NLI and QA-based factuality metrics on the benchmark without needing any fine-tuning. Our experiments indicate that each technique has different strengths, with SBERTScore particularly effective in identifying correct summaries. We demonstrate how a combination of techniques is more effective in detecting various types of error. △ Less

Submitted 23 September, 2024; originally announced September 2024.

arXiv:2408.12217 [pdf, other]

doi 10.1109/ACCESS.2024.3514603

Quantifying Psychological Sophistication of Malicious Emails

Authors: Theodore Longtchi, Rosana Montañez Rodriguez, Kora Gwartney, Ekzhin Ear, David P. Azari, Christopher P. Kelley, Shouhuai Xu

Abstract: Malicious emails including Phishing, Spam, and Scam are one significant class of cyber social engineering attacks. Despite numerous defenses to counter them, the problem remains largely open. The ineffectiveness of current defenses can be attributed to our superficial understanding of the psychological properties that make these attacks successful. This problem motivates us to investigate the psyc… ▽ More Malicious emails including Phishing, Spam, and Scam are one significant class of cyber social engineering attacks. Despite numerous defenses to counter them, the problem remains largely open. The ineffectiveness of current defenses can be attributed to our superficial understanding of the psychological properties that make these attacks successful. This problem motivates us to investigate the psychological sophistication, or sophistication for short, of malicious emails. We propose an innovative framework that accommodates two important and complementary aspects of sophistication, dubbed Psychological Techniques, PTechs, and Psychological Tactics, PTacs. We propose metrics and grading rules for human experts to assess the sophistication of malicious emails via the lens of these PTechs and PTacs. To demonstrate the usefulness of the framework, we conduct a case study based on 1,036 malicious emails assessed by four independent graders. Our results show that malicious emails are psychologically sophisticated, while exhibiting both commonalities and different patterns in terms of their PTechs and PTacs. Results also show that previous studies might have focused on dealing with the less proliferated PTechs such as Persuasion and PTacs such as Reward, rather than the most proliferated PTechs such as Attention Grabbing and Impersonation, and PTacs such as Fit and Form and Familiarity that are identified in this study. We also found among others that social events are widely exploited by attackers in contextualizing their malicious emails. These findings could be leveraged to guide the design of effective defenses against malicious emails. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 22 papges, 15 figures, 4 tables

Report number: Access-2024-45196

Journal ref: IEEE Access 12 (2024) 187512-187535

arXiv:2407.21783 [pdf, other]

The Llama 3 Herd of Models

Authors: Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere , et al. (536 additional authors not shown)

Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development. △ Less

Submitted 23 November, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

arXiv:2405.19354 [pdf, ps, other]

doi 10.1007/978-3-031-08971-8_55

Rotations of Gödel algebras with modal operators

Authors: Tommaso Flaminio, Lluis Godo, Paula Menchón, Ricardo O. Rodriguez

Abstract: The present paper is devoted to study the effect of connected and disconnected rotations of Gödel algebras with operators grounded on directly indecomposable structures. The structures resulting from this construction we will present are nilpotent minimum (with or without negation fixpoint, depending on whether the rotation is connected or disconnected) with special modal operators defined on a di… ▽ More The present paper is devoted to study the effect of connected and disconnected rotations of Gödel algebras with operators grounded on directly indecomposable structures. The structures resulting from this construction we will present are nilpotent minimum (with or without negation fixpoint, depending on whether the rotation is connected or disconnected) with special modal operators defined on a directly indecomposable algebra. In this paper we will present a (quasi-)equational definition of these latter structures. Our main results show that directly indecomposable nilpotent minimum algebras (with or without negation fixpoint) with modal operators are fully characterized as connected and disconnected rotations of directly indecomposable Gödel algebras endowed with modal operators. △ Less

Submitted 23 May, 2024; originally announced May 2024.

MSC Class: 03B50; 03B45

arXiv:2405.07369 [pdf, other]

Incorporating Anatomical Awareness for Enhanced Generalizability and Progression Prediction in Deep Learning-Based Radiographic Sacroiliitis Detection

Authors: Felix J. Dorfner, Janis L. Vahldiek, Leonhard Donle, Andrei Zhukov, Lina Xu, Hartmut Häntze, Marcus R. Makowski, Hugo J. W. L. Aerts, Fabian Proft, Valeria Rios Rodriguez, Judith Rademacher, Mikhail Protopopov, Hildrun Haibel, Torsten Diekhoff, Murat Torgutalp, Lisa C. Adams, Denis Poddubnyy, Keno K. Bressem

Abstract: Purpose: To examine whether incorporating anatomical awareness into a deep learning model can improve generalizability and enable prediction of disease progression. Methods: This retrospective multicenter study included conventional pelvic radiographs of 4 different patient cohorts focusing on axial spondyloarthritis (axSpA) collected at university and community hospitals. The first cohort, whic… ▽ More Purpose: To examine whether incorporating anatomical awareness into a deep learning model can improve generalizability and enable prediction of disease progression. Methods: This retrospective multicenter study included conventional pelvic radiographs of 4 different patient cohorts focusing on axial spondyloarthritis (axSpA) collected at university and community hospitals. The first cohort, which consisted of 1483 radiographs, was split into training (n=1261) and validation (n=222) sets. The other cohorts comprising 436, 340, and 163 patients, respectively, were used as independent test datasets. For the second cohort, follow-up data of 311 patients was used to examine progression prediction capabilities. Two neural networks were trained, one on images cropped to the bounding box of the sacroiliac joints (anatomy-aware) and the other one on full radiographs. The performance of the models was compared using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. Results: On the three test datasets, the standard model achieved AUC scores of 0.853, 0.817, 0.947, with an accuracy of 0.770, 0.724, 0.850. Whereas the anatomy-aware model achieved AUC scores of 0.899, 0.846, 0.957, with an accuracy of 0.821, 0.744, 0.906, respectively. The patients who were identified as high risk by the anatomy aware model had an odds ratio of 2.16 (95% CI: 1.19, 3.86) for having progression of radiographic sacroiliitis within 2 years. Conclusion: Anatomical awareness can improve the generalizability of a deep learning model in detecting radiographic sacroiliitis. The model is published as fully open source alongside this study. △ Less

Submitted 12 May, 2024; originally announced May 2024.

arXiv:2402.13673 [pdf, other]

doi 10.3390/axioms13020083

Computing Transiting Exoplanet Parameters with 1D Convolutional Neural Networks

Authors: Santiago Iglesias Álvarez, Enrique Díez Alonso, María Luisa Sánchez Rodríguez, Javier Rodríguez Rodríguez, Saúl Pérez Fernández, Francisco Javier de Cos Juez

Abstract: The transit method allows the detection and characterization of planetary systems by analyzing stellar light curves. Convolutional neural networks appear to offer a viable solution for automating these analyses. In this research, two 1D convolutional neural network models, which work with simulated light curves in which transit-like signals were injected, are presented. One model operates on compl… ▽ More The transit method allows the detection and characterization of planetary systems by analyzing stellar light curves. Convolutional neural networks appear to offer a viable solution for automating these analyses. In this research, two 1D convolutional neural network models, which work with simulated light curves in which transit-like signals were injected, are presented. One model operates on complete light curves and estimates the orbital period, and the other one operates on phase-folded light curves and estimates the semimajor axis of the orbit and the square of the planet-to-star radius ratio. Both models were tested on real data from TESS light curves with confirmed planets to ensure that they are able to work with real data. The results obtained show that 1D CNNs are able to characterize transiting exoplanets from their host star's detrended light curve and, furthermore, reducing both the required time and computational costs compared with the current detection and characterization algorithms. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.12394 [pdf, other]

Improving Model's Interpretability and Reliability using Biomarkers

Authors: Gautam Rajendrakumar Gare, Tom Fox, Beam Chansangavej, Amita Krishnan, Ricardo Luis Rodriguez, Bennett P deBoisblanc, Deva Kannan Ramanan, John Michael Galeotti

Abstract: Accurate and interpretable diagnostic models are crucial in the safety-critical field of medicine. We investigate the interpretability of our proposed biomarker-based lung ultrasound diagnostic pipeline to enhance clinicians' diagnostic capabilities. The objective of this study is to assess whether explanations from a decision tree classifier, utilizing biomarkers, can improve users' ability to id… ▽ More Accurate and interpretable diagnostic models are crucial in the safety-critical field of medicine. We investigate the interpretability of our proposed biomarker-based lung ultrasound diagnostic pipeline to enhance clinicians' diagnostic capabilities. The objective of this study is to assess whether explanations from a decision tree classifier, utilizing biomarkers, can improve users' ability to identify inaccurate model predictions compared to conventional saliency maps. Our findings demonstrate that decision tree explanations, based on clinically established biomarkers, can assist clinicians in detecting false positives, thus improving the reliability of diagnostic models in medicine. △ Less

Submitted 30 January, 2025; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: Accepted at BIAS 2023 Conference

arXiv:2401.12350 [pdf, other]

doi 10.1145/3615338.3618122

Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge

Authors: Yao Lu, Hiram Rayo Torres Rodriguez, Sebastian Vogel, Nick van de Waterlaat, Pavol Jancura

Abstract: Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods… ▽ More Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods, do not scale to larger tasks. Consequently, QA-NAS has mostly been limited to low-scale tasks and tiny networks. In this work, we present an approach to enable QA-NAS (INT8 and FB-MP) on large-scale tasks by leveraging the block-wise formulation introduced by block-wise NAS. We demonstrate strong results for the semantic segmentation task on the Cityscapes dataset, finding FB-MP models 33% smaller and INT8 models 17.6% faster than DeepLabV3 (INT8) without compromising task performance. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Accepted at Workshop on Compilers, Deployment, and Tooling for Edge AI (CODAI '23 ), September 21, 2023, Hamburg, Germany

arXiv:2312.11283 [pdf, other]

The 2010 Census Confidentiality Protections Failed, Here's How and Why

Authors: John M. Abowd, Tamara Adams, Robert Ashmead, David Darais, Sourya Dey, Simson L. Garfinkel, Nathan Goldschlag, Daniel Kifer, Philip Leclerc, Ethan Lew, Scott Moore, Rolando A. Rodríguez, Ramy N. Tadros, Lars Vilhuber

Abstract: Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can veri… ▽ More Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act. △ Less

Submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.10238 [pdf, other]

Hypothesis Testing for Class-Conditional Noise Using Local Maximum Likelihood

Authors: Weisong Yang, Rafael Poyiadzi, Niall Twomey, Raul Santos Rodriguez

Abstract: In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question. In certain particular cases, hypothesis testing procedures have been proposed to assess whether a given instance-label dataset is contaminated with class-conditional label noise, as opposed to uniform label noise. The existing theory builds on the asymptotic… ▽ More In supervised learning, automatically assessing the quality of the labels before any learning takes place remains an open research question. In certain particular cases, hypothesis testing procedures have been proposed to assess whether a given instance-label dataset is contaminated with class-conditional label noise, as opposed to uniform label noise. The existing theory builds on the asymptotic properties of the Maximum Likelihood Estimate for parametric logistic regression. However, the parametric assumptions on top of which these approaches are constructed are often too strong and unrealistic in practice. To alleviate this problem, in this paper we propose an alternative path by showing how similar procedures can be followed when the underlying model is a product of Local Maximum Likelihood Estimation that leads to more flexible nonparametric logistic regression models, which in turn are less susceptible to model misspecification. This different view allows for wider applicability of the tests by offering users access to a richer model class. Similarly to existing works, we assume we have access to anchor points which are provided by the users. We introduce the necessary ingredients for the adaptation of the hypothesis tests to the case of nonparametric logistic regression and empirically compare against the parametric approach presenting both synthetic and real-world case studies and discussing the advantages and limitations of the proposed approach. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.07161 [pdf, other]

doi 10.3390/axioms12040348

One-dimensional Convolutional Neural Networks for Detecting Transiting Exoplanets

Authors: Santiago Iglesias Álvarez, Enrique Díez Alonso, María Luisa Sánchez, Javier Rodríguez Rodríguez, Fernando Sánchez Lasheras, Francisco Javier de Cos Juez

Abstract: The transit method is one of the most relevant exoplanet detection techniques, which consists of detecting periodic eclipses in the light curves of stars. This is not always easy due to the presence of noise in the light curves, which is induced, for example, by the response of a telescope to stellar flux. For this reason, we aimed to develop an artificial neural network model that is able to dete… ▽ More The transit method is one of the most relevant exoplanet detection techniques, which consists of detecting periodic eclipses in the light curves of stars. This is not always easy due to the presence of noise in the light curves, which is induced, for example, by the response of a telescope to stellar flux. For this reason, we aimed to develop an artificial neural network model that is able to detect these transits in light curves obtained from different telescopes and surveys. We created artificial light curves with and without transits to try to mimic those expected for the extended mission of the Kepler telescope (K2) in order to train and validate a 1D convolutional neural network model, which was later tested, obtaining an accuracy of 99.02 % and an estimated error (loss function) of 0.03. These results, among others, helped to confirm that the 1D CNN is a good choice for working with non-phased-folded Mandel and Agol light curves with transits. It also reduces the number of light curves that have to be visually inspected to decide if they present transit-like signals and decreases the time needed for analyzing each (with respect to traditional analysis). △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.04308 [pdf, other]

Multi Actor-Critic DDPG for Robot Action Space Decomposition: A Framework to Control Large 3D Deformation of Soft Linear Objects

Authors: Mélodie Daniel, Aly Magassouba, Miguel Aranda, Laurent Lequièvre, Juan Antonio Corrales Ramon, Roberto Iglesias Rodriguez, Youcef Mezouar

Abstract: Robotic manipulation of deformable linear objects (DLOs) has great potential for applications in diverse fields such as agriculture or industry. However, a major challenge lies in acquiring accurate deformation models that describe the relationship between robot motion and DLO deformations. Such models are difficult to calculate analytically and vary among DLOs. Consequently, manipulating DLOs pos… ▽ More Robotic manipulation of deformable linear objects (DLOs) has great potential for applications in diverse fields such as agriculture or industry. However, a major challenge lies in acquiring accurate deformation models that describe the relationship between robot motion and DLO deformations. Such models are difficult to calculate analytically and vary among DLOs. Consequently, manipulating DLOs poses significant challenges, particularly in achieving large deformations that require highly accurate global models. To address these challenges, this paper presents MultiAC6: a new multi Actor-Critic framework for robot action space decomposition to control large 3D deformations of DLOs. In our approach, two deep reinforcement learning (DRL) agents orient and position a robot gripper to deform a DLO into the desired shape. Unlike previous DRL-based studies, MultiAC6 is able to solve the sim-to-real gap, achieving large 3D deformations up to 40 cm in real-world settings. Experimental results also show that MultiAC6 has a 66\% higher success rate than a single-agent approach. Further experimental studies demonstrate that MultiAC6 generalizes well, without retraining, to DLOs with different lengths or materials. △ Less

Submitted 8 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: 9 pages, 7 figures, 5 tables, Accepted for IEEE Robotics and Automation Letters (RA-L)

arXiv:2310.09398 [pdf, other]

doi 10.1073/pnas.2220558120

An In-Depth Examination of Requirements for Disclosure Risk Assessment

Authors: Ron S. Jarmin, John M. Abowd, Robert Ashmead, Ryan Cumings-Menon, Nathan Goldschlag, Michael B. Hawes, Sallie Ann Keller, Daniel Kifer, Philip Leclerc, Jerome P. Reiter, Rolando A. Rodríguez, Ian Schmutte, Victoria A. Velkoff, Pavel Zhuravlev

Abstract: The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be bas… ▽ More The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 47 pages, 1 table

Journal ref: PNAS, October 13, 2023, Vol. 120, No. 43

arXiv:2309.12744 [pdf, other]

Open Source Robot Localization for Non-Planar Environments

Authors: Francisco Martín Rico, José Miguel Guerrero Hernández, Rodrigo Pérez Rodríguez, Juan Diego Peña Narváez, Alberto García Gómez-Jacinto

Abstract: The operational environments in which a mobile robot executes its missions often exhibit non-flat terrain characteristics, encompassing outdoor and indoor settings featuring ramps and slopes. In such scenarios, the conventional methodologies employed for localization encounter novel challenges and limitations. This study delineates a localization framework incorporating ground elevation and inclin… ▽ More The operational environments in which a mobile robot executes its missions often exhibit non-flat terrain characteristics, encompassing outdoor and indoor settings featuring ramps and slopes. In such scenarios, the conventional methodologies employed for localization encounter novel challenges and limitations. This study delineates a localization framework incorporating ground elevation and incline considerations, deviating from traditional 2D localization paradigms that may falter in such contexts. In our proposed approach, the map encompasses elevation and spatial occupancy information, employing Gridmaps and Octomaps. At the same time, the perception model is designed to accommodate the robot's inclined orientation and the potential presence of ground as an obstacle, besides usual structural and dynamic obstacles. We provide an implementation of our approach fully working with Nav2, ready to replace the baseline AMCL approach when the robot is in non-planar environments. Our methodology was rigorously tested in both simulated environments and through practical application on actual robots, including the Tiago and Summit XL models, across various settings ranging from indoor and outdoor to flat and uneven terrains. Demonstrating exceptional precision, our approach yielded error margins below 10 centimeters and 0.05 radians in indoor settings and less than 1.0 meters in extensive outdoor routes. While our results exhibit a slight improvement over AMCL in indoor environments, the enhancement in performance is significantly more pronounced when compared to 3D SLAM algorithms. This underscores the considerable robustness and efficiency of our approach, positioning it as an effective strategy for mobile robots tasked with navigating expansive and intricate indoor/outdoor environments. △ Less

Submitted 30 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

arXiv:2302.02291 [pdf, other]

doi 10.1145/3582768.3582789

A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing

Authors: Izunna Okpala, Guillermo Romera Rodriguez, Andrea Tapia, Shane Halse, Jess Kropczynski

Abstract: This study aims to demonstrate the methods for detecting negations in a sentence by uniquely evaluating the lexical structure of the text via word-sense disambiguation. The proposed framework examines all the unique features in the various expressions within a text to resolve the contextual usage of all tokens and decipher the effect of negation on sentiment analysis. The application of popular ex… ▽ More This study aims to demonstrate the methods for detecting negations in a sentence by uniquely evaluating the lexical structure of the text via word-sense disambiguation. The proposed framework examines all the unique features in the various expressions within a text to resolve the contextual usage of all tokens and decipher the effect of negation on sentiment analysis. The application of popular expression detectors skips this important step, thereby neglecting the root words caught in the web of negation and making text classification difficult for machine learning and sentiment analysis. This study adopts the Natural Language Processing (NLP) approach to discover and antonimize words that were negated for better accuracy in text classification using a knowledge base provided by an NLP library called WordHoard. Early results show that our initial analysis improved on traditional sentiment analysis, which sometimes neglects negations or assigns an inverse polarity score. The SentiWordNet analyzer was improved by 35%, the Vader analyzer by 20% and the TextBlob by 6%. △ Less

Submitted 22 February, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

ACM Class: I.2.7; I.5.1; I.7.1; I.7.2

Journal ref: 6th International Conference on Natural Language Processing and Information Retrieval (NLPIR'2022)

arXiv:2212.07527 [pdf]

Plastic Contaminant Detection in Aerial Imagery of Cotton Fields with Deep Learning

Authors: Pappu Kumar Yadav, J. Alex Thomasson, Robert G. Hardin, Stephen W. Searcy, Ulisses Braga-Neto, Sorin C. Popescu, Roberto Rodriguez, Daniel E Martin, Juan Enciso, Karem Meza, Emma L. White

Abstract: Plastic shopping bags that get carried away from the side of roads and tangled on cotton plants can end up at cotton gins if not removed before the harvest. Such bags may not only cause problem in the ginning process but might also get embodied in cotton fibers reducing its quality and marketable value. Therefore, it is required to detect, locate, and remove the bags before cotton is harvested. Ma… ▽ More Plastic shopping bags that get carried away from the side of roads and tangled on cotton plants can end up at cotton gins if not removed before the harvest. Such bags may not only cause problem in the ginning process but might also get embodied in cotton fibers reducing its quality and marketable value. Therefore, it is required to detect, locate, and remove the bags before cotton is harvested. Manually detecting and locating these bags in cotton fields is labor intensive, time-consuming and a costly process. To solve these challenges, we present application of four variants of YOLOv5 (YOLOv5s, YOLOv5m, YOLOv5l and YOLOv5x) for detecting plastic shopping bags using Unmanned Aircraft Systems (UAS)-acquired RGB (Red, Green, and Blue) images. We also show fixed effect model tests of color of plastic bags as well as YOLOv5-variant on average precision (AP), mean average precision (mAP@50) and accuracy. In addition, we also demonstrate the effect of height of plastic bags on the detection accuracy. It was found that color of bags had significant effect (p < 0.001) on accuracy across all the four variants while it did not show any significant effect on the AP with YOLOv5m (p = 0.10) and YOLOv5x (p = 0.35) at 95% confidence level. Similarly, YOLOv5-variant did not show any significant effect on the AP (p = 0.11) and accuracy (p = 0.73) of white bags, but it had significant effects on the AP (p = 0.03) and accuracy (p = 0.02) of brown bags including on the mAP@50 (p = 0.01) and inference speed (p < 0.0001). Additionally, height of plastic bags had significant effect (p < 0.0001) on overall detection accuracy. The findings reported in this paper can be useful in speeding up removal of plastic bags from cotton fields before harvest and thereby reducing the amount of contaminants that end up at cotton gins. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: preprint

arXiv:2211.03471 [pdf, other]

Sittin'On the Dock of the (WiFi) Bay: On the Frame Aggregation under IEEE 802.11 DCF

Authors: Ricardo J. Rodríguez, José Luis Salazar, Julián Fernández-Navajas

Abstract: It is well known that frame aggregation in Internet communications improves transmission efficiency. However, it also causes a delay that for some real-time communications is inappropriate, thus creating a trade-off between efficiency and delay. In this paper, we establish the conditions for frame aggregation under the IEEE 802.11 DCF protocol to be beneficial on average delay. To do so, we first… ▽ More It is well known that frame aggregation in Internet communications improves transmission efficiency. However, it also causes a delay that for some real-time communications is inappropriate, thus creating a trade-off between efficiency and delay. In this paper, we establish the conditions for frame aggregation under the IEEE 802.11 DCF protocol to be beneficial on average delay. To do so, we first describe the transmission time in IEEE 802.11 in a stochastic framework and then we calculate the optimal value of the frames that, when aggregated, saves transmission time in the long term. Our findings, discussed with numerical experimentation, show that frame aggregation reduces transmission congestion and transmission delays. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2208.00519 [pdf]

Assessing The Performance of YOLOv5 Algorithm for Detecting Volunteer Cotton Plants in Corn Fields at Three Different Growth Stages

Authors: Pappu Kumar Yadav, J. Alex Thomasson, Stephen W. Searcy, Robert G. Hardin, Ulisses Braga-Neto, Sorin C. Popescu, Daniel E. Martin, Roberto Rodriguez, Karem Meza, Juan Enciso, Jorge Solorzano Diaz, Tianyi Wang

Abstract: The boll weevil (Anthonomus grandis L.) is a serious pest that primarily feeds on cotton plants. In places like Lower Rio Grande Valley of Texas, due to sub-tropical climatic conditions, cotton plants can grow year-round and therefore the left-over seeds from the previous season during harvest can continue to grow in the middle of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor… ▽ More The boll weevil (Anthonomus grandis L.) is a serious pest that primarily feeds on cotton plants. In places like Lower Rio Grande Valley of Texas, due to sub-tropical climatic conditions, cotton plants can grow year-round and therefore the left-over seeds from the previous season during harvest can continue to grow in the middle of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor L.). These feral or volunteer cotton (VC) plants when reach the pinhead squaring phase (5-6 leaf stage) can act as hosts for the boll weevil pest. The Texas Boll Weevil Eradication Program (TBWEP) employs people to locate and eliminate VC plants growing by the side of roads or fields with rotation crops but the ones growing in the middle of fields remain undetected. In this paper, we demonstrate the application of computer vision (CV) algorithm based on You Only Look Once version 5 (YOLOv5) for detecting VC plants growing in the middle of corn fields at three different growth stages (V3, V6, and VT) using unmanned aircraft systems (UAS) remote sensing imagery. All the four variants of YOLOv5 (s, m, l, and x) were used and their performances were compared based on classification accuracy, mean average precision (mAP), and F1-score. It was found that YOLOv5s could detect VC plants with a maximum classification accuracy of 98% and mAP of 96.3 % at the V6 stage of corn while YOLOv5s and YOLOv5m resulted in the lowest classification accuracy of 85% and YOLOv5m and YOLOv5l had the least mAP of 86.5% at the VT stage on images of size 416 x 416 pixels. The developed CV algorithm has the potential to effectively detect and locate VC plants growing in the middle of corn fields as well as expedite the management aspects of TBWEP. △ Less

Submitted 31 July, 2022; originally announced August 2022.

Comments: Preprint Under Review

arXiv:2207.07334 [pdf]

Computer Vision for Volunteer Cotton Detection in a Corn Field with UAS Remote Sensing Imagery and Spot Spray Applications

Authors: Pappu Kumar Yadav, J. Alex Thomasson, Stephen W. Searcy, Robert G. Hardin, Ulisses Braga-Neto, Sorin C. Popescu, Daniel E. Martin, Roberto Rodriguez, Karem Meza, Juan Enciso, Jorge Solorzano Diaz, Tianyi Wang

Abstract: To control boll weevil (Anthonomus grandis L.) pest re-infestation in cotton fields, the current practices of volunteer cotton (VC) (Gossypium hirsutum L.) plant detection in fields of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor L.) involve manual field scouting at the edges of fields. This leads to many VC plants growing in the middle of fields remain undetected that conti… ▽ More To control boll weevil (Anthonomus grandis L.) pest re-infestation in cotton fields, the current practices of volunteer cotton (VC) (Gossypium hirsutum L.) plant detection in fields of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor L.) involve manual field scouting at the edges of fields. This leads to many VC plants growing in the middle of fields remain undetected that continue to grow side by side along with corn and sorghum. When they reach pinhead squaring stage (5-6 leaves), they can serve as hosts for the boll weevil pests. Therefore, it is required to detect, locate and then precisely spot-spray them with chemicals. In this paper, we present the application of YOLOv5m on radiometrically and gamma-corrected low resolution (1.2 Megapixel) multispectral imagery for detecting and locating VC plants growing in the middle of tasseling (VT) growth stage of cornfield. Our results show that VC plants can be detected with a mean average precision (mAP) of 79% and classification accuracy of 78% on images of size 1207 x 923 pixels at an average inference speed of nearly 47 frames per second (FPS) on NVIDIA Tesla P100 GPU-16GB and 0.4 FPS on NVIDIA Jetson TX2 GPU. We also demonstrate the application of a customized unmanned aircraft systems (UAS) for spot-spray applications based on the developed computer vision (CV) algorithm and how it can be used for near real-time detection and mitigation of VC plants growing in corn fields for efficient management of the boll weevil pests. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: 39 pages

arXiv:2207.06673 [pdf]

Detecting Volunteer Cotton Plants in a Corn Field with Deep Learning on UAV Remote-Sensing Imagery

Authors: Pappu Kumar Yadav, J. Alex Thomasson, Robert Hardin, Stephen W. Searcy, Ulisses Braga-Neto, Sorin C. Popescu, Daniel E. Martin, Roberto Rodriguez, Karem Meza, Juan Enciso, Jorge Solorzano Diaz, Tianyi Wang

Abstract: The cotton boll weevil, Anthonomus grandis Boheman is a serious pest to the U.S. cotton industry that has cost more than 16 billion USD in damages since it entered the United States from Mexico in the late 1800s. This pest has been nearly eradicated; however, southern part of Texas still faces this issue and is always prone to the pest reinfestation each year due to its sub-tropical climate where… ▽ More The cotton boll weevil, Anthonomus grandis Boheman is a serious pest to the U.S. cotton industry that has cost more than 16 billion USD in damages since it entered the United States from Mexico in the late 1800s. This pest has been nearly eradicated; however, southern part of Texas still faces this issue and is always prone to the pest reinfestation each year due to its sub-tropical climate where cotton plants can grow year-round. Volunteer cotton (VC) plants growing in the fields of inter-seasonal crops, like corn, can serve as hosts to these pests once they reach pin-head square stage (5-6 leaf stage) and therefore need to be detected, located, and destroyed or sprayed . In this paper, we present a study to detect VC plants in a corn field using YOLOv3 on three band aerial images collected by unmanned aircraft system (UAS). The two-fold objectives of this paper were : (i) to determine whether YOLOv3 can be used for VC detection in a corn field using RGB (red, green, and blue) aerial images collected by UAS and (ii) to investigate the behavior of YOLOv3 on images at three different scales (320 x 320, S1; 416 x 416, S2; and 512 x 512, S3 pixels) based on average precision (AP), mean average precision (mAP) and F1-score at 95% confidence level. No significant differences existed for mAP among the three scales, while a significant difference was found for AP between S1 and S3 (p = 0.04) and S2 and S3 (p = 0.02). A significant difference was also found for F1-score between S2 and S3 (p = 0.02). The lack of significant differences of mAP at all the three scales indicated that the trained YOLOv3 model can be used on a computer vision-based remotely piloted aerial application system (RPAAS) for VC detection and spray application in near real-time. △ Less

Submitted 14 July, 2022; originally announced July 2022.

Comments: 38 Pages

arXiv:2206.08398 [pdf, other]

Learning Generic Lung Ultrasound Biomarkers for Decoupling Feature Extraction from Downstream Tasks

Authors: Gautam Rajendrakumar Gare, Tom Fox, Pete Lowery, Kevin Zamora, Hai V. Tran, Laura Hutchins, David Montgomery, Amita Krishnan, Deva Kannan Ramanan, Ricardo Luis Rodriguez, Bennett P deBoisblanc, John Michael Galeotti

Abstract: Contemporary artificial neural networks (ANN) are trained end-to-end, jointly learning both features and classifiers for the task of interest. Though enormously effective, this paradigm imposes significant costs in assembling annotated task-specific datasets and training large-scale networks. We propose to decouple feature learning from downstream lung ultrasound tasks by introducing an auxiliary… ▽ More Contemporary artificial neural networks (ANN) are trained end-to-end, jointly learning both features and classifiers for the task of interest. Though enormously effective, this paradigm imposes significant costs in assembling annotated task-specific datasets and training large-scale networks. We propose to decouple feature learning from downstream lung ultrasound tasks by introducing an auxiliary pre-task of visual biomarker classification. We demonstrate that one can learn an informative, concise, and interpretable feature space from ultrasound videos by training models for predicting biomarker labels. Notably, biomarker feature extractors can be trained from data annotated with weak video-scale supervision. These features can be used by a variety of downstream Expert models targeted for diverse clinical tasks (Diagnosis, lung severity, S/F ratio). Crucially, task-specific expert models are comparable in accuracy to end-to-end models directly trained for such target tasks, while being significantly lower cost to train. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2203.08302 [pdf, other]

Internet-based Social Engineering Attacks, Defenses and Psychology: A Survey

Authors: Theodore Longtchi, Rosana Montañez Rodriguez, Laith Al-Shawaf, Adham Atyabi, Shouhuai Xu

Abstract: Social engineering attacks are a major cyber threat because they often serve as a first step for an attacker to break into an otherwise well-defended network, steal victims' credentials, and cause financial losses. The problem has received due amount of attention with many publications proposing defenses against them. Despite this, the situation has not improved. In this paper, we aim to understan… ▽ More Social engineering attacks are a major cyber threat because they often serve as a first step for an attacker to break into an otherwise well-defended network, steal victims' credentials, and cause financial losses. The problem has received due amount of attention with many publications proposing defenses against them. Despite this, the situation has not improved. In this paper, we aim to understand and explain this phenomenon by looking into the root cause of the problem. To this end, we examine the literature on attacks and defenses through a unique lens we propose -- {\em psychological factors (PFs) and techniques (PTs)}. We find that there is a big discrepancy between attacks and defenses: Attacks have deliberately exploited PFs by leveraging PTs, but defenses rarely take either of these into consideration, preferring technical solutions. This explains why existing defenses have achieved limited success. This prompts us to propose a roadmap for a more systematic approach towards designing effective defenses against social engineering attacks. △ Less

Submitted 1 August, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: * Shared first-author

arXiv:2203.04813 [pdf, other]

Social Engineering Attacks and Defenses in the Physical World vs. Cyberspace: A Contrast Study

Authors: Rosana Montañez Rodriguez, Adham Atyabi, Shouhuai

Abstract: Social engineering attacks are phenomena that are equally applicable to both the physical world and cyberspace. These attacks in the physical world have been studied for a much longer time than their counterpart in cyberspace. This motivates us to investigate how social engineering attacks in the physical world and cyberspace relate to each other, including their common characteristics and unique… ▽ More Social engineering attacks are phenomena that are equally applicable to both the physical world and cyberspace. These attacks in the physical world have been studied for a much longer time than their counterpart in cyberspace. This motivates us to investigate how social engineering attacks in the physical world and cyberspace relate to each other, including their common characteristics and unique features. For this purpose, we propose a methodology to unify social engineering attacks and defenses in the physical world and cyberspace into a single framework, including: (i) a systematic model based on psychological principles for describing these attacks; (ii) a systematization of these attacks; and (iii) a systematization of defenses against them. Our study leads to several insights, which shed light on future research directions towards adequately defending against social engineering attacks in cyberspace. △ Less

Submitted 9 March, 2022; originally announced March 2022.

arXiv:2202.11333 [pdf, other]

Scalable Query Answering under Uncertainty to Neuroscientific Ontological Knowledge: The NeuroLang Approach

Authors: Gaston Zanitti, Yamil Soto, Valentin Iovene, Maria Vanina Martinez, Ricardo Rodriguez, Gerardo Simari, Demian Wassermann

Abstract: Researchers in neuroscience have a growing number of datasets available to study the brain, which is made possible by recent technological advances. Given the extent to which the brain has been studied, there is also available ontological knowledge encoding the current state of the art regarding its different areas, activation patterns, key words associated with studies, etc. Furthermore, there is… ▽ More Researchers in neuroscience have a growing number of datasets available to study the brain, which is made possible by recent technological advances. Given the extent to which the brain has been studied, there is also available ontological knowledge encoding the current state of the art regarding its different areas, activation patterns, key words associated with studies, etc. Furthermore, there is an inherent uncertainty associated with brain scans arising from the mapping between voxels -- 3D pixels -- and actual points in different individual brains. Unfortunately, there is currently no unifying framework for accessing such collections of rich heterogeneous data under uncertainty, making it necessary for researchers to rely on ad hoc tools. In particular, one major weakness of current tools that attempt to address this kind of task is that only very limited propositional query languages have been developed. In this paper, we present NeuroLang, an ontology language with existential rules, probabilistic uncertainty, and built-in mechanisms to guarantee tractable query answering over very large datasets. After presenting the language and its general query answering architecture, we discuss real-world use cases showing how NeuroLang can be applied to practical scenarios for which current tools are inadequate. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2202.05134 [pdf]

Understanding Twitters behavior during the pandemic: Fake News and Fear

Authors: Guillermo Romera Rodriguez, Sanjana Gautam, Andrea Tapia

Abstract: The outbreak of the SARS-CoV-2 novel coronavirus (COVID-19) has been accompanied by a large amount of misleading and false information about the virus, especially on social media. During the pandemic social media gained special interest as it went on to become an important medium of communication. This made the information being relayed on these platforms especially critical. In our work, we aim t… ▽ More The outbreak of the SARS-CoV-2 novel coronavirus (COVID-19) has been accompanied by a large amount of misleading and false information about the virus, especially on social media. During the pandemic social media gained special interest as it went on to become an important medium of communication. This made the information being relayed on these platforms especially critical. In our work, we aim to explore the percentage of fake news being spread on Twitter as well as measure the sentiment of the public at the same time. We further study how the sentiment of fear is present among the public. In addition to that we compare the rate of spread of the virus per day with the rate of spread of fake news on Twitter. Our study is useful in establishing the role of Twitter, and social media, during a crisis, and more specifically during crisis management. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: 21 pages, 5 figures

arXiv:2201.10166 [pdf, other]

doi 10.1109/ISBI48211.2021.9433826

Dense Pixel-Labeling for Reverse-Transfer and Diagnostic Learning on Lung Ultrasound for COVID-19 and Pneumonia Detection

Authors: Gautam Rajendrakumar Gare, Andrew Schoenling, Vipin Philip, Hai V Tran, Bennett P deBoisblanc, Ricardo Luis Rodriguez, John Michael Galeotti

Abstract: We propose using a pre-trained segmentation model to perform diagnostic classification in order to achieve better generalization and interpretability, terming the technique reverse-transfer learning. We present an architecture to convert segmentation models to classification models. We compare and contrast dense vs sparse segmentation labeling and study its impact on diagnostic classification. We… ▽ More We propose using a pre-trained segmentation model to perform diagnostic classification in order to achieve better generalization and interpretability, terming the technique reverse-transfer learning. We present an architecture to convert segmentation models to classification models. We compare and contrast dense vs sparse segmentation labeling and study its impact on diagnostic classification. We compare the performance of U-Net trained with dense and sparse labels to segment A-lines, B-lines, and Pleural lines on a custom dataset of lung ultrasound scans from 4 patients. Our experiments show that dense labels help reduce false positive detection. We study the classification capability of the dense and sparse trained U-Net and contrast it with a non-pretrained U-Net, to detect and differentiate COVID-19 and Pneumonia on a large ultrasound dataset of about 40k curvilinear and linear probe images. Our segmentation-based models perform better classification when using pretrained segmentation weights, with the dense-label pretrained U-Net performing the best. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Journal ref: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021, pp. 1406-1410

arXiv:2201.07368 [pdf, other]

doi 10.1007/978-3-030-90874-4_14

The Role of Pleura and Adipose in Lung Ultrasound AI

Authors: Gautam Rajendrakumar Gare, Wanwen Chen, Alex Ling Yu Hung, Edward Chen, Hai V. Tran, Tom Fox, Pete Lowery, Kevin Zamora, Bennett P deBoisblanc, Ricardo Luis Rodriguez, John Michael Galeotti

Abstract: In this paper, we study the significance of the pleura and adipose tissue in lung ultrasound AI analysis. We highlight their more prominent appearance when using high-frequency linear (HFL) instead of curvilinear ultrasound probes, showing HFL reveals better pleura detail. We compare the diagnostic utility of the pleura and adipose tissue using an HFL ultrasound probe. Masking the adipose tissue d… ▽ More In this paper, we study the significance of the pleura and adipose tissue in lung ultrasound AI analysis. We highlight their more prominent appearance when using high-frequency linear (HFL) instead of curvilinear ultrasound probes, showing HFL reveals better pleura detail. We compare the diagnostic utility of the pleura and adipose tissue using an HFL ultrasound probe. Masking the adipose tissue during training and inference (while retaining the pleural line and Merlin's space artifacts such as A-lines and B-lines) improved the AI model's diagnostic accuracy. △ Less

Submitted 18 January, 2022; originally announced January 2022.

Comments: Published in MICCAI 2021 workshop on Lessons Learned from the development and application of medical imaging-based AI technologies for combating COVID-19 (LL-COVID19). The first two authors contributed equally to this work

Journal ref: LL-COVID19 2021. Lecture Notes in Computer Science, vol 12969. Springer, Cham

arXiv:2201.07357 [pdf, other]

Weakly Supervised Contrastive Learning for Better Severity Scoring of Lung Ultrasound

Authors: Gautam Rajendrakumar Gare, Hai V. Tran, Bennett P deBoisblanc, Ricardo Luis Rodriguez, John Michael Galeotti

Abstract: With the onset of the COVID-19 pandemic, ultrasound has emerged as an effective tool for bedside monitoring of patients. Due to this, a large amount of lung ultrasound scans have been made available which can be used for AI based diagnosis and analysis. Several AI-based patient severity scoring models have been proposed that rely on scoring the appearance of the ultrasound scans. AI models are tra… ▽ More With the onset of the COVID-19 pandemic, ultrasound has emerged as an effective tool for bedside monitoring of patients. Due to this, a large amount of lung ultrasound scans have been made available which can be used for AI based diagnosis and analysis. Several AI-based patient severity scoring models have been proposed that rely on scoring the appearance of the ultrasound scans. AI models are trained using ultrasound-appearance severity scores that are manually labeled based on standardized visual features. We address the challenge of labeling every ultrasound frame in the video clips. Our contrastive learning method treats the video clip severity labels as noisy weak severity labels for individual frames, thus requiring only video-level labels. We show that it performs better than the conventional cross-entropy loss based training. We combine frame severity predictions to come up with video severity predictions and show that the frame based model achieves comparable performance to a video based TSM model, on a large dataset combining public and private sources. △ Less

Submitted 18 January, 2022; originally announced January 2022.

Comments: Under Review for MIDL 2022 conference

arXiv:2110.07305 [pdf]

DI-AA: An Interpretable White-box Attack for Fooling Deep Neural Networks

Authors: Yixiang Wang, Jiqiang Liu, Xiaolin Chang, Jianhua Wang, Ricardo J. Rodríguez

Abstract: White-box Adversarial Example (AE) attacks towards Deep Neural Networks (DNNs) have a more powerful destructive capacity than black-box AE attacks in the fields of AE strategies. However, almost all the white-box approaches lack interpretation from the point of view of DNNs. That is, adversaries did not investigate the attacks from the perspective of interpretable features, and few of these approa… ▽ More White-box Adversarial Example (AE) attacks towards Deep Neural Networks (DNNs) have a more powerful destructive capacity than black-box AE attacks in the fields of AE strategies. However, almost all the white-box approaches lack interpretation from the point of view of DNNs. That is, adversaries did not investigate the attacks from the perspective of interpretable features, and few of these approaches considered what features the DNN actually learns. In this paper, we propose an interpretable white-box AE attack approach, DI-AA, which explores the application of the interpretable approach of the deep Taylor decomposition in the selection of the most contributing features and adopts the Lagrangian relaxation optimization of the logit output and L_p norm to further decrease the perturbation. We compare DI-AA with six baseline attacks (including the state-of-the-art attack AutoAttack) on three datasets. Experimental results reveal that our proposed approach can 1) attack non-robust models with comparatively low perturbation, where the perturbation is closer to or lower than the AutoAttack approach; 2) break the TRADES adversarial training models with the highest success rate; 3) the generated AE can reduce the robust accuracy of the robust black-box models by 16% to 31% in the black-box transfer attack. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: 9 pages, 5 figures, 7 tables

arXiv:2106.06811 [pdf, other]

Case Study on Detecting COVID-19 Health-Related Misinformation in Social Media

Authors: Mir Mehedi A. Pritom, Rosana Montanez Rodriguez, Asad Ali Khan, Sebastian A. Nugroho, Esra'a Alrashydah, Beatrice N. Ruiz, Anthony Rios

Abstract: COVID-19 pandemic has generated what public health officials called an infodemic of misinformation. As social distancing and stay-at-home orders came into effect, many turned to social media for socializing. This increase in social media usage has made it a prime vehicle for the spreading of misinformation. This paper presents a mechanism to detect COVID-19 health-related misinformation in social… ▽ More COVID-19 pandemic has generated what public health officials called an infodemic of misinformation. As social distancing and stay-at-home orders came into effect, many turned to social media for socializing. This increase in social media usage has made it a prime vehicle for the spreading of misinformation. This paper presents a mechanism to detect COVID-19 health-related misinformation in social media following an interdisciplinary approach. Leveraging social psychology as a foundation and existing misinformation frameworks, we defined misinformation themes and associated keywords incorporated into the misinformation detection mechanism using applied machine learning techniques. Next, using the Twitter dataset, we explored the performance of the proposed methodology using multiple state-of-the-art machine learning classifiers. Our method shows promising results with at most 78% accuracy in classifying health-related misinformation versus true information using uni-gram-based NLP feature generations from tweets and the Decision Tree classifier. We also provide suggestions on alternatives for countering misinformation and ethical consideration for the study. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: 10 pages

arXiv:2105.06570 [pdf, ps, other]

Simplified Kripke semantics for K45-like Godel modal logics and its axiomatic extensions

Authors: Ricardo Rodriguez, Olim Tuyt, Lluis Godo, Francesc Esteva

Abstract: In this paper, we provide simplified semantics for the logic K45(G), i.e. the many-valued Godel counterpart of the classical modal logic K45. More precisely, we characterize K45(G) as the set of valid formulae of the class of possibilistic Godel Kripke Frames <W,π> where W is a non-empty set of worlds and π: W \to [0, 1] is a possibility distribution on W. In this paper, we provide simplified semantics for the logic K45(G), i.e. the many-valued Godel counterpart of the classical modal logic K45. More precisely, we characterize K45(G) as the set of valid formulae of the class of possibilistic Godel Kripke Frames <W,π> where W is a non-empty set of worlds and π: W \to [0, 1] is a possibility distribution on W. △ Less

Submitted 13 May, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:1611.04444

arXiv:2104.09785 [pdf, other]

Model-predictive control and reinforcement learning in multi-energy system case studies

Authors: Glenn Ceusters, Román Cantú Rodríguez, Alberte Bouso García, Rüdiger Franke, Geert Deconinck, Lieve Helsen, Ann Nowé, Maarten Messagie, Luis Ramirez Camargo

Abstract: Model-predictive-control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing… ▽ More Model-predictive-control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing project-specific engineering cost. In this paper, we present an on- and off-policy multi-objective reinforcement learning (RL) approach, that does not assume a model a priori, benchmarking this against a linear MPC (LMPC - to reflect current practice, though non-linear MPC performs better) - both derived from the general optimal control problem, highlighting their differences and similarities. In a simple multi-energy system (MES) configuration case study, we show that a twin delayed deep deterministic policy gradient (TD3) RL agent offers potential to match and outperform the perfect foresight LMPC benchmark (101.5%). This while the realistic LMPC, i.e. imperfect predictions, only achieves 98%. While in a more complex MES system configuration, the RL agent's performance is generally lower (94.6%), yet still better than the realistic LMPC (88.9%). In both case studies, the RL agents outperformed the realistic LMPC after a training period of 2 years using quarterly interactions with the environment. We conclude that reinforcement learning is a viable optimal control technique for multi-energy systems given adequate constraint handling and pre-training, to avoid unsafe interactions and long training periods, as is proposed in fundamental future work. △ Less

Submitted 9 September, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 43 pages, 29 figures

arXiv:2102.09529 [pdf, other]

Fuzzy clustering algorithms with distance metric learning and entropy regularization

Authors: Sara Ines Rizo Rodriguez, Francisco de Assis Tenorio de Carvalho

Abstract: The clustering methods have been used in a variety of fields such as image processing, data mining, pattern recognition, and statistical analysis. Generally, the clustering algorithms consider all variables equally relevant or not correlated for the clustering task. Nevertheless, in real situations, some variables can be correlated or may be more or less relevant or even irrelevant for this task.… ▽ More The clustering methods have been used in a variety of fields such as image processing, data mining, pattern recognition, and statistical analysis. Generally, the clustering algorithms consider all variables equally relevant or not correlated for the clustering task. Nevertheless, in real situations, some variables can be correlated or may be more or less relevant or even irrelevant for this task. This paper proposes partitioning fuzzy clustering algorithms based on Euclidean, City-block and Mahalanobis distances and entropy regularization. These methods are an iterative three steps algorithms which provide a fuzzy partition, a representative for each fuzzy cluster, and the relevance weight of the variables or their correlation by minimizing a suitable objective function. Several experiments on synthetic and real datasets, including its application to noisy image texture segmentation, demonstrate the usefulness of these adaptive clustering methods. △ Less

Submitted 18 February, 2021; originally announced February 2021.

arXiv:2009.09088 [pdf, other]

An AI based talent acquisition and benchmarking for job

Authors: Rudresh Mishra, Ricardo Rodriguez, Valentin Portillo

Abstract: In a recruitment industry, selecting a best CV from a particular job post within a pile of thousand CV's is quite challenging. Finding a perfect candidate for an organization who can be fit to work within organizational culture is a difficult task. In order to help the recruiters to fill these gaps we leverage the help of AI. We propose a methodology to solve these problems by matching the skill g… ▽ More In a recruitment industry, selecting a best CV from a particular job post within a pile of thousand CV's is quite challenging. Finding a perfect candidate for an organization who can be fit to work within organizational culture is a difficult task. In order to help the recruiters to fill these gaps we leverage the help of AI. We propose a methodology to solve these problems by matching the skill graph generated from CV and Job Post. In this report our approach is to perform the business understanding in order to justify why such problems arise and how we intend to solve these problems using natural language processing and machine learning techniques. We limit our project only to solve the problem in the domain of the computer science industry. △ Less

Submitted 12 August, 2020; originally announced September 2020.

Comments: 26 pages , 23 figures, This paper is yet to publish in conferences

arXiv:2008.12413 [pdf, other]

W-Net: Dense Semantic Segmentation of Subcutaneous Tissue in Ultrasound Images by Expanding U-Net to Incorporate Ultrasound RF Waveform Data

Authors: Gautam Rajendrakumar Gare, Jiayuan Li, Rohan Joshi, Mrunal Prashant Vaze, Rishikesh Magar, Michael Yousefpour, Ricardo Luis Rodriguez, John Micheal Galeotti

Abstract: We present W-Net, a novel Convolution Neural Network (CNN) framework that employs raw ultrasound waveforms from each A-scan, typically referred to as ultrasound Radio Frequency (RF) data, in addition to the gray ultrasound image to semantically segment and label tissues. Unlike prior work, we seek to label every pixel in the image, without the use of a background class. To the best of our knowledg… ▽ More We present W-Net, a novel Convolution Neural Network (CNN) framework that employs raw ultrasound waveforms from each A-scan, typically referred to as ultrasound Radio Frequency (RF) data, in addition to the gray ultrasound image to semantically segment and label tissues. Unlike prior work, we seek to label every pixel in the image, without the use of a background class. To the best of our knowledge, this is also the first deep-learning or CNN approach for segmentation that analyses ultrasound raw RF data along with the gray image. International patent(s) pending [PCT/US20/37519]. We chose subcutaneous tissue (SubQ) segmentation as our initial clinical goal since it has diverse intermixed tissues, is challenging to segment, and is an underrepresented research area. SubQ potential applications include plastic surgery, adipose stem-cell harvesting, lymphatic monitoring, and possibly detection/treatment of certain types of tumors. A custom dataset consisting of hand-labeled images by an expert clinician and trainees are used for the experimentation, currently labeled into the following categories: skin, fat, fat fascia/stroma, muscle and muscle fascia. We compared our results with U-Net and Attention U-Net. Our novel \emph{W-Net}'s RF-Waveform input and architecture increased mIoU accuracy (averaged across all tissue classes) by 4.5\% and 4.9\% compared to regular U-Net and Attention U-Net, respectively. We present analysis as to why the Muscle fascia and Fat fascia/stroma are the most difficult tissues to label. Muscle fascia in particular, the most difficult anatomic class to recognize for both humans and AI algorithms, saw mIoU improvements of 13\% and 16\% from our W-Net vs U-Net and Attention U-Net respectively. △ Less

Submitted 2 September, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

Comments: The paper is currently under review for publication in a peer-reviewed journal

arXiv:2007.04932 [pdf, other]

Human Cognition through the Lens of Social Engineering Cyberattacks

Authors: Rosana Montanez Rodriguez, Edward Golob, Shouhuai Xu

Abstract: Social engineering cyberattacks are a major threat because they often prelude sophisticated and devastating cyberattacks. Social engineering cyberattacks are a kind of psychological attack that exploits weaknesses in human cognitive functions. Adequate defense against social engineering cyberattacks requires a deeper understanding of what aspects of human cognition are exploited by these cyberatta… ▽ More Social engineering cyberattacks are a major threat because they often prelude sophisticated and devastating cyberattacks. Social engineering cyberattacks are a kind of psychological attack that exploits weaknesses in human cognitive functions. Adequate defense against social engineering cyberattacks requires a deeper understanding of what aspects of human cognition are exploited by these cyberattacks, why humans are susceptible to these cyberattacks, and how we can minimize or at least mitigate their damage. These questions have received some amount of attention but the state-of-the-art understanding is superficial and scattered in the literature. In this paper, we review human cognition through the lens of social engineering cyberattacks. Then, we propose an extended framework of human cognitive functions to accommodate social engineering cyberattacks. We cast existing studies on various aspects of social engineering cyberattacks into the extended framework, while drawing a number of insights that represent the current understanding and shed light on future research directions. The extended framework might inspire future research endeavors towards a new sub-field that can be called Cybersecurity Cognitive Psychology, which tailors or adapts principles of Cognitive Psychology to the cybersecurity domain while embracing new notions and concepts that are unique to the cybersecurity domain. △ Less

Submitted 9 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

arXiv:1803.07693 [pdf, other]

Balanced Black and White Coloring Problem on knights chessboards

Authors: Luis Eduardo Urbán Rivero, Rafael López Bracho, Javier Ramírez Rodríguez

Abstract: Graph anticoloring problem is partial coloring problem where the main feature is the opposite rule of the graph coloring problem, i.e., if two vertices are adjacent, their assigned colors must be the same or at least one of them is uncolored. In the same way, Berge in 1972 proposed the problem of placing b black queens and w white queens on a $n \times n$ chessboard such that no two queens of diff… ▽ More Graph anticoloring problem is partial coloring problem where the main feature is the opposite rule of the graph coloring problem, i.e., if two vertices are adjacent, their assigned colors must be the same or at least one of them is uncolored. In the same way, Berge in 1972 proposed the problem of placing b black queens and w white queens on a $n \times n$ chessboard such that no two queens of different color can attack to each other, the complexity of this problem remains open. In this work we deal with the knight piece under the balance property, since this special case is the most difficult for brute force algorithms. △ Less

Submitted 27 March, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

Comments: 10 pages

arXiv:1711.10968 [pdf, other]

doi 10.5244/C.31.77

Colour Constancy: Biologically-inspired Contrast Variant Pooling Mechanism

Authors: Arash Akbarinia, Raquel Gil Rodríguez, C. Alejandro Parraga

Abstract: Pooling is a ubiquitous operation in image processing algorithms that allows for higher-level processes to collect relevant low-level features from a region of interest. Currently, max-pooling is one of the most commonly used operators in the computational literature. However, it can lack robustness to outliers due to the fact that it relies merely on the peak of a function. Pooling mechanisms are… ▽ More Pooling is a ubiquitous operation in image processing algorithms that allows for higher-level processes to collect relevant low-level features from a region of interest. Currently, max-pooling is one of the most commonly used operators in the computational literature. However, it can lack robustness to outliers due to the fact that it relies merely on the peak of a function. Pooling mechanisms are also present in the primate visual cortex where neurons of higher cortical areas pool signals from lower ones. The receptive fields of these neurons have been shown to vary according to the contrast by aggregating signals over a larger region in the presence of low contrast stimuli. We hypothesise that this contrast-variant-pooling mechanism can address some of the shortcomings of max-pooling. We modelled this contrast variation through a histogram clipping in which the percentage of pooled signal is inversely proportional to the local contrast of an image. We tested our hypothesis by applying it to the phenomenon of colour constancy where a number of popular algorithms utilise a max-pooling step (e.g. White-Patch, Grey-Edge and Double-Opponency). For each of these methods, we investigated the consequences of replacing their original max-pooling by the proposed contrast-variant-pooling. Our experiments on three colour constancy benchmark datasets suggest that previous results can significantly improve by adopting a contrast-variant-pooling mechanism. △ Less

Submitted 29 November, 2017; originally announced November 2017.

Journal ref: Proceedings of the British machine Vision Conference (BMVC) 2017

arXiv:1704.03330 [pdf, other]

Food-bridging: a new network construction to unveil the principles of cooking

Authors: Tiago Simas, Michal Ficek, Albert Diaz-Guilera, Pere Obrador, Pablo R. Rodriguez

Abstract: In this manuscript we propose, analyse, and discuss a possible new principle behind traditional cuisine: the Food-bridging hypothesis and its comparison with the food-pairing hypothesis using the same dataset and graphical models employed in the food-pairing study by Ahn et al. [Scientific Reports, 1:196 (2011)]. The Food-bridging hypothesis assumes that if two ingredients do not share a strong… ▽ More In this manuscript we propose, analyse, and discuss a possible new principle behind traditional cuisine: the Food-bridging hypothesis and its comparison with the food-pairing hypothesis using the same dataset and graphical models employed in the food-pairing study by Ahn et al. [Scientific Reports, 1:196 (2011)]. The Food-bridging hypothesis assumes that if two ingredients do not share a strong molecular or empirical affinity, they may become affine through a chain of pairwise affinities. That is, in a graphical model as employed by Ahn et al., a chain represents a path that joints the two ingredients, the shortest path represents the strongest pairwise chain of affinities between the two ingredients. Food-pairing and Food-bridging are different hypotheses that may describe possible mechanisms behind the recipes of traditional cuisines. Food-pairing intensifies flavour by mixing ingredients in a recipe with similar chemical compounds, and food-bridging smoothes contrast between ingredients. Both food-pairing and food-bridging are observed in traditional cuisines, as shown in this work. We observed four classes of cuisines according to food-pairing and food-bridging: (1) East Asian cuisines, at one extreme, tend to avoid food-pairing as well as food-bridging; and (4) Latin American cuisines, at the other extreme, follow both principles. For the two middle classes: (2) Southeastern Asian cuisines, avoid food-pairing and follow food-bridging; and (3) Western cuisines, follow food-pairing and avoid food-bridging. △ Less

Submitted 14 April, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

Showing 1–50 of 59 results for author: Rodriguez, R