-
A Magnetic-like description of Oscillatory Behavior in Chemotactic Ants
Authors:
Rosa Flaquer-Galmés,
Daniel Campos,
Javier Cristín
Abstract:
We investigate the role of chemotaxis in the movement dynamics of Aphaenogaster Senilis ants. To do so, we design an experimental setup in which individual ants are exposed to a narrow pheromone trail to guide their motion. As expected, ants locate and navigate the trail by detecting chemical scents, exhibiting a characteristic zigzag pattern, moving at a nearly constant speed while oscillating pe…
▽ More
We investigate the role of chemotaxis in the movement dynamics of Aphaenogaster Senilis ants. To do so, we design an experimental setup in which individual ants are exposed to a narrow pheromone trail to guide their motion. As expected, ants locate and navigate the trail by detecting chemical scents, exhibiting a characteristic zigzag pattern, moving at a nearly constant speed while oscillating perpendicularly to the trail. The zigzagging motion is common across many species yet its underlying mechanism remains unclear. Here, we propose a physical framework based on the Inertial Spin Model as an approach to quantitatively describe and explain this behavior. So, we implement chemotaxis resembling magnetic-like interactions between the ant's velocity and the pheromone gradient. Under specific approximations, the model yields an analytical expression for the velocity correlations perpendicular to the trail, predicting a characteristic oscillatory decay. This prediction closely matches our experimental data, suggesting that the model captures the essential ingredients of ant dynamics. By fitting the model parameters to individual experimental trajectories, we further explore their potential biological significance and validate our assumptions. Overall, our findings contribute to the understanding of chemotaxis in ant motion and its physical features.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
EnronQA: Towards Personalized RAG over Private Documents
Authors:
Michael J. Ryan,
Danmei Xu,
Chris Nivera,
Daniel Campos
Abstract:
Retrieval Augmented Generation (RAG) has become one of the most popular methods for bringing knowledge-intensive context to large language models (LLM) because of its ability to bring local context at inference time without the cost or data leakage risks associated with fine-tuning. A clear separation of private information from the LLM training has made RAG the basis for many enterprise LLM workl…
▽ More
Retrieval Augmented Generation (RAG) has become one of the most popular methods for bringing knowledge-intensive context to large language models (LLM) because of its ability to bring local context at inference time without the cost or data leakage risks associated with fine-tuning. A clear separation of private information from the LLM training has made RAG the basis for many enterprise LLM workloads as it allows the company to augment LLM's understanding using customers' private documents. Despite its popularity for private documents in enterprise deployments, current RAG benchmarks for validating and optimizing RAG pipelines draw their corpora from public data such as Wikipedia or generic web pages and offer little to no personal context. Seeking to empower more personal and private RAG we release the EnronQA benchmark, a dataset of 103,638 emails with 528,304 question-answer pairs across 150 different user inboxes. EnronQA enables better benchmarking of RAG pipelines over private data and allows for experimentation on the introduction of personalized retrieval settings over realistic data. Finally, we use EnronQA to explore the tradeoff in memorization and retrieval when reasoning over private documents.
△ Less
Submitted 30 April, 2025;
originally announced May 2025.
-
Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges
Authors:
Nandan Thakur,
Ronak Pradeep,
Shivani Upadhyay,
Daniel Campos,
Nick Craswell,
Jimmy Lin
Abstract:
Retrieval-augmented generation (RAG) enables large language models (LLMs) to generate answers with citations from source documents containing "ground truth", thereby reducing system hallucinations. A crucial factor in RAG evaluation is "support", whether the information in the cited documents supports the answer. To this end, we conducted a large-scale comparative study of 45 participant submissio…
▽ More
Retrieval-augmented generation (RAG) enables large language models (LLMs) to generate answers with citations from source documents containing "ground truth", thereby reducing system hallucinations. A crucial factor in RAG evaluation is "support", whether the information in the cited documents supports the answer. To this end, we conducted a large-scale comparative study of 45 participant submissions on 36 topics to the TREC 2024 RAG Track, comparing an automatic LLM judge (GPT-4o) against human judges for support assessment. We considered two conditions: (1) fully manual assessments from scratch and (2) manual assessments with post-editing of LLM predictions. Our results indicate that for 56% of the manual from-scratch assessments, human and GPT-4o predictions match perfectly (on a three-level scale), increasing to 72% in the manual with post-editing condition. Furthermore, by carefully analyzing the disagreements in an unbiased study, we found that an independent human judge correlates better with GPT-4o than a human judge, suggesting that LLM judges can be a reliable alternative for support assessment. To conclude, we provide a qualitative analysis of human and GPT-4o errors to help guide future iterations of support assessment.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models
Authors:
Ronak Pradeep,
Nandan Thakur,
Shivani Upadhyay,
Daniel Campos,
Nick Craswell,
Jimmy Lin
Abstract:
Large Language Models (LLMs) have significantly enhanced the capabilities of information access systems, especially with retrieval-augmented generation (RAG). Nevertheless, the evaluation of RAG systems remains a barrier to continued progress, a challenge we tackle in this work by proposing an automatic evaluation framework that is validated against human annotations. We believe that the nugget ev…
▽ More
Large Language Models (LLMs) have significantly enhanced the capabilities of information access systems, especially with retrieval-augmented generation (RAG). Nevertheless, the evaluation of RAG systems remains a barrier to continued progress, a challenge we tackle in this work by proposing an automatic evaluation framework that is validated against human annotations. We believe that the nugget evaluation methodology provides a solid foundation for evaluating RAG systems. This approach, originally developed for the TREC Question Answering (QA) Track in 2003, evaluates systems based on atomic facts that should be present in good answers. Our efforts focus on "refactoring" this methodology, where we describe the AutoNuggetizer framework that specifically applies LLMs to both automatically create nuggets and automatically assign nuggets to system answers. In the context of the TREC 2024 RAG Track, we calibrate a fully automatic approach against strategies where nuggets are created manually or semi-manually by human assessors and then assigned manually to system answers. Based on results from a community-wide evaluation, we observe strong agreement at the run level between scores derived from fully automatic nugget evaluation and human-based variants. The agreement is stronger when individual framework components such as nugget assignment are automated independently. This suggests that our evaluation framework provides tradeoffs between effort and quality that can be used to guide the development of future RAG systems. However, further research is necessary to refine our approach, particularly in establishing robust per-topic agreement to diagnose system failures effectively.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
ColBERT-serve: Efficient Multi-Stage Memory-Mapped Scoring
Authors:
Kaili Huang,
Thejas Venkatesh,
Uma Dingankar,
Antonio Mallia,
Daniel Campos,
Jian Jiao,
Christopher Potts,
Matei Zaharia,
Kwabena Boahen,
Omar Khattab,
Saarthak Sarup,
Keshav Santhanam
Abstract:
We study serving retrieval models, specifically late interaction models like ColBERT, to many concurrent users at once and under a small budget, in which the index may not fit in memory. We present ColBERT-serve, a novel serving system that applies a memory-mapping strategy to the ColBERT index, reducing RAM usage by 90% and permitting its deployment on cheap servers, and incorporates a multi-stag…
▽ More
We study serving retrieval models, specifically late interaction models like ColBERT, to many concurrent users at once and under a small budget, in which the index may not fit in memory. We present ColBERT-serve, a novel serving system that applies a memory-mapping strategy to the ColBERT index, reducing RAM usage by 90% and permitting its deployment on cheap servers, and incorporates a multi-stage architecture with hybrid scoring, reducing ColBERT's query latency and supporting many concurrent queries in parallel.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Identifying and Replicating Code Patterns Driving Performance Regressions in Software Systems
Authors:
Denivan Campos,
Luana Martins,
Emanuela Guglielmi,
Michele Tucci,
Daniele Di Pompeo,
Simone Scalabrino,
Vittorio Cortellessa,
Dario Di Nucci,
Rocco Oliveto
Abstract:
Context: Performance regressions negatively impact execution time and memory usage of software systems. Nevertheless, there is a lack of systematic methods to evaluate the effectiveness of performance test suites. Performance mutation testing, which introduces intentional defects (mutants) to measure and enhance fault-detection capabilities, is promising but underexplored. A key challenge is under…
▽ More
Context: Performance regressions negatively impact execution time and memory usage of software systems. Nevertheless, there is a lack of systematic methods to evaluate the effectiveness of performance test suites. Performance mutation testing, which introduces intentional defects (mutants) to measure and enhance fault-detection capabilities, is promising but underexplored. A key challenge is understanding if generated mutants accurately reflect real-world performance issues. Goal: This study evaluates and extends mutation operators for performance testing. Its objectives include (i) collecting existing performance mutation operators, (ii) introducing new operators from real-world code changes that impact performance, and (iii) evaluating these operators on real-world systems to see if they effectively degrade performance. Method: To this aim, we will (i) review the literature to identify performance mutation operators, (ii) conduct a mining study to extract patterns of code changes linked to performance regressions, (iii) propose new mutation operators based on these patterns, and (iv) apply and evaluate the operators to assess their effectiveness in exposing performance degradations. Expected Outcomes: We aim to provide an enriched set of mutation operators for performance testing, helping developers and researchers identify harmful coding practices and design better strategies to detect and prevent performance regressions.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Prospection and dispersal in metapopulations: a perspective from opinion dynamics models
Authors:
Daniela Molas,
Daniel Campos
Abstract:
Dispersal is often used by living beings to gather information from conspecifics, integrating it with personal experience to guide decision-making. This mechanism has only recently been studied experimentally, facilitated by advancements in tracking animal groups over extended periods. Such studies enable the analysis of the adaptive dynamics underlying sequential decisions and collective choices.…
▽ More
Dispersal is often used by living beings to gather information from conspecifics, integrating it with personal experience to guide decision-making. This mechanism has only recently been studied experimentally, facilitated by advancements in tracking animal groups over extended periods. Such studies enable the analysis of the adaptive dynamics underlying sequential decisions and collective choices. Here, we present a theoretical framework based on the Voter Model to investigate these processes. The model, originally designed to study opinion or behavioral consensus within groups through imitation, is adapted to include the prospection of others' decisions as a mechanism for updating personal criteria. We demonstrate that several properties of our model (such as average consensus times and polarization dynamic) can be analytically mapped onto those of the classical Voter Model under simplifying assumptions. Finally, we discuss the potential of this framework for studying more complex scenarios.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
GOTO065054+593624: a 8.5 mag amplitude dwarf nova identified in real time via Kilonova Seekers
Authors:
T. L. Killestein,
G. Ramsay,
M. Kennedy,
L. Kelsey,
D. Steeghs,
S. Littlefair,
B. Godson,
J. Lyman,
M. Pursiainen,
B. Warwick,
C. Krawczyk,
L. K. Nuttall,
E. Wickens,
S. D. Alexandrov,
C. M. da Silva,
R. Leadbeater,
K. Ackley,
M. J. Dyer,
F. Jiménez-Ibarra,
K. Ulaczyk,
D. K. Galloway,
V. S. Dhillon,
P. O'Brien,
K. Noysena,
R. Kotak
, et al. (40 additional authors not shown)
Abstract:
Dwarf novae are astrophysical laboratories for probing the nature of accretion, binary mass transfer, and binary evolution -- yet their diverse observational characteristics continue to challenge our theoretical understanding. We here present the discovery of, and subsequent observing campaign on GOTO065054+593624 (hereafter GOTO0650), a dwarf nova of the WZ Sge type, discovered in real-time by ci…
▽ More
Dwarf novae are astrophysical laboratories for probing the nature of accretion, binary mass transfer, and binary evolution -- yet their diverse observational characteristics continue to challenge our theoretical understanding. We here present the discovery of, and subsequent observing campaign on GOTO065054+593624 (hereafter GOTO0650), a dwarf nova of the WZ Sge type, discovered in real-time by citizen scientists via the Kilonova Seekers citizen science project, which has an outburst amplitude of 8.5 mag. An extensive dataset charts the photometric and spectroscopic evolution of this object, covering the 2024 superoutburst. GOTO0650 shows an absence of visible emission lines during the high state, strong H and barely-detected HeII emission, and high-amplitude echo outbursts with a rapidly decreasing timescale. The comprehensive dataset presented here marks GOTO0650 as a candidate period bouncer, and highlights the important contribution that citizen scientists can make to the study of Galactic transients.
△ Less
Submitted 8 May, 2025; v1 submitted 20 January, 2025;
originally announced January 2025.
-
CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation
Authors:
Youngwon Lee,
Seung-won Hwang,
Daniel Campos,
Filip Graliński,
Zhewei Yao,
Yuxiong He
Abstract:
With the adoption of retrieval-augmented generation (RAG), large language models (LLMs) are expected to ground their generation to the retrieved contexts. Yet, this is hindered by position bias of LLMs, failing to evenly attend to all contexts. Previous work has addressed this by synthesizing contexts with perturbed positions of gold segment, creating a position-diversified train set. We extend th…
▽ More
With the adoption of retrieval-augmented generation (RAG), large language models (LLMs) are expected to ground their generation to the retrieved contexts. Yet, this is hindered by position bias of LLMs, failing to evenly attend to all contexts. Previous work has addressed this by synthesizing contexts with perturbed positions of gold segment, creating a position-diversified train set. We extend this intuition to propose consistency regularization with augmentation and distillation. First, we augment each training instance with its position perturbation to encourage consistent predictions, regardless of ordering. We also distill behaviors of this pair, although it can be counterproductive in certain RAG scenarios where the given order from the retriever is crucial for generation quality. We thus propose CORD, balancing COnsistency and Rank Distillation. CORD adaptively samples noise-controlled perturbations from an interpolation space, ensuring both consistency and respect for the rank prior. Empirical results show this balance enables CORD to outperform consistently in diverse RAG benchmarks.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Inference Scaling for Bridging Retrieval and Augmented Generation
Authors:
Youngwon Lee,
Seung-won Hwang,
Daniel Campos,
Filip Graliński,
Zhewei Yao,
Yuxiong He
Abstract:
Retrieval-augmented generation (RAG) has emerged as a popular approach to steering the output of a large language model (LLM) by incorporating retrieved contexts as inputs. However, existing work observed the generator bias, such that improving the retrieval results may negatively affect the outcome. In this work, we show such bias can be mitigated, from inference scaling, aggregating inference ca…
▽ More
Retrieval-augmented generation (RAG) has emerged as a popular approach to steering the output of a large language model (LLM) by incorporating retrieved contexts as inputs. However, existing work observed the generator bias, such that improving the retrieval results may negatively affect the outcome. In this work, we show such bias can be mitigated, from inference scaling, aggregating inference calls from the permuted order of retrieved contexts. The proposed Mixture-of-Intervention (MOI) explicitly models the debiased utility of each passage with multiple forward passes to construct a new ranking. We also show that MOI can leverage the retriever's prior knowledge to reduce the computational cost by minimizing the number of permutations considered and lowering the cost per LLM call. We showcase the effectiveness of MOI on diverse RAG tasks, improving ROUGE-L on MS MARCO and EM on HotpotQA benchmarks by ~7 points.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Arctic-Embed 2.0: Multilingual Retrieval Without Compromise
Authors:
Puxuan Yu,
Luke Merrick,
Gaurav Nuti,
Daniel Campos
Abstract:
This paper presents the training methodology of Arctic-Embed 2.0, a set of open-source text embedding models built for accurate and efficient multilingual retrieval. While prior works have suffered from degraded English retrieval quality, Arctic-Embed 2.0 delivers competitive retrieval quality on multilingual and English-only benchmarks, and supports Matryoshka Representation Learning (MRL) for ef…
▽ More
This paper presents the training methodology of Arctic-Embed 2.0, a set of open-source text embedding models built for accurate and efficient multilingual retrieval. While prior works have suffered from degraded English retrieval quality, Arctic-Embed 2.0 delivers competitive retrieval quality on multilingual and English-only benchmarks, and supports Matryoshka Representation Learning (MRL) for efficient embedding storage with significantly lower compressed quality degradation compared to alternatives. We detail the design and implementation, presenting several important open research questions that arose during model development. We conduct experiments exploring these research questions and include extensive discussion aimed at fostering further discussion in this field.
△ Less
Submitted 13 December, 2024; v1 submitted 3 December, 2024;
originally announced December 2024.
-
Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework
Authors:
Ronak Pradeep,
Nandan Thakur,
Shivani Upadhyay,
Daniel Campos,
Nick Craswell,
Jimmy Lin
Abstract:
This report provides an initial look at partial results from the TREC 2024 Retrieval-Augmented Generation (RAG) Track. We have identified RAG evaluation as a barrier to continued progress in information access (and more broadly, natural language processing and artificial intelligence), and it is our hope that we can contribute to tackling the many challenges in this space. The central hypothesis w…
▽ More
This report provides an initial look at partial results from the TREC 2024 Retrieval-Augmented Generation (RAG) Track. We have identified RAG evaluation as a barrier to continued progress in information access (and more broadly, natural language processing and artificial intelligence), and it is our hope that we can contribute to tackling the many challenges in this space. The central hypothesis we explore in this work is that the nugget evaluation methodology, originally developed for the TREC Question Answering Track in 2003, provides a solid foundation for evaluating RAG systems. As such, our efforts have focused on "refactoring" this methodology, specifically applying large language models to both automatically create nuggets and to automatically assign nuggets to system answers. We call this the AutoNuggetizer framework. Within the TREC setup, we are able to calibrate our fully automatic process against a manual process whereby nuggets are created by human assessors semi-manually and then assigned manually to system answers. Based on initial results across 21 topics from 45 runs, we observe a strong correlation between scores derived from a fully automatic nugget evaluation and a (mostly) manual nugget evaluation by human assessors. This suggests that our fully automatic evaluation process can be used to guide future iterations of RAG systems.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
A Large-Scale Study of Relevance Assessments with Large Language Models: An Initial Look
Authors:
Shivani Upadhyay,
Ronak Pradeep,
Nandan Thakur,
Daniel Campos,
Nick Craswell,
Ian Soboroff,
Hoa Trang Dang,
Jimmy Lin
Abstract:
The application of large language models to provide relevance assessments presents exciting opportunities to advance information retrieval, natural language processing, and beyond, but to date many unknowns remain. This paper reports on the results of a large-scale evaluation (the TREC 2024 RAG Track) where four different relevance assessment approaches were deployed in situ: the "standard" fully…
▽ More
The application of large language models to provide relevance assessments presents exciting opportunities to advance information retrieval, natural language processing, and beyond, but to date many unknowns remain. This paper reports on the results of a large-scale evaluation (the TREC 2024 RAG Track) where four different relevance assessment approaches were deployed in situ: the "standard" fully manual process that NIST has implemented for decades and three different alternatives that take advantage of LLMs to different extents using the open-source UMBRELA tool. This setup allows us to correlate system rankings induced by the different approaches to characterize tradeoffs between cost and quality. We find that in terms of nDCG@20, nDCG@100, and Recall@100, system rankings induced by automatically generated relevance assessments from UMBRELA correlate highly with those induced by fully manual assessments across a diverse set of 77 runs from 19 teams. Our results suggest that automatically generated UMBRELA judgments can replace fully manual judgments to accurately capture run-level effectiveness. Surprisingly, we find that LLM assistance does not appear to increase correlation with fully manual assessments, suggesting that costs associated with human-in-the-loop processes do not bring obvious tangible benefits. Overall, human assessors appear to be stricter than UMBRELA in applying relevance criteria. Our work validates the use of LLMs in academic TREC-style evaluations and provides the foundation for future studies.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
Authors:
Gabriele Oliaro,
Zhihao Jia,
Daniel Campos,
Aurick Qiao
Abstract:
Speculative decoding is widely adopted to reduce latency in large language model (LLM) inference by leveraging smaller draft models capable of handling diverse user tasks. However, emerging AI applications, such as LLM-based agents, present unique workload characteristics: instead of diverse independent requests, agentic frameworks typically submit repetitive inference requests, such as multi-agen…
▽ More
Speculative decoding is widely adopted to reduce latency in large language model (LLM) inference by leveraging smaller draft models capable of handling diverse user tasks. However, emerging AI applications, such as LLM-based agents, present unique workload characteristics: instead of diverse independent requests, agentic frameworks typically submit repetitive inference requests, such as multi-agent pipelines performing similar subtasks or self-refinement loops iteratively enhancing outputs. These workloads result in long and highly predictable sequences, which current speculative decoding methods do not effectively exploit. To address this gap, we introduce \emph{SuffixDecoding}, a novel method that utilizes efficient suffix trees to cache long token sequences from prompts and previous outputs. By adaptively speculating more tokens when acceptance likelihood is high and fewer when it is low, SuffixDecoding effectively exploits opportunities for longer speculations while conserving computation when those opportunities are limited. Evaluations on agentic benchmarks, including SWE-Bench and Text-to-SQL, demonstrate that SuffixDecoding achieves speedups of up to 5.3$\times$, outperforming state-of-the-art methods -- 2.8$\times$ faster than model-based approaches like EAGLE-2/3 and 1.9$\times$ faster than model-free approaches such as Token Recycling. SuffixDecoding is open-sourced at https://github.com/snowflakedb/ArcticInference.
△ Less
Submitted 2 June, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage
Authors:
Claus D. Hansen,
Thuy Hai Le,
David Campos
Abstract:
This paper examines the use of computer vision algorithms to estimate aspects of the psychosocial work environment using CCTV footage. We present a proof of concept for a methodology that detects and tracks people in video footage and estimates interactions between customers and employees by estimating their poses and calculating the duration of their encounters. We propose a pipeline that combine…
▽ More
This paper examines the use of computer vision algorithms to estimate aspects of the psychosocial work environment using CCTV footage. We present a proof of concept for a methodology that detects and tracks people in video footage and estimates interactions between customers and employees by estimating their poses and calculating the duration of their encounters. We propose a pipeline that combines existing object detection and tracking algorithms (YOLOv8 and DeepSORT) with pose estimation algorithms (BlazePose) to estimate the number of customers and employees in the footage as well as the duration of their encounters. We use a simple rule-based approach to classify the interactions as positive, neutral or negative based on three different criteria: distance, duration and pose. The proposed methodology is tested on a small dataset of CCTV footage. While the data is quite limited in particular with respect to the quality of the footage, we have chosen this case as it represents a typical setting where the method could be applied. The results show that the object detection and tracking part of the pipeline has a reasonable performance on the dataset with a high degree of recall and reasonable accuracy. At this stage, the pose estimation is still limited to fully detect the type of interactions due to difficulties in tracking employees in the footage. We conclude that the method is a promising alternative to self-reported measures of the psychosocial work environment and could be used in future studies to obtain external observations of the work environment.
△ Less
Submitted 6 November, 2024;
originally announced November 2024.
-
Vernacularizing Taxonomies of Harm is Essential for Operationalizing Holistic AI Safety
Authors:
Wm. Matthew Kennedy,
Daniel Vargas Campos
Abstract:
Operationalizing AI ethics and safety principles and frameworks is essential to realizing the potential benefits and mitigating potential harms caused by AI systems. To that end, actors across industry, academia, and regulatory bodies have created formal taxonomies of harm to support operationalization efforts. These include novel holistic methods that go beyond exclusive reliance on technical ben…
▽ More
Operationalizing AI ethics and safety principles and frameworks is essential to realizing the potential benefits and mitigating potential harms caused by AI systems. To that end, actors across industry, academia, and regulatory bodies have created formal taxonomies of harm to support operationalization efforts. These include novel holistic methods that go beyond exclusive reliance on technical benchmarking. However, our paper argues that such taxonomies must also be transferred into local categories to be readily implemented in sector-specific AI safety operationalization efforts, and especially in underresourced or high-risk sectors. This is because many sectors are constituted by discourses, norms, and values that "refract" or even directly conflict with those operating in society more broadly. Drawing from emerging anthropological theories of human rights, we propose that the process of "vernacularization"--a participatory, decolonial practice distinct from doctrinary "translation" (the dominant mode of AI safety operationalization)--can help bridge this gap. To demonstrate this point, we consider the education sector, and identify precisely how vernacularizing a leading holistic taxonomy of harm leads to a clearer view of how harms AI systems may cause are substantially intensified when deployed in educational spaces. We conclude by discussing the generalizability of vernacularization as a useful AI safety methodology.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
Authors:
Jaeseong Lee,
seung-won hwang,
Aurick Qiao,
Daniel F Campos,
Zhewei Yao,
Yuxiong He
Abstract:
Mixture-of-experts (MoEs) have been adopted for reducing inference costs by sparsely activating experts in Large language models (LLMs). Despite this reduction, the massive number of experts in MoEs still makes them expensive to serve. In this paper, we study how to address this, by pruning MoEs. Among pruning methodologies, unstructured pruning has been known to achieve the highest performance fo…
▽ More
Mixture-of-experts (MoEs) have been adopted for reducing inference costs by sparsely activating experts in Large language models (LLMs). Despite this reduction, the massive number of experts in MoEs still makes them expensive to serve. In this paper, we study how to address this, by pruning MoEs. Among pruning methodologies, unstructured pruning has been known to achieve the highest performance for a given pruning ratio, compared to structured pruning, since the latter imposes constraints on the sparsification structure. This is intuitive, as the solution space of unstructured pruning subsumes that of structured pruning. However, our counterintuitive finding reveals that expert pruning, a form of structured pruning, can actually precede unstructured pruning to outperform unstructured-only pruning. As existing expert pruning, requiring $O(\frac{k^n}{\sqrt{n}})$ forward passes for $n$ experts, cannot scale for recent MoEs, we propose a scalable alternative with $O(1)$ complexity, yet outperforming the more expensive methods. The key idea is leveraging a latent structure between experts, based on behavior similarity, such that the greedy decision of whether to prune closely captures the joint pruning effect. Ours is highly effective -- for Snowflake Arctic, a 480B-sized MoE with 128 experts, our method needs only one H100 and two hours to achieve nearly no loss in performance with 40% sparsity, even in generative tasks such as GSM8K, where state-of-the-art unstructured pruning fails to. The code will be made publicly available.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Gravitational Surface Tension as the Origin for the Black Hole Entropy
Authors:
S. D. Campos,
R. H. Longaresi
Abstract:
In this work, we explore the thermodynamics of black holes using the Gouy-Stodola theorem, traditionally applied to mechanical systems relating entropy production to the difference between reversible and irreversible work. We model black holes as gravitational bubbles with surface tension defined at the event horizon, deriving the Bekenstein-Hawking entropy relation for non-rotating black holes. O…
▽ More
In this work, we explore the thermodynamics of black holes using the Gouy-Stodola theorem, traditionally applied to mechanical systems relating entropy production to the difference between reversible and irreversible work. We model black holes as gravitational bubbles with surface tension defined at the event horizon, deriving the Bekenstein-Hawking entropy relation for non-rotating black holes. One extends this approach to rotating black holes, incorporating the effects of angular momentum, demonstrating that the Gouy-Stodola theorem can similarly derive the entropy-area law in this case. Additionally, we analyze the merging of two black holes, showing that the resultant total entropy exceeds the sum of the individual entropies, thereby adhering to the second law of thermodynamics. Our results suggest that gravitational surface tension is a key factor in black hole thermodynamics, providing a novel and coherent framework for understanding the entropy production in these extreme astrophysical objects.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
A New Control Law for TS Fuzzy Models: Less Conservative LMI Conditions by Using Membership Functions Derivative
Authors:
Leonardo Amaral Mozelli,
Victor Costa da Silva Campos
Abstract:
This note proposes a new type of Parallel Distributed Controller (PDC) for Takagi-Sugeno (TS) fuzzy models. Our idea consists of using two control terms based on state feedback, one composed of a convex combination of linear gains weighted by the normalized membership grade, as in traditional PDC, and the other composed of linear gains weighted by the time-derivatives of the membership functions.…
▽ More
This note proposes a new type of Parallel Distributed Controller (PDC) for Takagi-Sugeno (TS) fuzzy models. Our idea consists of using two control terms based on state feedback, one composed of a convex combination of linear gains weighted by the normalized membership grade, as in traditional PDC, and the other composed of linear gains weighted by the time-derivatives of the membership functions. We present the design conditions as Linear Matrix Inequalities, solvable through numerical optimization tools. Numerical examples are given to illustrate the advantages of the proposed approach, which contains the the traditional PDC as a special case.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Itô's Formula for Itô processes defined with respect to a cylindrical-martingale valued measure
Authors:
Santiago Cambronero,
David Campos,
C. A. Fonseca-Mora,
Darío Mena
Abstract:
Using the theory of stochastic integration developed recently by the authors, in this paper we prove an Itô formula for Hilbert space-valued Itô processes defined with respect to a cylindrical-martingale valued measure. As part of our study, we develop some tools from stochastic analysis as are the predictable and optional quadratic variation of the stochastic integral, the continuous and purely d…
▽ More
Using the theory of stochastic integration developed recently by the authors, in this paper we prove an Itô formula for Hilbert space-valued Itô processes defined with respect to a cylindrical-martingale valued measure. As part of our study, we develop some tools from stochastic analysis as are the predictable and optional quadratic variation of the stochastic integral, the continuous and purely discontinuous parts of the integral process, and a Riemann representation formula. Finally, as an application of Itô's formula we prove a Burkholder inequality for the stochastic integral defined with respect to a cylindrical-martingale valued measure.
△ Less
Submitted 15 December, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
Managing O-RAN Networks: xApp Development from Zero to Hero
Authors:
Joao F. Santos,
Alexandre Huff,
Daniel Campos,
Kleber V. Cardoso,
Cristiano B. Both,
Luiz A. DaSilva
Abstract:
The Open Radio Access Network (O-RAN) Alliance proposes an open architecture that disaggregates the RAN and supports executing custom control logic in near-real time from third-party applications, the xApps. Despite O-RAN's efforts, the creation of xApps remains a complex and time-consuming endeavor, aggravated by the sometimes fragmented, outdated, or deprecated documentation from the O-RAN Softw…
▽ More
The Open Radio Access Network (O-RAN) Alliance proposes an open architecture that disaggregates the RAN and supports executing custom control logic in near-real time from third-party applications, the xApps. Despite O-RAN's efforts, the creation of xApps remains a complex and time-consuming endeavor, aggravated by the sometimes fragmented, outdated, or deprecated documentation from the O-RAN Software Community (OSC). These challenges hinder academia and industry from developing and validating solutions and algorithms on O-RAN networks. This tutorial addresses this gap by providing the first comprehensive guide for developing xApps to manage the O-RAN ecosystem from theory to practice. We provide a thorough theoretical foundation of the O-RAN architecture and detail the functionality offered by Near Real-Time RAN Intelligent Controller (Near-RT RIC) components. We examine the xApp design and configuration. We explore the xApp lifecycle and demonstrate how to deploy and manage xApps on a Near-RT RIC. We address the xApps' interfaces and capabilities, accompanied by practical examples. We provide comprehensive details on how xApps can control the RAN. We discuss debugging strategies and good practices to aid the xApp developers in testing their xApps. Finally, we review the current landscape and open challenges for creating xApps.
△ Less
Submitted 5 February, 2025; v1 submitted 12 July, 2024;
originally announced July 2024.
-
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track
Authors:
Ronak Pradeep,
Nandan Thakur,
Sahel Sharifymoghaddam,
Eric Zhang,
Ryan Nguyen,
Daniel Campos,
Nick Craswell,
Jimmy Lin
Abstract:
Did you try out the new Bing Search? Or maybe you fiddled around with Google AI~Overviews? These might sound familiar because the modern-day search stack has recently evolved to include retrieval-augmented generation (RAG) systems. They allow searching and incorporating real-time data into large language models (LLMs) to provide a well-informed, attributed, concise summary in contrast to the tradi…
▽ More
Did you try out the new Bing Search? Or maybe you fiddled around with Google AI~Overviews? These might sound familiar because the modern-day search stack has recently evolved to include retrieval-augmented generation (RAG) systems. They allow searching and incorporating real-time data into large language models (LLMs) to provide a well-informed, attributed, concise summary in contrast to the traditional search paradigm that relies on displaying a ranked list of documents. Therefore, given these recent advancements, it is crucial to have an arena to build, test, visualize, and systematically evaluate RAG-based search systems. With this in mind, we propose the TREC 2024 RAG Track to foster innovation in evaluating RAG systems. In our work, we lay out the steps we've made towards making this track a reality -- we describe the details of our reusable framework, Ragnarök, explain the curation of the new MS MARCO V2.1 collection choice, release the development topics for the track, and standardize the I/O definitions which assist the end user. Next, using Ragnarök, we identify and provide key industrial baselines such as OpenAI's GPT-4o or Cohere's Command R+. Further, we introduce a web-based user interface for an interactive arena allowing benchmarking pairwise RAG systems by crowdsourcing. We open-source our Ragnarök framework and baselines to achieve a unified standard for future RAG systems.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Mimicking Negative Mass Properties
Authors:
S. D. Campos
Abstract:
In the present work, one analyzes two systems trying to obtain physical conditions where some properties attributed to negative mass can be mimicked by positive mass particles. The first one is the well-known 1/2-spin system described by the Dirac equation in the presence of an external electromagnetic field. Assuming some physical restrictions, one obtains that the use of $e\rightarrow-e$ can lea…
▽ More
In the present work, one analyzes two systems trying to obtain physical conditions where some properties attributed to negative mass can be mimicked by positive mass particles. The first one is the well-known 1/2-spin system described by the Dirac equation in the presence of an external electromagnetic field. Assuming some physical restrictions, one obtains that the use of $e\rightarrow-e$ can lead to the same results as using $m\rightarrow-m$. In particular, for a null dielectric function, it is possible to obtain a negative mass behavior from a positive mass system composed of negatively charged particles. The second system is based on the de Broglie matter wave. The dispersion relation of such a wave can be negative (real or imaginary valued) if one assumes an imaginary wavenumber. The consequence is the emergence of a negative refractive index for positive mass particles. However, this behavior is generally attributed to a negative mass system.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
On the Existence and Smoothness of the Navier-Stokes Equation I
Authors:
Brian David Vasquez Campos
Abstract:
In this paper, we give a sufficient condition to guarantee the existence of a smooth solution of the Navier-Stokes Equation with the nice decreasing properties at infinity. In this way, we prove the existence of smooth physically reasonable solutions to the Navier-Stokes problem. Additionally, we show the existence of a smooth curve of entire vector fields of order 2 that extends the solution to t…
▽ More
In this paper, we give a sufficient condition to guarantee the existence of a smooth solution of the Navier-Stokes Equation with the nice decreasing properties at infinity. In this way, we prove the existence of smooth physically reasonable solutions to the Navier-Stokes problem. Additionally, we show the existence of a smooth curve of entire vector fields of order 2 that extends the solution to the complex domain for positive time.
△ Less
Submitted 6 December, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Synthetic Test Collections for Retrieval Evaluation
Authors:
Hossein A. Rahmani,
Nick Craswell,
Emine Yilmaz,
Bhaskar Mitra,
Daniel Campos
Abstract:
Test collections play a vital role in evaluation of information retrieval (IR) systems. Obtaining a diverse set of user queries for test collection construction can be challenging, and acquiring relevance judgments, which indicate the appropriateness of retrieved documents to a query, is often costly and resource-intensive. Generating synthetic datasets using Large Language Models (LLMs) has recen…
▽ More
Test collections play a vital role in evaluation of information retrieval (IR) systems. Obtaining a diverse set of user queries for test collection construction can be challenging, and acquiring relevance judgments, which indicate the appropriateness of retrieved documents to a query, is often costly and resource-intensive. Generating synthetic datasets using Large Language Models (LLMs) has recently gained significant attention in various applications. In IR, while previous work exploited the capabilities of LLMs to generate synthetic queries or documents to augment training data and improve the performance of ranking models, using LLMs for constructing synthetic test collections is relatively unexplored. Previous studies demonstrate that LLMs have the potential to generate synthetic relevance judgments for use in the evaluation of IR systems. In this paper, we comprehensively investigate whether it is possible to use LLMs to construct fully synthetic test collections by generating not only synthetic judgments but also synthetic queries. In particular, we analyse whether it is possible to construct reliable synthetic test collections and the potential risks of bias such test collections may exhibit towards LLM-based models. Our experiments indicate that using LLMs it is possible to construct synthetic test collections that can reliably be used for retrieval evaluation.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
Authors:
Luke Merrick,
Danmei Xu,
Gaurav Nuti,
Daniel Campos
Abstract:
This report describes the training dataset creation and recipe behind the family of \texttt{arctic-embed} text embedding models (a set of five models ranging from 22 to 334 million parameters with weights open-sourced under an Apache-2 license). At the time of their release, each model achieved state-of-the-art retrieval accuracy for models of their size on the MTEB Retrieval leaderboard, with the…
▽ More
This report describes the training dataset creation and recipe behind the family of \texttt{arctic-embed} text embedding models (a set of five models ranging from 22 to 334 million parameters with weights open-sourced under an Apache-2 license). At the time of their release, each model achieved state-of-the-art retrieval accuracy for models of their size on the MTEB Retrieval leaderboard, with the largest model, arctic-embed-l outperforming closed source embedding models such as Cohere's embed-v3 and Open AI's text-embed-3-large. In addition to the details of our training recipe, we have provided several informative ablation studies, which we believe are the cause of our model performance.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version
Authors:
David Campos,
Bin Yang,
Tung Kieu,
Miao Zhang,
Chenjuan Guo,
Christian S. Jensen
Abstract:
We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and…
▽ More
We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and computational capabilities, the full-precision parameters in standard models can be quantized to use fewer bits. The resulting quantized models are then calibrated using back-propagation and full training data to ensure accuracy. This one-time calibration works for deployments in static environments. However, model deployment in dynamic edge environments call for continual calibration to adaptively adjust quantized models to fit new incoming data, which may have different distributions. The first difficulty in enabling continual calibration on the edge is that the full training data may be too large and thus not always available on edge devices. The second difficulty is that the use of back-propagation on the edge for repeated calibration is too expensive. We propose QCore to enable continual calibration on the edge. First, it compresses the full training data into a small subset to enable effective calibration of quantized models with different bit-widths. We also propose means of updating the subset when new streaming data arrives to reflect changes in the environment, while not forgetting earlier training data. Second, we propose a small bit-flipping network that works with the subset to update quantized model parameters, thus enabling efficient continual calibration without back-propagation. An experimental study, conducted with real-world data in a continual learning setting, offers insight into the properties of QCore and shows that it is capable of outperforming strong baseline methods.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
A Semi-Lagrangian Approach for Time and Energy Path Planning Optimization in Static Flow Fields
Authors:
Víctor C. da S. Campos,
Armando A. Neto,
Douglas G. Macharet
Abstract:
Efficient path planning for autonomous mobile robots is a critical problem across numerous domains, where optimizing both time and energy consumption is paramount. This paper introduces a novel methodology that considers the dynamic influence of an environmental flow field and considers geometric constraints, including obstacles and forbidden zones, enriching the complexity of the planning problem…
▽ More
Efficient path planning for autonomous mobile robots is a critical problem across numerous domains, where optimizing both time and energy consumption is paramount. This paper introduces a novel methodology that considers the dynamic influence of an environmental flow field and considers geometric constraints, including obstacles and forbidden zones, enriching the complexity of the planning problem. We formulate it as a multi-objective optimal control problem, propose a novel transformation called Harmonic Transformation, and apply a semi-Lagrangian scheme to solve it. The set of Pareto efficient solutions is obtained considering two distinct approaches: a deterministic method and an evolutionary-based one, both of which are designed to make use of the proposed Harmonic Transformation. Through an extensive analysis of these approaches, we demonstrate their efficacy in finding optimized paths.
△ Less
Submitted 13 March, 2025; v1 submitted 25 March, 2024;
originally announced March 2024.
-
The Active Asteroids Citizen Science Program: Overview and First Results
Authors:
Colin Orion Chandler,
Chadwick A. Trujillo,
William J. Oldroyd,
Jay K. Kueny,
William A. Burris,
Henry H. Hsieh,
Jarod A. DeSpain,
Nima Sedaghat,
Scott S. Sheppard,
Kennedy A. Farrell,
David E. Trilling,
Annika Gustafsson,
Mark Jesus Mendoza Magbanua,
Michele T. Mazzucato,
Milton K. D. Bosch,
Tiffany Shaw-Diaz,
Virgilio Gonano,
Al Lamperti,
José A. da Silva Campos,
Brian L. Goodwin,
Ivan A. Terentev,
Charles J. A. Dukes,
Sam Deen
Abstract:
We present the Citizen Science program Active Asteroids and describe discoveries stemming from our ongoing project. Our NASA Partner program is hosted on the Zooniverse online platform and launched on 2021 August 31, with the goal of engaging the community in the search for active asteroids -- asteroids with comet-like tails or comae. We also set out to identify other unusual active solar system o…
▽ More
We present the Citizen Science program Active Asteroids and describe discoveries stemming from our ongoing project. Our NASA Partner program is hosted on the Zooniverse online platform and launched on 2021 August 31, with the goal of engaging the community in the search for active asteroids -- asteroids with comet-like tails or comae. We also set out to identify other unusual active solar system objects, such as active Centaurs, active quasi-Hilda asteroids, and Jupiter-family comets (JFCs). Active objects are rare in large part because they are difficult to identify, so we ask volunteers to assist us in searching for active bodies in our collection of millions of images of known minor planets. We produced these cutout images with our project pipeline that makes use of publicly available Dark Energy Camera (DECam) data. Since the project launch, roughly 8,300 volunteers have scrutinized some 430,000 images to great effect, which we describe in this work. In total we have identified previously unknown activity on 15 asteroids, plus one Centaur, that were thought to be asteroidal (i.e., inactive). Of the asteroids, we classify four as active quasi-Hilda asteroids, seven as JFCs, and four as active asteroids, consisting of one Main-belt comet (MBC) and three MBC candidates. We also include our findings concerning known active objects that our program facilitated, an unanticipated avenue of scientific discovery. These include discovering activity occurring during an orbital epoch for which objects were not known to be active, and the reclassification of objects based on our dynamical analyses.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Intermittent random walks under stochastic resetting
Authors:
Rosa Flaquer-Galmés,
Daniel Campos,
Vicenç Méndez
Abstract:
We analyze a one-dimensional intermittent random walk on an unbounded domain in the presence of stochastic resetting. In this process, the walker alternates between local intensive search, diffusion, and rapid ballistic relocations in which it does not react to the target. We demonstrate that Poissonian resetting leads to the existence of a non-equilibrium steady state. We calculate the distributi…
▽ More
We analyze a one-dimensional intermittent random walk on an unbounded domain in the presence of stochastic resetting. In this process, the walker alternates between local intensive search, diffusion, and rapid ballistic relocations in which it does not react to the target. We demonstrate that Poissonian resetting leads to the existence of a non-equilibrium steady state. We calculate the distribution of the first arrival time to a target along with its mean and show the existence of an optimal reset rate. In particular, we prove that the initial condition of the walker, i.e., either starting diffusely or relocating, can significantly affect the long-time properties of the search process. Moreover, we demonstrate the presence of distinct parameter regimes for the global optimization of the mean first arrival time when ballistic and diffusive movements are in direct competition.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
A Class of Matrix Schrödinger Bispectral Operators
Authors:
Brian D. Vasquez Campos
Abstract:
We prove the bispectrality of some class of matrix Schrödinger operators with polynomial potentials that satisfy a second-order matrix autonomous differential equation. The physical equation is constructed using the formal theory of the Laurent series and after that obtaining local solutions using estimations in the Frobenius norm. Furthermore, the characterization of the algebra of polynomial eig…
▽ More
We prove the bispectrality of some class of matrix Schrödinger operators with polynomial potentials that satisfy a second-order matrix autonomous differential equation. The physical equation is constructed using the formal theory of the Laurent series and after that obtaining local solutions using estimations in the Frobenius norm. Furthermore, the characterization of the algebra of polynomial eigenvalues in the spectral variable is given using some family of functions $\left\{ P_{k}\right\}_{k\in \mathbb{N}}$ with the remarkable property of satisfying a general version of the Leibniz rule.
△ Less
Submitted 1 February, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
First-passage time of a Brownian searcher with stochastic resetting to random positions
Authors:
Vicenç Mendez,
Rosa Flaquer-Galmés,
Daniel Campos
Abstract:
We study the effect of a resetting point randomly distributed around the origin on the mean first passage time of a Brownian searcher moving in one dimension. We compare the search efficiency with that corresponding to reset to the origin and find that the mean first passage time of the latter can be larger or smaller than the distributed case, depending on whether the resetting points are symmetr…
▽ More
We study the effect of a resetting point randomly distributed around the origin on the mean first passage time of a Brownian searcher moving in one dimension. We compare the search efficiency with that corresponding to reset to the origin and find that the mean first passage time of the latter can be larger or smaller than the distributed case, depending on whether the resetting points are symmetrically or asymmetrically distributed. In particular, we prove the existence of an optimal reset rate that minimizes the mean first-passage time for distributed resetting to a finite interval if the target is located outside this interval. When the target position belongs to the resetting interval or it is infinite then no optimal reset rate exists, but there is an optimal resetting interval width or resetting characteristic scale which minimizes the mean first-passage time. We also show that the first-passage density averaged over the resetting points depends on its first moment only. As a consequence, there is an equivalent point such that the first-passage problem with resetting to that point is statistically equivalent to the case of distributed resetting. We end our study by analyzing the fluctuations of the first-passage times for these cases. All our analytical results are verified through numerical simulations.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Long-term temporal stability of the DarkSide-50 dark matter detector
Authors:
The DarkSide-50 Collaboration,
:,
P. Agnes,
I. F. M. Albuquerque,
T. Alexander,
A. K. Alton,
M. Ave,
H. O. Back,
G. Batignani,
K. Biery,
V. Bocci,
W. M. Bonivento,
B. Bottino,
S. Bussino,
M. Cadeddu,
M. Cadoni,
F. Calaprice,
A. Caminata,
M. D. Campos,
N. Canci,
M. Caravati,
N. Cargioli,
M. Cariello,
M. Carlini,
V. Cataudella
, et al. (121 additional authors not shown)
Abstract:
The stability of a dark matter detector on the timescale of a few years is a key requirement due to the large exposure needed to achieve a competitive sensitivity. It is especially crucial to enable the detector to potentially detect any annual event rate modulation, an expected dark matter signature. In this work, we present the performance history of the DarkSide-50 dual-phase argon time project…
▽ More
The stability of a dark matter detector on the timescale of a few years is a key requirement due to the large exposure needed to achieve a competitive sensitivity. It is especially crucial to enable the detector to potentially detect any annual event rate modulation, an expected dark matter signature. In this work, we present the performance history of the DarkSide-50 dual-phase argon time projection chamber over its almost three-year low-radioactivity argon run. In particular, we focus on the electroluminescence signal that enables sensitivity to sub-keV energy depositions. The stability of the electroluminescence yield is found to be better than 0.5%. Finally, we show the temporal evolution of the observed event rate around the sub-keV region being consistent to the background prediction.
△ Less
Submitted 22 May, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Dynamic redundancy as a mechanism to optimize collective random searches
Authors:
Daniel Campos,
Vicenç Méndez
Abstract:
We explore the case of a group of random walkers looking for a target randomly located in space, such that the number of walkers is not constant but new ones can join the search, or those that are active can abandon it, with constant rates $r_b$ and $r_d$, respectively. Exact analytical solutions are provided both for the fastest-first-passage time and for the collective search time required to re…
▽ More
We explore the case of a group of random walkers looking for a target randomly located in space, such that the number of walkers is not constant but new ones can join the search, or those that are active can abandon it, with constant rates $r_b$ and $r_d$, respectively. Exact analytical solutions are provided both for the fastest-first-passage time and for the collective search time required to reach the target, in the seminal case of Brownian walkers with $r_d=0$. We prove that even for such a simplified situation there exists an optimal rate $r_b$ at which walkers should join the search to minimize the collective effort required to reach the target. We discuss how these results open a new line to understand the optimal regulation of cooperative random searches, e.g. for the case of biological foraging in social species.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Overview of the TREC 2023 Product Product Search Track
Authors:
Daniel Campos,
Surya Kallumadi,
Corby Rosset,
Cheng Xiang Zhai,
Alessandro Magnani
Abstract:
This is the first year of the TREC Product search track. The focus this year was the creation of a reusable collection and evaluation of the impact of the use of metadata and multi-modal data on retrieval accuracy. This year we leverage the new product search corpus, which includes contextual metadata. Our analysis shows that in the product search domain, traditional retrieval systems are highly e…
▽ More
This is the first year of the TREC Product search track. The focus this year was the creation of a reusable collection and evaluation of the impact of the use of metadata and multi-modal data on retrieval accuracy. This year we leverage the new product search corpus, which includes contextual metadata. Our analysis shows that in the product search domain, traditional retrieval systems are highly effective and commonly outperform general-purpose pretrained embedding models. Our analysis also evaluates the impact of using simplified and metadata-enhanced collections, finding no clear trend in the impact of the expanded collection. We also see some surprising outcomes; despite their widespread adoption and competitive performance on other tasks, we find single-stage dense retrieval runs can commonly be noncompetitive or generate low-quality results both in the zero-shot and fine-tuned domain.
△ Less
Submitted 15 November, 2023; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Riesz spaces of signed charges on semi-rings
Authors:
Santiago Cambronero,
David Campos,
C. A. Fonseca-Mora,
Darío Mena
Abstract:
A constructive definition of the supremum of a family of set functions is exploited in the context of Riesz spaces of signed measures and finitely additive functions (signed charges) on semi-rings. We explore applications, particularly to establish a Jordan decomposition for signed charges on semi-rings, whether the structure of Riesz space is present or not.
A constructive definition of the supremum of a family of set functions is exploited in the context of Riesz spaces of signed measures and finitely additive functions (signed charges) on semi-rings. We explore applications, particularly to establish a Jordan decomposition for signed charges on semi-rings, whether the structure of Riesz space is present or not.
△ Less
Submitted 25 November, 2024; v1 submitted 20 August, 2023;
originally announced August 2023.
-
Cylindrical Martingale-Valued Measures, Stochastic Integration and SPDEs
Authors:
Santiago Cambronero,
David Campos,
C. A. Fonseca-Mora,
Darío Mena
Abstract:
We develop a theory of Hilbert-space valued stochastic integration with respect to cylindrical martingale-valued measures. As part of our construction, we expand the concept of quadratic variation, introduced by Veraar and Yaroslavtsev (2016), to the case of cylindrical martingale-valued measures that are allowed to have discontinuous paths (this is carried out within the context of separable Bana…
▽ More
We develop a theory of Hilbert-space valued stochastic integration with respect to cylindrical martingale-valued measures. As part of our construction, we expand the concept of quadratic variation, introduced by Veraar and Yaroslavtsev (2016), to the case of cylindrical martingale-valued measures that are allowed to have discontinuous paths (this is carried out within the context of separable Banach spaces). Our theory of stochastic integration is applied to address the existence and uniqueness of solutions to stochastic partial differential equations in Hilbert spaces.
△ Less
Submitted 21 November, 2024; v1 submitted 20 August, 2023;
originally announced August 2023.
-
Boundary conditions and infrared divergences
Authors:
Lissa de Souza Campos,
Claudio Dappiaggi,
Luca Sinibaldi
Abstract:
We review the procedure to construct quasi-free ground states, for real scalar fields whose dynamics is dictated by the Klein-Gordon equation, on standard static Lorentzian manifolds with a time-like boundary. We observe that, depending on the assigned boundary condition of Robin type, this procedure does not always lead to the existence of a suitable bi-distribution…
▽ More
We review the procedure to construct quasi-free ground states, for real scalar fields whose dynamics is dictated by the Klein-Gordon equation, on standard static Lorentzian manifolds with a time-like boundary. We observe that, depending on the assigned boundary condition of Robin type, this procedure does not always lead to the existence of a suitable bi-distribution $w_2\in \mathcal{D}'(M\times M)$ due to the presence of infrared divergences. As a concrete example we consider a Bertotti-Robinson spacetime in two different coordinate patches. In one case we show that infrared divergences do not occur only for Dirichlet boundary conditions as one might expect a priori, while, in the other case, we prove that they occur only when Neumann boundary conditions are imposed at the time-like boundary.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
On Negative Mass, Partition Function and Entropy
Authors:
S. D. Campos
Abstract:
This work examines some aspects related to the existence of negative mass. The requirement for the partition function to converge leads to two distinct approaches. Initially, convergence is achieved by assuming a negative absolute temperature, which results in an imaginary partition function and complex entropy. Subsequently, convergence is maintained by keeping the absolute temperature positive w…
▽ More
This work examines some aspects related to the existence of negative mass. The requirement for the partition function to converge leads to two distinct approaches. Initially, convergence is achieved by assuming a negative absolute temperature, which results in an imaginary partition function and complex entropy. Subsequently, convergence is maintained by keeping the absolute temperature positive while introducing an imaginary velocity. This modification leads to a positive partition function and real entropy. It seems the utilization of imaginary velocity may yield more plausible physical results compared to the use of negative temperature, at least for the partition function and entropy.
△ Less
Submitted 14 November, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Search for dark matter annual modulation with DarkSide-50
Authors:
The DarkSide-50 Collaboration,
:,
P. Agnes,
I. F. M. Albuquerque,
T. Alexander,
A. K. Alton,
M. Ave,
H. O. Back,
G. Batignani,
K. Biery,
V. Bocci,
W. M. Bonivento,
B. Bottino,
S. Bussino,
M. Cadeddu,
M. Cadoni,
F. Calaprice,
A. Caminata,
M. D. Campos,
N. Canci,
M. Caravati,
N. Cargioli,
M. Cariello,
M. Carlini,
V. Cataudella
, et al. (121 additional authors not shown)
Abstract:
Dark matter induced event rate in an Earth-based detector is predicted to show an annual modulation as a result of the Earth's orbital motion around the Sun. We searched for this modulation signature using the ionization signal of the DarkSide-50 liquid argon time projection chamber. No significant signature compatible with dark matter is observed in the electron recoil equivalent energy range abo…
▽ More
Dark matter induced event rate in an Earth-based detector is predicted to show an annual modulation as a result of the Earth's orbital motion around the Sun. We searched for this modulation signature using the ionization signal of the DarkSide-50 liquid argon time projection chamber. No significant signature compatible with dark matter is observed in the electron recoil equivalent energy range above $40~{\rm eV_{ee}}$, the lowest threshold ever achieved in such a search.
△ Less
Submitted 22 November, 2024; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Hearing the voice of experts: Unveiling Stack Exchange communities' knowledge of test smells
Authors:
Luana Martins,
Denivan Campos,
Railana Santana,
Joselito Mota Junior,
Heitor Costa,
Ivan Machado
Abstract:
Refactorings are transformations to improve the code design without changing overall functionality and observable behavior. During the refactoring process of smelly test code, practitioners may struggle to identify refactoring candidates and define and apply corrective strategies. This paper reports on an empirical study aimed at understanding how test smells and test refactorings are discussed on…
▽ More
Refactorings are transformations to improve the code design without changing overall functionality and observable behavior. During the refactoring process of smelly test code, practitioners may struggle to identify refactoring candidates and define and apply corrective strategies. This paper reports on an empirical study aimed at understanding how test smells and test refactorings are discussed on the Stack Exchange network. Developers commonly count on Stack Exchange to pick the brains of the wise, i.e., to `look up' how others are completing similar tasks. Therefore, in light of data from the Stack Exchange discussion topics, we could examine how developers understand and perceive test smells, the corrective actions they take to handle them, and the challenges they face when refactoring test code aiming to fix test smells. We observed that developers are interested in others' perceptions and hands-on experience handling test code issues. Besides, there is a clear indication that developers often ask whether test smells or anti-patterns are either good or bad testing practices than code-based refactoring recommendations.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Noise-Robust Dense Retrieval via Contrastive Alignment Post Training
Authors:
Daniel Campos,
ChengXiang Zhai,
Alessandro Magnani
Abstract:
The success of contextual word representations and advances in neural information retrieval have made dense vector-based retrieval a standard approach for passage and document ranking. While effective and efficient, dual-encoders are brittle to variations in query distributions and noisy queries. Data augmentation can make models more robust but introduces overhead to training set generation and r…
▽ More
The success of contextual word representations and advances in neural information retrieval have made dense vector-based retrieval a standard approach for passage and document ranking. While effective and efficient, dual-encoders are brittle to variations in query distributions and noisy queries. Data augmentation can make models more robust but introduces overhead to training set generation and requires retraining and index regeneration. We present Contrastive Alignment POst Training (CAPOT), a highly efficient finetuning method that improves model robustness without requiring index regeneration, the training set optimization, or alteration. CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root. We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
△ Less
Submitted 10 April, 2023; v1 submitted 6 April, 2023;
originally announced April 2023.
-
To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency
Authors:
Daniel Campos,
ChengXiang Zhai
Abstract:
Sequence-to-sequence language models can be used to produce abstractive summaries which are coherent, relevant, and concise. Still, model sizes can make deployment in latency-sensitive or web-scale implementations difficult. This paper studies the relationship between model size, structured pruning, inference efficiency, and summarization accuracy on widely used summarization datasets. We show tha…
▽ More
Sequence-to-sequence language models can be used to produce abstractive summaries which are coherent, relevant, and concise. Still, model sizes can make deployment in latency-sensitive or web-scale implementations difficult. This paper studies the relationship between model size, structured pruning, inference efficiency, and summarization accuracy on widely used summarization datasets. We show that model accuracy is tied to the encoder size while inference efficiency is connected to the decoder. Using asymmetric pruning can lead to nearly 3x improvement in inference latency with ~1 point loss in Rouge-2. Moreover, we find both the average degradation and the role of asymmetry to be consistent across model sizes and variations in datasets.
△ Less
Submitted 12 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders
Authors:
Daniel Campos,
Alessandro Magnani,
ChengXiang Zhai
Abstract:
In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we investigate the impact of pre and post-training compression on the MSMARCO, Natural Questions, TriviaQA, SQUAD, and SCIFACT, finding that asymmetry in the dual encod…
▽ More
In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we investigate the impact of pre and post-training compression on the MSMARCO, Natural Questions, TriviaQA, SQUAD, and SCIFACT, finding that asymmetry in the dual encoders in dense retrieval can lead to improved inference efficiency. Knowing this, we introduce Kullback Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods by pruning and aligning the query encoder after training. Specifically, KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation. Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.
△ Less
Submitted 1 June, 2023; v1 submitted 31 March, 2023;
originally announced April 2023.
-
Dense Sparse Retrieval: Using Sparse Language Models for Inference Efficient Dense Retrieval
Authors:
Daniel Campos,
ChengXiang Zhai
Abstract:
Vector-based retrieval systems have become a common staple for academic and industrial search applications because they provide a simple and scalable way of extending the search to leverage contextual representations for documents and queries. As these vector-based systems rely on contextual language models, their usage commonly requires GPUs, which can be expensive and difficult to manage. Given…
▽ More
Vector-based retrieval systems have become a common staple for academic and industrial search applications because they provide a simple and scalable way of extending the search to leverage contextual representations for documents and queries. As these vector-based systems rely on contextual language models, their usage commonly requires GPUs, which can be expensive and difficult to manage. Given recent advances in introducing sparsity into language models for improved inference efficiency, in this paper, we study how sparse language models can be used for dense retrieval to improve inference efficiency. Using the popular retrieval library Tevatron and the MSMARCO, NQ, and TriviaQA datasets, we find that sparse language models can be used as direct replacements with little to no drop in accuracy and up to 4.3x improved inference speeds
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes
Authors:
Daniel Campos,
Alexandre Marques,
Mark Kurtz,
ChengXiang Zhai
Abstract:
In this paper, we introduce the range of oBERTa language models, an easy-to-use set of language models which allows Natural Language Processing (NLP) practitioners to obtain between 3.8 and 24.3 times faster models without expertise in model compression. Specifically, oBERTa extends existing work on pruning, knowledge distillation, and quantization and leverages frozen embeddings improves distilla…
▽ More
In this paper, we introduce the range of oBERTa language models, an easy-to-use set of language models which allows Natural Language Processing (NLP) practitioners to obtain between 3.8 and 24.3 times faster models without expertise in model compression. Specifically, oBERTa extends existing work on pruning, knowledge distillation, and quantization and leverages frozen embeddings improves distillation and model initialization to deliver higher accuracy on a broad range of transfer tasks. In generating oBERTa, we explore how the highly optimized RoBERTa differs from the BERT for pruning during pre-training and finetuning. We find it less amenable to compression during fine-tuning. We explore the use of oBERTa on seven representative NLP tasks and find that the improved compression techniques allow a pruned oBERTa model to match the performance of BERTbase and exceed the performance of Prune OFA Large on the SQUAD V1.1 Question Answering dataset, despite being 8x and 2x, respectively faster in inference. We release our code, training regimes, and associated model for broad usage to encourage usage and experimentation
△ Less
Submitted 6 June, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
LightTS: Lightweight Time Series Classification with Adaptive Ensemble Distillation -- Extended Version
Authors:
David Campos,
Miao Zhang,
Bin Yang,
Tung Kieu,
Chenjuan Guo,
Christian S. Jensen
Abstract:
Due to the sweeping digitalization of processes, increasingly vast amounts of time series data are being produced. Accurate classification of such time series facilitates decision making in multiple domains. State-of-the-art classification accuracy is often achieved by ensemble learning where results are synthesized from multiple base models. This characteristic implies that ensemble learning need…
▽ More
Due to the sweeping digitalization of processes, increasingly vast amounts of time series data are being produced. Accurate classification of such time series facilitates decision making in multiple domains. State-of-the-art classification accuracy is often achieved by ensemble learning where results are synthesized from multiple base models. This characteristic implies that ensemble learning needs substantial computing resources, preventing their use in resource-limited environments, such as in edge devices. To extend the applicability of ensemble learning, we propose the LightTS framework that compresses large ensembles into lightweight models while ensuring competitive accuracy. First, we propose adaptive ensemble distillation that assigns adaptive weights to different base models such that their varying classification capabilities contribute purposefully to the training of the lightweight model. Second, we propose means of identifying Pareto optimal settings w.r.t. model accuracy and model size, thus enabling users with a space budget to select the most accurate lightweight model. We report on experiments using 128 real-world time series sets and different types of base models that justify key decisions in the design of LightTS and provide evidence that LightTS is able to outperform competitors.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
Search for low mass dark matter in DarkSide-50: the bayesian network approach
Authors:
The DarkSide-50 Collaboration,
:,
P. Agnes,
I. F. M. Albuquerque,
T. Alexander,
A. K. Alton,
M. Ave,
H. O. Back,
G. Batignani,
K. Biery,
V. Bocci,
W. M. Bonivento,
B. Bottino,
S. Bussino,
M. Cadeddu,
M. Cadoni,
F. Calaprice,
A. Caminata,
M. D. Campos,
N. Canci,
M. Caravati,
N. Cargioli,
M. Cariello,
M. Carlini,
V. Cataudella
, et al. (119 additional authors not shown)
Abstract:
We present a novel approach for the search of dark matter in the DarkSide-50 experiment, relying on Bayesian Networks. This method incorporates the detector response model into the likelihood function, explicitly maintaining the connection with the quantity of interest. No assumptions about the linearity of the problem or the shape of the probability distribution functions are required, and there…
▽ More
We present a novel approach for the search of dark matter in the DarkSide-50 experiment, relying on Bayesian Networks. This method incorporates the detector response model into the likelihood function, explicitly maintaining the connection with the quantity of interest. No assumptions about the linearity of the problem or the shape of the probability distribution functions are required, and there is no need to morph signal and background spectra as a function of nuisance parameters. By expressing the problem in terms of Bayesian Networks, we have developed an inference algorithm based on a Markov Chain Monte Carlo to calculate the posterior probability. A clever description of the detector response model in terms of parametric matrices allows us to study the impact of systematic variations of any parameter on the final results. Our approach not only provides the desired information on the parameter of interest, but also potential constraints on the response model. Our results are consistent with recent published analyses and further refine the parameters of the detector response model.
△ Less
Submitted 26 April, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Compressing Cross-Lingual Multi-Task Models at Qualtrics
Authors:
Daniel Campos,
Daniel Perry,
Samir Joshi,
Yashmeet Gambhir,
Wei Du,
Zhengzheng Xing,
Aaron Colak
Abstract:
Experience management is an emerging business area where organizations focus on understanding the feedback of customers and employees in order to improve their end-to-end experiences. This results in a unique set of machine learning problems to help understand how people feel, discover issues they care about, and find which actions need to be taken on data that are different in content and distrib…
▽ More
Experience management is an emerging business area where organizations focus on understanding the feedback of customers and employees in order to improve their end-to-end experiences. This results in a unique set of machine learning problems to help understand how people feel, discover issues they care about, and find which actions need to be taken on data that are different in content and distribution from traditional NLP domains. In this paper, we present a case study of building text analysis applications that perform multiple classification tasks efficiently in 12 languages in the nascent business area of experience management. In order to scale up modern ML methods on experience data, we leverage cross lingual and multi-task modeling techniques to consolidate our models into a single deployment to avoid overhead. We also make use of model compression and model distillation to reduce overall inference latency and hardware cost to the level acceptable for business needs while maintaining model prediction quality. Our findings show that multi-task modeling improves task performance for a subset of experience management tasks in both XLM-R and mBert architectures. Among the compressed architectures we explored, we found that MiniLM achieved the best compression/performance tradeoff. Our case study demonstrates a speedup of up to 15.61x with 2.60% average task degradation (or 3.29x speedup with 1.71% degradation) and estimated savings of 44% over using the original full-size model. These results demonstrate a successful scaling up of text classification for the challenging new area of ML for experience management.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Uma proposta metodologica para a aprendizagem: reflexao sobre as praticas pedagogicas da Estatistica ao elaborar os instrumentos de pesquisa sociais
Authors:
Manoel Benedito Nirdo da Silva Campos
Abstract:
Presents a differentiated teaching proposal that allows the student to be the agent in the construction of knowledge, overcoming the difficulties that Mathematics presents. Aiming to understand how the use of statistical tools can contribute to the improvement of the teaching-learning process and the construction of statistical knowledge, studied with students from the University Campus of Rondono…
▽ More
Presents a differentiated teaching proposal that allows the student to be the agent in the construction of knowledge, overcoming the difficulties that Mathematics presents. Aiming to understand how the use of statistical tools can contribute to the improvement of the teaching-learning process and the construction of statistical knowledge, studied with students from the University Campus of Rondonopolis/UFMT. In order to reach the proposed objective, an analysis was carried out about the didactic activities in the teaching of basic statistics in which a qualitative-quantitative approach was chosen, focusing on everyday life, with the use of software, creation and simulation of models, as well as seeking to establish the frequency of the students' attitude, through questionnaires and a four-point Likert Scale, seeking data that would trace the profile of the actors involved that would help in the understanding of the planned didactic activities. The study is justified by the lack of methodological theoretical references on the subject in question. Didactic activities of Statistics were organized, whose applications took place in alternating classes, being traditional lectures and practical classes in a Computer Laboratory. Constituting the main focus of characterization and reflection of this scientific research during the teaching-learning process of Statistics. The results showed a correct attitude of the researched about the methodological strategies used, assumed with reference to the research/action design. This study contributed to motivate, awaken and answer questions and give meaning and understanding to works with Mathematical Modeling, as it can promote the improvement of the teaching-learning process and configures itself as an indispensable tool for education.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.