-
Temporal horizons in forecasting: a performance-learnability trade-off
Authors:
Pau Vilimelis Aceituno,
Jack William Miller,
Noah Marti,
Youssef Farag,
Victor Boussange
Abstract:
When training autoregressive models for dynamical systems, a critical question arises: how far into the future should the model be trained to predict? Too short a horizon may miss long-term trends, while too long a horizon can impede convergence due to accumulating prediction errors. In this work, we formalize this trade-off by analyzing how the geometry of the loss landscape depends on the traini…
▽ More
When training autoregressive models for dynamical systems, a critical question arises: how far into the future should the model be trained to predict? Too short a horizon may miss long-term trends, while too long a horizon can impede convergence due to accumulating prediction errors. In this work, we formalize this trade-off by analyzing how the geometry of the loss landscape depends on the training horizon. We prove that for chaotic systems, the loss landscape's roughness grows exponentially with the training horizon, while for limit cycles, it grows linearly, making long-horizon training inherently challenging. However, we also show that models trained on long horizons generalize well to short-term forecasts, whereas those trained on short horizons suffer exponentially (resp. linearly) worse long-term predictions in chaotic (resp. periodic) systems. We validate our theory through numerical experiments and discuss practical implications for selecting training horizons. Our results provide a principled foundation for hyperparameter optimization in autoregressive forecasting models.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation
Authors:
Tony Chang,
Kiarie Ndegwa,
Andreas Gros,
Vincent A. Landau,
Luke J. Zachmann,
Bogdan State,
Mitchell A. Gritts,
Colton W. Miller,
Nathan E. Rutenbeck,
Scott Conway,
Guy Bayes
Abstract:
This paper explores the application of a novel multi-task vision transformer (ViT) model for the estimation of canopy height models (CHMs) using 4-band National Agriculture Imagery Program (NAIP) imagery across the western United States. We compare the effectiveness of this model in terms of accuracy and precision aggregated across ecoregions and class heights versus three other benchmark peer-rev…
▽ More
This paper explores the application of a novel multi-task vision transformer (ViT) model for the estimation of canopy height models (CHMs) using 4-band National Agriculture Imagery Program (NAIP) imagery across the western United States. We compare the effectiveness of this model in terms of accuracy and precision aggregated across ecoregions and class heights versus three other benchmark peer-reviewed models. Key findings suggest that, while other benchmark models can provide high precision in localized areas, the VibrantVS model has substantial advantages across a broad reach of ecoregions in the western United States with higher accuracy, higher precision, the ability to generate updated inference at a cadence of three years or less, and high spatial resolution. The VibrantVS model provides significant value for ecological monitoring and land management decisions, including for wildfire mitigation.
△ Less
Submitted 24 January, 2025; v1 submitted 13 December, 2024;
originally announced December 2024.
-
Nonparametric IPSS: Fast, flexible feature selection with false discovery control
Authors:
Omar Melikechi,
David B. Dunson,
Jeffrey W. Miller
Abstract:
Feature selection is a critical task in machine learning and statistics. However, existing feature selection methods either (i) rely on parametric methods such as linear or generalized linear models, (ii) lack theoretical false discovery control, or (iii) identify few true positives. Here, we introduce a general feature selection method with finite-sample false discovery control based on applying…
▽ More
Feature selection is a critical task in machine learning and statistics. However, existing feature selection methods either (i) rely on parametric methods such as linear or generalized linear models, (ii) lack theoretical false discovery control, or (iii) identify few true positives. Here, we introduce a general feature selection method with finite-sample false discovery control based on applying integrated path stability selection (IPSS) to arbitrary feature importance scores. The method is nonparametric whenever the importance scores are nonparametric, and it estimates q-values, which are better suited to high-dimensional data than p-values. We focus on two special cases using importance scores from gradient boosting (IPSSGB) and random forests (IPSSRF). Extensive nonlinear simulations with RNA sequencing data show that both methods accurately control the false discovery rate and detect more true positives than existing methods. Both methods are also efficient, running in under 20 seconds when there are 500 samples and 5000 features. We apply IPSSGB and IPSSRF to detect microRNAs and genes related to cancer, finding that they yield better predictions with fewer features than existing approaches.
△ Less
Submitted 6 May, 2025; v1 submitted 3 October, 2024;
originally announced October 2024.
-
Governing dual-use technologies: Case studies of international security agreements and lessons for AI governance
Authors:
Akash R. Wasil,
Peter Barnett,
Michael Gerovitch,
Roman Hauksson,
Tom Reed,
Jack William Miller
Abstract:
International AI governance agreements and institutions may play an important role in reducing global security risks from advanced AI. To inform the design of such agreements and institutions, we conducted case studies of historical and contemporary international security agreements. We focused specifically on those arrangements around dual-use technologies, examining agreements in nuclear securit…
▽ More
International AI governance agreements and institutions may play an important role in reducing global security risks from advanced AI. To inform the design of such agreements and institutions, we conducted case studies of historical and contemporary international security agreements. We focused specifically on those arrangements around dual-use technologies, examining agreements in nuclear security, chemical weapons, biosecurity, and export controls. For each agreement, we examined four key areas: (a) purpose, (b) core powers, (c) governance structure, and (d) instances of non-compliance. From these case studies, we extracted lessons for the design of international AI agreements and governance institutions. We discuss the importance of robust verification methods, strategies for balancing power between nations, mechanisms for adapting to rapid technological change, approaches to managing trade-offs between transparency and security, incentives for participation, and effective enforcement mechanisms.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Verification methods for international AI agreements
Authors:
Akash R. Wasil,
Tom Reed,
Jack William Miller,
Peter Barnett
Abstract:
What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technic…
▽ More
What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.
△ Less
Submitted 4 November, 2024; v1 submitted 28 August, 2024;
originally announced August 2024.
-
Refinement of an Epilepsy Dictionary through Human Annotation of Health-related posts on Instagram
Authors:
Aehong Min,
Xuan Wang,
Rion Brattig Correia,
Jordan Rozum,
Wendy R. Miller,
Luis M. Rocha
Abstract:
We used a dictionary built from biomedical terminology extracted from various sources such as DrugBank, MedDRA, MedlinePlus, TCMGeneDIT, to tag more than 8 million Instagram posts by users who have mentioned an epilepsy-relevant drug at least once, between 2010 and early 2016. A random sample of 1,771 posts with 2,947 term matches was evaluated by human annotators to identify false-positives. Open…
▽ More
We used a dictionary built from biomedical terminology extracted from various sources such as DrugBank, MedDRA, MedlinePlus, TCMGeneDIT, to tag more than 8 million Instagram posts by users who have mentioned an epilepsy-relevant drug at least once, between 2010 and early 2016. A random sample of 1,771 posts with 2,947 term matches was evaluated by human annotators to identify false-positives. OpenAI's GPT series models were compared against human annotation. Frequent terms with a high false-positive rate were removed from the dictionary. Analysis of the estimated false-positive rates of the annotated terms revealed 8 ambiguous terms (plus synonyms) used in Instagram posts, which were removed from the original dictionary. To study the effect of removing those terms, we constructed knowledge networks using the refined and the original dictionaries and performed an eigenvector-centrality analysis on both networks. We show that the refined dictionary thus produced leads to a significantly different rank of important terms, as measured by their eigenvector-centrality of the knowledge networks. Furthermore, the most important terms obtained after refinement are of greater medical relevance. In addition, we show that OpenAI's GPT series models fare worse than human annotators in this task.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
myAURA: Personalized health library for epilepsy management via knowledge graph sparsification and visualization
Authors:
Rion Brattig Correia,
Jordan C. Rozum,
Leonard Cross,
Jack Felag,
Michael Gallant,
Ziqi Guo,
Bruce W. Herr II,
Aehong Min,
Deborah Stungis Rocha,
Xuan Wang,
Katy Börner,
Wendy Miller,
Luis M. Rocha
Abstract:
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media,…
▽ More
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media, and electronic health records. A generalizable, open-source methodology was developed to compute a multi-layer knowledge graph linking all this heterogeneous data via the terms of a human-centered biomedical dictionary.
Results: The power of the approach is first exemplified in the study of the drug-drug interaction phenomenon. Furthermore, we employ a novel network sparsification methodology using the metric backbone of weighted graphs, which reveals the most important edges for inference, recommendation, and visualization, such as pharmacology factors patients discuss on social media. The network sparsification approach also allows us to extract focused digital cohorts from social media whose discourse is more relevant to epilepsy or other biomedical problems. Finally, we present our patient-centered design and pilot-testing of myAURA, including its user interface, based on focus groups and other stakeholder input.
Discussion: The ability to search and explore myAURA's heterogeneous data sources via a sparsified multi-layer knowledge graph, as well as the combination of those layers in a single map, are useful features for integrating relevant information for epilepsy.
Conclusion: Our stakeholder-driven, scalable approach to integrate traditional and non-traditional data sources, enables biomedical discovery and data-powered patient self-management in epilepsy, and is generalizable to other chronic conditions.
△ Less
Submitted 10 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Reproducible Parameter Inference Using Bagged Posteriors
Authors:
Jonathan H. Huggins,
Jeffrey W. Miller
Abstract:
Under model misspecification, it is known that Bayesian posteriors often do not properly quantify uncertainty about true or pseudo-true parameters. Even more fundamentally, misspecification leads to a lack of reproducibility in the sense that the same model will yield contradictory posteriors on independent data sets from the true distribution. To define a criterion for reproducible uncertainty qu…
▽ More
Under model misspecification, it is known that Bayesian posteriors often do not properly quantify uncertainty about true or pseudo-true parameters. Even more fundamentally, misspecification leads to a lack of reproducibility in the sense that the same model will yield contradictory posteriors on independent data sets from the true distribution. To define a criterion for reproducible uncertainty quantification under misspecification, we consider the probability that two confidence sets constructed from independent data sets have nonempty overlap, and we establish a lower bound on this overlap probability that holds for any valid confidence sets. We prove that credible sets from the standard posterior can strongly violate this bound, particularly in high-dimensional settings (i.e., with dimension increasing with sample size), indicating that it is not internally coherent under misspecification. To improve reproducibility in an easy-to-use and widely applicable way, we propose to apply bagging to the Bayesian posterior ("BayesBag"'); that is, to use the average of posterior distributions conditioned on bootstrapped datasets. We motivate BayesBag from first principles based on Jeffrey conditionalization and show that the bagged posterior typically satisfies the overlap lower bound. Further, we prove a Bernstein--Von Mises theorem for the bagged posterior, establishing its asymptotic normal distribution. We demonstrate the benefits of BayesBag via simulation experiments and an application to crime rate prediction.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Explainable Equivariant Neural Networks for Particle Physics: PELICAN
Authors:
Alexander Bogatskiy,
Timothy Hoffman,
David W. Miller,
Jan T. Offermann,
Xiaoyang Liu
Abstract:
PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group…
▽ More
PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the $W$-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs.~gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.
△ Less
Submitted 23 February, 2024; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Eigenvalue initialisation and regularisation for Koopman autoencoders
Authors:
Jack W. Miller,
Charles O'Neill,
Navid C. Constantinou,
Omri Azencot
Abstract:
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operato…
▽ More
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operator layer, and a decoder. These models have been designed and dedicated to tackle physics-related problems with interpretable dynamics and an ability to incorporate physics-related constraints. However, the majority of existing work employs standard regularisation practices. In our work, we take a step toward augmenting Koopman autoencoders with initialisation and penalty schemes tailored for physics-related settings. Specifically, we propose the "eigeninit" initialisation scheme that samples initial Koopman operators from specific eigenvalue distributions. In addition, we suggest the "eigenloss" penalty scheme that penalises the eigenvalues of the Koopman operator during training. We demonstrate the utility of these schemes on two synthetic data sets: a driven pendulum and flow past a cylinder; and two real-world problems: ocean surface temperatures and cyclone wind fields. We find on these datasets that eigenloss and eigeninit improves the convergence rate by up to a factor of 5, and that they reduce the cumulative long-term prediction error by up to a factor of 3. Such a finding points to the utility of incorporating similar schemes as an inductive bias in other physics-related deep learning approaches.
△ Less
Submitted 25 December, 2022; v1 submitted 22 December, 2022;
originally announced December 2022.
-
PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics
Authors:
Alexander Bogatskiy,
Timothy Hoffman,
David W. Miller,
Jan T. Offermann
Abstract:
Many current approaches to machine learning in particle physics use generic architectures that require large numbers of parameters and disregard underlying physics principles, limiting their applicability as scientific modeling tools. In this work, we present a machine learning architecture that uses a set of inputs maximally reduced with respect to the full 6-dimensional Lorentz symmetry, and is…
▽ More
Many current approaches to machine learning in particle physics use generic architectures that require large numbers of parameters and disregard underlying physics principles, limiting their applicability as scientific modeling tools. In this work, we present a machine learning architecture that uses a set of inputs maximally reduced with respect to the full 6-dimensional Lorentz symmetry, and is fully permutation-equivariant throughout. We study the application of this network architecture to the standard task of top quark tagging and show that the resulting network outperforms all existing competitors despite much lower model complexity. In addition, we present a Lorentz-covariant variant of the same network applied to a 4-momentum regression task.
△ Less
Submitted 23 December, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Innovations in trigger and data acquisition systems for next-generation physics facilities
Authors:
Rainer Bartoldus,
Catrin Bernius,
David W. Miller
Abstract:
Data-intensive physics facilities are increasingly reliant on heterogeneous and large-scale data processing and computational systems in order to collect, distribute, process, filter, and analyze the ever increasing huge volumes of data being collected. Moreover, these tasks are often performed in hard real-time or quasi real-time processing pipelines that place extreme constraints on various para…
▽ More
Data-intensive physics facilities are increasingly reliant on heterogeneous and large-scale data processing and computational systems in order to collect, distribute, process, filter, and analyze the ever increasing huge volumes of data being collected. Moreover, these tasks are often performed in hard real-time or quasi real-time processing pipelines that place extreme constraints on various parameters and design choices for those systems. Consequently, a large number and variety of challenges are faced to design, construct, and operate such facilities. This is especially true at the energy and intensity frontiers of particle physics where bandwidths of raw data can exceed 100 TB/s of heterogeneous, high-dimensional data sourced from 300M+ individual sensors. Data filtering and compression algorithms deployed at these facilities often operate at the level of 1 part in $10^5$, and once executed, these algorithms drive the data curation process, further highlighting the critical roles that these systems have in the physics impact of those endeavors. This White Paper aims to highlight the challenges that these facilities face in the design of the trigger and data acquisition instrumentation and systems, as well as in their installation, commissioning, integration and operation, and in building the domain knowledge and technical expertise required to do so.
△ Less
Submitted 17 March, 2022; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Symmetry Group Equivariant Architectures for Physics
Authors:
Alexander Bogatskiy,
Sanmay Ganguly,
Thomas Kipf,
Risi Kondor,
David W. Miller,
Daniel Murnane,
Jan T. Offermann,
Mariel Pettee,
Phiala Shanahan,
Chase Shimmin,
Savannah Thais
Abstract:
Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In t…
▽ More
Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
The RATTLE Motion Planning Algorithm for Robust Online Parametric Model Improvement with On-Orbit Validation
Authors:
Keenan Albee,
Monica Ekal,
Brian Coltin,
Rodrigo Ventura,
Richard Linares,
David W. Miller
Abstract:
Certain forms of uncertainty that robotic systems encounter can be explicitly learned within the context of a known model, like parametric model uncertainties such as mass and moments of inertia. Quantifying such parametric uncertainty is important for more accurate prediction of the system behavior, leading to safe and precise task execution. In tandem, providing a form of robustness guarantee ag…
▽ More
Certain forms of uncertainty that robotic systems encounter can be explicitly learned within the context of a known model, like parametric model uncertainties such as mass and moments of inertia. Quantifying such parametric uncertainty is important for more accurate prediction of the system behavior, leading to safe and precise task execution. In tandem, providing a form of robustness guarantee against prevailing uncertainty levels like environmental disturbances and current model knowledge is also desirable. To that end, the authors' previously proposed RATTLE algorithm, a framework for online information-aware motion planning, is outlined and extended to enhance its applicability to real robotic systems. RATTLE provides a clear tradeoff between information-seeking motion and traditional goal-achieving motion and features online-updateable models. Additionally, online-updateable low level control robustness guarantees and a new method for automatic adjustment of information content down to a specified estimation precision is proposed. Results of extensive experimentation in microgravity using the Astrobee robots aboard the International Space Station and practical implementation details are presented, demonstrating RATTLE's capabilities for real-time, robust, online-updateable, and model information-seeking motion planning capabilities under parametric uncertainty.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Small Cohort of Epilepsy Patients Showed Increased Activity on Facebook before Sudden Unexpected Death
Authors:
Ian B. Wood,
Rion Brattig Correia,
Wendy R. Miller,
Luis M. Rocha
Abstract:
Sudden Unexpected Death in Epilepsy (SUDEP) remains a leading cause of death in people with epilepsy. Despite the constant risk for patients and bereavement to family members, to date the physiological mechanisms of SUDEP remain unknown. Here we explore the potential to identify putative predictive signals of SUDEP from online digital behavioral data using text and sentiment analysis. Specifically…
▽ More
Sudden Unexpected Death in Epilepsy (SUDEP) remains a leading cause of death in people with epilepsy. Despite the constant risk for patients and bereavement to family members, to date the physiological mechanisms of SUDEP remain unknown. Here we explore the potential to identify putative predictive signals of SUDEP from online digital behavioral data using text and sentiment analysis. Specifically, we analyze Facebook timelines of six epilepsy patients deceased due to SUDEP, donated by surviving family members. We find preliminary evidence for behavioral changes detectable by text and sentiment analysis tools. Namely, in the months preceding their SUDEP event patient social media timelines show: i) increase in verbosity; ii) increased use of functional words; and iii) sentiment shifts as measured by different sentiment analysis tools. Combined, these results suggest that social media engagement, as well as its sentiment, may serve as possible early-warning signals for SUDEP in people with epilepsy. While the small sample of patient timelines analyzed in this study prevents generalization, our preliminary investigation demonstrates the potential of social media data as complementary data in larger studies of SUDEP and epilepsy.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Online Information-Aware Motion Planning with Inertial Parameter Learning for Robotic Free-Flyers
Authors:
Monica Ekal,
Keenan Albee,
Brian Coltin,
Rodrigo Ventura,
Richard Linares,
David W. Miller
Abstract:
Space free-flyers like the Astrobee robots currently operating aboard the International Space Station must operate with inherent system uncertainties. Parametric uncertainties like mass and moment of inertia are especially important to quantify in these safety-critical space systems and can change in scenarios such as on-orbit cargo movement, where unknown grappled payloads significantly change th…
▽ More
Space free-flyers like the Astrobee robots currently operating aboard the International Space Station must operate with inherent system uncertainties. Parametric uncertainties like mass and moment of inertia are especially important to quantify in these safety-critical space systems and can change in scenarios such as on-orbit cargo movement, where unknown grappled payloads significantly change the system dynamics. Cautiously learning these uncertainties en route can potentially avoid time- and fuel-consuming pure system identification maneuvers. Recognizing this, this work proposes RATTLE, an online information-aware motion planning algorithm that explicitly weights parametric model-learning coupled with real-time replanning capability that can take advantage of improved system models. The method consists of a two-tiered (global and local) planner, a low-level model predictive controller, and an online parameter estimator that produces estimates of the robot's inertial properties for more informed control and replanning on-the-fly; all levels of the planning and control feature online update-able models. Simulation results of RATTLE for the Astrobee free-flyer grappling an uncertain payload are presented alongside results of a hardware demonstration showcasing the ability to explicitly encourage model parametric learning while achieving otherwise useful motion.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Towards an Interpretable Data-driven Trigger System for High-throughput Physics Facilities
Authors:
Chinmaya Mahesh,
Kristin Dona,
David W. Miller,
Yuxin Chen
Abstract:
Data-intensive science is increasingly reliant on real-time processing capabilities and machine learning workflows, in order to filter and analyze the extreme volumes of data being collected. This is especially true at the energy and intensity frontiers of particle physics where bandwidths of raw data can exceed 100 Tb/s of heterogeneous, high-dimensional data sourced from hundreds of millions of…
▽ More
Data-intensive science is increasingly reliant on real-time processing capabilities and machine learning workflows, in order to filter and analyze the extreme volumes of data being collected. This is especially true at the energy and intensity frontiers of particle physics where bandwidths of raw data can exceed 100 Tb/s of heterogeneous, high-dimensional data sourced from hundreds of millions of individual sensors. In this paper, we introduce a new data-driven approach for designing and optimizing high-throughput data filtering and trigger systems such as those in use at physics facilities like the Large Hadron Collider (LHC). Concretely, our goal is to design a data-driven filtering system with a minimal run-time cost for determining which data event to keep, while preserving (and potentially improving upon) the distribution of the output as generated by the hand-designed trigger system. We introduce key insights from interpretable predictive modeling and cost-sensitive learning in order to account for non-local inefficiencies in the current paradigm and construct a cost-effective data filtering and trigger model that does not compromise physics coverage.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Improving Neural Networks for Time Series Forecasting using Data Augmentation and AutoML
Authors:
Indrajeet Y. Javeri,
Mohammadhossein Toutiaee,
Ismailcem B. Arpinar,
Tom W. Miller,
John A. Miller
Abstract:
Statistical methods such as the Box-Jenkins method for time-series forecasting have been prominent since their development in 1970. Many researchers rely on such models as they can be efficiently estimated and also provide interpretability. However, advances in machine learning research indicate that neural networks can be powerful data modeling techniques, as they can give higher accuracy for a p…
▽ More
Statistical methods such as the Box-Jenkins method for time-series forecasting have been prominent since their development in 1970. Many researchers rely on such models as they can be efficiently estimated and also provide interpretability. However, advances in machine learning research indicate that neural networks can be powerful data modeling techniques, as they can give higher accuracy for a plethora of learning problems and datasets. In the past, they have been tried on time-series forecasting as well, but their overall results have not been significantly better than the statistical models especially for intermediate length times series data. Their modeling capacities are limited in cases where enough data may not be available to estimate the large number of parameters that these non-linear models require. This paper presents an easy to implement data augmentation method to significantly improve the performance of such networks. Our method, Augmented-Neural-Network, which involves using forecasts from statistical models, can help unlock the power of neural networks on intermediate length time-series and produces competitive results. It shows that data augmentation, when paired with Automated Machine Learning techniques such as Neural Architecture Search, can help to find the best neural architecture for a given time-series. Using the combination of these, demonstrates significant enhancement in the forecasting accuracy of three neural network-based models for a COVID-19 dataset, with a maximum improvement in forecasting accuracy by 21.41%, 24.29%, and 16.42%, respectively, over the neural networks that do not use augmented data.
△ Less
Submitted 7 May, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Lorentz Group Equivariant Neural Network for Particle Physics
Authors:
Alexander Bogatskiy,
Brandon Anderson,
Jan T. Offermann,
Marwah Roussi,
David W. Miller,
Risi Kondor
Abstract:
We present a neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics. The architecture is based on the theory of the finite-dimensional representations of the Lorentz group and the equivariant nonlinearity involves the tensor product. For classification tasks in particle physics, we demonstra…
▽ More
We present a neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics. The architecture is based on the theory of the finite-dimensional representations of the Lorentz group and the equivariant nonlinearity involves the tensor product. For classification tasks in particle physics, we demonstrate that such an equivariant architecture leads to drastically simpler models that have relatively few learnable parameters and are much more physically interpretable than leading approaches that use CNNs and point cloud approaches. The competitive performance of the network is demonstrated on a public classification dataset [27] for tagging top quark decays given energy-momenta of jet constituents produced in proton-proton collisions.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Design, Construction, and Use of a Single Board Computer Beowulf Cluster: Application of the Small-Footprint, Low-Cost, InSignal 5420 Octa Board
Authors:
James J. Cusick,
William Miller,
Nicholas Laurita,
Tasha Pitt
Abstract:
In recent years development in the area of Single Board Computing has been advancing rapidly. At Wolters Kluwer's Corporate Legal Services Division a prototyping effort was undertaken to establish the utility of such devices for practical and general computing needs. This paper presents the background of this work, the design and construction of a 64 core 96 GHz cluster, and their possibility of y…
▽ More
In recent years development in the area of Single Board Computing has been advancing rapidly. At Wolters Kluwer's Corporate Legal Services Division a prototyping effort was undertaken to establish the utility of such devices for practical and general computing needs. This paper presents the background of this work, the design and construction of a 64 core 96 GHz cluster, and their possibility of yielding approximately 400 GFLOPs from a set of small footprint InSignal boards created for just over $2,300. Additionally this paper discusses the software environment on the cluster, the use of a standard Beowulf library and its operation, as well as other software application uses including Elastic Search and ownCloud. Finally, consideration will be given to the future use of such technologies in a business setting in order to introduce new Open Source technologies, reduce computing costs, and improve Time to Market.
Index Terms: Single Board Computing, Raspberry Pi, InSignal Exynos 5420, Linaro Ubuntu Linux, High Performance Computing, Beowulf clustering, Open Source, MySQL, MongoDB, ownCloud, Computing Architectures, Parallel Computing, Cluster Computing
△ Less
Submitted 5 January, 2015; v1 submitted 30 December, 2014;
originally announced January 2015.
-
Evolution of choices over time: The U.S. Presidential election 2012 and the NY City Mayoral Election, 2013
Authors:
Mukkai Krishnamoorthy,
Wesley Miller,
Raju Krishnamoorthy
Abstract:
We conducted surveys before and after the 2012 U.S. Presidential election and prior to the NY City Mayoral election in 2013. The surveys were done using Amazon Turk. This poster describes the results of our analysis of the surveys and predicts the winner of the NY City Mayoral Election.
We conducted surveys before and after the 2012 U.S. Presidential election and prior to the NY City Mayoral election in 2013. The surveys were done using Amazon Turk. This poster describes the results of our analysis of the surveys and predicts the winner of the NY City Mayoral Election.
△ Less
Submitted 8 October, 2013; v1 submitted 3 October, 2013;
originally announced October 2013.
-
A Polynomial Time Algorithm for Finding Bayesian Probabilities from Marginal Constraints
Authors:
J. W. Miller,
R. M. Goodman
Abstract:
A method of calculating probability values from a system of marginal constraints is presented. Previous systems for finding the probability of a single attribute have either made an independence assumption concerning the evidence or have required, in the worst case, time exponential in the number of attributes of the system. In this paper a closed form solution to the probability of an attribute g…
▽ More
A method of calculating probability values from a system of marginal constraints is presented. Previous systems for finding the probability of a single attribute have either made an independence assumption concerning the evidence or have required, in the worst case, time exponential in the number of attributes of the system. In this paper a closed form solution to the probability of an attribute given the evidence is found. The closed form solution, however does not enforce the (non-linear) constraint that all terms in the underlying distribution be positive. The equation requires O(r^3) steps to evaluate, where r is the number of independent marginal constraints describing the system at the time of evaluation. Furthermore, a marginal constraint may be exchanged with a new constraint, and a new solution calculated in O(r^2) steps. This method is appropriate for calculating probabilities in a real time expert system
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
Reduced Criteria for Degree Sequences
Authors:
Jeffrey W. Miller
Abstract:
For many types of graphs, criteria have been discovered that give necessary and sufficient conditions for an integer sequence to be the degree sequence of such a graph. These criteria tend to take the form of a set of inequalities, and in the case of the Erdős-Gallai criterion (for simple undirected graphs) and the Gale-Ryser criterion (for bipartite graphs), it has been shown that the number of i…
▽ More
For many types of graphs, criteria have been discovered that give necessary and sufficient conditions for an integer sequence to be the degree sequence of such a graph. These criteria tend to take the form of a set of inequalities, and in the case of the Erdős-Gallai criterion (for simple undirected graphs) and the Gale-Ryser criterion (for bipartite graphs), it has been shown that the number of inequalities that must be checked can be reduced significantly. We show that similar reductions hold for the corresponding criteria for many other types of graphs, including bipartite r-multigraphs, bipartite graphs with structural edges, directed graphs, r-multigraphs, and tournaments. We also prove a reduction for imbalance sequences.
△ Less
Submitted 12 January, 2013; v1 submitted 11 May, 2012;
originally announced May 2012.
-
Exact Enumeration and Sampling of Matrices with Specified Margins
Authors:
Jeffrey W. Miller,
Matthew T. Harrison
Abstract:
We describe a dynamic programming algorithm for exact counting and exact uniform sampling of matrices with specified row and column sums. The algorithm runs in polynomial time when the column sums are bounded. Binary or non-negative integer matrices are handled. The method is distinguished by applicability to non-regular margins, tractability on large matrices, and the capacity for exact sampling.
We describe a dynamic programming algorithm for exact counting and exact uniform sampling of matrices with specified row and column sums. The algorithm runs in polynomial time when the column sums are bounded. Binary or non-negative integer matrices are handled. The method is distinguished by applicability to non-regular margins, tractability on large matrices, and the capacity for exact sampling.
△ Less
Submitted 2 April, 2011;
originally announced April 2011.
-
A student's guide to searching the literature using online databases
Authors:
Casey W. Miller,
Michelle D. Chabot,
Troy C. Messina
Abstract:
A method is described to empower students to efficiently perform general and literature searches using online resources. The method was tested on undergraduate and graduate students with varying backgrounds with scientific literature. Students involved in this study showed marked improvement in their awareness of how and where to find accurate scientific information.
A method is described to empower students to efficiently perform general and literature searches using online resources. The method was tested on undergraduate and graduate students with varying backgrounds with scientific literature. Students involved in this study showed marked improvement in their awareness of how and where to find accurate scientific information.
△ Less
Submitted 3 March, 2010;
originally announced March 2010.
-
A Robust System for Natural Spoken Dialogue
Authors:
James F. Allen,
Bradford W. Miller,
Eric K. Ringger,
Teresa Sikorski
Abstract:
This paper describes a system that leads us to believe in the feasibility of constructing natural spoken dialogue systems in task-oriented domains. It specifically addresses the issue of robust interpretation of speech in the presence of recognition errors. Robustness is achieved by a combination of statistical error post-correction, syntactically- and semantically-driven robust parsing, and ext…
▽ More
This paper describes a system that leads us to believe in the feasibility of constructing natural spoken dialogue systems in task-oriented domains. It specifically addresses the issue of robust interpretation of speech in the presence of recognition errors. Robustness is achieved by a combination of statistical error post-correction, syntactically- and semantically-driven robust parsing, and extensive use of the dialogue context. We present an evaluation of the system using time-to-completion and the quality of the final solution that suggests that most native speakers of English can use the system successfully with virtually no training.
△ Less
Submitted 18 June, 1996;
originally announced June 1996.