-
Developing 3D Virtual Safety Risk Terrain for UAS Operations in Complex Urban Environments
Authors:
Zhenyu Gao,
John-Paul Clarke,
Javid Mardanov,
Karen Marais
Abstract:
Unmanned Aerial Systems (UAS), an integral part of the Advanced Air Mobility (AAM) vision, are capable of performing a wide spectrum of tasks in urban environments. The societal integration of UAS is a pivotal challenge, as these systems must operate harmoniously within the constraints imposed by regulations and societal concerns. In complex urban environments, UAS safety has been a perennial obst…
▽ More
Unmanned Aerial Systems (UAS), an integral part of the Advanced Air Mobility (AAM) vision, are capable of performing a wide spectrum of tasks in urban environments. The societal integration of UAS is a pivotal challenge, as these systems must operate harmoniously within the constraints imposed by regulations and societal concerns. In complex urban environments, UAS safety has been a perennial obstacle to their large-scale deployment. To mitigate UAS safety risk and facilitate risk-aware UAS operations planning, we propose a novel concept called \textit{3D virtual risk terrain}. This concept converts public risk constraints in an urban environment into 3D exclusion zones that UAS operations should avoid to adequately reduce risk to Entities of Value (EoV). To implement the 3D virtual risk terrain, we develop a conditional probability framework that comprehensively integrates most existing basic models for UAS ground risk. To demonstrate the concept, we build risk terrains on a Chicago downtown model and observe their characteristics under different conditions. We believe that the 3D virtual risk terrain has the potential to become a new routine tool for risk-aware UAS operations planning, urban airspace management, and policy development. The same idea can also be extended to other forms of societal impacts, such as noise, privacy, and perceived risk.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Target Network and Truncation Overcome The Deadly Triad in $Q$-Learning
Authors:
Zaiwei Chen,
John Paul Clarke,
Siva Theja Maguluri
Abstract:
$Q…
▽ More
$Q$-learning with function approximation is one of the most empirically successful while theoretically mysterious reinforcement learning (RL) algorithms, and was identified in Sutton (1999) as one of the most important theoretical open problems in the RL community. Even in the basic linear function approximation setting, there are well-known divergent examples. In this work, we show that \textit{target network} and \textit{truncation} together are enough to provably stabilize $Q$-learning with linear function approximation, and we establish the finite-sample guarantees. The result implies an $O(ε^{-2})$ sample complexity up to a function approximation error. Moreover, our results do not require strong assumptions or modifying the problem parameters as in existing literature.
△ Less
Submitted 3 May, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Predictive Criteria for Prior Selection Using Shrinkage in Linear Models
Authors:
Dean Dustin,
Bertrand Clarke,
Jennifer Clarke
Abstract:
Choosing a shrinkage method can be done by selecting a penalty from a list of pre-specified penalties or by constructing a penalty based on the data. If a list of penalties for a class of linear models is given, we provide comparisons based on sample size and number of non-zero parameters under a predictive stability criterion based on data perturbation. These comparisons provide recommendations f…
▽ More
Choosing a shrinkage method can be done by selecting a penalty from a list of pre-specified penalties or by constructing a penalty based on the data. If a list of penalties for a class of linear models is given, we provide comparisons based on sample size and number of non-zero parameters under a predictive stability criterion based on data perturbation. These comparisons provide recommendations for penalty selection in a variety of settings. If the preference is to construct a penalty customized for a given problem, then we propose a technique based on genetic algorithms, again using a predictive criterion. We find that, in general, a custom penalty never performs worse than any commonly used penalties but that there are cases the custom penalty reduces to a recognizable penalty. Since penalty selection is mathematically equivalent to prior selection, our method also constructs priors.
The techniques and recommendations we offer are intended for finite sample cases. In this context, we argue that predictive stability under perturbation is one of the few relevant properties that can be invoked when the true model is not known. Nevertheless, we study variable inclusion in simulations and, as part of our shrinkage selection strategy, we include oracle property considerations. In particular, we see that the oracle property typically holds for penalties that satisfy basic regularity conditions and therefore is not restrictive enough to play a direct role in penalty selection. In addition, our real data example also includes considerations merging from model mis-specification.
△ Less
Submitted 6 January, 2022;
originally announced January 2022.
-
Modeling association in microbial communities with clique loglinear models
Authors:
Adrian Dobra,
Camilo Valdes,
Dragana Ajdic,
Bertrand Clarke,
Jennifer Clarke
Abstract:
There is a growing awareness of the important roles that microbial communities play in complex biological processes. Modern investigation of these often uses next generation sequencing of metagenomic samples to determine community composition. We propose a statistical technique based on clique loglinear models and Bayes model averaging to identify microbial components in a metagenomic sample at va…
▽ More
There is a growing awareness of the important roles that microbial communities play in complex biological processes. Modern investigation of these often uses next generation sequencing of metagenomic samples to determine community composition. We propose a statistical technique based on clique loglinear models and Bayes model averaging to identify microbial components in a metagenomic sample at various taxonomic levels that have significant associations. We describe the model class, a stochastic search technique for model selection, and the calculation of estimates of posterior probabilities of interest. We demonstrate our approach using data from the Human Microbiome Project and from a study of the skin microbiome in chronic wound healing. Our technique also identifies significant dependencies among microbial components as evidence of possible microbial syntrophy.
KEYWORDS: contingency tables, graphical models, model selection, microbiome, next generation sequencing
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
Clustering categorical data via ensembling dissimilarity matrices
Authors:
Saeid Amiri,
Bertrand Clarke,
Jennifer Clarke
Abstract:
We present a technique for clustering categorical data by generating many dissimilarity matrices and averaging over them. We begin by demonstrating our technique on low dimensional categorical data and comparing it to several other techniques that have been proposed. Then we give conditions under which our method should yield good results in general. Our method extends to high dimensional categori…
▽ More
We present a technique for clustering categorical data by generating many dissimilarity matrices and averaging over them. We begin by demonstrating our technique on low dimensional categorical data and comparing it to several other techniques that have been proposed. Then we give conditions under which our method should yield good results in general. Our method extends to high dimensional categorical data of equal lengths by ensembling over many choices of explanatory variables. In this context we compare our method with two other methods. Finally, we extend our method to high dimensional categorical data vectors of unequal length by using alignment techniques to equalize the lengths. We give examples to show that our method continues to provide good results, in particular, better in the context of genome sequences than clusterings suggested by phylogenetic trees.
△ Less
Submitted 25 June, 2015;
originally announced June 2015.
-
A General Hybrid Clustering Technique
Authors:
Saeid Amiri,
Bertrand Clarke,
Jennifer Clarke,
Hoyt A. Koepke
Abstract:
Here, we propose a clustering technique for general clustering problems including those that have non-convex clusters. For a given desired number of clusters $K$, we use three stages to find a clustering. The first stage uses a hybrid clustering technique to produce a series of clusterings of various sizes (randomly selected). They key steps are to find a $K$-means clustering using $K_\ell$ cluste…
▽ More
Here, we propose a clustering technique for general clustering problems including those that have non-convex clusters. For a given desired number of clusters $K$, we use three stages to find a clustering. The first stage uses a hybrid clustering technique to produce a series of clusterings of various sizes (randomly selected). They key steps are to find a $K$-means clustering using $K_\ell$ clusters where $K_\ell \gg K$ and then joins these small clusters by using single linkage clustering. The second stage stabilizes the result of stage one by reclustering via the `membership matrix' under Hamming distance to generate a dendrogram. The third stage is to cut the dendrogram to get $K^*$ clusters where $K^* \geq K$ and then prune back to $K$ to give a final clustering. A variant on our technique also gives a reasonable estimate for $K_T$, the true number of clusters.
We provide a series of arguments to justify the steps in the stages of our methods and we provide numerous examples involving real and simulated data to compare our technique with other related techniques.
△ Less
Submitted 5 March, 2015; v1 submitted 3 March, 2015;
originally announced March 2015.
-
Topological and Statistical Behavior Classifiers for Tracking Applications
Authors:
Paul Bendich,
Sang Chin,
Jesse Clarke,
Jonathan deSena,
John Harer,
Elizabeth Munch,
Andrew Newman,
David Porter,
David Rouse,
Nate Strawn,
Adam Watkins
Abstract:
We introduce the first unified theory for target tracking using Multiple Hypothesis Tracking, Topological Data Analysis, and machine learning. Our string of innovations are 1) robust topological features are used to encode behavioral information, 2) statistical models are fitted to distributions over these topological features, and 3) the target type classification methods of Wigren and Bar Shalom…
▽ More
We introduce the first unified theory for target tracking using Multiple Hypothesis Tracking, Topological Data Analysis, and machine learning. Our string of innovations are 1) robust topological features are used to encode behavioral information, 2) statistical models are fitted to distributions over these topological features, and 3) the target type classification methods of Wigren and Bar Shalom et al. are employed to exploit the resulting likelihoods for topological features inside of the tracking procedure. To demonstrate the efficacy of our approach, we test our procedure on synthetic vehicular data generated by the Simulation of Urban Mobility package.
△ Less
Submitted 1 June, 2014;
originally announced June 2014.
-
An ensemble approach to improved prediction from multitype data
Authors:
Jennifer Clarke,
David Seo
Abstract:
We have developed a strategy for the analysis of newly available binary data to improve outcome predictions based on existing data (binary or non-binary). Our strategy involves two modeling approaches for the newly available data, one combining binary covariate selection via LASSO with logistic regression and one based on logic trees. The results of these models are then compared to the results…
▽ More
We have developed a strategy for the analysis of newly available binary data to improve outcome predictions based on existing data (binary or non-binary). Our strategy involves two modeling approaches for the newly available data, one combining binary covariate selection via LASSO with logistic regression and one based on logic trees. The results of these models are then compared to the results of a model based on existing data with the objective of combining model results to achieve the most accurate predictions. The combination of model predictions is aided by the use of support vector machines to identify subspaces of the covariate space in which specific models lead to successful predictions. We demonstrate our approach in the analysis of single nucleotide polymorphism (SNP) data and traditional clinical risk factors for the prediction of coronary heart disease.
△ Less
Submitted 21 May, 2008;
originally announced May 2008.