Search | arXiv e-print repository

Mode Selection in Cognitive Radar Networks

Authors: William W. Howard, Samuel R. Shebert, Anthony F. Martone, R. Michael Buehrer

Abstract: Cognitive Radar Networks, which were popularized by Simon Haykin in 2006, have been proposed to address limitations with legacy radar installations. These limitations include large physical size, power consumption, fixed operating parameters, and single point vulnerabilities. Cognitive radar solves part of this problem through adaptability, using biologically inspired techniques to observe the env… ▽ More Cognitive Radar Networks, which were popularized by Simon Haykin in 2006, have been proposed to address limitations with legacy radar installations. These limitations include large physical size, power consumption, fixed operating parameters, and single point vulnerabilities. Cognitive radar solves part of this problem through adaptability, using biologically inspired techniques to observe the environment and adjust operation accordingly. Cognitive radar networks (CRNs) extend the capabilities of cognitive radar spatially, providing the opportunity to observe targets from multiple angles to mitigate stealth effects; distribute resources over space and in time; obtain better tracking performance; and gain more information from a scene. Often, problems of cognition in CRNs are viewed through the lens of iterative learning problems - one or multiple cognitive processes are implemented in the network, where each process first observes the environment, then selects operating parameters (from discrete or continuous options) using the history of observations and previous rewards, then repeats the cycle. Further, cognitive radar networks often are modeled with a flexible architecture and wide-bandwidth front-ends, enabling the addition of electronic support measures such as passive signal estimation. In this work we consider questions of the form "How should a cognitive radar network choose when to observe targets?" and "How can a cognitive radar network reduce the amount of energy it uses?". We implement tools from the multi-armed bandit and age of information literature to select modes for the network, choosing either an active radar mode or a passive signal estimation mode. We show that through the use of target classes, the network can determine how often each target should be observed to optimize tracking performance. △ Less

Submitted 4 April, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

Comments: 13 pages, 11 figures

arXiv:2307.12936 [pdf, other]

Timely Target Tracking: Distributed Updating in Cognitive Radar Networks

Authors: William W. Howard, Anthony F. Martone, R. Michael Buehrer

Abstract: Cognitive radar networks (CRNs) are capable of optimizing operating parameters in order to provide actionable information to an operator or secondary system. CRNs have been proposed to answer the need for low-cost devices tracking potentially large numbers of targets in geographically diverse regions. Networks of small-scale devices have also been shown to outperform legacy, large scale, high pric… ▽ More Cognitive radar networks (CRNs) are capable of optimizing operating parameters in order to provide actionable information to an operator or secondary system. CRNs have been proposed to answer the need for low-cost devices tracking potentially large numbers of targets in geographically diverse regions. Networks of small-scale devices have also been shown to outperform legacy, large scale, high price, single-device installations. In this work, we consider a CRN tracking multiple targets with a goal of providing information which is both fresh and accurate to a measurement fusion center (FC). We show that under a constraint on the update rate of each radar node, the network is able to utilize Age of Information (AoI) metrics to maximize the resource utilization and minimize error per track. Since information freshness is critical to decision-making, this structure enables a CRN to provide the highest-quality information possible to a downstream system or operator. We discuss centralized and distributed approaches to solving this problem, taking into account the quality of node observations, the maneuverability of each target, and a limit on the rate at which any node may provide updates to the FC. We present a centralized AoI-inspired node selection metric, where a FC requests updates from specific nodes. We compare this against several alternative techniques. Further, we provide a distributed approach which utilizes the Age of Incorrect Information (AoII) metric, allowing each independent node to provide updates according to the targets it can observe. We provide mathematical analysis of the rate limits defined for the centralized and distributed approaches, showing that they are equivalent. We conclude with numerical simulations demonstrating that the performance of the algorithms exceeds that of alternative approaches, both in resource utilization and in tracking performance. △ Less

Submitted 2 March, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: 15 pages, double column, 14 figures

arXiv:2207.06917 [pdf, ps, other]

Online Bayesian Meta-Learning for Cognitive Tracking Radar

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: A key component of cognitive radar is the ability to generalize, or achieve consistent performance across a range of sensing environments, since aspects of the physical scene may vary over time. This presents a challenge for learning-based waveform selection approaches, since transmission policies which are effective in one scene may be highly suboptimal in another. We address this problem by stra… ▽ More A key component of cognitive radar is the ability to generalize, or achieve consistent performance across a range of sensing environments, since aspects of the physical scene may vary over time. This presents a challenge for learning-based waveform selection approaches, since transmission policies which are effective in one scene may be highly suboptimal in another. We address this problem by strategically biasing a learning algorithm by exploiting high-level structure across tracking instances, referred to as meta-learning. In this work, we develop an online meta-learning approach for waveform-agile tracking. This approach uses information gained from previous target tracks to speed up and enhance learning in new tracking instances. This results in sample-efficient learning across a class of finite state target channels by exploiting inherent similarity across tracking scenes, attributed to common physical elements such as target type or clutter statistics. We formulate the online waveform selection problem within the framework of Bayesian learning, and provide prior-dependent performance bounds for the meta-learning problem using Probability Approximately Correct (PAC)-Bayes theory. We present a computationally feasible meta-posterior sampling algorithm and study the performance in a simulation study consisting of diverse scenes. Finally, we examine the potential performance benefits and practical challenges associated with online meta-learning for waveform-agile tracking. △ Less

Submitted 7 February, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: 14 pages, 5 figures

arXiv:2202.05294 [pdf, ps, other]

Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Authors: Charles E. Thornton, R. Michael Buehrer, Harpreet S. Dhillon, Anthony F. Martone

Abstract: Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires… ▽ More Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Further, due to computational concerns, many traditional approaches are limited to near-term, or myopic, optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process (MDP), allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene which can be modeled as a $U^{\text{th}}$ order Markov process for a finite, but unknown, integer $U$. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree, which is used as a probabalistic model for the scene's behavior. We show that an algorithm based on a multi-alphabet version of the Context-Tree Weighting (CTW) method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: 23 pages, 5 figures

arXiv:2110.11450 [pdf, ps, other]

Online Meta-Learning for Scene-Diverse Waveform-Agile Radar Target Tracking

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: A fundamental problem for waveform-agile radar systems is that the true environment is unknown, and transmission policies which perform well for a particular tracking instance may be sub-optimal for another. Additionally, there is a limited time window for each target track, and the radar must learn an effective strategy from a sequence of measurements in a timely manner. This paper studies a Baye… ▽ More A fundamental problem for waveform-agile radar systems is that the true environment is unknown, and transmission policies which perform well for a particular tracking instance may be sub-optimal for another. Additionally, there is a limited time window for each target track, and the radar must learn an effective strategy from a sequence of measurements in a timely manner. This paper studies a Bayesian meta-learning model for radar waveform selection which seeks to learn an inductive bias to quickly optimize tracking performance across a class of radar scenes. We cast the waveform selection problem in the framework of sequential Bayesian inference, and introduce a contextual bandit variant of the recently proposed meta-Thompson Sampling algorithm, which learns an inductive bias in the form of a prior distribution. Each track is treated as an instance of a contextual bandit learning problem, coming from a task distribution. We show that the meta-learning process results in an appreciably faster learning, resulting in significantly fewer lost tracks than a conventional learning approach equipped with an uninformative prior. △ Less

Submitted 21 October, 2021; originally announced October 2021.

Comments: 6 pages, 6 figures

arXiv:2108.01656 [pdf, other]

Open Set Wireless Standard Classification Using Convolutional Neural Networks

Authors: Samuel R. Shebert, Anthony F. Martone, R. Michael Buehrer

Abstract: In congested electromagnetic environments, cognitive radios require knowledge about other emitters in order to optimize their dynamic spectrum access strategy. Deep learning classification algorithms have been used to recognize the wireless signal standards of emitters with high accuracy, but are limited to classifying signal classes that appear in their training set. This diminishes the performan… ▽ More In congested electromagnetic environments, cognitive radios require knowledge about other emitters in order to optimize their dynamic spectrum access strategy. Deep learning classification algorithms have been used to recognize the wireless signal standards of emitters with high accuracy, but are limited to classifying signal classes that appear in their training set. This diminishes the performance of deep learning classifiers deployed in the field because they cannot accurately identify signals from classes outside of the training set. In this paper, a convolution neural network based open set classifier is proposed with the ability to detect if signals are not from known classes by thresholding the output sigmoid activation. The open set classifier was trained on 4G LTE, 5G NR, IEEE 802.11ax, Bluetooth Low Energy 5.0, and Narrowband Internet-of-Things signals impaired with Rayleigh or Rician fading, AWGN, frequency offsets, and in-phase/quadrature imbalances. Then, the classifier was tested on OFDM, SC-FDMA, SC, AM, and FM signals, which did not appear in the training set classes. The closed set classifier achieves an average accuracy of 94.5% for known signals with SNR's greater than 0 dB, but by design, has a 0% accuracy detecting signals from unknown classes. On the other hand, the open set classifier retains an 86% accuracy for known signal classes, but can detect 95.5% of signals from unknown classes with SNR's greater than 0 dB. △ Less

Submitted 3 August, 2021; originally announced August 2021.

arXiv:2108.01181 [pdf, ps, other]

Waveform Selection for Radar Tracking in Target Channels With Memory via Universal Learning

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: In tracking radar, the sensing environment often varies significantly over a track duration due to the target's trajectory and dynamic interference. Adapting the radar's waveform using partial information about the state of the scene has been shown to provide performance benefits in many practical scenarios. Moreover, radar measurements generally exhibit strong temporal correlation, allowing memor… ▽ More In tracking radar, the sensing environment often varies significantly over a track duration due to the target's trajectory and dynamic interference. Adapting the radar's waveform using partial information about the state of the scene has been shown to provide performance benefits in many practical scenarios. Moreover, radar measurements generally exhibit strong temporal correlation, allowing memory-based learning algorithms to effectively learn waveform selection strategies. This work examines a radar system which builds a compressed model of the radar-environment interface in the form of a context-tree. The radar uses this context tree-based model to select waveforms in a signal-dependent target channel, which may respond adversarially to the radar's strategy. This approach is guaranteed to asymptotically converge to the average-cost optimal policy for any stationary target channel that can be represented as a Markov process of order U < $\infty$, where the constant U is unknown to the radar. The proposed approach is tested in a simulation study, and is shown to provide tracking performance improvements over two state-of-the-art waveform selection schemes. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 6 pages, 2 figures

arXiv:2103.05541 [pdf, ps, other]

doi 10.1109/TAES.2021.3109110

Constrained Contextual Bandit Learning for Adaptive Radar Waveform Selection

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: A sequential decision process in which an adaptive radar system repeatedly interacts with a finite-state target channel is studied. The radar is capable of passively sensing the spectrum at regular intervals, which provides side information for the waveform selection process. The radar transmitter uses the sequence of spectrum observations as well as feedback from a collocated receiver to select w… ▽ More A sequential decision process in which an adaptive radar system repeatedly interacts with a finite-state target channel is studied. The radar is capable of passively sensing the spectrum at regular intervals, which provides side information for the waveform selection process. The radar transmitter uses the sequence of spectrum observations as well as feedback from a collocated receiver to select waveforms which accurately estimate target parameters. It is shown that the waveform selection problem can be effectively addressed using a linear contextual bandit formulation in a manner that is both computationally feasible and sample efficient. Stochastic and adversarial linear contextual bandit models are introduced, allowing the radar to achieve effective performance in broad classes of physical environments. Simulations in a radar-communication coexistence scenario, as well as in an adversarial radar-jammer scenario, demonstrate that the proposed formulation provides a substantial improvement in target detection performance when Thompson Sampling and EXP3 algorithms are used to drive the waveform selection process. Further, it is shown that the harmful impacts of pulse-agile behavior on coherently processed radar data can be mitigated by adopting a time-varying constraint on the radar's waveform catalog. △ Less

Submitted 14 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 16 pages, 9 figures. arXiv admin note: text overlap with arXiv:2010.15698

Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 2021

arXiv:2102.00274 [pdf, ps, other]

Multi-player Bandits for Distributed Cognitive Radar

Authors: William W. Howard, Charles E. Thornton, Anthony F. Martone, R. Michael Buehrer

Abstract: With new applications for radar networks such as automotive control or indoor localization, the need for spectrum sharing and general interoperability is expected to rise. This paper describes the application of multi-player bandit algorithms for waveform selection to a distributed cognitive radar network that must coexist with a communications system. Specifically, we make the assumption that rad… ▽ More With new applications for radar networks such as automotive control or indoor localization, the need for spectrum sharing and general interoperability is expected to rise. This paper describes the application of multi-player bandit algorithms for waveform selection to a distributed cognitive radar network that must coexist with a communications system. Specifically, we make the assumption that radar nodes in the network have no dedicated communication channel. As we will discuss later, nodes can communicate indirectly by taking actions which intentionally interfere with other nodes and observing the resulting collisions. The radar nodes attempt to optimize their own spectrum utilization while avoiding collisions, not only with each other, but with the communications system. The communications system is assumed to statically occupy some subset of the bands available to the radar network. First, we examine models that assume each node experiences equivalent channel conditions, and later examine a model that relaxes this assumption. △ Less

Submitted 30 January, 2021; originally announced February 2021.

Comments: 6 pages, 2 figures, accepted to the 2021 IEEE Radar Conference

arXiv:2010.15698 [pdf, ps, other]

Constrained Online Learning to Mitigate Distortion Effects in Pulse-Agile Cognitive Radar

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: Pulse-agile radar systems have demonstrated favorable performance in dynamic electromagnetic scenarios. However, the use of non-identical waveforms within a radar's coherent processing interval may lead to harmful distortion effects when pulse-Doppler processing is used. This paper presents an online learning framework to optimize detection performance while mitigating harmful sidelobe levels. The… ▽ More Pulse-agile radar systems have demonstrated favorable performance in dynamic electromagnetic scenarios. However, the use of non-identical waveforms within a radar's coherent processing interval may lead to harmful distortion effects when pulse-Doppler processing is used. This paper presents an online learning framework to optimize detection performance while mitigating harmful sidelobe levels. The radar waveform selection process is formulated as a linear contextual bandit problem, within which waveform adaptations which exceed a tolerable level of expected distortion are eliminated. The constrained online learning approach is effective and computationally feasible, evidenced by simulations in a radar-communication coexistence scenario and in the presence of intentional adaptive jamming. This approach is applied to both stochastic and adversarial contextual bandit learning models and the detection performance in dynamic scenarios is evaluated. △ Less

Submitted 26 February, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

Comments: 6 pages, 4 figures, to be presented at IEEE Radar Conference, Atlanta GA, May 2021

arXiv:2008.10149 [pdf, ps, other]

Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Abstract: This paper describes a sequential, or online, learning scheme for adaptive radar transmissions that facilitate spectrum sharing with a non-cooperative cellular network. First, the interference channel between the radar and a spatially distant cellular network is modeled. Then, a linear Contextual Bandit (CB) learning framework is applied to drive the radar's behavior. The fundamental trade-off bet… ▽ More This paper describes a sequential, or online, learning scheme for adaptive radar transmissions that facilitate spectrum sharing with a non-cooperative cellular network. First, the interference channel between the radar and a spatially distant cellular network is modeled. Then, a linear Contextual Bandit (CB) learning framework is applied to drive the radar's behavior. The fundamental trade-off between exploration and exploitation is balanced by a proposed Thompson Sampling (TS) algorithm, a pseudo-Bayesian approach which selects waveform parameters based on the posterior probability that a specific waveform is optimal, given discounted channel information as context. It is shown that the contextual TS approach converges more rapidly to behavior that minimizes mutual interference and maximizes spectrum utilization than comparable contextual bandit algorithms. Additionally, we show that the TS learning scheme results in a favorable SINR distribution compared to other online learning algorithms. Finally, the proposed TS algorithm is compared to a deep reinforcement learning model. We show that the TS algorithm maintains competitive performance with a more complex Deep Q-Network (DQN). △ Less

Submitted 23 August, 2020; originally announced August 2020.

Comments: 6 pages, 6 Figures, To Appear in Proc. IEEE GLOBECOM 2020, Taipei Taiwan

arXiv:2006.13173 [pdf, ps, other]

doi 10.1109/TCCN.2020.3019605

Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments

Authors: Charles E. Thornton, Mark A. Kozy, R. Michael Buehrer, Anthony F. Martone, Kelly D. Sherbondy

Abstract: In this paper, dynamic non-cooperative coexistence between a cognitive pulsed radar and a nearby communications system is addressed by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate mu… ▽ More In this paper, dynamic non-cooperative coexistence between a cognitive pulsed radar and a nearby communications system is addressed by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate mutual interference with other systems and improve target detection performance while also maintaining sufficient utilization of the available frequency bands required for a fine range resolution. We demonstrate that our approach, based on the Deep Q-Learning (DQL) algorithm, enhances important radar metrics, including SINR and bandwidth utilization, more effectively than policy iteration or sense-and-avoid (SAA) approaches in a variety of realistic coexistence environments. We also extend the DQL-based approach to incorporate Double Q-learning and a recurrent neural network to form a Double Deep Recurrent Q-Network (DDRQN). We demonstrate the DDRQN results in favorable performance and stability compared to DQL and policy iteration. Finally, we demonstrate the practicality of our proposed approach through a discussion of experiments performed on a software defined radar (SDRadar) prototype system. Our experimental results indicate that the proposed Deep RL approach significantly improves radar detection performance in congested spectral environments when compared to policy iteration and SAA. △ Less

Submitted 27 August, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

Comments: To Appear in IEEE Transactions on Cognitive Communications and Networking, 2020

arXiv:2001.01799 [pdf, ps, other]

doi 10.1109/RADAR42522.2020.9114698

Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar

Authors: Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone, Kelly D. Sherbondy

Abstract: In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achi… ▽ More In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a congested spectral environment, and the ability to share 100MHz spectrum with an uncooperative communications system. We examine policy iteration, which solves an environment posed as a Markov Decision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well as Deep RL techniques, which utilize a form of Q-Learning to approximate a parameterized function that is used by the radar to select optimal actions. We show that RL techniques are beneficial over a Sense-and-Avoid (SAA) scheme and discuss the conditions under which each approach is most effective. △ Less

Submitted 13 March, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

Comments: Accepted for publication at IEEE Intl. Radar Conference, Washington DC, Apr. 2020. This is the author's version of the work

arXiv:1709.08573 [pdf, other]

Coexistence between Communication and Radar Systems - A Survey

Authors: Mina Labib, Vuk Marojevic, Anthony F. Martone, Jeffrey H. Reed, Amir I. Zaghloul

Abstract: Data traffic demand in cellular networks has been tremendously growing and has led to creating congested RF environment. Accordingly, innovative approaches for spectrum sharing have been proposed and implemented to accommodate several systems within the same frequency band. Spectrum sharing between radar and communication systems is one of the important research and development areas. In this pape… ▽ More Data traffic demand in cellular networks has been tremendously growing and has led to creating congested RF environment. Accordingly, innovative approaches for spectrum sharing have been proposed and implemented to accommodate several systems within the same frequency band. Spectrum sharing between radar and communication systems is one of the important research and development areas. In this paper, we present the fundamental spectrum sharing concepts and technologies, then we provide an updated and comprehensive survey of spectrum sharing techniques that have been developed to enable some of the wireless communication systems to coexist in the same band as radar systems. △ Less

Submitted 25 September, 2017; originally announced September 2017.

Comments: Accepted for publication in Radio Science Bulletin

Showing 1–14 of 14 results for author: Martone, A F