-
EXONEST: The Bayesian Exoplanetary Explorer
Authors:
Kevin H. Knuth,
Ben Placek,
Daniel Angerhausen,
Jennifer L. Carter,
Bryan D'Angelo,
Anthony D. Gai,
Bertrand Carado
Abstract:
The fields of astronomy and astrophysics are currently engaged in an unprecedented era of discovery as recent missions have revealed thousands of exoplanets orbiting other stars. While the Kepler Space Telescope mission has enabled most of these exoplanets to be detected by identifying transiting events, exoplanets often exhibit additional photometric effects that can be used to improve the charac…
▽ More
The fields of astronomy and astrophysics are currently engaged in an unprecedented era of discovery as recent missions have revealed thousands of exoplanets orbiting other stars. While the Kepler Space Telescope mission has enabled most of these exoplanets to be detected by identifying transiting events, exoplanets often exhibit additional photometric effects that can be used to improve the characterization of exoplanets. The EXONEST Exoplanetary Explorer is a Bayesian exoplanet inference engine based on nested sampling and originally designed to analyze archived Kepler Space Telescope and CoRoT (Convection Rotation et Transits planétaires) exoplanet mission data. We discuss the EXONEST software package and describe how it accommodates plug-and-play models of exoplanet-associated photometric effects for the purpose of exoplanet detection, characterization and scientific hypothesis testing. The current suite of models allows for both circular and eccentric orbits in conjunction with photometric effects, such as the primary transit and secondary eclipse, reflected light, thermal emissions, ellipsoidal variations, Doppler beaming and superrotation. We discuss our new efforts to expand the capabilities of the software to include more subtle photometric effects involving reflected and refracted light. We discuss the EXONEST inference engine design and introduce our plans to port the current MATLAB-based EXONEST software package over to the next generation Exoplanetary Explorer, which will be a Python-based open source project with the capability to employ third-party plug-and-play models of exoplanet-related photometric effects.
△ Less
Submitted 24 December, 2017;
originally announced December 2017.
-
Convergent Bayesian formulations of blind source separation and electromagnetic source estimation
Authors:
Kevin H. Knuth,
Herbert G. Vaughan Jr
Abstract:
We consider two areas of research that have been developing in parallel over the last decade: blind source separation (BSS) and electromagnetic source estimation (ESE). BSS deals with the recovery of source signals when only mixtures of signals can be obtained from an array of detectors and the only prior knowledge consists of some information about the nature of the source signals. On the other h…
▽ More
We consider two areas of research that have been developing in parallel over the last decade: blind source separation (BSS) and electromagnetic source estimation (ESE). BSS deals with the recovery of source signals when only mixtures of signals can be obtained from an array of detectors and the only prior knowledge consists of some information about the nature of the source signals. On the other hand, ESE utilizes knowledge of the electromagnetic forward problem to assign source signals to their respective generators, while information about the signals themselves is typically ignored. We demonstrate that these two techniques can be derived from the same starting point using the Bayesian formalism. This suggests a means by which new algorithms can be developed that utilize as much relevant information as possible. We also briefly mention some preliminary work that supports the value of integrating information used by these two techniques and review the kinds of information that may be useful in addressing the ESE problem.
△ Less
Submitted 21 January, 2015;
originally announced January 2015.
-
Difficulties applying recent blind source separation techniques to EEG and MEG
Authors:
Kevin H. Knuth
Abstract:
High temporal resolution measurements of human brain activity can be performed by recording the electric potentials on the scalp surface (electroencephalography, EEG), or by recording the magnetic fields near the surface of the head (magnetoencephalography, MEG). The analysis of the data is problematic due to the fact that multiple neural generators may be simultaneously active and the potentials…
▽ More
High temporal resolution measurements of human brain activity can be performed by recording the electric potentials on the scalp surface (electroencephalography, EEG), or by recording the magnetic fields near the surface of the head (magnetoencephalography, MEG). The analysis of the data is problematic due to the fact that multiple neural generators may be simultaneously active and the potentials and magnetic fields from these sources are superimposed on the detectors. It is highly desirable to un-mix the data into signals representing the behaviors of the original individual generators. This general problem is called blind source separation and several recent techniques utilizing maximum entropy, minimum mutual information, and maximum likelihood estimation have been applied. These techniques have had much success in separating signals such as natural sounds or speech, but appear to be ineffective when applied to EEG or MEG signals. Many of these techniques implicitly assume that the source distributions have a large kurtosis, whereas an analysis of EEG/MEG signals reveals that the distributions are multimodal. This suggests that more effective separation techniques could be designed for EEG and MEG signals.
△ Less
Submitted 21 January, 2015;
originally announced January 2015.
-
Information-Theoretic Methods for Identifying Relationships among Climate Variables
Authors:
Kevin H. Knuth,
Deniz Gençağa,
William B. Rossow
Abstract:
Information-theoretic quantities, such as entropy, are used to quantify the amount of information a given variable provides. Entropies can be used together to compute the mutual information, which quantifies the amount of information two variables share. However, accurately estimating these quantities from data is extremely challenging. We have developed a set of computational techniques that allo…
▽ More
Information-theoretic quantities, such as entropy, are used to quantify the amount of information a given variable provides. Entropies can be used together to compute the mutual information, which quantifies the amount of information two variables share. However, accurately estimating these quantities from data is extremely challenging. We have developed a set of computational techniques that allow one to accurately compute marginal and joint entropies. These algorithms are probabilistic in nature and thus provide information on the uncertainty in our estimates, which enable us to establish statistical significance of our findings. We demonstrate these methods by identifying relations between cloud data from the International Satellite Cloud Climatology Project (ISCCP) and data from other sources, such as equatorial pacific sea surface temperatures (SST).
△ Less
Submitted 19 December, 2014;
originally announced December 2014.
-
Bayesian Evidence and Model Selection
Authors:
Kevin H. Knuth,
Michael Habeck,
Nabin K. Malakar,
Asim M. Mubeen,
Ben Placek
Abstract:
In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Specific attention is paid to the Laplace approximation, variational Bayes, importance sampling, thermodynamic integration, and nested sampling and its…
▽ More
In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Specific attention is paid to the Laplace approximation, variational Bayes, importance sampling, thermodynamic integration, and nested sampling and its recent variants. Analogies to statistical physics, from which many of these techniques originate, are discussed in order to provide readers with deeper insights that may lead to new techniques. The utility of Bayesian model testing in the domain sciences is demonstrated by presenting four specific practical examples considered within the context of signal processing in the areas of signal detection, sensor characterization, scientific model selection and molecular force characterization.
△ Less
Submitted 23 November, 2015; v1 submitted 11 November, 2014;
originally announced November 2014.
-
Bayesian Source Separation Applied to Identifying Complex Organic Molecules in Space
Authors:
Kevin H. Knuth,
Man Kit Tse,
Joshua Choinsky,
Haley A. Maunu,
Duane F. Carbon
Abstract:
Emission from a class of benzene-based molecules known as Polycyclic Aromatic Hydrocarbons (PAHs) dominates the infrared spectrum of star-forming regions. The observed emission appears to arise from the combined emission of numerous PAH species, each with its unique spectrum. Linear superposition of the PAH spectra identifies this problem as a source separation problem. It is, however, of a formid…
▽ More
Emission from a class of benzene-based molecules known as Polycyclic Aromatic Hydrocarbons (PAHs) dominates the infrared spectrum of star-forming regions. The observed emission appears to arise from the combined emission of numerous PAH species, each with its unique spectrum. Linear superposition of the PAH spectra identifies this problem as a source separation problem. It is, however, of a formidable class of source separation problems given that different PAH sources potentially number in the hundreds, even thousands, and there is only one measured spectral signal for a given astrophysical site. Fortunately, the source spectra of the PAHs are known, but the signal is also contaminated by other spectral sources. We describe our ongoing work in developing Bayesian source separation techniques relying on nested sampling in conjunction with an ON/OFF mechanism enabling simultaneous estimation of the probability that a particular PAH species is present and its contribution to the spectrum.
△ Less
Submitted 18 March, 2014;
originally announced March 2014.
-
The Spatial Sensitivity Function of a Light Sensor
Authors:
N. K. Malakar,
A. J. Mesiti,
K. H. Knuth
Abstract:
The Spatial Sensitivity Function (SSF) is used to quantify a detector's sensitivity to a spatially-distributed input signal. By weighting the incoming signal with the SSF and integrating, the overall scalar response of the detector can be estimated. This project focuses on estimating the SSF of a light intensity sensor consisting of a photodiode. This light sensor has been used previously in the K…
▽ More
The Spatial Sensitivity Function (SSF) is used to quantify a detector's sensitivity to a spatially-distributed input signal. By weighting the incoming signal with the SSF and integrating, the overall scalar response of the detector can be estimated. This project focuses on estimating the SSF of a light intensity sensor consisting of a photodiode. This light sensor has been used previously in the Knuth Cyberphysics Laboratory on a robotic arm that performs its own experiments to locate a white circle in a dark field (Knuth et al., 2007). To use the light sensor to learn about its surroundings, the robot's inference software must be able to model and predict the light sensor's response to a hypothesized stimulus. Previous models of the light sensor treated it as a point sensor and ignored its spatial characteristics. Here we propose a parametric approach where the SSF is described by a mixture of Gaussians (MOG). By performing controlled calibration experiments with known stimulus inputs, we used nested sampling to estimate the SSF of the light sensor using an MOG model with the number of Gaussians ranging from one to five. By comparing the evidence computed for each MOG model, we found that one Gaussian is sufficient to describe the SSF to the accuracy we require. Future work will involve incorporating this more accurate SSF into the Bayesian machine learning software for the robotic system and studying how this detailed information about the properties of the light sensor will improve robot's ability to learn.
△ Less
Submitted 10 February, 2014;
originally announced February 2014.
-
Informed Source Separation: A Bayesian Tutorial
Authors:
Kevin H. Knuth
Abstract:
Source separation problems are ubiquitous in the physical sciences; any situation where signals are superimposed calls for source separation to estimate the original signals. In this tutorial I will discuss the Bayesian approach to the source separation problem. This approach has a specific advantage in that it requires the designer to explicitly describe the signal model in addition to any other…
▽ More
Source separation problems are ubiquitous in the physical sciences; any situation where signals are superimposed calls for source separation to estimate the original signals. In this tutorial I will discuss the Bayesian approach to the source separation problem. This approach has a specific advantage in that it requires the designer to explicitly describe the signal model in addition to any other information or assumptions that go into the problem description. This leads naturally to the idea of informed source separation, where the algorithm design incorporates relevant information about the specific problem. This approach promises to enable researchers to design their own high-quality algorithms that are specifically tailored to the problem at hand.
△ Less
Submitted 12 November, 2013;
originally announced November 2013.
-
Bayesian Odds-Ratio Filters: A Template-Based Method for Online Detection of P300 Evoked Responses
Authors:
Asim M. Mubeen,
Kevin H. Knuth
Abstract:
Template-based signal detection most often relies on computing a correlation, or a dot product, between an incoming data stream and a signal template. While such a correlation results in an ongoing estimate of the magnitude of the signal in the data stream, it does not directly indicate the presence or absence of a signal. Instead, the problem of signal detection is one of model-selection. Here we…
▽ More
Template-based signal detection most often relies on computing a correlation, or a dot product, between an incoming data stream and a signal template. While such a correlation results in an ongoing estimate of the magnitude of the signal in the data stream, it does not directly indicate the presence or absence of a signal. Instead, the problem of signal detection is one of model-selection. Here we explore the use of the Bayesian odds-ratio (OR), which is the ratio of posterior probabilities of a signal-plus-noise model over a noise-only model. We demonstrate this method by applying it to simulated electroencephalographic (EEG) signals based on the P300 response, which is widely used in both Brain Computer Interface (BCI) and Brain Machine Interface (BMI) systems. The efficacy of this algorithm is demonstrated by comparing the receiver operating characteristic (ROC) curves of the OR-based (logOR) filter to the usual correlation method where we find a significant improvement in P300 detection. The logOR filter promises to improve the accuracy and speed of the detection of evoked brain responses in BCI/BMI applications as well the detection of template signals in general.
△ Less
Submitted 4 April, 2013;
originally announced April 2013.
-
Modeling a Sensor to Improve its Efficacy
Authors:
N. K. Malakar,
D. Gladkov,
K. H. Knuth
Abstract:
Robots rely on sensors to provide them with information about their surroundings. However, high-quality sensors can be extremely expensive and cost-prohibitive. Thus many robotic systems must make due with lower-quality sensors. Here we demonstrate via a case study how modeling a sensor can improve its efficacy when employed within a Bayesian inferential framework. As a test bed we employ a roboti…
▽ More
Robots rely on sensors to provide them with information about their surroundings. However, high-quality sensors can be extremely expensive and cost-prohibitive. Thus many robotic systems must make due with lower-quality sensors. Here we demonstrate via a case study how modeling a sensor can improve its efficacy when employed within a Bayesian inferential framework. As a test bed we employ a robotic arm that is designed to autonomously take its own measurements using an inexpensive LEGO light sensor to estimate the position and radius of a white circle on a black field. The light sensor integrates the light arriving from a spatially distributed region within its field of view weighted by its Spatial Sensitivity Function (SSF). We demonstrate that by incorporating an accurate model of the light sensor SSF into the likelihood function of a Bayesian inference engine, an autonomous system can make improved inferences about its surroundings. The method presented here is data-based, fairly general, and made with plug-and play in mind so that it could be implemented in similar problems.
△ Less
Submitted 18 March, 2013;
originally announced March 2013.
-
Maximum Joint Entropy and Information-Based Collaboration of Automated Learning Machines
Authors:
N. K. Malakar,
K. H. Knuth,
D. J. Lary
Abstract:
We are working to develop automated intelligent agents, which can act and react as learning machines with minimal human intervention. To accomplish this, an intelligent agent is viewed as a question-asking machine, which is designed by coupling the processes of inference and inquiry to form a model-based learning unit. In order to select maximally-informative queries, the intelligent agent needs t…
▽ More
We are working to develop automated intelligent agents, which can act and react as learning machines with minimal human intervention. To accomplish this, an intelligent agent is viewed as a question-asking machine, which is designed by coupling the processes of inference and inquiry to form a model-based learning unit. In order to select maximally-informative queries, the intelligent agent needs to be able to compute the relevance of a question. This is accomplished by employing the inquiry calculus, which is dual to the probability calculus, and extends information theory by explicitly requiring context. Here, we consider the interaction between two question-asking intelligent agents, and note that there is a potential information redundancy with respect to the two questions that the agents may choose to pose. We show that the information redundancy is minimized by maximizing the joint entropy of the questions, which simultaneously maximizes the relevance of each question while minimizing the mutual information between them. Maximum joint entropy is therefore an important principle of information-based collaboration, which enables intelligent agents to efficiently learn together.
△ Less
Submitted 14 November, 2011;
originally announced November 2011.
-
Entropy-Based Search Algorithm for Experimental Design
Authors:
N. K. Malakar,
K. H. Knuth
Abstract:
The scientific method relies on the iterated processes of inference and inquiry. The inference phase consists of selecting the most probable models based on the available data; whereas the inquiry phase consists of using what is known about the models to select the most relevant experiment. Optimizing inquiry involves searching the parameterized space of experiments to select the experiment that p…
▽ More
The scientific method relies on the iterated processes of inference and inquiry. The inference phase consists of selecting the most probable models based on the available data; whereas the inquiry phase consists of using what is known about the models to select the most relevant experiment. Optimizing inquiry involves searching the parameterized space of experiments to select the experiment that promises, on average, to be maximally informative. In the case where it is important to learn about each of the model parameters, the relevance of an experiment is quantified by Shannon entropy of the distribution of experimental outcomes predicted by a probable set of models. If the set of potential experiments is described by many parameters, we must search this high-dimensional entropy space. Brute force search methods will be slow and computationally expensive. We present an entropy-based search algorithm, called nested entropy sampling, to select the most informative experiment for efficient experimental design. This algorithm is inspired by Skilling's nested sampling algorithm used in inference and borrows the concept of a rising threshold while a set of experiment samples are maintained. We demonstrate that this algorithm not only selects highly relevant experiments, but also is more efficient than brute force search. Such entropic search techniques promise to greatly benefit autonomous experimental design.
△ Less
Submitted 29 August, 2010;
originally announced August 2010.