-
Separating States in Astronomical Sources Using Hidden Markov Models: With a Case Study of Flaring and Quiescence on EV Lac
Authors:
Robert Zimmerman,
David A. van Dyk,
Vinay L. Kashyap,
Aneta Siemiginowska
Abstract:
We present a new method to distinguish between different states (e.g., high and low, quiescent and flaring) in astronomical sources with count data. The method models the underlying physical process as latent variables following a continuous-space Markov chain that determines the expected Poisson counts in observed light curves in multiple passbands. For the underlying state process, we consider s…
▽ More
We present a new method to distinguish between different states (e.g., high and low, quiescent and flaring) in astronomical sources with count data. The method models the underlying physical process as latent variables following a continuous-space Markov chain that determines the expected Poisson counts in observed light curves in multiple passbands. For the underlying state process, we consider several autoregressive processes, yielding continuous-space hidden Markov models of varying complexity. Under these models, we can infer the state that the object is in at any given time. The continuous state predictions from these models are then dichotomized with the help of a finite mixture model to produce state classifications. We apply these techniques to X-ray data from the active dMe flare star EV Lac, splitting the data into quiescent and flaring states. We find that a first-order vector autoregressive process efficiently separates flaring from quiescence: flaring occurs over 30-40% of the observation durations, a well-defined persistent quiescent state can be identified, and the flaring state is characterized by higher plasma temperatures and emission measures.
△ Less
Submitted 3 September, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Effect of Systematic Uncertainties on Density and Temperature Estimates in Coronae of Capella
Authors:
Xixi Yu,
Vinay L. Kashyap,
Giulio Del Zanna,
David A. van Dyk,
David C. Stenning,
Connor P. Ballance,
Harry P. Warren
Abstract:
We estimate the coronal density of Capella using the O VII and Fe XVII line systems in the soft X-ray regime that have been observed over the course of the Chandra mission. Our analysis combines measures of error due to uncertainty in the underlying atomic data with statistical errors in the Chandra data to derive meaningful overall uncertainties on the plasma density of the coronae of Capella. We…
▽ More
We estimate the coronal density of Capella using the O VII and Fe XVII line systems in the soft X-ray regime that have been observed over the course of the Chandra mission. Our analysis combines measures of error due to uncertainty in the underlying atomic data with statistical errors in the Chandra data to derive meaningful overall uncertainties on the plasma density of the coronae of Capella. We consider two Bayesian frameworks. First, the so-called pragmatic-Bayesian approach considers the atomic data and their uncertainties as fully specified and uncorrectable. The fully-Bayesian approach, on the other hand, allows the observed spectral data to update the atomic data and their uncertainties, thereby reducing the overall errors on the inferred parameters. To incorporate atomic data uncertainties, we obtain a set of atomic data replicates, the distribution of which captures their uncertainty. A principal component analysis of these replicates allows us to represent the atomic uncertainty with a lower-dimensional multivariate Gaussian distribution. A $t$-distribution approximation of the uncertainties of a subset of plasma parameters including a priori temperature information, obtained from the temperature-sensitive-only Fe XVII spectral line analysis, is carried forward into the density- and temperature-sensitive O VII spectral line analysis. Markov Chain Monte Carlo based model fitting is implemented including Multi-step Monte Carlo Gibbs Sampler and Hamiltonian Monte Carlo. Our analysis recovers an isothermally approximated coronal plasma temperature of $\approx$5 MK and a coronal plasma density of $\approx$10$^{10}$ cm$^{-3}$, with uncertainties of 0.1 and 0.2 dex respectively.
△ Less
Submitted 18 June, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
eBASCS: Disentangling Overlapping Astronomical Sources II, using Spatial, Spectral, and Temporal Information
Authors:
Antoine D. Meyer,
David A. van Dyk,
Vinay L. Kashyap,
Luis F. Campos,
David E. Jones,
Aneta Siemiginowska,
Andreas Zezas
Abstract:
The analysis of individual X-ray sources that appear in a crowded field can easily be compromised by the misallocation of recorded events to their originating sources. Even with a small number of sources, that nonetheless have overlapping point spread functions, the allocation of events to sources is a complex task that is subject to uncertainty. We develop a Bayesian method designed to sift high-…
▽ More
The analysis of individual X-ray sources that appear in a crowded field can easily be compromised by the misallocation of recorded events to their originating sources. Even with a small number of sources, that nonetheless have overlapping point spread functions, the allocation of events to sources is a complex task that is subject to uncertainty. We develop a Bayesian method designed to sift high-energy photon events from multiple sources with overlapping point spread functions, leveraging the differences in their spatial, spectral, and temporal signatures. The method probabilistically assigns each event to a given source. Such a disentanglement allows more detailed spectral or temporal analysis to focus on the individual component in isolation, free of contamination from other sources or the background. We are also able to compute source parameters of interest like their locations, relative brightness, and background contamination, while accounting for the uncertainty in event assignments. Simulation studies that include event arrival time information demonstrate that the temporal component improves event disambiguation beyond using only spatial and spectral information. The proposed methods correctly allocate up to 65% more events than the corresponding algorithms that ignore event arrival time information. We apply our methods to two stellar X-ray binaries, UV Cet and HBC515 A, observed with Chandra. We demonstrate that our methods are capable of removing the contamination due to a strong flare on UV Cet B in its companion approximately 40 times weaker during that event, and that evidence for spectral variability at timescales of a few ks can be determined in HBC515 Aa and HBC515 Ab.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Change point detection and image segmentation for time series of astrophysical images
Authors:
Cong Xu,
Hans Moritz Günther,
Vinay L. Kashyap,
Thomas C. M. Lee,
Andreas Zezas
Abstract:
Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates)…
▽ More
Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates), and viewed as a time series of non-homogeneous Poisson images. The method assumes that at each time point, the corresponding multi-band image stack is an unknown 3D piecewise constant function including Poisson noise. It also assumes that all image stacks between any two adjacent change points (in time domain) share the same unknown piecewise constant function. The proposed method is designed to estimate the number and the locations of all the change points (in time domain), as well as all the unknown piecewise constant functions between any pairs of the change points. The method applies the minimum description length (MDL) principle to perform this task. A practical algorithm is also developed to solve the corresponding complicated optimization problem. Simulation experiments and applications to real datasets show that the proposed method enjoys very promising empirical properties. Applications to two real datasets, the XMM observation of a flaring star and an emerging solar coronal loop, illustrate the usage of the proposed method and the scientific insight gained from it.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Calibration Concordance for Astronomical Instruments via Multiplicative Shrinkage
Authors:
Yang Chen,
Xiao-Li Meng,
Xufei Wang,
David A. van Dyk,
Herman L. Marshall,
Vinay L. Kashyap
Abstract:
Calibration data are often obtained by observing several well-understood objects simultaneously with multiple instruments, such as satellites for measuring astronomical sources. Analyzing such data and obtaining proper concordance among the instruments is challenging when the physical source models are not well understood, when there are uncertainties in "known" physical quantities, or when data q…
▽ More
Calibration data are often obtained by observing several well-understood objects simultaneously with multiple instruments, such as satellites for measuring astronomical sources. Analyzing such data and obtaining proper concordance among the instruments is challenging when the physical source models are not well understood, when there are uncertainties in "known" physical quantities, or when data quality varies in ways that cannot be fully quantified. Furthermore, the number of model parameters increases with both the number of instruments and the number of sources. Thus, concordance of the instruments requires careful modeling of the mean signals, the intrinsic source differences, and measurement errors. In this paper, we propose a log-Normal hierarchical model and a more general log-t model that respect the multiplicative nature of the mean signals via a half-variance adjustment, yet permit imperfections in the mean modeling to be absorbed by residual variances. We present analytical solutions in the form of power shrinkage in special cases and develop reliable MCMC algorithms for general cases. We apply our method to several data sets obtained with a variety of X-ray telescopes such as Chandra. We demonstrate that our method provides helpful and practical guidance for astrophysicists when adjusting for disagreements among instruments.
△ Less
Submitted 26 September, 2018; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Bayesian Estimates of Astronomical Time Delays between Gravitationally Lensed Stochastic Light Curves
Authors:
Hyungsuk Tak,
Kaisey Mandel,
David A. van Dyk,
Vinay L. Kashyap,
Xiao-Li Meng,
Aneta Siemiginowska
Abstract:
The gravitational field of a galaxy can act as a lens and deflect the light emitted by a more distant object such as a quasar. Strong gravitational lensing causes multiple images of the same quasar to appear in the sky. Since the light in each gravitationally lensed image traverses a different path length from the quasar to the Earth, fluctuations in the source brightness are observed in the sever…
▽ More
The gravitational field of a galaxy can act as a lens and deflect the light emitted by a more distant object such as a quasar. Strong gravitational lensing causes multiple images of the same quasar to appear in the sky. Since the light in each gravitationally lensed image traverses a different path length from the quasar to the Earth, fluctuations in the source brightness are observed in the several images at different times. The time delay between these fluctuations can be used to constrain cosmological parameters and can be inferred from the time series of brightness data or light curves of each image. To estimate the time delay, we construct a model based on a state-space representation for irregularly observed time series generated by a latent continuous-time Ornstein-Uhlenbeck process. We account for microlensing, an additional source of independent long-term extrinsic variability, via a polynomial regression. Our Bayesian strategy adopts a Metropolis-Hastings within Gibbs sampler. We improve the sampler by using an ancillarity-sufficiency interweaving strategy and adaptive Markov chain Monte Carlo. We introduce a profile likelihood of the time delay as an approximation of its marginal posterior distribution. The Bayesian and profile likelihood approaches complement each other, producing almost identical results; the Bayesian method is more principled but the profile likelihood is simpler to implement. We demonstrate our estimation strategy using simulated data of doubly- and quadruply-lensed quasars, and observed data from quasars Q0957+561 and J1029+2623.
△ Less
Submitted 30 January, 2017; v1 submitted 2 February, 2016;
originally announced February 2016.
-
Preprocessing Solar Images while Preserving their Latent Structure
Authors:
Nathan M Stein,
David A van Dyk,
Vinay L Kashyap
Abstract:
Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics Observatory, a NASA satellite, collect massive streams of high resolution images of the Sun through multiple wavelength filters. Reconstructing pixel-by-pixel thermal properties based on these images can be framed as an ill-posed inverse problem with Poisson noise, but this reconstruction is computationally expensive and…
▽ More
Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics Observatory, a NASA satellite, collect massive streams of high resolution images of the Sun through multiple wavelength filters. Reconstructing pixel-by-pixel thermal properties based on these images can be framed as an ill-posed inverse problem with Poisson noise, but this reconstruction is computationally expensive and there is disagreement among researchers about what regularization or prior assumptions are most appropriate. This article presents an image segmentation framework for preprocessing such images in order to reduce the data volume while preserving as much thermal information as possible for later downstream analyses. The resulting segmented images reflect thermal properties but do not depend on solving the ill-posed inverse problem. This allows users to avoid the Poisson inverse problem altogether or to tackle it on each of $\sim$10 segments rather than on each of $\sim$10$^7$ pixels, reducing computing time by a factor of $\sim$10$^6$. We employ a parametric class of dissimilarities that can be expressed as cosine dissimilarity functions or Hellinger distances between nonlinearly transformed vectors of multi-passband observations in each pixel. We develop a decision theoretic framework for choosing the dissimilarity that minimizes the expected loss that arises when estimating identifiable thermal properties based on segmented images rather than on a pixel-by-pixel basis. We also examine the efficacy of different dissimilarities for recovering clusters in the underlying thermal properties. The expected losses are computed under scientifically motivated prior distributions. Two simulation studies guide our choices of dissimilarity function. We illustrate our method by segmenting images of a coronal hole observed on 26 February 2015.
△ Less
Submitted 14 December, 2015;
originally announced December 2015.
-
Detecting Unspecified Structure in Low-Count Images
Authors:
Nathan M. Stein,
David A. van Dyk,
Vinay L. Kashyap,
Aneta Siemiginowska
Abstract:
Unexpected structure in images of astronomical sources often presents itself upon visual inspection of the image, but such apparent structure may either correspond to true features in the source or be due to noise in the data. This paper presents a method for testing whether inferred structure in an image with Poisson noise represents a significant departure from a baseline (null) model of the ima…
▽ More
Unexpected structure in images of astronomical sources often presents itself upon visual inspection of the image, but such apparent structure may either correspond to true features in the source or be due to noise in the data. This paper presents a method for testing whether inferred structure in an image with Poisson noise represents a significant departure from a baseline (null) model of the image. To infer image structure, we conduct a Bayesian analysis of a full model that uses a multiscale component to allow flexible departures from the posited null model. As a test statistic, we use a tail probability of the posterior distribution under the full model. This choice of test statistic allows us to estimate a computationally efficient upper bound on a p-value that enables us to draw strong conclusions even when there are limited computational resources that can be devoted to simulations under the null model. We demonstrate the statistical performance of our method on simulated images. Applying our method to an X-ray image of the quasar 0730+257, we find significant evidence against the null model of a single point source and uniform background, lending support to the claim of an X-ray jet.
△ Less
Submitted 15 October, 2015;
originally announced October 2015.
-
Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources
Authors:
Raymond K. W. Wong,
Vinay L. Kashyap,
Thomas C. M. Lee,
David A. van Dyk
Abstract:
Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a…
▽ More
Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a marked Poisson process, where photon wavelengths are regarded as marks and both the Poisson intensity parameter and the distribution of the marks are allowed to change. To the best of our knowledge this is the first effort to embed change points into a marked Poisson process. Between the change points, the spectrum is modeled non-parametrically using a mixture of a smooth radial basis expansion and a number of local deviations from the smooth term representing spectral emission lines. Because the model is over parameterized we employ an $\ell_1$ penalty. The tuning parameter in the penalty and the number of change points are determined via the minimum description length principle. Our method is validated via a series of simulation studies and its practical utility is illustrated in the analysis of the ultra-fast rotating yellow giant star known as FK Com.
△ Less
Submitted 10 December, 2015; v1 submitted 27 August, 2015;
originally announced August 2015.
-
Automatic estimation of flux distributions of astrophysical source populations
Authors:
Raymond K. W. Wong,
Paul Baines,
Alexander Aue,
Thomas C. M. Lee,
Vinay L. Kashyap
Abstract:
In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theo…
▽ More
In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theoretical models for source populations and their evolution. Under restrictive assumptions the relationship should be linear. In practice, however, when simple theoretical models fail, it is common for astrophysicists to use prespecified piecewise linear models. This paper proposes a methodology for estimating both the number and locations of "breakpoints" in astrophysical source populations that extends beyond existing work in this field. An important component of the proposed methodology is a new interwoven EM algorithm that computes parameter estimates. It is shown that in simple settings such estimates are asymptotically consistent despite the complex nature of the parameter space. Through simulation studies it is demonstrated that the proposed methodology is capable of accurately detecting structural breaks in a variety of parameter configurations. This paper concludes with an application of our methodology to the Chandra Deep Field North (CDFN) data set.
△ Less
Submitted 24 November, 2014; v1 submitted 4 May, 2013;
originally announced May 2013.
-
A Bayesian Analysis of the Correlations Among Sunspot Cycles
Authors:
Yaming Yu,
David A. van Dyk,
Vinay L. Kashyap,
C. Alex Young
Abstract:
Sunspot numbers form a comprehensive, long-duration proxy of solar activity and have been used numerous times to empirically investigate the properties of the solar cycle. A number of correlations have been discovered over the 24 cycles for which observational records are available. Here we carry out a sophisticated statistical analysis of the sunspot record that reaffirms these correlations, and…
▽ More
Sunspot numbers form a comprehensive, long-duration proxy of solar activity and have been used numerous times to empirically investigate the properties of the solar cycle. A number of correlations have been discovered over the 24 cycles for which observational records are available. Here we carry out a sophisticated statistical analysis of the sunspot record that reaffirms these correlations, and sets up an empirical predictive framework for future cycles. An advantage of our approach is that it allows for rigorous assessment of both the statistical significance of various cycle features and the uncertainty associated with predictions. We summarize the data into three sequential relations that estimate the amplitude, duration, and time of rise to maximum for any cycle, given the values from the previous cycle. We find that there is no indication of a persistence in predictive power beyond one cycle, and conclude that the dynamo does not retain memory beyond one cycle. Based on sunspot records up to October 2011, we obtain, for Cycle 24, an estimated maximum smoothed monthly sunspot number of 97 +- 15, to occur in January--February 2014 +- 6 months.
△ Less
Submitted 8 August, 2012;
originally announced August 2012.
-
Accounting for Calibration Uncertainties in X-ray Analysis: Effective Areas in Spectral Fitting
Authors:
Hyunsook Lee,
Vinay L. Kashyap,
David A. van Dyk,
Alanna Connors,
Jeremy J. Drake,
Rima Izem,
Xiao-Li Meng,
Shandong Min,
Taeyoung Park,
Pete Ratzlaff,
Aneta Siemiginowska,
Andreas Zezas
Abstract:
While considerable advance has been made to account for statistical uncertainties in astronomical analyses, systematic instrumental uncertainties have been generally ignored. This can be crucial to a proper interpretation of analysis results because instrumental calibration uncertainty is a form of systematic uncertainty. Ignoring it can underestimate error bars and introduce bias into the fitted…
▽ More
While considerable advance has been made to account for statistical uncertainties in astronomical analyses, systematic instrumental uncertainties have been generally ignored. This can be crucial to a proper interpretation of analysis results because instrumental calibration uncertainty is a form of systematic uncertainty. Ignoring it can underestimate error bars and introduce bias into the fitted values of model parameters. Accounting for such uncertainties currently requires extensive case-specific simulations if using existing analysis packages. Here we present general statistical methods that incorporate calibration uncertainties into spectral analysis of high-energy data. We first present a method based on multiple imputation that can be applied with any fitting method, but is necessarily approximate. We then describe a more exact Bayesian approach that works in conjunction with a Markov chain Monte Carlo based fitting. We explore methods for improving computational efficiency, and in particular detail a method of summarizing calibration uncertainties with a principal component analysis of samples of plausible calibration files. This method is implemented using recently codified Chandra effective area uncertainties for low-resolution spectral analysis and is verified using both simulated and actual Chandra data. Our procedure for incorporating effective area uncertainty is easily generalized to other types of calibration uncertainties.
△ Less
Submitted 22 February, 2011;
originally announced February 2011.
-
On Computing Upper Limits to Source Intensities
Authors:
Vinay L. Kashyap,
David A. van Dyk,
Alanna Connors,
Peter Freeman,
Aneta Siemiginowska,
Jin Xu,
Andreas Zezas
Abstract:
A common problem in astrophysics is determining how bright a source could be and still not be detected. Despite the simplicity with which the problem can be stated, the solution involves complex statistical issues that require careful analysis. In contrast to the confidence bound, this concept has never been formally analyzed, leading to a great variety of often ad hoc solutions. Here we formulate…
▽ More
A common problem in astrophysics is determining how bright a source could be and still not be detected. Despite the simplicity with which the problem can be stated, the solution involves complex statistical issues that require careful analysis. In contrast to the confidence bound, this concept has never been formally analyzed, leading to a great variety of often ad hoc solutions. Here we formulate and describe the problem in a self-consistent manner. Detection significance is usually defined by the acceptable proportion of false positives (the TypeI error), and we invoke the complementary concept of false negatives (the TypeII error), based on the statistical power of a test, to compute an upper limit to the detectable source intensity. To determine the minimum intensity that a source must have for it to be detected, we first define a detection threshold, and then compute the probabilities of detecting sources of various intensities at the given threshold. The intensity that corresponds to the specified TypeII error probability defines that minimum intensity, and is identified as the upper limit. Thus, an upper limit is a characteristic of the detection procedure rather than the strength of any particular source and should not be confused with confidence intervals or other estimates of source intensity. This is particularly important given the large number of catalogs that are being generated from increasingly sensitive surveys. We discuss the differences between these upper limits and confidence bounds. Both measures are useful quantities that should be reported in order to extract the most science from catalogs, though they answer different statistical questions: an upper bound describes an inference range on the source intensity, while an upper limit calibrates the detection process. We provide a recipe for computing upper limits that applies to all detection algorithms.
△ Less
Submitted 22 June, 2010;
originally announced June 2010.