-
Testing Hypotheses of Covariate Effects on Topics of Discourse
Authors:
Gabriel Phelan,
David A. Campbell
Abstract:
We introduce an approach to topic modelling with document-level covariates that remains tractable in the face of large text corpora. This is achieved by de-emphasizing the role of parameter estimation in an underlying probabilistic model, assuming instead that the data come from a fixed but unknown distribution whose statistical functionals are of interest. We propose combining a convex formulatio…
▽ More
We introduce an approach to topic modelling with document-level covariates that remains tractable in the face of large text corpora. This is achieved by de-emphasizing the role of parameter estimation in an underlying probabilistic model, assuming instead that the data come from a fixed but unknown distribution whose statistical functionals are of interest. We propose combining a convex formulation of non-negative matrix factorization with standard regression techniques as a fast-to-compute and useful estimate of such a functional. Uncertainty quantification can then be achieved by reposing non-parametric resampling methods on top of this scheme. This is in contrast to popular topic modelling paradigms, which posit a complex and often hard-to-fit generative model of the data. We argue that the simple, non-parametric approach advocated here is faster, more interpretable, and enjoys better inferential justification than said generative models. Finally, our methods are demonstrated with an application analysing covariate effects on discourse of flavours attributed to Canadian beers.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
A survey for variable young stars with small telescopes: VIII -- Properties of 1687 Gaia selected members in 21 nearby clusters
Authors:
Dirk Froebrich,
Aleks Scholz,
Justyn Campbell-White,
Siegfried Vanaverbeke,
Carys Herbert,
Jochen Eislöffel,
Thomas Urtly,
Timothy P. Long,
Ivan L. Walton,
Klaas Wiersema,
Nick J. Quinn,
Tony Rodda,
Juan-Luis González-Carballo,
Mario Morales Aimar,
Rafael Castillo García,
Francisco C. Soldán Alfaro,
Faustino García de la Cuesta,
Domenico Licchelli,
Alex Escartin Perez,
José Luis Salto González,
Marc Deldem,
Stephen R. L. Futcher,
Tim Nelson,
Shawn Dvorak,
Dawid Moździerski
, et al. (38 additional authors not shown)
Abstract:
The Hunting Outbursting Young Stars (HOYS) project performs long-term, optical, multi-filter, high cadence monitoring of 25 nearby young clusters and star forming regions. Utilising Gaia DR3 data we have identified about 17000 potential young stellar members in 45 coherent astrometric groups in these fields. Twenty one of them are clear young groups or clusters of stars within one kiloparsec and t…
▽ More
The Hunting Outbursting Young Stars (HOYS) project performs long-term, optical, multi-filter, high cadence monitoring of 25 nearby young clusters and star forming regions. Utilising Gaia DR3 data we have identified about 17000 potential young stellar members in 45 coherent astrometric groups in these fields. Twenty one of them are clear young groups or clusters of stars within one kiloparsec and they contain 9143 Gaia selected potential members. The cluster distances, proper motions and membership numbers are determined. We analyse long term (about 7yr) V, R, and I-band light curves from HOYS for 1687 of the potential cluster members. One quarter of the stars are variable in all three optical filters, and two thirds of these have light curves that are symmetric around the mean. Light curves affected by obscuration from circumstellar materials are more common than those affected by accretion bursts, by a factor of 2-4. The variability fraction in the clusters ranges from 10 to almost 100 percent, and correlates positively with the fraction of stars with detectable inner disks, indicating that a lot of variability is driven by the disk. About one in six variables shows detectable periodicity, mostly caused by magnetic spots. Two thirds of the periodic variables with disk excess emission are slow rotators, and amongst the stars without disk excess two thirds are fast rotators - in agreement with rotation being slowed down by the presence of a disk.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
A survey for variable young stars with small telescopes: VI -- Analysis of the outbursting Be stars NSW284, Gaia19eyy, and VES263
Authors:
Dirk Froebrich,
Lynne A. Hillenbrand,
Carys Herbert,
Kishalay De,
Jochen Eislöffel,
Justyn Campbell-White,
Ruhee Kahar,
Franz-Josef Hambsch,
Thomas Urtly,
Adam Popowicz,
Krzysztof Bernacki,
Andrzej Malcher,
Slawomir Lasota,
Jerzy Fiolka,
Piotr Jozwik-Wabik,
Franky Dubois,
Ludwig Logie,
Steve Rau,
Mark Phillips,
George Fleming,
Rafael Gonzalez Farfán,
Francisco C. Soldán Alfaro,
Tim Nelson,
Stephen R. L. Futcher,
Samantha M. Rolfe
, et al. (22 additional authors not shown)
Abstract:
This paper is one in a series reporting results from small telescope observations of variable young stars. Here, we study the repeating outbursts of three likely Be stars based on long-term optical, near-infrared, and mid-infrared photometry for all three objects, along with follow-up spectra for two of the three. The sources are characterised as rare, truly regularly outbursting Be stars. We inte…
▽ More
This paper is one in a series reporting results from small telescope observations of variable young stars. Here, we study the repeating outbursts of three likely Be stars based on long-term optical, near-infrared, and mid-infrared photometry for all three objects, along with follow-up spectra for two of the three. The sources are characterised as rare, truly regularly outbursting Be stars. We interpret the photometric data within a framework for modelling light curve morphology, and find that the models correctly predict the burst shapes, including their larger amplitudes and later peaks towards longer wavelengths. We are thus able to infer the start and end times of mass loading into the circumstellar disks of these stars. The disk sizes are typically 3-6 times the areas of the central star. The disk temperatures are ~40%, and the disk luminosities are ~10% of those of the central Be star, respectively. The available spectroscopy is consistent with inside-out evolution of the disk. Higher excitation lines have larger velocity widths in their double-horned shaped emission profiles. Our observations and analysis support the decretion disk model for outbursting Be stars.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
A small actively-controlled high-resolution spectrograph based on off-the-shelf components
Authors:
H. R. A. Jones,
W. E. Martin,
G. Anglada-Escudé,
R. Errmann,
D. A. Campbell,
C. Baker,
C. Boonsri,
P. Choochalerm
Abstract:
We present the design and testing of a prototype in-plane echelle spectrograph based on an actively controlled fibre-fed double-pass design. This system aims to be small and efficient with the minimum number of optical surfaces - currently a collimator/camera lens, cross-dispersing prism, grating and a reflector to send light to the detector. It is built from catalogue optical components and has d…
▽ More
We present the design and testing of a prototype in-plane echelle spectrograph based on an actively controlled fibre-fed double-pass design. This system aims to be small and efficient with the minimum number of optical surfaces - currently a collimator/camera lens, cross-dispersing prism, grating and a reflector to send light to the detector. It is built from catalogue optical components and has dimensions of approximately 20x30 cm. It works in the optical regime with a resolution of >70,000. The spectrograph is fed by a bifurcated fibre with one fibre to a telescope and the other used to provide simultaneous Thorium Argon light illumination for wavelength calibration. The positions of the arc lines on the detector are processed in real time and commercial auto-guiding software is used to treat the positions of the arc lines as guide stars. The guiding software sends any required adjustments to mechanical piezo-electric actuators which move the mirror sending light to the camera removing any drift in the position of the arc lines. The current configuration using an sCMOS detector provides a precision of 3.5 milli-pixels equivalent to 4 m/s in a standard laboratory environment.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
Parallel Tempering via Simulated Tempering Without Normalizing Constants
Authors:
Biljana Jonoska Stojkova,
David A. Campbell
Abstract:
In this paper we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and the marginal likelihood of the model. The proposed methodology builds on Simulated Tempering, which is a powerful algorithm that enables sampling from multi-modal distributions. However, Simulated Tempering comes with the practical limitation of needing to specify a prior for the te…
▽ More
In this paper we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and the marginal likelihood of the model. The proposed methodology builds on Simulated Tempering, which is a powerful algorithm that enables sampling from multi-modal distributions. However, Simulated Tempering comes with the practical limitation of needing to specify a prior for the temperature along a chosen discretization schedule that will allow calculation of normalizing constants at each temperature. Our proposed model defines the prior for the temperature so as to remove the need for calculating normalizing constants at each temperature and thereby enables a continuous temperature schedule, while preserving the sampling efficiency of the Simulated Tempering algorithm. The resulting algorithm simultaneously estimates parameters while estimating marginal likelihoods through thermodynamic integration. We illustrate the applicability of the new algorithm to different examples involving mixture models of Gaussian distributions and ordinary differential equation models.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Incremental Mixture Importance Sampling with Shotgun optimization
Authors:
Biljana Jonoska Stojkova,
David A. Campbell
Abstract:
This paper proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Shotgun optimization is developed as a framework which employs different optimization strategies, criteria, or conditional targets to enable wider likelihood exploration. The introduced Shotgun optimization approach is…
▽ More
This paper proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Shotgun optimization is developed as a framework which employs different optimization strategies, criteria, or conditional targets to enable wider likelihood exploration. The introduced Shotgun optimization approach is embedded into an incremental mixture importance sampling algorithm to produce improved posterior samples for multimodal densities and creates robustness in cases where the likelihood and prior are in disagreement. Despite using different optimization approaches, the samples are combined into samples from a single target posterior. The diversity of the framework is demonstrated on parameter estimation from differential equation models employing diverse strategies including numerical solutions and approximations thereof. Additionally the approach is demonstrated on mixtures of discrete and continuous parameters and is shown to ease estimation from synthetic likelihood models. R code of the implemented examples is stored in a zipped archive (codeSubmit.zip).
△ Less
Submitted 13 November, 2017;
originally announced November 2017.
-
Sequentially Constrained Monte Carlo
Authors:
Shirin Golchi,
David A. Campbell
Abstract:
Constraints can be interpreted in a broad sense as any kind of explicit restriction over the parameters. While some constraints are defined directly on the parameter space, when they are instead defined by known behaviour on the model, transformation of constraints into features on the parameter space may not be possible. Difficulties in sampling from the posterior distribution as a result of inco…
▽ More
Constraints can be interpreted in a broad sense as any kind of explicit restriction over the parameters. While some constraints are defined directly on the parameter space, when they are instead defined by known behaviour on the model, transformation of constraints into features on the parameter space may not be possible. Difficulties in sampling from the posterior distribution as a result of incorporation of constraints into the model is a common challenge leading to truncations in the parameter space and inefficient sampling algorithms. We propose a variant of sequential Monte Carlo algorithm for posterior sampling in presence of constraints by defining a sequence of densities through the imposition of the constraint. Particles generated from an unconstrained or mildly constrained distribution are filtered and moved through sampling and resampling steps to obtain a sample from the fully constrained target distribution. General and model specific forms of constraints enforcing strategies are defined. The Sequentially Constrained Monte Carlo algorithm is demonstrated on constraints defined by monotonicity of a function, densities constrained to low dimensional manifolds, adherence to a theoretically derived model, and model feature matching.
△ Less
Submitted 25 February, 2015; v1 submitted 29 October, 2014;
originally announced October 2014.
-
The first planet detected in the WTS: an inflated hot-Jupiter in a 3.35 day orbit around a late F-star [ERRATUM]
Authors:
M. Cappetta,
R. P. Saglia,
J. L. Birkby,
J. Koppenhoefer,
D. J. Pinfield,
S. T. Hodgkin,
P. Cruz,
G. Kovacs,
B. Sipocz,
D. Barrado,
B. Nefs,
Y. V. Pavlenko,
L. Fossati,
C. del Burgo,
E. L. Martin,
I. Snellen,
J. Barnes,
D. A. Campbell,
S. Catalan,
M. C. Galvez-Ortiz,
N. Goulding,
C. Haswell,
O. Ivanyuk,
H. Jones,
M. Kuznetsov
, et al. (13 additional authors not shown)
Abstract:
We report the discovery of WTS-1b, the first extrasolar planet found by the WFCAM Transit Survey, which began observations at the 3.8-m United Kingdom Infrared Telescope (UKIRT) in August 2007. Light curves comprising almost 1200 epochs with a photometric precision of better than 1 per cent to J ~ 16 were constructed for ~60000 stars and searched for periodic transit signals. For one of the most p…
▽ More
We report the discovery of WTS-1b, the first extrasolar planet found by the WFCAM Transit Survey, which began observations at the 3.8-m United Kingdom Infrared Telescope (UKIRT) in August 2007. Light curves comprising almost 1200 epochs with a photometric precision of better than 1 per cent to J ~ 16 were constructed for ~60000 stars and searched for periodic transit signals. For one of the most promising transiting candidates, high-resolution spectra taken at the Hobby-Eberly Telescope (HET) allowed us to estimate the spectroscopic parameters of the host star, a late-F main sequence dwarf (V=16.13) with possibly slightly subsolar metallicity, and to measure its radial velocity variations. The combined analysis of the light curves and spectroscopic data resulted in an orbital period of the substellar companion of 3.35 days, a planetary mass of 4.01 +- 0.35 Mj and a planetary radius of 1.49+0.16-0.18 Rj. WTS-1b has one of the largest radius anomalies among the known hot Jupiters in the mass range 3-5 Mj. The high irradiation from the host star ranks the planet in the pM class.
△ Less
Submitted 13 October, 2014;
originally announced October 2014.
-
Transdimensional Approximate Bayesian Computation for Inference on Invasive Species Models with Latent Variables of Unknown Dimension
Authors:
Oksana A. Chkrebtii,
Erin K. Cameron,
David A. Campbell,
Erin M. Bayne
Abstract:
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimens…
▽ More
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimension precludes exact sampling techniques. Approximate Bayesian computation (ABC) methods provide an alternative to exact sampling but rely on inefficient conditional simulation of the latent variables. To accomplish this task efficiently, a new transdimensional Monte Carlo sampler is developed for approximate Bayesian model inference and used to estimate rates of introduction and spread for the non-native earthworm species Dendrobaena octaedra (Savigny) along roads in the boreal forest of northern Alberta. Using low and high estimates of introduction and spread rates, the extent of earthworm invasions in northeastern Alberta was simulated to project the proportion of suitable habitat invaded in the year following data collection.
△ Less
Submitted 30 December, 2014; v1 submitted 10 October, 2013;
originally announced October 2013.
-
Monotone Function Estimation for Computer Experiments
Authors:
Shirin Golchi,
Derek R. Bingham,
Hugh Chipman,
David A. Campbell
Abstract:
In statistical modeling of computer experiments sometimes prior information is available about the underlying function. For example, the physical system simulated by the computer code may be known to be monotone with respect to some or all inputs. We develop a Bayesian approach to Gaussian process modelling capable of incorporating monotonicity information for computer model emulation. Markov chai…
▽ More
In statistical modeling of computer experiments sometimes prior information is available about the underlying function. For example, the physical system simulated by the computer code may be known to be monotone with respect to some or all inputs. We develop a Bayesian approach to Gaussian process modelling capable of incorporating monotonicity information for computer model emulation. Markov chain Monte Carlo methods are used to sample from the posterior distribution of the process given the simulator output and monotonicity information. The performance of the proposed approach in terms of predictive accuracy and uncertainty quantification is demonstrated in a number of simulated examples as well as a real queueing system application.
△ Less
Submitted 14 June, 2014; v1 submitted 15 September, 2013;
originally announced September 2013.
-
Bayesian Solution Uncertainty Quantification for Differential Equations
Authors:
Oksana A. Chkrebtii,
David A. Campbell,
Ben Calderhead,
Mark A. Girolami
Abstract:
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory throug…
▽ More
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory through Bayesian updating of a prior process conditional on model information. A one-step-ahead sampling scheme for interrogating the model is described, its consistency and first order convergence properties are proved, and its computational complexity is shown to be proportional to that of numerical explicit one-step solvers. Examples illustrate the flexibility of this framework to deal with a wide variety of complex and large-scale systems. Within the calibration problem, discretization uncertainty defines a layer in the Bayesian hierarchy, and a Markov chain Monte Carlo algorithm that targets this posterior distribution is presented. This formalism is used for inference on the JAK-STAT delay differential equation model of protein dynamics from indirectly observed measurements. The discussion outlines implications for the new field of probabilistic numerics.
△ Less
Submitted 23 October, 2016; v1 submitted 10 June, 2013;
originally announced June 2013.
-
The first planet detected in the WTS: an inflated hot-Jupiter in a 3.35 day orbit around a late F-star
Authors:
M. Cappetta,
R. P. Saglia,
J. L. Birkby,
J. Koppenhoefer,
D. J. Pinfield,
S. T. Hodgkin,
P. Cruz,
G. Kovács,
B. Sipöcz,
D. Barrado,
B. Nefs,
Y. V. Pavlenko,
L. Fossati,
C. del Burgo,
E. L. Martín,
I. Snellen,
J. Barnes,
A. M. Bayo,
D. A. Campbell,
S. Catalan,
M. C. Gálvez-Ortiz,
N. Goulding,
C. Haswell,
O. Ivanyuk,
H. Jones
, et al. (14 additional authors not shown)
Abstract:
We report the discovery of WTS-1b, the first extrasolar planet found by the WFCAM Transit Survey, which began observations at the 3.8-m United Kingdom Infrared Telescope. Light curves comprising almost 1200 epochs with a photometric precision of better than 1 per cent to J=16 were constructed for 60000 stars and searched for periodic transit signals. For one of the most promising transiting candid…
▽ More
We report the discovery of WTS-1b, the first extrasolar planet found by the WFCAM Transit Survey, which began observations at the 3.8-m United Kingdom Infrared Telescope. Light curves comprising almost 1200 epochs with a photometric precision of better than 1 per cent to J=16 were constructed for 60000 stars and searched for periodic transit signals. For one of the most promising transiting candidates, high-resolution spectra taken at the Hobby-Eberly Telescope allowed us to estimate the spectroscopic parameters of the host star, a late-F main sequence dwarf (V=16.13) with possibly slightly subsolar metallicity, and to measure its radial velocity variations. The combined analysis of the light curves and spectroscopic data resulted in an orbital period of the substellar companion of 3.35 days, a planetary mass of 4.01+-0.35 Mj and a planetary radius of 1.49+-0.17 Rj. WTS-1b has one of the largest radius anomalies among the known hot Jupiters in the mass range 3-5 Mj.
△ Less
Submitted 3 October, 2012;
originally announced October 2012.